---
language: en
license: apache-2.0
tags:
  - summarization
  - t5
  - academic
  - arxiv
  - finetuned
datasets:
  - arxiv
metrics:
  - rouge
widget:
  - text: "summarize: We present a novel approach to neural network optimization using adaptive learning rates. Our method dynamically adjusts the learning rate based on gradient statistics during training. Experiments on ImageNet show 15% improvement over standard SGD with minimal computational overhead."
    example_title: "Example 1"
---

# T5-Small Fine-tuned for Academic Paper Summarization

## Model Description

This is a **T5-small** model fine-tuned on 30,000 arXiv papers for academic text summarization.

### Performance

Compared to base T5-small:

- **ROUGE-1:** +28.29% improvement
- **ROUGE-2:** +46.45% improvement ⭐
- **ROUGE-L:** +27.85% improvement

Additional metrics:
- **BERTScore:** +2.14% improvement
- **BARTScore:** +6.62% improvement  
- **FactCC:** +28.24% improvement

**Overall:** 6/8 metrics improved (75% win rate)

## Intended Use

This model is specifically designed for:
- Summarizing academic papers
- Generating abstracts from research articles
- Scientific document summarization
- Technical content summarization

## How to Use

### Quick Start
```python
from transformers import T5Tokenizer, T5ForConditionalGeneration

# Load model
tokenizer = T5Tokenizer.from_pretrained("Bashaarat1/t5-small-arxiv-summarizer")
model = T5ForConditionalGeneration.from_pretrained("Bashaarat1/t5-small-arxiv-summarizer")

# Prepare input
text = "summarize: Your academic paper text here..."
inputs = tokenizer(text, return_tensors="pt", max_length=512, truncation=True)

# Generate summary
outputs = model.generate(
    **inputs,
    max_length=128,
    num_beams=4,
    early_stopping=True
)

summary = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(summary)
```

### Inference API
```python
import requests

API_URL = "https://api-inference.huggingface.co/models/Bashaarat1/t5-small-arxiv-summarizer"
headers = {"Authorization": "Bearer YOUR_HF_TOKEN"}

def query(payload):
    response = requests.post(API_URL, headers=headers, json=payload)
    return response.json()

output = query({"inputs": "summarize: Your paper text..."})
```

## Training Details

### Training Data

- **Dataset:** arXiv papers
- **Size:** 30,000 training samples
- **Validation:** 2,000 samples
- **Test:** 1,000 samples

### Training Procedure

- **Base Model:** t5-small (60M parameters)
- **Epochs:** 3
- **Batch Size:** 8 (effective: 32 with gradient accumulation)
- **Learning Rate:** 5e-5
- **Optimizer:** AdamW (8-bit)
- **Hardware:** NVIDIA A100-80GB
- **Training Time:** ~3 hours

### Hyperparameters
```python
- max_input_length: 512
- max_target_length: 128
- num_beams: 4
- learning_rate: 5e-5
- warmup_steps: 500
- weight_decay: 0.01
```

## Evaluation

Evaluated on 1,000 arXiv test papers:

| Metric | Base T5-small | Fine-tuned | Improvement |
|--------|--------------|------------|-------------|
| ROUGE-1 | 0.2200 | 0.2823 | **+28.29%** |
| ROUGE-2 | 0.0564 | 0.0826 | **+46.45%** |
| ROUGE-L | 0.1405 | 0.1796 | **+27.85%** |

## Limitations

- Optimized for academic/scientific text
- May not perform as well on general-domain text
- Maximum input length: 512 tokens
- Works best with English text

## Citation

If you use this model, please cite:
```bibtex
@misc{t5-arxiv-summarizer,
  author = {Bashaarat1},
  title = {T5-Small Fine-tuned for Academic Summarization},
  year = {2024},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/Bashaarat1/t5-small-arxiv-summarizer}}
}
```

## License

This model is released under the Apache 2.0 License (same as T5-small base model).

## Contact

For questions or issues, please open an issue on the model repository.

---

**Model trained and uploaded:** December 2024