--- language: en license: apache-2.0 tags: - summarization - t5 - academic - arxiv - finetuned datasets: - arxiv metrics: - rouge widget: - text: "summarize: We present a novel approach to neural network optimization using adaptive learning rates. Our method dynamically adjusts the learning rate based on gradient statistics during training. Experiments on ImageNet show 15% improvement over standard SGD with minimal computational overhead." example_title: "Example 1" --- # T5-Small Fine-tuned for Academic Paper Summarization ## Model Description This is a **T5-small** model fine-tuned on 30,000 arXiv papers for academic text summarization. ### Performance Compared to base T5-small: - **ROUGE-1:** +28.29% improvement - **ROUGE-2:** +46.45% improvement ⭐ - **ROUGE-L:** +27.85% improvement Additional metrics: - **BERTScore:** +2.14% improvement - **BARTScore:** +6.62% improvement - **FactCC:** +28.24% improvement **Overall:** 6/8 metrics improved (75% win rate) ## Intended Use This model is specifically designed for: - Summarizing academic papers - Generating abstracts from research articles - Scientific document summarization - Technical content summarization ## How to Use ### Quick Start ```python from transformers import T5Tokenizer, T5ForConditionalGeneration # Load model tokenizer = T5Tokenizer.from_pretrained("Bashaarat1/t5-small-arxiv-summarizer") model = T5ForConditionalGeneration.from_pretrained("Bashaarat1/t5-small-arxiv-summarizer") # Prepare input text = "summarize: Your academic paper text here..." inputs = tokenizer(text, return_tensors="pt", max_length=512, truncation=True) # Generate summary outputs = model.generate( **inputs, max_length=128, num_beams=4, early_stopping=True ) summary = tokenizer.decode(outputs[0], skip_special_tokens=True) print(summary) ``` ### Inference API ```python import requests API_URL = "https://api-inference.huggingface.co/models/Bashaarat1/t5-small-arxiv-summarizer" headers = {"Authorization": "Bearer YOUR_HF_TOKEN"} def query(payload): response = requests.post(API_URL, headers=headers, json=payload) return response.json() output = query({"inputs": "summarize: Your paper text..."}) ``` ## Training Details ### Training Data - **Dataset:** arXiv papers - **Size:** 30,000 training samples - **Validation:** 2,000 samples - **Test:** 1,000 samples ### Training Procedure - **Base Model:** t5-small (60M parameters) - **Epochs:** 3 - **Batch Size:** 8 (effective: 32 with gradient accumulation) - **Learning Rate:** 5e-5 - **Optimizer:** AdamW (8-bit) - **Hardware:** NVIDIA A100-80GB - **Training Time:** ~3 hours ### Hyperparameters ```python - max_input_length: 512 - max_target_length: 128 - num_beams: 4 - learning_rate: 5e-5 - warmup_steps: 500 - weight_decay: 0.01 ``` ## Evaluation Evaluated on 1,000 arXiv test papers: | Metric | Base T5-small | Fine-tuned | Improvement | |--------|--------------|------------|-------------| | ROUGE-1 | 0.2200 | 0.2823 | **+28.29%** | | ROUGE-2 | 0.0564 | 0.0826 | **+46.45%** | | ROUGE-L | 0.1405 | 0.1796 | **+27.85%** | ## Limitations - Optimized for academic/scientific text - May not perform as well on general-domain text - Maximum input length: 512 tokens - Works best with English text ## Citation If you use this model, please cite: ```bibtex @misc{t5-arxiv-summarizer, author = {Bashaarat1}, title = {T5-Small Fine-tuned for Academic Summarization}, year = {2024}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co/Bashaarat1/t5-small-arxiv-summarizer}} } ``` ## License This model is released under the Apache 2.0 License (same as T5-small base model). ## Contact For questions or issues, please open an issue on the model repository. --- **Model trained and uploaded:** December 2024