ccdv/arxiv-summarization
Viewer • Updated • 432k • 13.8k • 124
How to use mehdielg/primera-billsum-arxiv with PEFT:
from peft import PeftModel
from transformers import AutoModelForSeq2SeqLM
base_model = AutoModelForSeq2SeqLM.from_pretrained("allenai/PRIMERA")
model = PeftModel.from_pretrained(base_model, "mehdielg/primera-billsum-arxiv")How to use mehdielg/primera-billsum-arxiv with Transformers:
# Use a pipeline as a high-level helper
# Warning: Pipeline type "summarization" is no longer supported in transformers v5.
# You must load the model directly (see below) or downgrade to v4.x with:
# 'pip install "transformers<5.0.0'
from transformers import pipeline
pipe = pipeline("summarization", model="mehdielg/primera-billsum-arxiv") # Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("mehdielg/primera-billsum-arxiv", dtype="auto")A LoRA adapter for allenai/PRIMERA trained via 2-stage sequential chain fine-tuning: BillSum → arXiv. Starts from a BillSum-adapted PRIMERA and continues training on arXiv scientific papers.
Note: Earlier docs called this "QLoRA". 4-bit quantization caused NaN gradients with LED/Longformer in-place attention ops, so quantization was disabled. Training is standard LoRA in bf16.
| Stage | Dataset |
|---|---|
| 1 | BillSum |
| 2 | arXiv |
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
from peft import PeftModel
import torch
tokenizer = AutoTokenizer.from_pretrained("xNoper/primera-billsum-arxiv")
base = AutoModelForSeq2SeqLM.from_pretrained(
"allenai/PRIMERA", torch_dtype=torch.bfloat16
)
model = PeftModel.from_pretrained(base, "xNoper/primera-billsum-arxiv")
If you use this model, please also cite the underlying base model:
@inproceedings{xiao-etal-2022-primera,
title = "{PRIMERA}: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization",
author = "Xiao, Wen and Beltagy, Iz and Carenini, Giuseppe and Cohan, Arman",
booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics",
year = "2022",
}
Base model
allenai/PRIMERA