---
base_model: allenai/PRIMERA
library_name: peft
license: apache-2.0
language:
- en
tags:
- base_model:adapter:allenai/PRIMERA
- lora
- transformers
- summarization
- primera
- chain-finetuning
datasets:
- billsum
- ccdv/arxiv-summarization
metrics:
- rouge
pipeline_tag: summarization
---

# PRIMERA-BillSum-arXiv (2-Stage Chain LoRA, bf16)

A LoRA adapter for [allenai/PRIMERA](https://huggingface.co/allenai/PRIMERA) trained via **2-stage sequential chain fine-tuning**: BillSum → arXiv. Starts from a BillSum-adapted PRIMERA and continues training on arXiv scientific papers.

## Model Details

- **Base model:** [allenai/PRIMERA](https://huggingface.co/allenai/PRIMERA)
- **Method:** LoRA (Low-Rank Adaptation), bf16 precision (no quantization)
- **Chain order:** BillSum → arXiv
- **Language:** English

> **Note:** Earlier docs called this "QLoRA". 4-bit quantization caused NaN gradients with LED/Longformer in-place attention ops, so quantization was disabled. Training is standard LoRA in bf16.

### Training Stages

| Stage | Dataset |
|-------|---------|
| 1 | BillSum |
| 2 | arXiv |

### Hyperparameters

- **LoRA rank (r):** 16
- **LoRA alpha:** 32
- **LoRA dropout:** 0.05
- **Precision:** bf16 (no quantization)
- **Target modules (LED/Longformer):**
  - Encoder: query, key, value, query_global, key_global, value_global, output
  - Decoder: q_proj, k_proj, v_proj, out_proj
  - Feed-forward: fc1, fc2

## Usage

```python
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
from peft import PeftModel
import torch

tokenizer = AutoTokenizer.from_pretrained("xNoper/primera-billsum-arxiv")
base = AutoModelForSeq2SeqLM.from_pretrained(
    "allenai/PRIMERA", torch_dtype=torch.bfloat16
)
model = PeftModel.from_pretrained(base, "xNoper/primera-billsum-arxiv")
```

## Citation

If you use this model, please also cite the underlying base model:

```bibtex
@inproceedings{xiao-etal-2022-primera,
    title = "{PRIMERA}: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization",
    author = "Xiao, Wen and Beltagy, Iz and Carenini, Giuseppe and Cohan, Arman",
    booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics",
    year = "2022",
}
```