Text Generation
PEFT
Safetensors
English
funding-extraction
lora
grpo
rl
scholarly-metadata
conversational
Instructions to use cometadata/funding-extraction-llama-3.1-8b-instruct-artifact-data-mix-grpo-mixed-reward with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use cometadata/funding-extraction-llama-3.1-8b-instruct-artifact-data-mix-grpo-mixed-reward with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B-Instruct") model = PeftModel.from_pretrained(base_model, "cometadata/funding-extraction-llama-3.1-8b-instruct-artifact-data-mix-grpo-mixed-reward") - Notebooks
- Google Colab
- Kaggle
Funding Extraction LoRA
LoRA adapter for extracting structured funding metadata (funder names + award IDs) from academic paper funding statements. Fine-tuned on Llama 3.1 8B Instruct via SFT then GRPO reinforcement learning.
Training Pipeline
Stage 1: Supervised Fine-Tuning (SFT)
- Base model:
meta-llama/Llama-3.1-8B-Instruct - Data: 5,264 real + 10,124 synthetic funding statements with gold-standard funder/award labels
- Data augmentation: 50% of training examples augmented with synthetic noise (OCR-like case errors, digit/letter swaps, Unicode artifacts, XML/HTML tags, LaTeX markup) for robustness to real-world document formats
- LoRA rank: 128
- Epochs: 2
Stage 2: Reinforcement Learning (GRPO)
- Algorithm: Group Relative Policy Optimization (GRPO) with importance sampling loss
- Reward: Hierarchical F0.5 scoring with binary funder/award-ID matching + flat award-ID association bonus
reward = 0.50 * funder_F0.5 + 0.50 * hierarchical_award_id_F0.5 + 0.10 * flat_award_id_F0.5- Funder matching - Use fuzzy (token_sort_ratio ≥ 0.80 threshold, Hungarian optimal assignment)
- Award ID matching - Use binary exact after normalization (strip whitespace/hyphens/slashes, uppercase)
- Flat award-ID term - Awards partial credit when the correct award ID is extracted under the wrong funder, providing gradient on funder-award association errors
- KL penalty: 0.03 (anchored to SFT checkpoint)
- Group size: 8 rollouts per prompt
- Temperature: 0.8
- Learning rate: 3e-5
- Steps: 193 batches
- Checkpoint: final (batch 193)
Evaluation Results
arxiv_test.jsonl (300 held-out examples)
Permissive (partial_ratio + token_set, no damping)
| Field | P | R | F1 | F0.5 | F1.5 |
|---|---|---|---|---|---|
| Funder | 0.9365 | 0.9408 | 0.9386 | 0.9374 | 0.9395 |
| Award ID | 0.9129 | 0.8968 | 0.9048 | 0.9096 | 0.9017 |
| Scheme | 0.6414 | 0.7686 | 0.6992 | 0.6633 | 0.7244 |
| Title | 0.6774 | 0.4375 | 0.5316 | 0.6105 | 0.4910 |
Balanced (length-damped + acronym detection)
| Field | P | R | F1 | F0.5 | F1.5 |
|---|---|---|---|---|---|
| Funder | 0.8932 | 0.9071 | 0.9001 | 0.8960 | 0.9028 |
| Award ID | 0.8859 | 0.8702 | 0.8780 | 0.8827 | 0.8750 |
| Scheme | 0.5931 | 0.7107 | 0.6466 | 0.6134 | 0.6699 |
| Title | 0.6774 | 0.4375 | 0.5316 | 0.6105 | 0.4910 |
Strict (token_sort_ratio only)
| Field | P | R | F1 | F0.5 | F1.5 |
|---|---|---|---|---|---|
| Funder | 0.8826 | 0.8962 | 0.8894 | 0.8853 | 0.8920 |
| Award ID | 0.8829 | 0.8673 | 0.8750 | 0.8797 | 0.8720 |
| Scheme | 0.5448 | 0.6529 | 0.5940 | 0.5635 | 0.6153 |
| Title | 0.6129 | 0.3958 | 0.4810 | 0.5523 | 0.4442 |
synthetic_edges test (1288 augmented/adversarial examples)
Permissive
| Field | P | R | F1 | F0.5 | F1.5 |
|---|---|---|---|---|---|
| Funder | 0.9184 | 0.9343 | 0.9263 | 0.9215 | 0.9293 |
| Award ID | 0.8556 | 0.8643 | 0.8600 | 0.8573 | 0.8616 |
| Scheme | 0.7271 | 0.6774 | 0.7014 | 0.7166 | 0.6919 |
| Title | 0.7066 | 0.3430 | 0.4618 | 0.5830 | 0.4075 |
Balanced
| Field | P | R | F1 | F0.5 | F1.5 |
|---|---|---|---|---|---|
| Funder | 0.8922 | 0.9079 | 0.8999 | 0.8953 | 0.9030 |
| Award ID | 0.8434 | 0.8520 | 0.8477 | 0.8451 | 0.8493 |
| Scheme | 0.6604 | 0.6152 | 0.6370 | 0.6509 | 0.6285 |
| Title | 0.6287 | 0.3052 | 0.4110 | 0.5188 | 0.3626 |
Strict
| Field | P | R | F1 | F0.5 | F1.5 |
|---|---|---|---|---|---|
| Funder | 0.8746 | 0.8895 | 0.8820 | 0.8776 | 0.8849 |
| Award ID | 0.8366 | 0.8452 | 0.8409 | 0.8383 | 0.8425 |
| Scheme | 0.6072 | 0.5656 | 0.5857 | 0.5984 | 0.5778 |
| Title | 0.6108 | 0.2965 | 0.3992 | 0.5040 | 0.3523 |
Usage
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B-Instruct")
model = PeftModel.from_pretrained(base_model, "adambuttrick/funding-extraction-llama-3.1-8b-instruct-grpo-stepfinal")
tokenizer = AutoTokenizer.from_pretrained("adambuttrick/funding-extraction-llama-3.1-8b-instruct-grpo-stepfinal")
prompt = """Extract funding information from the following statement:
This work was supported by the National Science Foundation under grant DMS-1613002 and by the NIH (R01-AI123456)."""
messages = [
{"role": "system", "content": "You are an expert at extracting structured funding metadata from academic papers. Given a funding statement, extract all funders and their associated awards. Return a JSON array of funder objects. Each funder has:\n- \"funder_name\": string or null\n- \"awards\": array of objects with \"award_ids\" (array of strings), \"funding_scheme\" (array of strings), and \"award_title\" (array of strings)\nReturn ONLY the JSON array, no other text."},
{"role": "user", "content": prompt},
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
outputs = model.generate(inputs, max_new_tokens=512, temperature=0.0, do_sample=False)
print(tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True))
Output Format
[
{
"funder_name": "National Science Foundation",
"awards": [
{
"award_ids": ["DMS-1613002"],
"funding_scheme": [],
"award_title": []
}
]
},
{
"funder_name": "NIH",
"awards": [
{
"award_ids": ["R01-AI123456"],
"funding_scheme": [],
"award_title": []
}
]
}
]
- Downloads last month
- 10
Model tree for cometadata/funding-extraction-llama-3.1-8b-instruct-artifact-data-mix-grpo-mixed-reward
Base model
meta-llama/Llama-3.1-8B Finetuned
meta-llama/Llama-3.1-8B-Instruct