Funding Parsing LoRA — Llama 3.1 8B Instruct + GRPO

A LoRA adapter for extracting structured funding information from funding statements in scholarly works.

Model Details

Base model: meta-llama/Llama-3.1-8B-Instruct
Method: Supervised fine-tuning (SFT) followed by reinforcement learning (GRPO)
Task: Given a funding statement, extract a structured JSON array of funders with their award IDs
Training data: cometadata/funding-extraction-sft-data

Training Pipeline

Stage 1: Supervised Fine-Tuning

Data: cometadata/funding-extraction-sft-data — 1,316 real examples (train.jsonl) + 2,531 synthetic examples (synthetic.jsonl) upsampled 2x = 6,378 total
Epochs: 2
LoRA rank: 64
LoRA alpha: 32
Learning rate: ~2.86e-4
Batch size: 128
Max sequence length: 4,096 tokens
LR schedule: Linear decay
Renderer: llama3 (Llama 3.1 Instruct chat template)
Train on: Last assistant message only

Stage 2: Reinforcement Learning (GRPO)

Algorithm: Group Relative Policy Optimization (GRPO)
Starting checkpoint: SFT final weights
Data: 3,462 train / 385 eval examples
Learning rate: 3e-5
Temperature: 0.8
Batch size: 16, Group size: 8
KL penalty: 0.03 (against SFT reference policy)
Best checkpoint: Step 130 / 217 (selected by eval reward)
Eval reward at best step: 0.961

Reward Function

See https://github.com/cometadata/funding-metadata-enrichment/tree/main/train for the full training code

Gated, hierarchical matching on funder using the Hungarian algorithm for limiting subordinate fields 1:1 funder pairing:

Funder name - Fuzzy matching using a token-sort with acronym and containment boosts, F0.5 score, weight 0.50
Award IDs - Normalized exact matching, F0.5 score, weight 0.50
Funding scheme - Not weighted
Award title - Not weighted

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B-Instruct")
model = PeftModel.from_pretrained(base_model, "cometadata/funding-parsing-lora-Llama_3.1_8B-instruct-ep2-r64-a32-grpo")
tokenizer = AutoTokenizer.from_pretrained("cometadata/funding-parsing-lora-Llama_3.1_8B-instruct-ep2-r64-a32-grpo")

messages = [
    {"role": "system", "content": "Extract funding information from the text. Return a JSON array of funders."},
    {"role": "user", "content": "Extract funding information from the following statement:\n\nThis work was supported by the National Science Foundation (Grant No. 2045678) and the European Research Council (ERC-2021-StG-101039567)."}
]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
outputs = model.generate(inputs, max_new_tokens=512, temperature=0.1)
print(tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True))

Expected output:

[
  {
    "funder_name": "National Science Foundation",
    "awards": [
      {
        "award_ids": ["2045678"],
        "funding_scheme": [],
        "award_title": []
      }
    ]
  },
  {
    "funder_name": "European Research Council",
    "awards": [
      {
        "award_ids": ["ERC-2021-StG-101039567"],
        "funding_scheme": [],
        "award_title": []
      }
    ]
  }
]

Training Infrastructure

Trained on Tinker by Thinking Machines Lab

Eval Results

Step	Eval Reward	Funder F0.5	Award F0.5	Format Valid	KL
0	0.944	0.961	0.926	99.7%	0.0005
10	0.937	0.954	0.920	99.7%	0.0015
20	0.950	0.969	0.932	100%	0.0020
30	0.954	0.966	0.942	100%	0.0025
40	0.951	0.971	0.931	100%	0.0013
50	0.938	0.956	0.919	100%	0.0051
60	0.949	0.967	0.931	100%	0.0047
70	0.954	0.968	0.939	100%	0.0025
80	0.951	0.962	0.940	100%	0.0021
90	0.945	0.959	0.931	100%	0.0026
100	0.943	0.963	0.923	99.7%	0.0016
110	0.945	0.961	0.929	99.5%	0.0036
120	0.950	0.964	0.936	99.5%	0.0028
130	0.961	0.974	0.948	100%	0.0026
140	0.955	0.973	0.938	100%	0.0020
150	0.957	0.972	0.942	100%	0.0012
160	0.947	0.963	0.931	99.7%	0.0034
170	0.951	0.957	0.944	100%	0.0023
180	0.944	0.960	0.928	100%	0.0013
190	0.933	0.956	0.910	99.5%	0.0004
200	0.942	0.961	0.922	99.7%	0.0017
210	0.957	0.967	0.947	100%	0.0014

Downloads last month: 12

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for cometadata/funding-parsing-lora-Llama_3.1_8B-instruct-ep2-r64-a32-grpo

Base model

meta-llama/Llama-3.1-8B

Finetuned

meta-llama/Llama-3.1-8B-Instruct

Adapter

(2499)

this model

cometadata
/

funding-parsing-lora-Llama_3.1_8B-instruct-ep2-r64-a32-grpo