PEFT
Safetensors
lora
funding-extraction
grpo
rl

Funding Parsing LoRA — Llama 3.1 8B Instruct + GRPO

A LoRA adapter for extracting structured funding information from funding statements in scholarly works.

Model Details

Training Pipeline

Stage 1: Supervised Fine-Tuning

  • Data: cometadata/funding-extraction-sft-data — 1,316 real examples (train.jsonl) + 2,531 synthetic examples (synthetic.jsonl) upsampled 2x = 6,378 total
  • Epochs: 2
  • LoRA rank: 64
  • LoRA alpha: 32
  • Learning rate: ~2.86e-4
  • Batch size: 128
  • Max sequence length: 4,096 tokens
  • LR schedule: Linear decay
  • Renderer: llama3 (Llama 3.1 Instruct chat template)
  • Train on: Last assistant message only

Stage 2: Reinforcement Learning (GRPO)

  • Algorithm: Group Relative Policy Optimization (GRPO)
  • Starting checkpoint: SFT final weights
  • Data: 3,462 train / 385 eval examples
  • Learning rate: 3e-5
  • Temperature: 0.8
  • Batch size: 16, Group size: 8
  • KL penalty: 0.03 (against SFT reference policy)
  • Best checkpoint: Step 130 / 217 (selected by eval reward)
  • Eval reward at best step: 0.961

Reward Function

See https://github.com/cometadata/funding-metadata-enrichment/tree/main/train for the full training code

Gated, hierarchical matching on funder using the Hungarian algorithm for limiting subordinate fields 1:1 funder pairing:

  • Funder name - Fuzzy matching using a token-sort with acronym and containment boosts, F0.5 score, weight 0.50
  • Award IDs - Normalized exact matching, F0.5 score, weight 0.50
  • Funding scheme - Not weighted
  • Award title - Not weighted

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B-Instruct")
model = PeftModel.from_pretrained(base_model, "cometadata/funding-parsing-lora-Llama_3.1_8B-instruct-ep2-r64-a32-grpo")
tokenizer = AutoTokenizer.from_pretrained("cometadata/funding-parsing-lora-Llama_3.1_8B-instruct-ep2-r64-a32-grpo")

messages = [
    {"role": "system", "content": "Extract funding information from the text. Return a JSON array of funders."},
    {"role": "user", "content": "Extract funding information from the following statement:\n\nThis work was supported by the National Science Foundation (Grant No. 2045678) and the European Research Council (ERC-2021-StG-101039567)."}
]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
outputs = model.generate(inputs, max_new_tokens=512, temperature=0.1)
print(tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True))

Expected output:

[
  {
    "funder_name": "National Science Foundation",
    "awards": [
      {
        "award_ids": ["2045678"],
        "funding_scheme": [],
        "award_title": []
      }
    ]
  },
  {
    "funder_name": "European Research Council",
    "awards": [
      {
        "award_ids": ["ERC-2021-StG-101039567"],
        "funding_scheme": [],
        "award_title": []
      }
    ]
  }
]

Training Infrastructure

Trained on Tinker by Thinking Machines Lab

Eval Results

Step Eval Reward Funder F0.5 Award F0.5 Format Valid KL
0 0.944 0.961 0.926 99.7% 0.0005
10 0.937 0.954 0.920 99.7% 0.0015
20 0.950 0.969 0.932 100% 0.0020
30 0.954 0.966 0.942 100% 0.0025
40 0.951 0.971 0.931 100% 0.0013
50 0.938 0.956 0.919 100% 0.0051
60 0.949 0.967 0.931 100% 0.0047
70 0.954 0.968 0.939 100% 0.0025
80 0.951 0.962 0.940 100% 0.0021
90 0.945 0.959 0.931 100% 0.0026
100 0.943 0.963 0.923 99.7% 0.0016
110 0.945 0.961 0.929 99.5% 0.0036
120 0.950 0.964 0.936 99.5% 0.0028
130 0.961 0.974 0.948 100% 0.0026
140 0.955 0.973 0.938 100% 0.0020
150 0.957 0.972 0.942 100% 0.0012
160 0.947 0.963 0.931 99.7% 0.0034
170 0.951 0.957 0.944 100% 0.0023
180 0.944 0.960 0.928 100% 0.0013
190 0.933 0.956 0.910 99.5% 0.0004
200 0.942 0.961 0.922 99.7% 0.0017
210 0.957 0.967 0.947 100% 0.0014
Downloads last month
12
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for cometadata/funding-parsing-lora-Llama_3.1_8B-instruct-ep2-r64-a32-grpo

Adapter
(2499)
this model

Dataset used to train cometadata/funding-parsing-lora-Llama_3.1_8B-instruct-ep2-r64-a32-grpo