Pritish92/ner-grit-llama31-8b-lora-best

This is a GRIT + LoRA adapter fine-tuned from meta-llama/Llama-3.1-8B to do instruction-following NER-style extraction into a strict JSON list format:

[{"label":"...","text":"..."}]

Note: This repository contains adapter weights only (not the full base model weights). You must have access to meta-llama/Llama-3.1-8B on Hugging Face to run it.

Prompt format (exact)

### Instruction:
{instruction}
Maintain the JSON key order exactly as shown.
Output format: [{"label":"...","text":"..."}]

### Input:
{input_chunk}

### Response:

How to load

import torch
from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer

adapter_id = "Pritish92/ner-grit-llama31-8b-lora-best"
tokenizer = AutoTokenizer.from_pretrained(adapter_id, use_fast=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "left"
tokenizer.truncation_side = "left"

model = AutoPeftModelForCausalLM.from_pretrained(
    adapter_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
model.eval()

Training details

  • Date: 2026-01-02
  • Sequence length cap (max_length): 20
  • Chunking strategy: token_overlap
    • prompt overhead tokens reserved: 256
    • output overhead tokens reserved: 1024
    • max input chunk tokens: 2048
    • overlap chunk tokens: 256
    • min chunk tokens: 256
  • Batch size: 1
  • Gradient accumulation: 8 (effective batch: 8)
  • Learning rate: 5e-05
  • Planned epochs: 2 (early stopping may stop sooner)
  • Loss masking: response-only (prompt + input chunk tokens masked with -100)

LoRA / PEFT

  • LoRA rank (r): 16
  • LoRA alpha: 32
  • LoRA dropout: 0.1
  • Target modules: up_proj, v_proj, down_proj, o_proj, k_proj, gate_proj, q_proj

GRIT hyperparameters

  • kfac_min_samples: 256
  • kfac_update_freq: 100
  • kfac_damping: 0.005
  • reprojection_warmup_steps: 500
  • reprojection_freq: 100
  • use_two_sided_reprojection: True
  • rank_adaptation_start_step: 500
  • rank_adaptation_threshold: 0.85
  • ng_warmup_steps: 300
  • regularizer_warmup_steps: 500
  • lambda_kfac: 1e-05
  • lambda_reproj: 0.0001

Training data

Local CSVs:

  • NER/NER-Data/ner_train_dataset.csv
  • NER/NER-Data/ner_dev_dataset.csv
  • NER/NER-Data/ner_test_dataset.csv

Example counts: raw train=18,115, raw val=2,010; after chunking train examples=24,620

Evaluation

  • Best checkpoint metric: eval_entity_f1=0.187876 (best checkpoint: step 3078)
  • Train runtime: 34690.8s (9h 38m 10s)
  • eval_entity_f1: 0.187876
  • eval_entity_micro_f1: 0.175875
  • eval_entity_parse_fail_rate: 0.651071
  • eval_entity_precision: 0.291457
  • eval_entity_recall: 0.167590
  • eval_loss: 0.138082
  • eval_runtime: 22803.049000
  • eval_samples_per_second: 0.123000
  • eval_steps_per_second: 0.031000

Limitations / notes

  • Outputs are not guaranteed to be valid JSON; validate/parse and handle failures robustly.
  • Model performance depends on the entity schema/labels in your training data.
  • If meta-llama/Llama-3.1-8B is gated, you must authenticate to download it.
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Pritish92/ner-grit-llama31-8b-lora-best

Adapter
(745)
this model