--- base_model: meta-llama/Llama-3.1-8B library_name: peft pipeline_tag: text-generation tags: - lora - grit - ner - information-extraction - transformers --- ## Pritish92/ner-grit-llama31-8b-lora-best This is a **GRIT + LoRA adapter** fine-tuned from **`meta-llama/Llama-3.1-8B`** to do **instruction-following NER-style extraction** into a strict JSON list format: ```json [{"label":"...","text":"..."}] ``` **Note:** This repository contains **adapter weights only** (not the full base model weights). You must have access to `meta-llama/Llama-3.1-8B` on Hugging Face to run it. ## Prompt format (exact) ```text ### Instruction: {instruction} Maintain the JSON key order exactly as shown. Output format: [{"label":"...","text":"..."}] ### Input: {input_chunk} ### Response: ``` ## How to load ```python import torch from peft import AutoPeftModelForCausalLM from transformers import AutoTokenizer adapter_id = "Pritish92/ner-grit-llama31-8b-lora-best" tokenizer = AutoTokenizer.from_pretrained(adapter_id, use_fast=True) tokenizer.pad_token = tokenizer.eos_token tokenizer.padding_side = "left" tokenizer.truncation_side = "left" model = AutoPeftModelForCausalLM.from_pretrained( adapter_id, torch_dtype=torch.bfloat16, device_map="auto", ) model.eval() ``` ## Training details - **Date**: 2026-01-02 - **Sequence length cap (`max_length`)**: 20 - **Chunking strategy**: token_overlap - prompt overhead tokens reserved: 256 - output overhead tokens reserved: 1024 - max input chunk tokens: 2048 - overlap chunk tokens: 256 - min chunk tokens: 256 - **Batch size**: 1 - **Gradient accumulation**: 8 (effective batch: 8) - **Learning rate**: 5e-05 - **Planned epochs**: 2 (early stopping may stop sooner) - **Loss masking**: response-only (prompt + input chunk tokens masked with -100) ### LoRA / PEFT - **LoRA rank (r)**: 16 - **LoRA alpha**: 32 - **LoRA dropout**: 0.1 - **Target modules**: up_proj, v_proj, down_proj, o_proj, k_proj, gate_proj, q_proj ### GRIT hyperparameters - **kfac_min_samples**: 256 - **kfac_update_freq**: 100 - **kfac_damping**: 0.005 - **reprojection_warmup_steps**: 500 - **reprojection_freq**: 100 - **use_two_sided_reprojection**: True - **rank_adaptation_start_step**: 500 - **rank_adaptation_threshold**: 0.85 - **ng_warmup_steps**: 300 - **regularizer_warmup_steps**: 500 - **lambda_kfac**: 1e-05 - **lambda_reproj**: 0.0001 ## Training data Local CSVs: - `NER/NER-Data/ner_train_dataset.csv` - `NER/NER-Data/ner_dev_dataset.csv` - `NER/NER-Data/ner_test_dataset.csv` **Example counts:** raw train=18,115, raw val=2,010; after chunking train examples=24,620 ## Evaluation - **Best checkpoint metric**: eval_entity_f1=0.187876 (best checkpoint: step 3078) - **Train runtime**: 34690.8s (9h 38m 10s) - **eval_entity_f1**: 0.187876 - **eval_entity_micro_f1**: 0.175875 - **eval_entity_parse_fail_rate**: 0.651071 - **eval_entity_precision**: 0.291457 - **eval_entity_recall**: 0.167590 - **eval_loss**: 0.138082 - **eval_runtime**: 22803.049000 - **eval_samples_per_second**: 0.123000 - **eval_steps_per_second**: 0.031000 ## Limitations / notes - Outputs are **not guaranteed** to be valid JSON; validate/parse and handle failures robustly. - Model performance depends on the entity schema/labels in your training data. - If `meta-llama/Llama-3.1-8B` is gated, you must authenticate to download it.