---
license: llama3
base_model: meta-llama/Meta-Llama-3-8B
library_name: transformers
pipeline_tag: text-generation
tags:
- nexa
- scientific-reasoning
- claim-verification
- biomedical-qa
- retrieval-reranking
- lora
- merged
---

# Nexa Llama-3 8B Science Multitask (Merged)

Merged full model produced by fusing LoRA adapters trained for scientific multitask instruction tuning.

## Model Details
- Base model: `meta-llama/Meta-Llama-3-8B`
- Method: QLoRA/LoRA adapter training, then merged (`merge_and_unload`) into full weights
- Timestamp (UTC): `2026-02-24T03:56:05+00:00`

## Tasks
- `<TASK:VERIFY>`: SUPPORTS/REFUTES/NEI claim verification
- `<TASK:QA>`: yes/no/maybe abstract-grounded QA
- `<TASK:RERANK>`: 0-3 relevance scoring used for ranking

## Training Data
- Dataset: Nexa science multitask mixture (balanced short rerun release)
- Format: text-to-text with explicit task tokens and JSON outputs

## Evaluation Snapshot

### Balanced split (trusted)
| Metric | Baseline (pre-rerun) | Post-train |
|---|---:|---:|
| Verify Accuracy | 0.5333 | 0.6667 |
| Verify Macro-F1 | 0.5385 | 0.6592 |
| QA Accuracy | 0.4000 | 0.5333 |
| QA Majority Baseline | 0.4000 | 0.4000 |
| Rerank Pair Accuracy | 0.3500 | 0.4667 |
| Rerank MRR@10 | 0.2667 | 0.5708 |
| Rerank Recall@1 | 0.0000 | 0.5000 |
| Rerank Recall@3 | 0.3333 | 0.5000 |
| Rerank Recall@5 | 0.5000 | 0.6667 |

### Mixed split (diagnostic only)
- Verify Accuracy: 0.5833
- Verify Macro-F1: 0.6667
- QA Accuracy: 0.6667 (mixed split is label-skewed)
- Rerank MRR@10: 0.4352

## Intended Use
Research and prototyping for scientific assistant workflows that mix verification, QA, and reranking.

## Limitations
- Biomedical/scientific outputs can still hallucinate or overstate confidence.
- Not validated for clinical, legal, or high-stakes decision making.
- Mixed validation split has known QA label imbalance and should not be used as sole quality signal.

## Artifacts in This Repo
- Merged model weights and tokenizer
- `eval/` metrics JSON files
- `code/` dataset/training/eval scripts used in this release

## Notes
Merged from Nexa_Tune_Balanced_Rerun adapter after balanced short rerun.

HF repo: https://huggingface.co/Allanatrix/nexa-llama3-8b-science-multitask-merged