Instructions to use Allanatrix/nexa-llama3-8b-science-multitask-merged with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Allanatrix/nexa-llama3-8b-science-multitask-merged with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Allanatrix/nexa-llama3-8b-science-multitask-merged")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Allanatrix/nexa-llama3-8b-science-multitask-merged")
model = AutoModelForCausalLM.from_pretrained("Allanatrix/nexa-llama3-8b-science-multitask-merged")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Allanatrix/nexa-llama3-8b-science-multitask-merged with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Allanatrix/nexa-llama3-8b-science-multitask-merged"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Allanatrix/nexa-llama3-8b-science-multitask-merged",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/Allanatrix/nexa-llama3-8b-science-multitask-merged

SGLang

How to use Allanatrix/nexa-llama3-8b-science-multitask-merged with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Allanatrix/nexa-llama3-8b-science-multitask-merged" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Allanatrix/nexa-llama3-8b-science-multitask-merged",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Allanatrix/nexa-llama3-8b-science-multitask-merged" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Allanatrix/nexa-llama3-8b-science-multitask-merged",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use Allanatrix/nexa-llama3-8b-science-multitask-merged with Docker Model Runner:
```
docker model run hf.co/Allanatrix/nexa-llama3-8b-science-multitask-merged
```

Nexa Llama-3 8B Science Multitask (Merged)

Merged full model produced by fusing LoRA adapters trained for scientific multitask instruction tuning.

Model Details

Base model: meta-llama/Meta-Llama-3-8B
Method: QLoRA/LoRA adapter training, then merged (merge_and_unload) into full weights
Timestamp (UTC): 2026-02-24T03:56:05+00:00

Tasks

<TASK:VERIFY>: SUPPORTS/REFUTES/NEI claim verification
<TASK:QA>: yes/no/maybe abstract-grounded QA
<TASK:RERANK>: 0-3 relevance scoring used for ranking

Training Data

Dataset: Nexa science multitask mixture (balanced short rerun release)
Format: text-to-text with explicit task tokens and JSON outputs

Evaluation Snapshot

Balanced split (trusted)

Metric	Baseline (pre-rerun)	Post-train
Verify Accuracy	0.5333	0.6667
Verify Macro-F1	0.5385	0.6592
QA Accuracy	0.4000	0.5333
QA Majority Baseline	0.4000	0.4000
Rerank Pair Accuracy	0.3500	0.4667
Rerank MRR@10	0.2667	0.5708
Rerank Recall@1	0.0000	0.5000
Rerank Recall@3	0.3333	0.5000
Rerank Recall@5	0.5000	0.6667

Mixed split (diagnostic only)

Verify Accuracy: 0.5833
Verify Macro-F1: 0.6667
QA Accuracy: 0.6667 (mixed split is label-skewed)
Rerank MRR@10: 0.4352

Intended Use

Research and prototyping for scientific assistant workflows that mix verification, QA, and reranking.

Limitations

Biomedical/scientific outputs can still hallucinate or overstate confidence.
Not validated for clinical, legal, or high-stakes decision making.
Mixed validation split has known QA label imbalance and should not be used as sole quality signal.

Artifacts in This Repo

Merged model weights and tokenizer
eval/ metrics JSON files
code/ dataset/training/eval scripts used in this release

Notes

Merged from Nexa_Tune_Balanced_Rerun adapter after balanced short rerun.

HF repo: https://huggingface.co/Allanatrix/nexa-llama3-8b-science-multitask-merged

Downloads last month: 1

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for Allanatrix/nexa-llama3-8b-science-multitask-merged

Base model

meta-llama/Meta-Llama-3-8B

Adapter

(725)

this model