Instructions to use rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x")
model = AutoModelForMultimodalLM.from_pretrained("rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x

SGLang

How to use rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x with Docker Model Runner:
```
docker model run hf.co/rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x
```

hep-agent-qwen-qwen3-5-9b-mi300x

HEP domain expert — Fine-tuned Qwen/Qwen3.5-9B on High Energy Physics data.

This model is a full fine-tune of Qwen/Qwen3.5-9B on a curated corpus of High Energy Physics literature, experimental data, and synthetic Q&A. Trained on a single AMD MI300X (192 GB HBM3, ROCm 7.0).

Model Overview

Property	Value
Base model	`Qwen/Qwen3.5-9B`
Fine-tuning type	Full fine-tune (NOT LoRA)
Hardware	1× AMD MI300X (192 GB HBM3, ROCm 7.0)
Precision	bfloat16
Context length	2048 tokens
Training data	~50K–100K HEP examples
Optimizer	AdamW 8-bit (bitsandbytes)

Evaluation Results

All scores are accuracy (%) unless noted. Comparison against the unmodified Qwen/Qwen3.5-9B base.

General Benchmarks

Benchmark	Shots	Metric	Base (%)	Fine-tuned (%)	Δ
MMLU Full	5	acc	69.8	70.6	+0.7
ARC-Challenge	25	acc_norm	71.1	71.8	+0.7

No significant regressions were detected (threshold: −3 pp).

MMLU Physics Subsets (extracted from MMLU Full run)

Subset	Base (%)	Fine-tuned (%)	Δ
Conceptual Physics	77.0	77.9	+0.9
College Physics	57.8	58.8	+1.0
High School Physics	60.9	62.9	+2.0
Astronomy	80.3	80.9	+0.7
Physics avg	69.0	70.1	+1.1

MMLU STEM aggregate: Base 68.3% → Fine-tuned 68.7% (+0.4 pp).

Custom Physics Calculations (8 problems)

Category	Base (%)	Fine-tuned (%)
Four-vectors	50.0	50.0
Invariant mass	0.0	0.0
Decay kinematics	0.0	0.0
Branching ratios	0.0	0.0
Kinematics (pT/η)	0.0	0.0
Overall (exact match)	12.5	12.5

Note: This custom benchmark covers only 8 problems and uses strict exact-match numeric scoring. Both models demonstrate correct reasoning in the response text but often fail the final answer-extraction step (e.g., outputting an intermediate value rather than the final result in the expected units). A lenient scoring pass would yield higher effective accuracy. The benchmark will be expanded in a future evaluation run.

Benchmarks Not Yet Available

The following benchmarks encountered infrastructure errors during this evaluation run and will be included in a future update:

Benchmark	Intended Purpose	Blocker
SciQ	Science Q&A	HF dataset URI format incompatibility
GSM8K	Math reasoning	HF dataset URI format incompatibility
TruthfulQA mc1/mc2	Hallucination resistance	HF dataset URI format incompatibility
HellaSwag	Commonsense forgetting check	HF dataset URI format incompatibility
IFEval	Instruction following	Missing `immutabledict` package
Minerva MATH	Advanced math	Missing `antlr4` package (LaTeX parsing)
BBQ	Bias evaluation	Task not registered in harness version
HEP-QA (held-out)	Domain Q&A	Evaluation module path error

Intended Use

This model is designed for:

Answering questions about experimental and theoretical particle physics
Explaining detector physics, collision analysis, and data analysis
Solving quantitative physics problems (kinematics, cross-sections, decay calculations)
Summarizing HEP papers and explaining their methodology

Not intended for:

Real-time experimental analysis or ROOT file processing
Safety-critical applications
Medical or regulatory decisions

Training Data

Source	Volume	Description
arXiv hep-ph / hep-ex	~10K papers → Q&A	Theory, phenomenology, experimental
INSPIRE-HEP	~15K records	Paper summaries, detector data
CMS Open Data	~5K examples	Collision analysis, ROOT metadata
PDG (Particle Data Group)	~3K entries	Particle properties, decay modes
Synthetic Q&A	~20K generated	Kinematics, formulas, calculations

Training Configuration

Parameter	Value
learning_rate	8e-06
num_epochs	2
batch_size (effective)	32
sequence_length	4096
optimizer	adamw_8bit

Usage

Basic Generation

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)

# ChatML format (for Qwen base)
prompt = """<|im_start|>system
You are an expert particle physicist.<|im_end|>
<|im_start|>user
What is the invariant mass of two photons with energies 62.5 GeV each, traveling back-to-back?<|im_end|>
<|im_start|>assistant
"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
    output = model.generate(**inputs, max_new_tokens=300, do_sample=False)
print(tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))

Example 2


# Install latest stable Transformers
!pip install -U transformers==5.5.0

# Install remaining deps
!pip install -U accelerate bitsandbytes sentencepiece protobuf peft trl

# Optional
!pip install -U unsloth

from transformers import (
    AutoTokenizer,
    AutoModelForCausalLM,
    BitsAndBytesConfig,
)
import torch

model_name = "rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x"

# Quantization config
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
)

# Tokenizer
tokenizer = AutoTokenizer.from_pretrained(
    model_name,
    trust_remote_code=True
)

# Model
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    device_map="auto",
    dtype=torch.float16,
    trust_remote_code=True,
    quantization_config=bnb_config,
)


prompt = "Explain what a jet detector is in particle physics."

messages = [
    {"role": "user", "content": prompt}
]

# Apply chat template
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

inputs = tokenizer(
    text,
    return_tensors="pt"
).to(model.device)

# Generate
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=2048,
        temperature=0.5,
        do_sample=True,
        top_p=0.9,
    )

response = tokenizer.decode(
    outputs[0],
    skip_special_tokens=True
)

print(response)

vLLM Server (Recommended for Production)

# Install vLLM with ROCm support
pip install vllm --extra-index-url https://download.pytorch.org/whl/rocm7.0

# Launch server
vllm serve rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x \
  --dtype bfloat16 \
  --max-model-len 4096 \
  --port 8000

from openai import OpenAI
client = OpenAI(api_key="EMPTY", base_url="http://localhost:8000/v1")
response = client.chat.completions.create(
    model="rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x",
    messages=[{"role": "user", "content": "Explain the CMS detector architecture."}],
    max_tokens=500,
)
print(response.choices[0].message.content)

Limitations

Knowledge cutoff reflects training data (primarily pre-2025 papers)
May hallucinate specific numerical values; always verify against PDG/PDG Live
Not trained for function-calling or tool-use tasks
Quantitative calculations: correct reasoning approach observed but strict exact-match scores are low on small test sets; verify numerical outputs independently
Limited coverage of very recent experimental results
Several planned benchmarks (GSM8K, HellaSwag, TruthfulQA) could not run due to harness infrastructure issues; results will be added in a follow-up evaluation

Citation

@misc{hep-agent-mi300x-2026,
  title        = {HEP-Agent: Full Fine-Tuning of Qwen/Qwen3.5-9B on High Energy Physics Data},
  author       = {Rathod, Rajveer},
  year         = {2026},
  howpublished = {\url{https://huggingface.co/rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x}},
  note         = {Fine-tuned on AMD MI300X (ROCm 7.0) using Unsloth acceleration}
}

License

Apache License 2.0.

Base model weights are subject to their own license: Qwen/Qwen3.5-9B License

Downloads last month: 102

Safetensors

Model size

9B params

Tensor type

BF16

Model tree for rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x

Base model

Qwen/Qwen3.5-9B-Base

Finetuned

Qwen/Qwen3.5-9B

Finetuned

(373)

this model

Evaluation results

MMLU (5-shot) on MMLU
self-reported

70.600
ARC-Challenge (25-shot, norm) on ARC Challenge
self-reported

71.800
MMLU Conceptual Physics (5-shot) on MMLU Conceptual Physics
self-reported

77.900
MMLU College Physics (5-shot) on MMLU College Physics
self-reported

58.800
MMLU High School Physics (5-shot) on MMLU High School Physics
self-reported

62.900
MMLU Astronomy (5-shot) on MMLU Astronomy
self-reported

80.900