Instructions to use rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x") model = AutoModelForMultimodalLM.from_pretrained("rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x
- SGLang
How to use rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x with Docker Model Runner:
docker model run hf.co/rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x
hep-agent-qwen-qwen3-5-9b-mi300x
HEP domain expert — Fine-tuned Qwen/Qwen3.5-9B on High Energy Physics data.
This model is a full fine-tune of Qwen/Qwen3.5-9B on a curated corpus of High Energy Physics literature, experimental data, and synthetic Q&A. Trained on a single AMD MI300X (192 GB HBM3, ROCm 7.0).
Model Overview
| Property | Value |
|---|---|
| Base model | Qwen/Qwen3.5-9B |
| Fine-tuning type | Full fine-tune (NOT LoRA) |
| Hardware | 1× AMD MI300X (192 GB HBM3, ROCm 7.0) |
| Precision | bfloat16 |
| Context length | 2048 tokens |
| Training data | ~50K–100K HEP examples |
| Optimizer | AdamW 8-bit (bitsandbytes) |
Evaluation Results
All scores are accuracy (%) unless noted. Comparison against the unmodified Qwen/Qwen3.5-9B base.
General Benchmarks
| Benchmark | Shots | Metric | Base (%) | Fine-tuned (%) | Δ |
|---|---|---|---|---|---|
| MMLU Full | 5 | acc | 69.8 | 70.6 | +0.7 |
| ARC-Challenge | 25 | acc_norm | 71.1 | 71.8 | +0.7 |
No significant regressions were detected (threshold: −3 pp).
MMLU Physics Subsets (extracted from MMLU Full run)
| Subset | Base (%) | Fine-tuned (%) | Δ |
|---|---|---|---|
| Conceptual Physics | 77.0 | 77.9 | +0.9 |
| College Physics | 57.8 | 58.8 | +1.0 |
| High School Physics | 60.9 | 62.9 | +2.0 |
| Astronomy | 80.3 | 80.9 | +0.7 |
| Physics avg | 69.0 | 70.1 | +1.1 |
MMLU STEM aggregate: Base 68.3% → Fine-tuned 68.7% (+0.4 pp).
Custom Physics Calculations (8 problems)
| Category | Base (%) | Fine-tuned (%) |
|---|---|---|
| Four-vectors | 50.0 | 50.0 |
| Invariant mass | 0.0 | 0.0 |
| Decay kinematics | 0.0 | 0.0 |
| Branching ratios | 0.0 | 0.0 |
| Kinematics (pT/η) | 0.0 | 0.0 |
| Overall (exact match) | 12.5 | 12.5 |
Note: This custom benchmark covers only 8 problems and uses strict exact-match numeric scoring. Both models demonstrate correct reasoning in the response text but often fail the final answer-extraction step (e.g., outputting an intermediate value rather than the final result in the expected units). A lenient scoring pass would yield higher effective accuracy. The benchmark will be expanded in a future evaluation run.
Benchmarks Not Yet Available
The following benchmarks encountered infrastructure errors during this evaluation run and will be included in a future update:
| Benchmark | Intended Purpose | Blocker |
|---|---|---|
| SciQ | Science Q&A | HF dataset URI format incompatibility |
| GSM8K | Math reasoning | HF dataset URI format incompatibility |
| TruthfulQA mc1/mc2 | Hallucination resistance | HF dataset URI format incompatibility |
| HellaSwag | Commonsense forgetting check | HF dataset URI format incompatibility |
| IFEval | Instruction following | Missing immutabledict package |
| Minerva MATH | Advanced math | Missing antlr4 package (LaTeX parsing) |
| BBQ | Bias evaluation | Task not registered in harness version |
| HEP-QA (held-out) | Domain Q&A | Evaluation module path error |
Intended Use
This model is designed for:
- Answering questions about experimental and theoretical particle physics
- Explaining detector physics, collision analysis, and data analysis
- Solving quantitative physics problems (kinematics, cross-sections, decay calculations)
- Summarizing HEP papers and explaining their methodology
Not intended for:
- Real-time experimental analysis or ROOT file processing
- Safety-critical applications
- Medical or regulatory decisions
Training Data
| Source | Volume | Description |
|---|---|---|
| arXiv hep-ph / hep-ex | ~10K papers → Q&A | Theory, phenomenology, experimental |
| INSPIRE-HEP | ~15K records | Paper summaries, detector data |
| CMS Open Data | ~5K examples | Collision analysis, ROOT metadata |
| PDG (Particle Data Group) | ~3K entries | Particle properties, decay modes |
| Synthetic Q&A | ~20K generated | Kinematics, formulas, calculations |
Training Configuration
| Parameter | Value |
|---|---|
| learning_rate | 8e-06 |
| num_epochs | 2 |
| batch_size (effective) | 32 |
| sequence_length | 4096 |
| optimizer | adamw_8bit |
Usage
Basic Generation
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True,
)
# ChatML format (for Qwen base)
prompt = """<|im_start|>system
You are an expert particle physicist.<|im_end|>
<|im_start|>user
What is the invariant mass of two photons with energies 62.5 GeV each, traveling back-to-back?<|im_end|>
<|im_start|>assistant
"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
output = model.generate(**inputs, max_new_tokens=300, do_sample=False)
print(tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
Example 2
# Install latest stable Transformers
!pip install -U transformers==5.5.0
# Install remaining deps
!pip install -U accelerate bitsandbytes sentencepiece protobuf peft trl
# Optional
!pip install -U unsloth
from transformers import (
AutoTokenizer,
AutoModelForCausalLM,
BitsAndBytesConfig,
)
import torch
model_name = "rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x"
# Quantization config
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
)
# Tokenizer
tokenizer = AutoTokenizer.from_pretrained(
model_name,
trust_remote_code=True
)
# Model
model = AutoModelForCausalLM.from_pretrained(
model_name,
device_map="auto",
dtype=torch.float16,
trust_remote_code=True,
quantization_config=bnb_config,
)
prompt = "Explain what a jet detector is in particle physics."
messages = [
{"role": "user", "content": prompt}
]
# Apply chat template
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
inputs = tokenizer(
text,
return_tensors="pt"
).to(model.device)
# Generate
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=2048,
temperature=0.5,
do_sample=True,
top_p=0.9,
)
response = tokenizer.decode(
outputs[0],
skip_special_tokens=True
)
print(response)
vLLM Server (Recommended for Production)
# Install vLLM with ROCm support
pip install vllm --extra-index-url https://download.pytorch.org/whl/rocm7.0
# Launch server
vllm serve rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x \
--dtype bfloat16 \
--max-model-len 4096 \
--port 8000
from openai import OpenAI
client = OpenAI(api_key="EMPTY", base_url="http://localhost:8000/v1")
response = client.chat.completions.create(
model="rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x",
messages=[{"role": "user", "content": "Explain the CMS detector architecture."}],
max_tokens=500,
)
print(response.choices[0].message.content)
Limitations
- Knowledge cutoff reflects training data (primarily pre-2025 papers)
- May hallucinate specific numerical values; always verify against PDG/PDG Live
- Not trained for function-calling or tool-use tasks
- Quantitative calculations: correct reasoning approach observed but strict exact-match scores are low on small test sets; verify numerical outputs independently
- Limited coverage of very recent experimental results
- Several planned benchmarks (GSM8K, HellaSwag, TruthfulQA) could not run due to harness infrastructure issues; results will be added in a follow-up evaluation
Citation
@misc{hep-agent-mi300x-2026,
title = {HEP-Agent: Full Fine-Tuning of Qwen/Qwen3.5-9B on High Energy Physics Data},
author = {Rathod, Rajveer},
year = {2026},
howpublished = {\url{https://huggingface.co/rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x}},
note = {Fine-tuned on AMD MI300X (ROCm 7.0) using Unsloth acceleration}
}
License
Apache License 2.0.
Base model weights are subject to their own license: Qwen/Qwen3.5-9B License
- Downloads last month
- 102
Model tree for rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x
Evaluation results
- MMLU (5-shot) on MMLUself-reported70.600
- ARC-Challenge (25-shot, norm) on ARC Challengeself-reported71.800
- MMLU Conceptual Physics (5-shot) on MMLU Conceptual Physicsself-reported77.900
- MMLU College Physics (5-shot) on MMLU College Physicsself-reported58.800
- MMLU High School Physics (5-shot) on MMLU High School Physicsself-reported62.900
- MMLU Astronomy (5-shot) on MMLU Astronomyself-reported80.900
docker model run hf.co/rajveer43/hep-agent-qwen-qwen3-5-9b-mi300x