Qwen2.5-7B-Breast-CRAG

Breast Cancer Specialized LLM (Full Fine-tuned / LoRA-ready)

Qwen2.5-7B-Breast-CRAG is a 7B-parameter large language model fully fine-tuned for breast cancer clinical consultation, developed as part of the Breast-CRAG system. It is optimized for human-like doctor–patient dialogue and professional breast cancer domain knowledge.

This model can be loaded directly or as a LoRA adapter via LLaMA Factory.

🔗 Links

HF Repo: https://huggingface.co/MaxinT23/Qwen2.5-7B-Breast-CRAG
Paper: https://link.springer.com/chapter/10.1007/978-3-031-95841-0_19
Code: https://github.com/Maxin-C/Breast-CRAG
Framework: LLaMA Factory

✨ Model Description

Base Model: Qwen2.5-7B-Instruct
Task: Breast cancer medical dialogue generation
Parameters: 7B
Language: Chinese
Training: Full fine-tuned + LoRA (PEFT) optimized
Purpose: Clinical consultation assistance (research only)
System: Core generator of Breast-CRAG (RAG-enhanced)

📊 Training Data

Curated breast cancer dialogues: 268K
Train split: 91K (30K MedDialog-BC + 61K Huatuo-BC)
Data source: MedDialog-CN, Huatuo-26M (filtered/cleaned)
Pipeline: Keyword filter → GPT-4o quality filter → dialogue summarization

⚙️ Training Hyperparameters

LoRA alpha: 32
LoRA rank: 16
Dropout: 0.1
Learning rate: 5e-5
Scheduler: cosine
Batch size: 2
Gradient accumulation: 8
Epochs: 8
GPU: RTX 3090 (24GB)
Training time: ~34.2 hours

🧪 Key Results

Dialogue (Humanization)

Outperforms similar-size open-source LLMs
Matches/exceeds GPT-4o on 70% of dialogue metrics

Exam (Specialization)

USMLE-BC: 81% accuracy (on par with GPT-4o)
Exam-BC Simple: 63% | Hard: 52%

🚀 Usage (LLaMA Factory LoRA Load)

1. Install

pip install "llamafactory[torch]" transformers peft

2. Load via LLaMA Factory

from llamafactory import load_model
from transformers import AutoTokenizer

model_name = "MaxinT23/Qwen2.5-7B-Breast-CRAG"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = load_model(model_name, adapter_name="lora", device="auto")

prompt = "请以乳腺癌专科医生身份，回答以下问题：乳腺癌术后多久可以恢复正常生活？"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.95, top_p=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

⚠️ Limitations

Text-only (no multimodal support)
For research purposes only, not for direct clinical use
Weak performance on multiple-choice exams
Best used with Breast-CRAG Retriever + 1M knowledge chunks

📚 Citation

Chen, Z., Wang, Q., Liu, J., Sun, Y., Zheng, H., Li, H., Duan, H., Lu, X. (2025). Breast-CRAG: A Breast Cancer Large Language Model Leveraging Retrieval-Augmented Generation. In: Artificial Intelligence in Medicine. AIME 2025. Lecture Notes in Computer Science, vol 15735. Springer, Cham. https://doi.org/10.1007/978-3-031-95841-0_19

🙏 Acknowledgments

Supported by the National Key Technologies R&D Program of China (2022YFF1203002) and Sir Run Run Shaw Hospital.

Downloads last month: 18

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for MaxinT23/Qwen2.5-7B-Breast-CRAG

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-7B-Instruct

Adapter

(2145)

this model