LoRA Adapter: LLaMA 3.2 3B Fine-Tuned on Medical Chain-of-Thought Reasoning

This repository contains a LoRA adapter for the LLaMA 3.2 3B Instruct model (4-bit, Unsloth), fine-tuned using parameter-efficient supervised learning on a medical Chain-of-Thought (CoT) dataset. The goal is to enable the model to generate step-by-step medical reasoning and structured responses for clinical queries.


Model Details


Uses

Direct Use

  • Medical QA: Generate step-by-step reasoning and final answers for medical questions.
  • Educational: Assist in medical education by providing structured clinical reasoning.

Downstream Use

  • Integrate into medical chatbots or virtual assistants for healthcare.
  • Use as a base for further fine-tuning on other medical reasoning datasets.

Out-of-Scope Use

  • Not for real-time clinical decision-making or diagnosis without human oversight.
  • Not suitable for non-medical domains.

Bias, Risks, and Limitations

  • The model may reflect biases present in the training data.
  • Not a substitute for professional medical advice.
  • May generate plausible-sounding but incorrect or unsafe medical reasoning.

Recommendations

  • Always have outputs reviewed by qualified medical professionals.
  • Do not use for critical or emergency medical decisions.

How to Use

from unsloth import FastLanguageModel

# Load base model
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="unsloth/Llama-3.2-3B-Instruct-bnb-4bit",
    load_in_4bit=True,
    device_map="auto"
)

# Load LoRA adapter
model.load_adapter("adifa-jahangir/lora-medical-cot")

# Example inference
prompt = "A 25-year-old male presents with fever and cough. <think>"
inputs = tokenizer(prompt, return_tensors="pt", padding=True)
input_ids = inputs.input_ids.cuda()
attention_mask = inputs.attention_mask.cuda()
output = model.generate(input_ids=input_ids, attention_mask=attention_mask, max_new_tokens=128)
print(tokenizer.decode(output[0], skip_special_tokens=True))

Training Details


Evaluation

  • Metric: ROUGE-L
  • Baseline ROUGE-L: 0.0
  • After Fine-Tuning ROUGE-L: 0.005

Environmental Impact


Citation

If you use this adapter, please cite the base model and dataset as well as this repository.


Model Card Contact

For questions, contact Adifa Jahangir(mailto:adifajahangir99@gmail.com).


This model card was generated as part of a parameter-efficient supervised fine-tuning project for medical chain-of-thought reasoning.

Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for adifa-jahangir/lora-medical-cot