LoRA Adapter: LLaMA 3.2 3B Fine-Tuned on Medical Chain-of-Thought Reasoning

This repository contains a LoRA adapter for the LLaMA 3.2 3B Instruct model (4-bit, Unsloth), fine-tuned using parameter-efficient supervised learning on a medical Chain-of-Thought (CoT) dataset. The goal is to enable the model to generate step-by-step medical reasoning and structured responses for clinical queries.

Model Details

Base Model: unsloth/Llama-3.2-3B-Instruct-bnb-4bit
Adapter Type: LoRA (Low-Rank Adaptation)
Fine-tuned by: Adifa Jahangir
Dataset: easonlai/medical-common-diseases-reasoning-SFT
Task: Medical Chain-of-Thought Reasoning (step-by-step clinical reasoning and response generation)
Language: English, some Chinese (see dataset)
Frameworks: Unsloth, PEFT v0.14.0, Transformers
License: MIT (dataset and base model)

Uses

Direct Use

Medical QA: Generate step-by-step reasoning and final answers for medical questions.
Educational: Assist in medical education by providing structured clinical reasoning.

Downstream Use

Integrate into medical chatbots or virtual assistants for healthcare.
Use as a base for further fine-tuning on other medical reasoning datasets.

Out-of-Scope Use

Not for real-time clinical decision-making or diagnosis without human oversight.
Not suitable for non-medical domains.

Bias, Risks, and Limitations

The model may reflect biases present in the training data.
Not a substitute for professional medical advice.
May generate plausible-sounding but incorrect or unsafe medical reasoning.

Recommendations

Always have outputs reviewed by qualified medical professionals.
Do not use for critical or emergency medical decisions.

How to Use

from unsloth import FastLanguageModel

# Load base model
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="unsloth/Llama-3.2-3B-Instruct-bnb-4bit",
    load_in_4bit=True,
    device_map="auto"
)

# Load LoRA adapter
model.load_adapter("adifa-jahangir/lora-medical-cot")

# Example inference
prompt = "A 25-year-old male presents with fever and cough. <think>"
inputs = tokenizer(prompt, return_tensors="pt", padding=True)
input_ids = inputs.input_ids.cuda()
attention_mask = inputs.attention_mask.cuda()
output = model.generate(input_ids=input_ids, attention_mask=attention_mask, max_new_tokens=128)
print(tokenizer.decode(output[0], skip_special_tokens=True))

Training Details

Dataset: easonlai/medical-common-diseases-reasoning-SFT
Validation Split: 100 samples for validation, rest for training
LoRA Config: r=16, lora_alpha=32, lora_dropout=0.05, bias="none"
Batch Size: 8
Learning Rate: 2e-4
Epochs: 3
Logging: wandb.ai/adifajahangir99/huggingface

Evaluation

Metric: ROUGE-L
Baseline ROUGE-L: 0.0
After Fine-Tuning ROUGE-L: 0.005

Environmental Impact

Hardware Type: Kaggle GPU, e.g., NVIDIA Tesla T4
Hours used: 4-5
Cloud Provider: Kaggle
Compute Region: us-central1
Carbon Emitted: Estimate with https://mlco2.github.io/impact#compute

Citation

If you use this adapter, please cite the base model and dataset as well as this repository.

Model Card Contact

For questions, contact Adifa Jahangir(mailto:adifajahangir99@gmail.com).

This model card was generated as part of a parameter-efficient supervised fine-tuning project for medical chain-of-thought reasoning.

Downloads last month: 1

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for adifa-jahangir/lora-medical-cot

Base model

meta-llama/Llama-3.2-3B-Instruct

Quantized

unsloth/Llama-3.2-3B-Instruct-bnb-4bit

Adapter

(53)

this model