Instructions to use adifa-jahangir/lora-medical-cot with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use adifa-jahangir/lora-medical-cot with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("unsloth/llama-3.2-3b-instruct-bnb-4bit") model = PeftModel.from_pretrained(base_model, "adifa-jahangir/lora-medical-cot") - Notebooks
- Google Colab
- Kaggle
LoRA Adapter: LLaMA 3.2 3B Fine-Tuned on Medical Chain-of-Thought Reasoning
This repository contains a LoRA adapter for the LLaMA 3.2 3B Instruct model (4-bit, Unsloth), fine-tuned using parameter-efficient supervised learning on a medical Chain-of-Thought (CoT) dataset. The goal is to enable the model to generate step-by-step medical reasoning and structured responses for clinical queries.
Model Details
- Base Model: unsloth/Llama-3.2-3B-Instruct-bnb-4bit
- Adapter Type: LoRA (Low-Rank Adaptation)
- Fine-tuned by: Adifa Jahangir
- Dataset: easonlai/medical-common-diseases-reasoning-SFT
- Task: Medical Chain-of-Thought Reasoning (step-by-step clinical reasoning and response generation)
- Language: English, some Chinese (see dataset)
- Frameworks: Unsloth, PEFT v0.14.0, Transformers
- License: MIT (dataset and base model)
Uses
Direct Use
- Medical QA: Generate step-by-step reasoning and final answers for medical questions.
- Educational: Assist in medical education by providing structured clinical reasoning.
Downstream Use
- Integrate into medical chatbots or virtual assistants for healthcare.
- Use as a base for further fine-tuning on other medical reasoning datasets.
Out-of-Scope Use
- Not for real-time clinical decision-making or diagnosis without human oversight.
- Not suitable for non-medical domains.
Bias, Risks, and Limitations
- The model may reflect biases present in the training data.
- Not a substitute for professional medical advice.
- May generate plausible-sounding but incorrect or unsafe medical reasoning.
Recommendations
- Always have outputs reviewed by qualified medical professionals.
- Do not use for critical or emergency medical decisions.
How to Use
from unsloth import FastLanguageModel
# Load base model
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="unsloth/Llama-3.2-3B-Instruct-bnb-4bit",
load_in_4bit=True,
device_map="auto"
)
# Load LoRA adapter
model.load_adapter("adifa-jahangir/lora-medical-cot")
# Example inference
prompt = "A 25-year-old male presents with fever and cough. <think>"
inputs = tokenizer(prompt, return_tensors="pt", padding=True)
input_ids = inputs.input_ids.cuda()
attention_mask = inputs.attention_mask.cuda()
output = model.generate(input_ids=input_ids, attention_mask=attention_mask, max_new_tokens=128)
print(tokenizer.decode(output[0], skip_special_tokens=True))
Training Details
- Dataset: easonlai/medical-common-diseases-reasoning-SFT
- Validation Split: 100 samples for validation, rest for training
- LoRA Config: r=16, lora_alpha=32, lora_dropout=0.05, bias="none"
- Batch Size: 8
- Learning Rate: 2e-4
- Epochs: 3
- Logging: wandb.ai/adifajahangir99/huggingface
Evaluation
- Metric: ROUGE-L
- Baseline ROUGE-L: 0.0
- After Fine-Tuning ROUGE-L: 0.005
Environmental Impact
- Hardware Type: Kaggle GPU, e.g., NVIDIA Tesla T4
- Hours used: 4-5
- Cloud Provider: Kaggle
- Compute Region: us-central1
- Carbon Emitted: Estimate with https://mlco2.github.io/impact#compute
Citation
If you use this adapter, please cite the base model and dataset as well as this repository.
Model Card Contact
For questions, contact Adifa Jahangir(mailto:adifajahangir99@gmail.com).
This model card was generated as part of a parameter-efficient supervised fine-tuning project for medical chain-of-thought reasoning.
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for adifa-jahangir/lora-medical-cot
Base model
meta-llama/Llama-3.2-3B-Instruct Quantized
unsloth/Llama-3.2-3B-Instruct-bnb-4bit