Instructions to use prapaa/medgemma-1.5-4b-it-sft-lora-indian-meds with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use prapaa/medgemma-1.5-4b-it-sft-lora-indian-meds with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="prapaa/medgemma-1.5-4b-it-sft-lora-indian-meds")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("prapaa/medgemma-1.5-4b-it-sft-lora-indian-meds", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use prapaa/medgemma-1.5-4b-it-sft-lora-indian-meds with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "prapaa/medgemma-1.5-4b-it-sft-lora-indian-meds"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prapaa/medgemma-1.5-4b-it-sft-lora-indian-meds",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/prapaa/medgemma-1.5-4b-it-sft-lora-indian-meds

SGLang

How to use prapaa/medgemma-1.5-4b-it-sft-lora-indian-meds with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "prapaa/medgemma-1.5-4b-it-sft-lora-indian-meds" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prapaa/medgemma-1.5-4b-it-sft-lora-indian-meds",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "prapaa/medgemma-1.5-4b-it-sft-lora-indian-meds" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prapaa/medgemma-1.5-4b-it-sft-lora-indian-meds",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use prapaa/medgemma-1.5-4b-it-sft-lora-indian-meds with Docker Model Runner:
```
docker model run hf.co/prapaa/medgemma-1.5-4b-it-sft-lora-indian-meds
```

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

MedGemma 1.5 4B SFT LoRA — Indian Medicines

A QLoRA/LoRA fine-tuned version of Google MedGemma 1.5 4B (instruction-tuned) for Indian medicine–centric question answering. The model is trained on text-only Indian medicine metadata (uses, side effects, drug interactions, composition, manufacturer, price) and is intended for non-commercial research and educational use only.

Model description

Architecture: Based on google/medgemma-1.5-4b-it (MedGemma 1.5 4B instruction-tuned), with LoRA adapters merged into the full model.
Training: Supervised fine-tuning (SFT) with QLoRA (4-bit quantization + LoRA). Only the adapter weights were trained; the merged model is uploaded for direct use.
Modalities: Text-only (no images). Trained and used in chat format (user/assistant turns).
Target use: Answering questions about Indian medicines (e.g., uses, side effects, drug interactions, salt composition, manufacturer, approximate price in India).

Training data

Dataset: Indian Medicine Data (Kaggle, by mohneesh7).
Source: https://www.kaggle.com/datasets/mohneesh7/indian-medicine-data?resource=download
Content: Indian medicine metadata CSV with columns: sub_category, product_name, salt_composition, product_price, product_manufactured, medicine_desc, side_effects, drug_interactions.
Preprocessing: Rows are converted into instruction–response pairs (e.g., “What is [product_name] used for and what are its important details for patients in India?” → answer built from description, side effects, interactions, composition, manufacturer, price). Train/validation split: 90% / 10%.

Training procedure

Key hyperparameters: LoRA r=16, lora_alpha=16, lora_dropout=0.05, target_modules="all-linear", modules_to_save=["lm_head", "embed_tokens"]; QLoRA 4-bit NF4, double quant, bfloat16; AdamW (fused), learning rate 2e-4, linear LR schedule, warmup ratio 0.03, max grad norm 0.3; gradient checkpointing, gradient_accumulation_steps=4, bf16=True.
Monitoring: Weights & Biases run.

Training metrics (W&B run)

Config	Value
learning_rate	2e-4
num_train_epochs	1
per_device_train_batch_size	4
gradient_accumulation_steps	4
eval_steps	50
logging_steps	50

Metric	Value
Train loss (final)	0.240
Eval loss (final)	0.0257
Train token accuracy	99.45%
Eval token accuracy	99.31%
Total steps	11,003
Train runtime	~12.7 h

Evaluation

Validation was performed on a held-out 10% of the dataset during training.

How to use

from transformers import AutoModelForImageTextToText, AutoProcessor

model_id = "prapaa/medgemma-4b-it-sft-lora-indian-meds"
model = AutoModelForImageTextToText.from_pretrained(model_id, torch_dtype="auto", device_map="auto")
processor = AutoProcessor.from_pretrained(model_id)

messages = [
    {"role": "user", "content": "What is Paracetamol used for and what are its important details for patients in India?"}
]
text = processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
inputs = processor(text=text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, temperature=0.7, top_p=0.9)
response = processor.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)

Intended uses and limitations

The model is trained only on the Indian medicine metadata CSV; it is not a general-purpose medical model and can be wrong or incomplete.
Do not use outputs for clinical or treatment decisions. Always rely on qualified healthcare providers and official product information.
Possible biases and errors from the dataset and base model may remain. Use only for non-commercial research and education.

License

Non-commercial use only. This model strictly prohibits any commercial use. You may use, copy, and modify the model only for personal non-commercial use, academic and scientific research, and educational purposes. You may not use this model (or any derivative) for any commercial purpose, including selling or licensing the model or its outputs, integrating it into commercial products or services, or using it to generate revenue. By using this model, you agree to comply with this restriction and with the terms of the base model google/medgemma-1.5-4b-it where applicable.

Citation

@misc{medgemma-4b-it-sft-lora-indian-meds,
  author = {prapaa},
  title = {MedGemma-4b-it SFT LoRA Indian Medicines},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/prapaa/medgemma-4b-it-sft-lora-indian-meds}
}