Instructions to use abnuel/fine-tuned-openbiollm-medical-coding with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use abnuel/fine-tuned-openbiollm-medical-coding with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="abnuel/fine-tuned-openbiollm-medical-coding")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("abnuel/fine-tuned-openbiollm-medical-coding", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use abnuel/fine-tuned-openbiollm-medical-coding with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "abnuel/fine-tuned-openbiollm-medical-coding"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "abnuel/fine-tuned-openbiollm-medical-coding",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/abnuel/fine-tuned-openbiollm-medical-coding

SGLang

How to use abnuel/fine-tuned-openbiollm-medical-coding with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "abnuel/fine-tuned-openbiollm-medical-coding" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "abnuel/fine-tuned-openbiollm-medical-coding",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "abnuel/fine-tuned-openbiollm-medical-coding" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "abnuel/fine-tuned-openbiollm-medical-coding",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use abnuel/fine-tuned-openbiollm-medical-coding with Docker Model Runner:
```
docker model run hf.co/abnuel/fine-tuned-openbiollm-medical-coding
```

fine-tuned-openbiollm-medical-coding

Fine-tuned version of aaditya/Llama3-OpenBioLLM-8B for automated ICD medical coding from clinical text. This model extends OpenBioLLM's strong biomedical language understanding with task-specific fine-tuning on ICD-10 code assignment.

Model Description

This model was developed as part of a research effort to evaluate multiple biomedical LLMs on the medical coding task. OpenBioLLM-8B provides a strong foundation in biomedical language understanding (pre-trained on PubMed, clinical notes, and biomedical corpora), and this fine-tune further specializes it for structured ICD-10 output from unstructured clinical text.

Base model: aaditya/Llama3-OpenBioLLM-8B
Fine-tuning method: SFT (Supervised Fine-Tuning) via TRL
Task: ICD-10 code generation from clinical text
Domain: Clinical NLP / Healthcare AI
Parameters: ~8B

Intended Uses

Automated medical coding assistance in clinical documentation workflows
Research benchmarking of biomedical LLMs on ICD coding tasks
Integration into clinical decision support pipelines (with human oversight)

How to Use

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "abnuel/fine-tuned-openbiollm-medical-coding"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

prompt = """You are a clinical coding assistant. Given the following clinical note, 
provide the most appropriate ICD-10 code(s).

Clinical note: Patient diagnosed with essential hypertension and stage 2 chronic kidney disease.

ICD-10 Code(s):"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=64, temperature=0.1)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Fine-tuning framework: TRL (Transformer Reinforcement Learning)
Method: Supervised Fine-Tuning (SFT)
Base model: Llama3-OpenBioLLM-8B (biomedical-specialized Llama 3)
Hardware: GPU (CUDA)

Limitations

As with all LLM-based coding tools, outputs should be reviewed by a certified medical coder before use in billing or clinical records.
May not generalize to all ICD-10-CM editions, regional coding conventions, or highly specialized subspecialties.
The model does not have access to real-time coding updates or payer-specific guidelines.

Related Models & Resources

abnuel/MedGemma-4b-ICD — MedGemma-4b fine-tuned on the same task
abnuel/MedGemma-4b-ICD-Coder — companion checkpoint
🚀 Live demo: spaces/abnuel/med-coding

Citation

@misc{adegunlehin2025openbiollm-coding,
  author = {Abayomi Adegunlehin},
  title  = {Fine-tuned OpenBioLLM-8B for ICD-10 Medical Coding},
  year   = {2025},
  url    = {https://huggingface.co/abnuel/fine-tuned-openbiollm-medical-coding}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for abnuel/fine-tuned-openbiollm-medical-coding

Base model

meta-llama/Meta-Llama-3-8B

Finetuned

aaditya/Llama3-OpenBioLLM-8B

Finetuned

(6)

this model