OHM Tanglish MedASR 1.7B v152

OHM Tanglish MedASR 1.7B v152 is the first checkpoint from our one-Indic-language + English medical ASR effort: a Tamil + English clinical speech transcription model at compact 1.7B scale, aimed at smartphone-class edge medical transcription.

This checkpoint is a research collaboration between osmAPI, OHM - Open Holistic Medicine, and the Terv Student Research Team.

We use Tanglish here to mean the Tamil + English speech environment common in many clinical conversations: Tamil, English, transliterated terms, medical vocabulary, patient symptom descriptions, and doctor-patient dialogue. The model is designed as the first language-pair checkpoint in a repeatable path toward one-Indic-language + English medical ASR systems that can eventually run near the patient on edge devices.

Base Model

Base model: knoveleng/polyglot-lion-1.7b-v1.5
Released checkpoint: OHM Tanglish MedASR 1.7B v152
Parameter scale: 1.7B
Training method: full-parameter finetune
Adapter type: none
Primary target: Tamil + English medical ASR
Deployment direction: smartphone-class edge transcription after quantization and runtime packaging

Why This Size

Medical ASR should not always require a cloud-scale model. Larger ASR systems can be stronger on broad benchmarks, but they are not the natural first target for phone-side or clinic-side transcription.

We target 1.7B parameters because this size is a practical middle ground:

large enough to preserve multilingual ASR behavior
small enough to target 4-bit or 8-bit quantized inference
plausible for offline or low-connectivity medical transcription workflows
suitable for privacy-preserving edge research
repeatable for future one-Indic-language + English medical ASR checkpoints

This repository contains the research checkpoint. Actual "runs on any smartphone" support requires mobile export, quantization, runtime integration, and device-level benchmarking.

Finetune Data

v152 was selected from a mixed Tamil + English + medical training recipe designed to preserve all three capabilities at once.

Training Source	Purpose	Rows	Hours
FLEURS English train wide	English speech coverage	6,000	17.3986
IISc-MILE Tamil train wide	Tamil ASR robustness	7,200	18.5397
FLEURS Tamil train	Tamil read-speech coverage	1,800	7.2667
PriMock57 train	clinical dialogue speech	1,200	0.8805
Medical Speech Intent train	medical symptom utterances	120	0.1835
Total		16,320	44.2690

Validation used held-out English, Tamil, medical-intent, and clinical-dialogue data:

Validation Source	Rows	Hours
FLEURS English validation	60	0.1693
Medical Speech Intent validation	120	0.1634
PriMock57 validation	80	0.0864
FLEURS Tamil validation	60	0.2128
Total	320	0.6320

All test sets were held out for evaluation only.

Finetuning

OHM Tanglish MedASR 1.7B v152 was trained as a full-parameter finetune from Polyglot-Lion 1.7B. We selected the checkpoint with a conservative guard-suite approach:

train on a Tamil + English + medical mixture
evaluate on medical symptom speech, clinical dialogue, English speech, Tamil FLEURS, and IISc-MILE Tamil
retain only checkpoints that improve without introducing regressions across the mixed guard suite

v152 remained the local champion after additional dataset-focused passes using public Tamil and medical-dialogue data. Later candidates either tied v152 or regressed on at least one guard gate.

Results

Lower is better.

Evaluation Gate	Samples	Vanilla WER / CER	OHM Tanglish MedASR WER / CER	WER Delta
Medical Speech Intent	500	4.57 / 1.47	4.55 / 1.42	-0.02
PriMock57 full	969	12.79 / 8.26	12.66 / 8.15	-0.13
FLEURS English	100	6.08 / 3.02	5.73 / 3.26	-0.35
FLEURS Tamil	100	33.07 / 12.09	31.70 / 11.56	-1.37
IISc-MILE Tamil / SLR127	100	37.03 / 10.92	36.95 / 10.79	-0.08

Macro WER across the five gates: 18.32.

Sample-weighted WER across 1,769 evaluation samples: 12.43.

Claim

OHM Tanglish MedASR 1.7B v152 is our first validated Tamil + English medical ASR checkpoint for the one-Indic-language + English medical ASR direction.

We claim that this checkpoint:

improves WER over vanilla Polyglot-Lion on all five local Tamil + English + medical evaluation gates
is a compact 1.7B full-finetuned model selected for mixed clinical, English, and Tamil ASR
is sized for smartphone-class edge deployment research after quantization and runtime packaging
establishes the first checkpoint in a repeatable path toward other Indic-language + English medical ASR models

We do not claim that this checkpoint is:

global Tamil ASR state of the art
a certified medical device
ready for unsupervised clinical documentation
universally runnable on every smartphone without further mobile packaging and validation
superior to Whisper, Qwen3-ASR, MERaLiON, IndicWhisper, Canary, or commercial medical ASR systems without apples-to-apples evaluation on the same manifests

Intended Use

This model is intended for:

Tamil + English medical ASR research
clinical speech transcription prototypes
patient symptom transcription experiments
medical dialogue ASR benchmarking
offline and edge ASR studies
smartphone-oriented ASR quantization and packaging experiments
future one-Indic-language + English medical ASR research

Safety And Limitations

This checkpoint is for research and prototyping. It is not a certified medical device and should not be used as the sole source for clinical care, diagnosis, treatment, billing, or legal documentation.

Known limitations:

The benchmark suite is local and not directly comparable to every public ASR leaderboard.
Public Tamil-specialized systems may perform better on other Tamil benchmarks.
English performance improves over our local vanilla baseline but is not claimed as state of the art.
Medical-dialogue evaluation used public/simulated benchmark material, not private real-world deployment audio.
Mobile deployment still requires export, quantization, runtime integration, and device testing.
Human review is required for any clinical workflow.

Collaboration

OHM Tanglish MedASR 1.7B v152 is a research collaboration between:

Acknowledgements

Built from the Polyglot-Lion/Qwen3-ASR ecosystem and evaluated with public and local ASR research manifests. Please cite the upstream model, datasets, and this checkpoint when using it in research.

Downloads last month: 109

Safetensors

Model size

2B params

Tensor type

F32

Model tree for osmapi/OHM-Tanglish-MedASR-1.7B-v152

Base model

Qwen/Qwen3-ASR-1.7B

Finetuned

knoveleng/polyglot-lion-1.7b-v1.5

Finetuned

(2)

this model