OHM Tanglish MedASR 1.7B v152

OHM Tanglish MedASR 1.7B v152 is the first checkpoint from our one-Indic-language + English medical ASR effort: a Tamil + English clinical speech transcription model at compact 1.7B scale, aimed at smartphone-class edge medical transcription.

This checkpoint is a research collaboration between osmAPI, OHM - Open Holistic Medicine, and the Terv Student Research Team.

We use Tanglish here to mean the Tamil + English speech environment common in many clinical conversations: Tamil, English, transliterated terms, medical vocabulary, patient symptom descriptions, and doctor-patient dialogue. The model is designed as the first language-pair checkpoint in a repeatable path toward one-Indic-language + English medical ASR systems that can eventually run near the patient on edge devices.

Base Model

  • Base model: knoveleng/polyglot-lion-1.7b-v1.5
  • Released checkpoint: OHM Tanglish MedASR 1.7B v152
  • Parameter scale: 1.7B
  • Training method: full-parameter finetune
  • Adapter type: none
  • Primary target: Tamil + English medical ASR
  • Deployment direction: smartphone-class edge transcription after quantization and runtime packaging

Why This Size

Medical ASR should not always require a cloud-scale model. Larger ASR systems can be stronger on broad benchmarks, but they are not the natural first target for phone-side or clinic-side transcription.

We target 1.7B parameters because this size is a practical middle ground:

  • large enough to preserve multilingual ASR behavior
  • small enough to target 4-bit or 8-bit quantized inference
  • plausible for offline or low-connectivity medical transcription workflows
  • suitable for privacy-preserving edge research
  • repeatable for future one-Indic-language + English medical ASR checkpoints

This repository contains the research checkpoint. Actual "runs on any smartphone" support requires mobile export, quantization, runtime integration, and device-level benchmarking.

Finetune Data

v152 was selected from a mixed Tamil + English + medical training recipe designed to preserve all three capabilities at once.

Training Source Purpose Rows Hours
FLEURS English train wide English speech coverage 6,000 17.3986
IISc-MILE Tamil train wide Tamil ASR robustness 7,200 18.5397
FLEURS Tamil train Tamil read-speech coverage 1,800 7.2667
PriMock57 train clinical dialogue speech 1,200 0.8805
Medical Speech Intent train medical symptom utterances 120 0.1835
Total 16,320 44.2690

Validation used held-out English, Tamil, medical-intent, and clinical-dialogue data:

Validation Source Rows Hours
FLEURS English validation 60 0.1693
Medical Speech Intent validation 120 0.1634
PriMock57 validation 80 0.0864
FLEURS Tamil validation 60 0.2128
Total 320 0.6320

All test sets were held out for evaluation only.

Finetuning

OHM Tanglish MedASR 1.7B v152 was trained as a full-parameter finetune from Polyglot-Lion 1.7B. We selected the checkpoint with a conservative guard-suite approach:

  • train on a Tamil + English + medical mixture
  • evaluate on medical symptom speech, clinical dialogue, English speech, Tamil FLEURS, and IISc-MILE Tamil
  • retain only checkpoints that improve without introducing regressions across the mixed guard suite

v152 remained the local champion after additional dataset-focused passes using public Tamil and medical-dialogue data. Later candidates either tied v152 or regressed on at least one guard gate.

Results

Lower is better.

Evaluation Gate Samples Vanilla WER / CER OHM Tanglish MedASR WER / CER WER Delta
Medical Speech Intent 500 4.57 / 1.47 4.55 / 1.42 -0.02
PriMock57 full 969 12.79 / 8.26 12.66 / 8.15 -0.13
FLEURS English 100 6.08 / 3.02 5.73 / 3.26 -0.35
FLEURS Tamil 100 33.07 / 12.09 31.70 / 11.56 -1.37
IISc-MILE Tamil / SLR127 100 37.03 / 10.92 36.95 / 10.79 -0.08

Macro WER across the five gates: 18.32.

Sample-weighted WER across 1,769 evaluation samples: 12.43.

Claim

OHM Tanglish MedASR 1.7B v152 is our first validated Tamil + English medical ASR checkpoint for the one-Indic-language + English medical ASR direction.

We claim that this checkpoint:

  • improves WER over vanilla Polyglot-Lion on all five local Tamil + English + medical evaluation gates
  • is a compact 1.7B full-finetuned model selected for mixed clinical, English, and Tamil ASR
  • is sized for smartphone-class edge deployment research after quantization and runtime packaging
  • establishes the first checkpoint in a repeatable path toward other Indic-language + English medical ASR models

We do not claim that this checkpoint is:

  • global Tamil ASR state of the art
  • a certified medical device
  • ready for unsupervised clinical documentation
  • universally runnable on every smartphone without further mobile packaging and validation
  • superior to Whisper, Qwen3-ASR, MERaLiON, IndicWhisper, Canary, or commercial medical ASR systems without apples-to-apples evaluation on the same manifests

Intended Use

This model is intended for:

  • Tamil + English medical ASR research
  • clinical speech transcription prototypes
  • patient symptom transcription experiments
  • medical dialogue ASR benchmarking
  • offline and edge ASR studies
  • smartphone-oriented ASR quantization and packaging experiments
  • future one-Indic-language + English medical ASR research

Safety And Limitations

This checkpoint is for research and prototyping. It is not a certified medical device and should not be used as the sole source for clinical care, diagnosis, treatment, billing, or legal documentation.

Known limitations:

  • The benchmark suite is local and not directly comparable to every public ASR leaderboard.
  • Public Tamil-specialized systems may perform better on other Tamil benchmarks.
  • English performance improves over our local vanilla baseline but is not claimed as state of the art.
  • Medical-dialogue evaluation used public/simulated benchmark material, not private real-world deployment audio.
  • Mobile deployment still requires export, quantization, runtime integration, and device testing.
  • Human review is required for any clinical workflow.

Collaboration

OHM Tanglish MedASR 1.7B v152 is a research collaboration between:

Acknowledgements

Built from the Polyglot-Lion/Qwen3-ASR ecosystem and evaluated with public and local ASR research manifests. Please cite the upstream model, datasets, and this checkpoint when using it in research.

Downloads last month
109
Safetensors
Model size
2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for osmapi/OHM-Tanglish-MedASR-1.7B-v152

Finetuned
(2)
this model