OHM Tanglish MedASR 1.7B v152
OHM Tanglish MedASR 1.7B v152 is the first checkpoint from our one-Indic-language + English medical ASR effort: a Tamil + English clinical speech transcription model at compact 1.7B scale, aimed at smartphone-class edge medical transcription.
This checkpoint is a research collaboration between osmAPI, OHM - Open Holistic Medicine, and the Terv Student Research Team.
We use Tanglish here to mean the Tamil + English speech environment common in many clinical conversations: Tamil, English, transliterated terms, medical vocabulary, patient symptom descriptions, and doctor-patient dialogue. The model is designed as the first language-pair checkpoint in a repeatable path toward one-Indic-language + English medical ASR systems that can eventually run near the patient on edge devices.
Base Model
- Base model: knoveleng/polyglot-lion-1.7b-v1.5
- Released checkpoint: OHM Tanglish MedASR 1.7B v152
- Parameter scale: 1.7B
- Training method: full-parameter finetune
- Adapter type: none
- Primary target: Tamil + English medical ASR
- Deployment direction: smartphone-class edge transcription after quantization and runtime packaging
Why This Size
Medical ASR should not always require a cloud-scale model. Larger ASR systems can be stronger on broad benchmarks, but they are not the natural first target for phone-side or clinic-side transcription.
We target 1.7B parameters because this size is a practical middle ground:
- large enough to preserve multilingual ASR behavior
- small enough to target 4-bit or 8-bit quantized inference
- plausible for offline or low-connectivity medical transcription workflows
- suitable for privacy-preserving edge research
- repeatable for future one-Indic-language + English medical ASR checkpoints
This repository contains the research checkpoint. Actual "runs on any smartphone" support requires mobile export, quantization, runtime integration, and device-level benchmarking.
Finetune Data
v152 was selected from a mixed Tamil + English + medical training recipe designed to preserve all three capabilities at once.
| Training Source | Purpose | Rows | Hours |
|---|---|---|---|
| FLEURS English train wide | English speech coverage | 6,000 | 17.3986 |
| IISc-MILE Tamil train wide | Tamil ASR robustness | 7,200 | 18.5397 |
| FLEURS Tamil train | Tamil read-speech coverage | 1,800 | 7.2667 |
| PriMock57 train | clinical dialogue speech | 1,200 | 0.8805 |
| Medical Speech Intent train | medical symptom utterances | 120 | 0.1835 |
| Total | 16,320 | 44.2690 |
Validation used held-out English, Tamil, medical-intent, and clinical-dialogue data:
| Validation Source | Rows | Hours |
|---|---|---|
| FLEURS English validation | 60 | 0.1693 |
| Medical Speech Intent validation | 120 | 0.1634 |
| PriMock57 validation | 80 | 0.0864 |
| FLEURS Tamil validation | 60 | 0.2128 |
| Total | 320 | 0.6320 |
All test sets were held out for evaluation only.
Finetuning
OHM Tanglish MedASR 1.7B v152 was trained as a full-parameter finetune from Polyglot-Lion 1.7B. We selected the checkpoint with a conservative guard-suite approach:
- train on a Tamil + English + medical mixture
- evaluate on medical symptom speech, clinical dialogue, English speech, Tamil FLEURS, and IISc-MILE Tamil
- retain only checkpoints that improve without introducing regressions across the mixed guard suite
v152 remained the local champion after additional dataset-focused passes using public Tamil and medical-dialogue data. Later candidates either tied v152 or regressed on at least one guard gate.
Results
Lower is better.
| Evaluation Gate | Samples | Vanilla WER / CER | OHM Tanglish MedASR WER / CER | WER Delta |
|---|---|---|---|---|
| Medical Speech Intent | 500 | 4.57 / 1.47 | 4.55 / 1.42 | -0.02 |
| PriMock57 full | 969 | 12.79 / 8.26 | 12.66 / 8.15 | -0.13 |
| FLEURS English | 100 | 6.08 / 3.02 | 5.73 / 3.26 | -0.35 |
| FLEURS Tamil | 100 | 33.07 / 12.09 | 31.70 / 11.56 | -1.37 |
| IISc-MILE Tamil / SLR127 | 100 | 37.03 / 10.92 | 36.95 / 10.79 | -0.08 |
Macro WER across the five gates: 18.32.
Sample-weighted WER across 1,769 evaluation samples: 12.43.
Claim
OHM Tanglish MedASR 1.7B v152 is our first validated Tamil + English medical ASR checkpoint for the one-Indic-language + English medical ASR direction.
We claim that this checkpoint:
- improves WER over vanilla Polyglot-Lion on all five local Tamil + English + medical evaluation gates
- is a compact 1.7B full-finetuned model selected for mixed clinical, English, and Tamil ASR
- is sized for smartphone-class edge deployment research after quantization and runtime packaging
- establishes the first checkpoint in a repeatable path toward other Indic-language + English medical ASR models
We do not claim that this checkpoint is:
- global Tamil ASR state of the art
- a certified medical device
- ready for unsupervised clinical documentation
- universally runnable on every smartphone without further mobile packaging and validation
- superior to Whisper, Qwen3-ASR, MERaLiON, IndicWhisper, Canary, or commercial medical ASR systems without apples-to-apples evaluation on the same manifests
Intended Use
This model is intended for:
- Tamil + English medical ASR research
- clinical speech transcription prototypes
- patient symptom transcription experiments
- medical dialogue ASR benchmarking
- offline and edge ASR studies
- smartphone-oriented ASR quantization and packaging experiments
- future one-Indic-language + English medical ASR research
Safety And Limitations
This checkpoint is for research and prototyping. It is not a certified medical device and should not be used as the sole source for clinical care, diagnosis, treatment, billing, or legal documentation.
Known limitations:
- The benchmark suite is local and not directly comparable to every public ASR leaderboard.
- Public Tamil-specialized systems may perform better on other Tamil benchmarks.
- English performance improves over our local vanilla baseline but is not claimed as state of the art.
- Medical-dialogue evaluation used public/simulated benchmark material, not private real-world deployment audio.
- Mobile deployment still requires export, quantization, runtime integration, and device testing.
- Human review is required for any clinical workflow.
Collaboration
OHM Tanglish MedASR 1.7B v152 is a research collaboration between:
Acknowledgements
Built from the Polyglot-Lion/Qwen3-ASR ecosystem and evaluated with public and local ASR research manifests. Please cite the upstream model, datasets, and this checkpoint when using it in research.
- Downloads last month
- 109