---
language:
- en
- ta
license: mit
tags:
- automatic-speech-recognition
- tamil
- english
- tanglish
- medical-asr
- clinical-dialogue
- edge-ai
- smartphone
- polyglot-lion
pipeline_tag: automatic-speech-recognition
base_model: knoveleng/polyglot-lion-1.7b-v1.5
---

# OHM Tanglish MedASR 1.7B v152

**OHM Tanglish MedASR 1.7B v152 is the first checkpoint from our one-Indic-language + English medical ASR effort: a Tamil + English clinical speech transcription model at compact 1.7B scale, aimed at smartphone-class edge medical transcription.**

This checkpoint is a research collaboration between [osmAPI](https://osmapi.com), [OHM - Open Holistic Medicine](https://ohm.doctor), and the [Terv Student Research Team](https://terv.pro).

We use **Tanglish** here to mean the Tamil + English speech environment common in many clinical conversations: Tamil, English, transliterated terms, medical vocabulary, patient symptom descriptions, and doctor-patient dialogue. The model is designed as the first language-pair checkpoint in a repeatable path toward one-Indic-language + English medical ASR systems that can eventually run near the patient on edge devices.

## Base Model

- **Base model:** [knoveleng/polyglot-lion-1.7b-v1.5](https://huggingface.co/knoveleng/polyglot-lion-1.7b-v1.5)
- **Released checkpoint:** OHM Tanglish MedASR 1.7B v152
- **Parameter scale:** 1.7B
- **Training method:** full-parameter finetune
- **Adapter type:** none
- **Primary target:** Tamil + English medical ASR
- **Deployment direction:** smartphone-class edge transcription after quantization and runtime packaging

## Why This Size

Medical ASR should not always require a cloud-scale model. Larger ASR systems can be stronger on broad benchmarks, but they are not the natural first target for phone-side or clinic-side transcription.

We target 1.7B parameters because this size is a practical middle ground:

- large enough to preserve multilingual ASR behavior
- small enough to target 4-bit or 8-bit quantized inference
- plausible for offline or low-connectivity medical transcription workflows
- suitable for privacy-preserving edge research
- repeatable for future one-Indic-language + English medical ASR checkpoints

This repository contains the research checkpoint. Actual "runs on any smartphone" support requires mobile export, quantization, runtime integration, and device-level benchmarking.

## Finetune Data

v152 was selected from a mixed Tamil + English + medical training recipe designed to preserve all three capabilities at once.

| Training Source | Purpose | Rows | Hours |
|---|---|---:|---:|
| FLEURS English train wide | English speech coverage | 6,000 | 17.3986 |
| IISc-MILE Tamil train wide | Tamil ASR robustness | 7,200 | 18.5397 |
| FLEURS Tamil train | Tamil read-speech coverage | 1,800 | 7.2667 |
| PriMock57 train | clinical dialogue speech | 1,200 | 0.8805 |
| Medical Speech Intent train | medical symptom utterances | 120 | 0.1835 |
| **Total** |  | **16,320** | **44.2690** |

Validation used held-out English, Tamil, medical-intent, and clinical-dialogue data:

| Validation Source | Rows | Hours |
|---|---:|---:|
| FLEURS English validation | 60 | 0.1693 |
| Medical Speech Intent validation | 120 | 0.1634 |
| PriMock57 validation | 80 | 0.0864 |
| FLEURS Tamil validation | 60 | 0.2128 |
| **Total** | **320** | **0.6320** |

All test sets were held out for evaluation only.

## Finetuning

OHM Tanglish MedASR 1.7B v152 was trained as a full-parameter finetune from Polyglot-Lion 1.7B. We selected the checkpoint with a conservative guard-suite approach:

- train on a Tamil + English + medical mixture
- evaluate on medical symptom speech, clinical dialogue, English speech, Tamil FLEURS, and IISc-MILE Tamil
- retain only checkpoints that improve without introducing regressions across the mixed guard suite

v152 remained the local champion after additional dataset-focused passes using public Tamil and medical-dialogue data. Later candidates either tied v152 or regressed on at least one guard gate.

## Results

Lower is better.

| Evaluation Gate | Samples | Vanilla WER / CER | OHM Tanglish MedASR WER / CER | WER Delta |
|---|---:|---:|---:|---:|
| Medical Speech Intent | 500 | 4.57 / 1.47 | **4.55 / 1.42** | -0.02 |
| PriMock57 full | 969 | 12.79 / 8.26 | **12.66 / 8.15** | -0.13 |
| FLEURS English | 100 | 6.08 / 3.02 | **5.73 / 3.26** | -0.35 |
| FLEURS Tamil | 100 | 33.07 / 12.09 | **31.70 / 11.56** | -1.37 |
| IISc-MILE Tamil / SLR127 | 100 | 37.03 / 10.92 | **36.95 / 10.79** | -0.08 |

Macro WER across the five gates: **18.32**.

Sample-weighted WER across 1,769 evaluation samples: **12.43**.

## Claim

OHM Tanglish MedASR 1.7B v152 is our first validated Tamil + English medical ASR checkpoint for the one-Indic-language + English medical ASR direction.

We claim that this checkpoint:

- improves WER over vanilla Polyglot-Lion on all five local Tamil + English + medical evaluation gates
- is a compact 1.7B full-finetuned model selected for mixed clinical, English, and Tamil ASR
- is sized for smartphone-class edge deployment research after quantization and runtime packaging
- establishes the first checkpoint in a repeatable path toward other Indic-language + English medical ASR models

We do not claim that this checkpoint is:

- global Tamil ASR state of the art
- a certified medical device
- ready for unsupervised clinical documentation
- universally runnable on every smartphone without further mobile packaging and validation
- superior to Whisper, Qwen3-ASR, MERaLiON, IndicWhisper, Canary, or commercial medical ASR systems without apples-to-apples evaluation on the same manifests

## Intended Use

This model is intended for:

- Tamil + English medical ASR research
- clinical speech transcription prototypes
- patient symptom transcription experiments
- medical dialogue ASR benchmarking
- offline and edge ASR studies
- smartphone-oriented ASR quantization and packaging experiments
- future one-Indic-language + English medical ASR research

## Safety And Limitations

This checkpoint is for research and prototyping. It is not a certified medical device and should not be used as the sole source for clinical care, diagnosis, treatment, billing, or legal documentation.

Known limitations:

- The benchmark suite is local and not directly comparable to every public ASR leaderboard.
- Public Tamil-specialized systems may perform better on other Tamil benchmarks.
- English performance improves over our local vanilla baseline but is not claimed as state of the art.
- Medical-dialogue evaluation used public/simulated benchmark material, not private real-world deployment audio.
- Mobile deployment still requires export, quantization, runtime integration, and device testing.
- Human review is required for any clinical workflow.

## Collaboration

OHM Tanglish MedASR 1.7B v152 is a research collaboration between:

- [osmAPI](https://osmapi.com)
- [OHM - Open Holistic Medicine](https://ohm.doctor)
- [Terv Student Research Team](https://terv.pro)

## Acknowledgements

Built from the Polyglot-Lion/Qwen3-ASR ecosystem and evaluated with public and local ASR research manifests. Please cite the upstream model, datasets, and this checkpoint when using it in research.