Instructions to use Reza2kn/visualears-fastconformer-fa-full-ab-fp8 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- NeMo
How to use Reza2kn/visualears-fastconformer-fa-full-ab-fp8 with NeMo:
import nemo.collections.asr as nemo_asr asr_model = nemo_asr.models.ASRModel.from_pretrained("Reza2kn/visualears-fastconformer-fa-full-ab-fp8") transcriptions = asr_model.transcribe(["file.wav"]) - Notebooks
- Google Colab
- Kaggle
visualears-fastconformer-fa-full-ab-fp8
FP8 post-training quantization of Reza2kn/visualears-fastconformer-fa-full-ab via NVIDIA modelopt.
- Base architecture: EncDecHybridRNNTCTCBPEModel (NeMo)
- Calibration: 32 Persian clips from
Reza2kn/persian-asr-eval-v0(held out from eval). - Hardware target: NVIDIA GPUs with FP8/TensorRT-family runtime support.
Eval β Reza2kn/persian-asr-eval-v0 (FLEURS-fa, 200 clips)
| Variant | WER β | CER β | per-clip latency | peak VRAM |
|---|---|---|---|---|
| FP base | 18.38% | 6.58% | 31 ms | 588 MiB |
| FP8 (this repo) | 18.48% | 6.69% | 51 ms | 662 MiB |
Usage
import nemo.collections.asr as nemo_asr
m = nemo_asr.models.ASRModel.restore_from("visualears-fastconformer-fa-full-ab-FP8.nemo").cuda().eval()
transcripts = m.transcribe(["clip.wav"])
print(transcripts[0])
License
Inherits the base model's license.
Base Comparison
On the same 200 FLEURS-fa clips, FP8 WER retention vs the FP base was 99.47% and CER retention was 98.34%. Exact normalized transcript match was 54.0%; rough word-position agreement was 93.13%. See validation/fp8_vs_base_eval_summary.json.
- Downloads last month
- 11
Model tree for Reza2kn/visualears-fastconformer-fa-full-ab-fp8
Base model
nvidia/stt_fa_fastconformer_hybrid_large