---
language:
- fa
license: apache-2.0
library_name: nemo
tags:
- automatic-speech-recognition
- speech
- persian
- farsi
- nvfp4
- nemo
- modelopt
- quantized
base_model:
- Reza2kn/visualears-fastconformer-fa-full-ab
base_model_relation: quantized
datasets:
- Reza2kn/persian-asr-eval-v0
metrics:
- wer
- cer
pipeline_tag: automatic-speech-recognition
---

# visualears-fastconformer-fa-full-ab-nvfp4

NVFP4 (W4A4) post-training quantization of [`Reza2kn/visualears-fastconformer-fa-full-ab`](https://huggingface.co/Reza2kn/visualears-fastconformer-fa-full-ab) via NVIDIA `modelopt`.

- **Base architecture:** EncDecHybridRNNTCTCBPEModel (NeMo)
- **Calibration:** 32 Persian clips from `Reza2kn/persian-asr-eval-v0` (held out from eval).
- **Hardware target:** NVIDIA Blackwell tensor cores.

## Eval — `Reza2kn/persian-asr-eval-v0` (FLEURS-fa, 200 clips)

| Variant | WER ↓ | CER ↓ | per-clip latency | peak VRAM |
|---|---|---|---|---|
| **NVFP4 (this repo)** | **20.33%** | **7.38%** | 48 ms | 603 MiB |

## Usage

```python
import nemo.collections.asr as nemo_asr
m = nemo_asr.models.ASRModel.restore_from("visualears-fastconformer-fa-full-ab-NVFP4.nemo").cuda().eval()
transcripts = m.transcribe(["clip.wav"])
print(transcripts[0])
```

## License

Inherits the base model's license.