Whisper Medium Quran (LoRA Fine-Tuned)

This is a specialized Automatic Speech Recognition (ASR) model for Quranic Recitation with tashkeel or diacritics. It is a fine-tuned version of openai/whisper-medium, optimized to recognize Quranic Arabic with high accuracy while maintaining robustness across different recording conditions.

Model Performance

Word Error Rate (WER): Achieved 12.69% on the tarteel-ai/everyayah validation set.
Accuracy: The model demonstrates high precision in capturing Quranic vocabulary and Uthmani script nuances.

Training Details

The model was trained using LoRA (Low-Rank Adaptation) in a multi-stage curriculum learning process to ensure stability and precision.

Datasets

The training utilized a mix of professional and diverse recitations from two primary sources:

MohamedRashad/Quran-Recitations
tarteel-ai/everyayah (Highly diverse professional recitations)

Methodology

Curriculum Learning: The model was trained gradually across these datasets to refine its understanding of Tajweed and Quranic sentence structures.
Data Augmentation: To ensure the model remains robust against real-world conditions (non-studio microphones, background noise, varying volumes), diverse audio augmentations including gain adjustments and spectral masking were applied during the training process.

Usage

This model is fully compatible with the Hugging Face transformers pipeline. For longer verses, chunking is recommended to maintain context.

from transformers import pipeline

# Load the pipeline
pipe = pipeline(
    "automatic-speech-recognition",
    model="MaddoggProduction/whisper-m-quran-lora-dataset-mix",
    device=0 # for GPU usage, -1 for CPU
)

# Transcribe audio (chunking enabled for long verses)
result = pipe(
    "path_to_audio.mp3",
    chunk_length_s=30, # Critical for long verses like 2:282, to avoid hallucinations
    stride_length_s=5,
    batch_size=8,
    return_timestamps=True
)

print(result["text"])

Downloads last month: 57

Safetensors

Model size

0.8B params

Tensor type

F16

Model tree for MaddoggProduction/whisper-m-quran-lora-dataset-mix

Base model

openai/whisper-medium

Finetuned

(879)

this model

MaddoggProduction
/

whisper-m-quran-lora-dataset-mix