tarteel-ai/everyayah
Viewer • Updated • 127k • 1.79k • 36
How to use MaddoggProduction/whisper-m-quran-lora-dataset-mix with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("automatic-speech-recognition", model="MaddoggProduction/whisper-m-quran-lora-dataset-mix") # Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM
processor = AutoProcessor.from_pretrained("MaddoggProduction/whisper-m-quran-lora-dataset-mix")
model = AutoModelForMultimodalLM.from_pretrained("MaddoggProduction/whisper-m-quran-lora-dataset-mix")This is a specialized Automatic Speech Recognition (ASR) model for Quranic Recitation with tashkeel or diacritics. It is a fine-tuned version of openai/whisper-medium, optimized to recognize Quranic Arabic with high accuracy while maintaining robustness across different recording conditions.
tarteel-ai/everyayah validation set.The model was trained using LoRA (Low-Rank Adaptation) in a multi-stage curriculum learning process to ensure stability and precision.
The training utilized a mix of professional and diverse recitations from two primary sources:
This model is fully compatible with the Hugging Face transformers pipeline. For longer verses, chunking is recommended to maintain context.
from transformers import pipeline
# Load the pipeline
pipe = pipeline(
"automatic-speech-recognition",
model="MaddoggProduction/whisper-m-quran-lora-dataset-mix",
device=0 # for GPU usage, -1 for CPU
)
# Transcribe audio (chunking enabled for long verses)
result = pipe(
"path_to_audio.mp3",
chunk_length_s=30, # Critical for long verses like 2:282, to avoid hallucinations
stride_length_s=5,
batch_size=8,
return_timestamps=True
)
print(result["text"])
Base model
openai/whisper-medium