Whisper base arabic

It achieves the following results on the evaluation set:

Loss: 0.44
Wer: 34.7

Training and evaluation data

Train set:

mozilla-foundation/common_voice_16_0 ar [train+validation]
BelalElhossany/mgb2_audios_transcriptions_non_overlap
nadsoft/Jordan-Audio

cross validation set: 600 samples in total from the 3 sets to save time during training as colab free tier was used to train the model. note: evaluate accuracy in the way you see fit.

Training procedure

removed arabic (حركات) from the texts. trained the model on the combined dataset for 6 epochs, the best one being the fifth so the model is basically the 5th epoch.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 32
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 1
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
0.4603	1	1437	0.4931	45.8857
0.2867	2	2874	0.4493	36.9973
0.2494	3	4311	0.4219	43.5553
0.1435	4	5748	0.4408	40.2351
0.1345	5	7185	0.4407	34.7081

Downloads last month: 21

Safetensors

Model size

72.6M params

Tensor type

F32

Model tree for YazanSalameh/Whisper-base-Arabic

Base model

openai/whisper-base

Finetuned

(706)

this model

Finetunes

3 models

Datasets used to train YazanSalameh/Whisper-base-Arabic

Evaluation results

Wer
self-reported

34.700