nadsoft/Jordan-Audio
Viewer • Updated • 5.04k • 73 • 2
How to use YazanSalameh/Whisper-base-Arabic with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("automatic-speech-recognition", model="YazanSalameh/Whisper-base-Arabic") # Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM
processor = AutoProcessor.from_pretrained("YazanSalameh/Whisper-base-Arabic")
model = AutoModelForMultimodalLM.from_pretrained("YazanSalameh/Whisper-base-Arabic")It achieves the following results on the evaluation set:
Train set:
cross validation set: 600 samples in total from the 3 sets to save time during training as colab free tier was used to train the model. note: evaluate accuracy in the way you see fit.
removed arabic (ØØ±ÙƒØ§Øª) from the texts. trained the model on the combined dataset for 6 epochs, the best one being the fifth so the model is basically the 5th epoch.
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss | Wer |
|---|---|---|---|---|
| 0.4603 | 1 | 1437 | 0.4931 | 45.8857 |
| 0.2867 | 2 | 2874 | 0.4493 | 36.9973 |
| 0.2494 | 3 | 4311 | 0.4219 | 43.5553 |
| 0.1435 | 4 | 5748 | 0.4408 | 40.2351 |
| 0.1345 | 5 | 7185 | 0.4407 | 34.7081 |