MERaLiON
/

MERaLiON-2-10B

Automatic Speech Recognition

Model card Files Files and versions

YingxuHe commited on May 28, 2025

Commit

696961a

·

verified ·

1 Parent(s): 222586f

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -86,7 +86,8 @@ We benchmark MERaLiON-2 series models with extended [AudioBench benchmark](https
 **Better Automatic Speech Recognition (ASR) Accuracy**
-MERaLiON-2-10B-ASR and MERaLiON-2-10B demonstrate leading performance in Singlish, Mandarin, Malay, Tamil, and other Southeast Asian languages, while maintaining competitive results in English compared to `Whisper-large-v3`. The following table shows the average transcription `Word Error Rate` by language for the MERaLiON family and other leading AudioLLMs. The `Private Dataset` includes a collection of Singapore's locally accented speeches with code-switch.
 <style type="text/css">
 #T_0910c th {
@@ -265,6 +266,7 @@ MERaLiON-2-10B-ASR and MERaLiON-2-10B demonstrate leading performance in Singlis
 **Better Instruction Following and Audio Understanding**
 **MERaLiON-2-10B** exhibits substantial advancements in speech and audio understanding, as well as paralinguistic tasks. Notably, it adeptly handles complex instructions and responds with enhanced flexibility, effectively preserving the pre-trained knowledge from Gemma during the audio fine-tuning process. This capability enables MERaLiON-2-10B to provide detailed explanations regarding speech content and the speaker's emotional state. Furthermore, with appropriate prompt adjustments, the model can assume various roles, such as a voice assistant, virtual caregiver, or an integral component of sophisticated multi-agent AI systems and software solutions.
 <style type="text/css">
 #T_b6ba8 th {

 **Better Automatic Speech Recognition (ASR) Accuracy**
+MERaLiON-2-10B-ASR and MERaLiON-2-10B demonstrate leading performance in Singlish, Mandarin, Malay, Tamil, and other Southeast Asian languages, while maintaining competitive results in English compared to `Whisper-large-v3`. The following table shows the average transcription `Word Error Rate` by language for the MERaLiON family and other leading AudioLLMs. The `Private Dataset` includes a collection of Singapore's locally accented speeches with code-switch.
+Please visit [AudioBench benchmark](https://huggingface.co/spaces/MERaLiON/AudioBench-Leaderboard) for dataset-level evaluation results.
 <style type="text/css">
 #T_0910c th {
 **Better Instruction Following and Audio Understanding**
 **MERaLiON-2-10B** exhibits substantial advancements in speech and audio understanding, as well as paralinguistic tasks. Notably, it adeptly handles complex instructions and responds with enhanced flexibility, effectively preserving the pre-trained knowledge from Gemma during the audio fine-tuning process. This capability enables MERaLiON-2-10B to provide detailed explanations regarding speech content and the speaker's emotional state. Furthermore, with appropriate prompt adjustments, the model can assume various roles, such as a voice assistant, virtual caregiver, or an integral component of sophisticated multi-agent AI systems and software solutions.
+Please visit [AudioBench benchmark](https://huggingface.co/spaces/MERaLiON/AudioBench-Leaderboard) for dataset-level evaluation results.
 <style type="text/css">
 #T_b6ba8 th {