Automatic Speech Recognition
Transformers
Safetensors
meralion2
meralion
meralion-2
custom_code
YingxuHe commited on
Commit
696961a
·
verified ·
1 Parent(s): 222586f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -86,7 +86,8 @@ We benchmark MERaLiON-2 series models with extended [AudioBench benchmark](https
86
 
87
  **Better Automatic Speech Recognition (ASR) Accuracy**
88
 
89
- MERaLiON-2-10B-ASR and MERaLiON-2-10B demonstrate leading performance in Singlish, Mandarin, Malay, Tamil, and other Southeast Asian languages, while maintaining competitive results in English compared to `Whisper-large-v3`. The following table shows the average transcription `Word Error Rate` by language for the MERaLiON family and other leading AudioLLMs. The `Private Dataset` includes a collection of Singapore's locally accented speeches with code-switch.
 
90
 
91
  <style type="text/css">
92
  #T_0910c th {
@@ -265,6 +266,7 @@ MERaLiON-2-10B-ASR and MERaLiON-2-10B demonstrate leading performance in Singlis
265
  **Better Instruction Following and Audio Understanding**
266
 
267
  **MERaLiON-2-10B** exhibits substantial advancements in speech and audio understanding, as well as paralinguistic tasks. Notably, it adeptly handles complex instructions and responds with enhanced flexibility, effectively preserving the pre-trained knowledge from Gemma during the audio fine-tuning process. This capability enables MERaLiON-2-10B to provide detailed explanations regarding speech content and the speaker's emotional state. Furthermore, with appropriate prompt adjustments, the model can assume various roles, such as a voice assistant, virtual caregiver, or an integral component of sophisticated multi-agent AI systems and software solutions.
 
268
 
269
  <style type="text/css">
270
  #T_b6ba8 th {
 
86
 
87
  **Better Automatic Speech Recognition (ASR) Accuracy**
88
 
89
+ MERaLiON-2-10B-ASR and MERaLiON-2-10B demonstrate leading performance in Singlish, Mandarin, Malay, Tamil, and other Southeast Asian languages, while maintaining competitive results in English compared to `Whisper-large-v3`. The following table shows the average transcription `Word Error Rate` by language for the MERaLiON family and other leading AudioLLMs. The `Private Dataset` includes a collection of Singapore's locally accented speeches with code-switch.
90
+ Please visit [AudioBench benchmark](https://huggingface.co/spaces/MERaLiON/AudioBench-Leaderboard) for dataset-level evaluation results.
91
 
92
  <style type="text/css">
93
  #T_0910c th {
 
266
  **Better Instruction Following and Audio Understanding**
267
 
268
  **MERaLiON-2-10B** exhibits substantial advancements in speech and audio understanding, as well as paralinguistic tasks. Notably, it adeptly handles complex instructions and responds with enhanced flexibility, effectively preserving the pre-trained knowledge from Gemma during the audio fine-tuning process. This capability enables MERaLiON-2-10B to provide detailed explanations regarding speech content and the speaker's emotional state. Furthermore, with appropriate prompt adjustments, the model can assume various roles, such as a voice assistant, virtual caregiver, or an integral component of sophisticated multi-agent AI systems and software solutions.
269
+ Please visit [AudioBench benchmark](https://huggingface.co/spaces/MERaLiON/AudioBench-Leaderboard) for dataset-level evaluation results.
270
 
271
  <style type="text/css">
272
  #T_b6ba8 th {