patricklifixie commited on
Commit
92a7961
·
verified ·
1 Parent(s): 50735b6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -1
README.md CHANGED
@@ -102,7 +102,7 @@ Supervised speech instruction finetuning via knowledge-distillation. For more in
102
 
103
  ## Evaluation
104
 
105
- Evaluations are conducted [big bench audio](https://huggingface.co/blog/big-bench-audio-release) (audio reasoning measured in accuracy), [VoiceBench](https://github.com/MatthewCYM/VoiceBench) (overall score averaged across multiple evaluations), as well as on covost2 (speech translation measured in BLEU), and LibriSpeech (speech recognition measured in WER).
106
 
107
  ### Audio Reasoning & General Understanding
108
  | | **v0_7-glm-4_6 w/ reasoning** | **v0_7-glm-4_6 w/o reasoning** | v0_6-llama-3_3-70b | v0_6-gemma-3-27b | v0_6-qwen-3-32b | **gpt4o-audio** |
@@ -121,3 +121,12 @@ Evaluations are conducted [big bench audio](https://huggingface.co/blog/big-benc
121
  | **covost2 ru_en** | **50.30** | 43.73 | 49.29 | 47.08 |
122
  | **covost2 zh_en** | **23.85** | 17.81 | 20.88 | 22.24 |
123
  | **librispeech** | **2.28** | 2.55 | 2.73 | 2.88 |
 
 
 
 
 
 
 
 
 
 
102
 
103
  ## Evaluation
104
 
105
+ Evaluations are conducted [big bench audio](https://huggingface.co/blog/big-bench-audio-release) (audio reasoning measured in accuracy) [^1], [VoiceBench](https://github.com/MatthewCYM/VoiceBench) (overall score averaged across multiple evaluations) [^2], as well as on covost2 (speech translation measured in BLEU), and LibriSpeech (speech recognition measured in WER).
106
 
107
  ### Audio Reasoning & General Understanding
108
  | | **v0_7-glm-4_6 w/ reasoning** | **v0_7-glm-4_6 w/o reasoning** | v0_6-llama-3_3-70b | v0_6-gemma-3-27b | v0_6-qwen-3-32b | **gpt4o-audio** |
 
121
  | **covost2 ru_en** | **50.30** | 43.73 | 49.29 | 47.08 |
122
  | **covost2 zh_en** | **23.85** | 17.81 | 20.88 | 22.24 |
123
  | **librispeech** | **2.28** | 2.55 | 2.73 | 2.88 |
124
+
125
+
126
+ [^1]: VoiceBench prompt used:
127
+ You are a helpful assistant. When answering questions:
128
+ For multiple choice questions (A/B/C/D options):
129
+ - End your response with "The answer is [X]" where X is the letter (A, B, C, or D)
130
+
131
+ [^2]: BigBenchAudio prompt used:
132
+ You are a helpful assistant. Answer the question at the end of your response.