fixie-ai
/

ultravox-v0_7-glm-4_6

Audio-Text-to-Text

image-feature-extraction

Model card Files Files and versions

zqhuang commited on Dec 2, 2025

Commit

aec6738

·

verified ·

1 Parent(s): dde1ba6

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -128,14 +128,14 @@ Supervised speech instruction finetuning via knowledge-distillation. For more in
 Evaluations are conducted [big bench audio](https://huggingface.co/blog/big-bench-audio-release) (audio reasoning measured in accuracy), [VoiceBench](https://github.com/MatthewCYM/VoiceBench) (overall score averaged across multiple evaluations), as well as on covost2 (speech translation measured in BLEU), and LibriSpeech (speech recognition measured in WER).
 ### Audio Reasoning & General Understanding
-| | **v0_7-glm w/ reasoning** | **v0_7-glm w/o reasoning** | v0_6-llama-3_3-70b | v0_6-gemma-3-27b | v0_6-qwen-3-32b | **gpt4o-audio** |
 | --- | ---: | ---: | ---: | ---: | ---: | ---: |
 | **big bench audio** | **97.00** |  **91.80** | 85.48 | 83.84 | 84.22 | 82.80 |
 | **voicebench overall** | **90.75** | **87.05** | 81.81 | – | – | 86.75 |
-### Speech Understanding
-| | **v0_7-glm** | v0_6-llama-3_3-70b | v0_6-gemma-3-27b | v0_6-qwen-3-32b |
 | --- | ---: | ---: | ---: | ---: |
 | **covost2 en_ar** | **22.89** | 18.92 | 22.68 | 16.91 |
 | **covost2 en_ca** | **41.48** | 38.73 | 39.67 | 33.63 |

 Evaluations are conducted [big bench audio](https://huggingface.co/blog/big-bench-audio-release) (audio reasoning measured in accuracy), [VoiceBench](https://github.com/MatthewCYM/VoiceBench) (overall score averaged across multiple evaluations), as well as on covost2 (speech translation measured in BLEU), and LibriSpeech (speech recognition measured in WER).
 ### Audio Reasoning & General Understanding
+| | **v0_7-glm-4_6 w/ reasoning** | **v0_7-glm-4_6 w/o reasoning** | v0_6-llama-3_3-70b | v0_6-gemma-3-27b | v0_6-qwen-3-32b | **gpt4o-audio** |
 | --- | ---: | ---: | ---: | ---: | ---: | ---: |
 | **big bench audio** | **97.00** |  **91.80** | 85.48 | 83.84 | 84.22 | 82.80 |
 | **voicebench overall** | **90.75** | **87.05** | 81.81 | – | – | 86.75 |
+### Speech Translation & Recognition
+| | **v0_7-glm-4_6** | v0_6-llama-3_3-70b | v0_6-gemma-3-27b | v0_6-qwen-3-32b |
 | --- | ---: | ---: | ---: | ---: |
 | **covost2 en_ar** | **22.89** | 18.92 | 22.68 | 16.91 |
 | **covost2 en_ca** | **41.48** | 38.73 | 39.67 | 33.63 |