VibeVoice 0.5B — streaming TTS (Core AI)

Real-time streaming text-to-speech, 0.5B Qwen2 backbone.

Source & export pipeline: github.com/gafiatulin/vibevoice-coreai

On-device performance (M4 Max, Core AI): 10.2× RTF.

⚠️ Beta artifacts. These .aimodel bundles are compiled for macOS 27 / Xcode 27 beta (Core AI). They may need re-export on the GA toolchain. The original weights are Microsoft VibeVoice (see upstream for the model license).

Layout

vibevoice-0.5b-coreai/
  manifest.json        # role → {variant: path} + recommended flags
  embed_tokens.f16     # host-side embed table
  tokenizer/           # tokenizer files
  lm/model-int4.aimodel/
  lm/model.aimodel/
  diffusion/diffusion-head.aimodel/
  diffusion/fused-sampler.aimodel/
  codec/acoustic-decoder-streaming.aimodel/
  codec/acoustic-decoder.aimodel/
  connector/acoustic-connector.aimodel/

Roles

Resolve assets by role via manifest.json (default = recommended variant):

{
  "lm": {
    "default": "lm/model.aimodel",
    "int4": "lm/model-int4.aimodel"
  },
  "diffusion": {
    "default": "diffusion/fused-sampler.aimodel",
    "per_step": "diffusion/diffusion-head.aimodel"
  },
  "acoustic_decoder": {
    "default": "codec/acoustic-decoder.aimodel",
    "streaming": "codec/acoustic-decoder-streaming.aimodel"
  },
  "acoustic_connector": {
    "default": "connector/acoustic-connector.aimodel"
  }
}

Recommended flags

{
  "fused": true,
  "steps": 10,
  "defer_decode": true,
  "decode_compute": "gpu"
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for gafiatulin/vibevoice-0.5b-coreai

Base model

Qwen/Qwen2.5-0.5B

Finetuned

microsoft/VibeVoice-Realtime-0.5B

Finetuned

(15)

this model

Collection including gafiatulin/vibevoice-0.5b-coreai

VibeVoice CoreAI

Collection

4 items • Updated 7 days ago