--- license: cc-by-nc-4.0 base_model: liujiafeng/Khala-MusicGeneration-v1.0 pipeline_tag: text-to-audio tags: - music-generation - audio - apple-silicon - mps - khala - acoustic-tokens --- # Khala — Apple Silicon (MPS) pre-converted weights Pre-converted, **vanilla-PyTorch** weights for running [**Khala**](https://huggingface.co/liujiafeng/Khala-MusicGeneration-v1.0) — a high-fidelity, unified acoustic-token song generator — on **Apple Silicon (MPS) or CPU**, with no NVIDIA / Megatron / TransformerEngine / FlashAttention stack. These are **format conversions of the original weights**, not a retrain. They are produced by gathering the upstream Megatron `torch_dist` checkpoint and the DAC-RVQ decoder, then renaming tensors to the de-Megatron `KhalaModel` layout. The numerics match the original (backbone greedy decode is bit-identical to the CUDA reference, 64/64 tokens). ## Files | File | What it is | |---|---| | `khala_backbone.safetensors` | Backbone GPT (q0/q1 coarse acoustic tokens), `KhalaModel` naming | | `khala_superres.safetensors` | Super-resolution GPT (expands to q0…q63), non-causal | | `decoder_weights.pt` | DAC-RVQ decoder (generator-only), stereo, 64 quantizers | | `backbone_megatron_args.json` | Backbone config (consumed by `KhalaConfig.from_megatron_args`) | | `superres_megatron_args.json` | Super-res config | | `decoder_config.yaml` | Decoder config | ## Usage Download into the directory the vanilla runtime reads (`_cuda_artifacts/`, or set `KHALA_VANILLA_WEIGHTS`): ```bash hf download Vinpolar/Khala-MusicGeneration-v1.0-MPS --local-dir _cuda_artifacts ``` Then run the CLI generator or the Mac web stack from the Khala repo (Apple Silicon section of the README): ```bash # one-off track from the command line KHALA_BACKEND=vanilla .venv-mac/bin/python -u tools/generate_vanilla.py \ --duration 3 --tags "upbeat, pop, piano" # or the full web UI (worker + API) bash backend/run_backend_mac.sh # --device cpu also supported ``` ## Attribution & license - Original model, weights, and research: **liujiafeng / the Khala team** — [liujiafeng/Khala-MusicGeneration-v1.0](https://huggingface.co/liujiafeng/Khala-MusicGeneration-v1.0), [paper](https://arxiv.org/abs/2605.01790). - Released under **CC BY-NC 4.0**, the same license as the original weights. This repository only re-packages those weights for Apple-Silicon inference. - ⚠️ The upstream team noted (2026-05-07) a possible inference-quality precision issue under investigation; treat generation quality as not yet final.