--- license: apache-2.0 library_name: transformers pipeline_tag: text-to-audio language: - en tags: - feature-extraction - audio - music - text2music - custom_code - ace-step - acestep - lora - loha - music-generation - xl-base - pop - electro base_model: ACE-Step/acestep-v15-xl-base --- # ACE-Step v1.5 xl-base: Pop/Electro LoHA (1.3k tracks) LoHA adapter trained on 1.3k pop/electro tracks against the [ACE-Step v1.5 xl-base](https://huggingface.co/ACE-Step/acestep-v15-xl-base) checkpoint. Trained with [Side-Step](https://github.com/koda-dernet/Side-Step) following the maintainer's [recommended config](https://github.com/koda-dernet/Side-Step/issues/57).

![Training loss curve](loss_curve.png)

**Final**: epoch 100, MA5 **0.7955** (best of run) **Dataset**: 1.3k pop/electro tracks (Same as [Nekochu/stable-audio-open-1.0-Music](https://huggingface.co/Nekochu/stable-audio-open-1.0-Music)) **~1/3 common caption tags**: Chill & Relax, Feel-Good Vibes, Pop Dance, Slow Down & Relax, EDM, Pop, Workout Beats, Chill Vibes, Alt Z, Electro House **Adapter**: LoHA dim=128, alpha=256, target-mlp **Trainable**: 652M params (11.6%) **Hardware**: RTX 5090, ~24-30 min/epoch, ~48 hrs total

## Pick your epoch Multiple checkpoints provided. Same prompt + seed across all samples (Alan Walker "Alone", 60s). Listen and pick the one whose tone you prefer. | Epoch | Folder | Sample | Notes | |---|---|---|---| | base (no LoRA) | n/a |

| Clearest, most "different" tone | ## Inference ACE-Step Gradio UI does **not** load LyCORIS LoHA directly (PEFT-only adapter loader). Use ACE-Step 1.5's API: ```bash git clone https://github.com/ace-step/ACE-Step-1.5 && cd ACE-Step-1.5 uv run acestep-api # starts REST server # In another terminal: curl -X POST http://localhost:8000/v1/lora/load -d '{"lora_path": "/abs/path/to/lora_ep100/loha_weights.safetensors", "adapter_name": "pop_electro"}' ```

Or programmatically via AceStepHandler.add_lora(): Python inference script

```python from acestep.handler import AceStepHandler from acestep.llm_inference import LLMHandler from acestep.inference import GenerationParams, GenerationConfig, generate_music dit = AceStepHandler() dit.initialize_service( project_root="/path/to/ACE-Step-1.5", config_path="acestep-v15-xl-base", device="cuda", ) dit.add_lora(lora_path="/abs/path/to/lora_ep100/loha_weights.safetensors", adapter_name="pop_electro") llm = LLMHandler() llm.initialize(checkpoint_dir="/path/to/checkpoints", lm_model_path="acestep-5Hz-lm-4B", backend="pt", device="cuda") params = GenerationParams( task_type="text2music", caption="...", lyrics="...", bpm=97, keyscale="F Major", timesignature="4", vocal_language="en", duration=60.0, inference_steps=50, guidance_scale=7.0, thinking=True, infer_method="ode", ) config = GenerationConfig(audio_format="wav", batch_size=1, seeds=[4178637441]) result = generate_music(dit, llm, params, config, save_dir="./output") ```

## Side-Step training command ```bash uv run sidestep --yes --plain train \ -d "../tmp/xl_tensors" \ --checkpoint-dir "../tmp/checkpoints" \ --model xl-base \ --adapter loha \ --loha-linear-dim 128 \ --loha-linear-alpha 256 \ --target-mlp \ --batch-size 1 \ --gradient-accumulation 8 \ --chunk-duration 120 \ --precision fp32 \ --lr 5e-5 \ --warmup-steps 300 \ --optimizer-type adamw8bit \ --scheduler-type cosine \ --gradient-checkpointing \ --gradient-checkpointing-ratio 0.5 \ --epochs 100 \ --save-every 5 \ --output-dir "../tmp/train_runs/loha_xl_base" ```