---
library_name: audio-interv
tags:
  - activation-steering
  - all
  - audio
  - austeer
  - diffusion
  - interpretability
  - music
  - steering
  - vocal-gender
---

# AUSteer — `vocal_gender` (ACE-Step)

Per-(step, layer) sparse activation-momentum scores for the **vocal_gender** concept on ACE-Step. At inference, `AUSteerSteeringController` adds `alpha` along the top-`k` most concept-discriminative bins.

## Paper

TADA! Tuning Audio Diffusion Models through Activation Steering — [https://huggingface.co/papers/2602.11910](https://huggingface.co/papers/2602.11910)

## Quickstart

```python
from src.steering import SteerableACEModel, AUSteerSteeringController

model = SteerableACEModel(device="cuda")
model.pipeline.load()
ctrl = AUSteerSteeringController.from_pretrained(
    "lukasz-staniszewski/ace-step-austeer-vocal-gender-all", alpha=15.0, k=256, mode="additive",
)

with model.steer(ctrl):
    audio = model.generate(
        prompt="instrumental music", lyrics="[inst]",
        audio_duration=10.0, infer_step=30, manual_seed=0,
    )
```

## Generation config

```json
{
  "method": "austeer",
  "concept": "vocal_gender",
  "lyrics": "",
  "layers": "all",
  "layers_collected": [
    "tf0",
    "tf1",
    "tf2",
    "tf3",
    "tf4",
    "tf5",
    "tf6",
    "tf7",
    "tf8",
    "tf9",
    "tf10",
    "tf11",
    "tf12",
    "tf13",
    "tf14",
    "tf15",
    "tf16",
    "tf17",
    "tf18",
    "tf19",
    "tf20",
    "tf21",
    "tf22",
    "tf23"
  ],
  "num_inference_steps": 30,
  "audio_duration": 30.0,
  "seed": 10,
  "guidance_scale": 5.0,
  "guidance_scale_text": 0.0,
  "guidance_scale_lyric": 0.0,
  "guidance_interval": 1.0,
  "guidance_interval_decay": 0.0
}
```