---
title: Numberblocks One Voice Extraction (CPU - Fixed)
emoji: 🔊
colorFrom: purple
colorTo: pink
sdk: docker
pinned: false
license: mit
---

# Numberblocks One Voice Extraction (CPU Version - Fixed)

**🔧 修复版本**: 添加 Web 服务器以通过 Hugging Face 健康检查，批处理任务在后台运行。

This Hugging Face Space automatically extracts **One's** voice from Numberblocks audio files using speaker diarization.

## What It Does

1. **Downloads** all audio files from the `ayf3/numberblocks-audio` dataset
2. **Analyzes** each file using `pyannote.audio` speaker diarization
3. **Identifies** which speaker is "One" using heuristic analysis
4. **Extracts** all speech segments belonging to One
5. **Saves** the clean audio segments to `/data/output/one_audio/`

## 🔧 What Was Fixed

### Problem
- **Runtime Error**: "Launch timed out, workload was not healthy after 30 min"
- **Cause**: The original Dockerfile ran a long-running batch process (15-30 hours) without a web server
- **Issue**: Hugging Face's health check expects an HTTP response on port 7860

### Solution
✅ **Added Flask Web Server**: Responds to health checks immediately
✅ **Background Processing**: Batch task runs in a separate thread
✅ **Status Dashboard**: View processing progress in real-time at the Space homepage
✅ **Progress Tracking**: Status saved after each file for crash recovery

## Features

- ✅ **CPU-friendly**: Runs on basic CPU (no GPU required)
- ✅ **Fully automated**: Runs on container startup, no user interaction needed
- ✅ **Web Dashboard**: Real-time progress tracking via browser
- ✅ **Health Checks**: Passes Hugging Face's health monitoring
- ✅ **Smart speaker identification**: Uses heuristics to identify One's voice
- ✅ **Error handling**: Continues processing even if individual files fail

## Usage

### Viewing Progress

Simply visit this Space's homepage to see:
- Current processing status (running/completed/error)
- Progress counter (processed X of Y files)
- Current file being processed
- Number of output files generated
- Total duration of extracted audio

### API Endpoints

- **`/`** - HTML status dashboard
- **`/status`** - JSON status API
- **`/health`** - Health check endpoint (used by Hugging Face)

## Output

The extracted audio segments are saved in `/data/output/one_audio/` with the format:

```
S01E01_One_12.34_15.67.wav
```

Where:
- `S01E01_One`: Episode name
- `12.34`: Start time in seconds
- `15.67`: End time in seconds

## Technical Details

- **Model**: `pyannote/speaker-diarization-3.1`
- **Hardware**: CPU (no GPU required)
- **SDK**: Docker (for automated batch processing)
- **Processing time**: ~15-30 hours for 124 files (CPU)
- **Web Framework**: Flask (for health checks and status dashboard)

## Progress Tracking

Current progress is saved in `/data/output/processing_report.json`:

```json
{
  "total_files": 124,
  "processed_files": 124,
  "total_one_audio_hours": 8.5,
  "segments": [...],
  "completed_at": "2026-03-18 18:30:00"
}
```

## Logs

View real-time processing logs in the **Logs** tab of this Space.

## Next Steps

Once extraction is complete:

1. Download the extracted audio files from `/data/output/one_audio/`
2. Use them to train an RVC (Retrieval-based Voice Conversion) model
3. Generate new speech in One's voice

---

**Note**: This Space runs automatically on startup and provides a web dashboard for monitoring. No manual interaction required.

**Status**: 🔧 Fixed - Health checks passing, background processing enabled.
# Trigger rebuild at Thu Mar 19 06:33:50 CST 2026


# Trigger rebuild at Fri Mar 20 12:33:58 2026