Spaces:
Sleeping
Sleeping
| title: khAdI | |
| emoji: 🌍 | |
| colorFrom: green | |
| colorTo: yellow | |
| sdk: docker | |
| pinned: false | |
| app_port: 7860 | |
| # khAdI — l'IA qui comprend et parle wolof | |
| Assistant vocal intelligent en wolof. Reconnaît la voix, comprend le wolof, répond à l'oral — démarches administratives, santé, culture, et plus. | |
| --- | |
| ## Stack technique | |
| | Composant | Modèle | Notes | | |
| |---|---|---| | |
| | ASR | M9and2M/whisper-small-wolof | Reconnaissance vocale wolof | | |
| | LLM | Oolel-v0.1 Q4_K_M GGUF | LLM wolof via llama.cpp, CPU | | |
| | TTS | Moustapha91/TTS_WOLOF_FINAL | SpeechT5 + post-processing prosodique | | |
| | API | FastAPI | Backend Python | | |
| | Frontend | Node.js / React | Interface web | | |
| --- | |
| ## Prérequis | |
| - Python 3.11+ | |
| - Node.js 18+ | |
| - Conda (recommandé) | |
| - ~2 GB RAM minimum | |
| - GPU NVIDIA optionnel (CUDA 11+) | |
| --- | |
| ## Installation | |
| ### 1. Cloner le dépôt | |
| ```bash | |
| git clone https://github.com/Nabzozifo/wolof-bot.git | |
| cd wolof-bot | |
| ``` | |
| ### 2. Environnement Python | |
| ```bash | |
| conda create -n wolof python=3.11 -y | |
| conda activate wolof | |
| pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu | |
| pip install transformers==4.46.3 huggingface_hub | |
| pip install llama-cpp-python | |
| pip install fastapi uvicorn[standard] | |
| pip install soundfile scipy numpy psutil | |
| pip install sentence-transformers==2.7.0 | |
| pip install faiss-cpu | |
| ``` | |
| ### 3. Télécharger les modèles | |
| ```bash | |
| cd wolof_voice_agent | |
| # ASR | |
| python scripts/download_models.py | |
| # LLM (Oolel GGUF) | |
| # Placer le fichier oolel-v0.1-q4_k_m.gguf dans models/gguf/ | |
| # TTS (Moustapha91/TTS_WOLOF_FINAL + HiFi-GAN) | |
| python scripts/download_speecht5.py | |
| ``` | |
| ### 4. Frontend Node.js | |
| ```bash | |
| cd frontend | |
| npm install | |
| npm run build | |
| ``` | |
| --- | |
| ## Lancer l'application | |
| ### Backend API | |
| ```bash | |
| conda activate wolof | |
| cd wolof_voice_agent | |
| uvicorn app.main:app --reload --port 8000 | |
| ``` | |
| ### Frontend | |
| ```bash | |
| cd frontend | |
| npm run dev | |
| ``` | |
| Accès : [http://localhost:3000](http://localhost:3000) | |
| --- | |
| ## Configuration | |
| Le fichier de configuration principal est `wolof_voice_agent/config/models.yaml`. | |
| ```yaml | |
| tts: | |
| provider: "moustapha" # TTS Wolof principal | |
| model_name: "Moustapha91/TTS_WOLOF_FINAL" | |
| llm: | |
| provider: "oolel_gguf" # LLM wolof via llama.cpp | |
| asr: | |
| provider: "hf_whisper" | |
| model_name: "M9and2M/whisper-small-wolof" | |
| ``` | |
| ### Variables d'environnement | |
| ```bash | |
| # Optionnel — répertoire cache des modèles HuggingFace | |
| HF_HOME=./wolof_voice_agent/data/cache/huggingface | |
| ``` | |
| --- | |
| ## Architecture du pipeline | |
| ``` | |
| Audio utilisateur (WAV/WebM) | |
| ↓ | |
| ASR → texte wolof | |
| ↓ | |
| LLM → réponse wolof naturelle | |
| ↓ | |
| Prosody split → unités orales | |
| ↓ | |
| TTS Moustapha → audio par segment | |
| ↓ | |
| Post-processing → filtre + trim + fades | |
| ↓ | |
| Audio final (WAV 16kHz) | |
| ``` | |
| --- | |
| ## API Endpoints | |
| | Méthode | Route | Description | | |
| |---|---|---| | |
| | `POST` | `/v1/voice-chat` | Envoie audio, reçoit audio | | |
| | `POST` | `/v1/text-chat` | Envoie texte, reçoit texte + audio | | |
| | `GET` | `/health` | Statut de l'application | | |
| | `GET` | `/v1/profiles` | Profils disponibles | | |
| ### Exemple | |
| ```bash | |
| curl -X POST http://localhost:8000/v1/voice-chat \ | |
| -F "audio=@question.wav" \ | |
| -F "profile=administration" \ | |
| --output reponse.wav | |
| ``` | |
| --- | |
| ## Profils disponibles | |
| | Profil | Description | | |
| |---|---| | |
| | `administration` | Démarches administratives (NICAD, passeport, état civil) | | |
| | `health_assistance` | Conseils santé en wolof | | |
| | `customer_support` | Support client général | | |
| --- | |
| ## Développement | |
| ```bash | |
| # Tests unitaires | |
| pytest wolof_voice_agent/app/tests/ -v | |
| # Benchmark TTS | |
| python benchmark_tts.py --models moustapha | |
| # Benchmark LLM | |
| python benchmark_llm.py --models oolel | |
| ``` | |
| --- | |
| ## Licence | |
| MIT — Projet communautaire pour les langues africaines. | |