Spaces:

GaindeNdiaye
/

khAdI

Sleeping

App Files Files Community

khAdI / README.md

Mouhamed Naby NDIAYE

feat: rebrand to khAdI + general LLM + configurable GGUF

f593adc about 1 month ago

preview code

Raw

History Blame Contribute Delete

3.82 kB

metadata

title: khAdI
emoji: 🌍
colorFrom: green
colorTo: yellow
sdk: docker
pinned: false
app_port: 7860

khAdI — l'IA qui comprend et parle wolof

Assistant vocal intelligent en wolof. Reconnaît la voix, comprend le wolof, répond à l'oral — démarches administratives, santé, culture, et plus.

Stack technique

Composant	Modèle	Notes
ASR	M9and2M/whisper-small-wolof	Reconnaissance vocale wolof
LLM	Oolel-v0.1 Q4_K_M GGUF	LLM wolof via llama.cpp, CPU
TTS	Moustapha91/TTS_WOLOF_FINAL	SpeechT5 + post-processing prosodique
API	FastAPI	Backend Python
Frontend	Node.js / React	Interface web

Prérequis

Python 3.11+
Node.js 18+
Conda (recommandé)
~2 GB RAM minimum
GPU NVIDIA optionnel (CUDA 11+)

Installation

1. Cloner le dépôt

git clone https://github.com/Nabzozifo/wolof-bot.git
cd wolof-bot

2. Environnement Python

conda create -n wolof python=3.11 -y
conda activate wolof

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
pip install transformers==4.46.3 huggingface_hub
pip install llama-cpp-python
pip install fastapi uvicorn[standard]
pip install soundfile scipy numpy psutil
pip install sentence-transformers==2.7.0
pip install faiss-cpu

3. Télécharger les modèles

cd wolof_voice_agent

# ASR
python scripts/download_models.py

# LLM (Oolel GGUF)
# Placer le fichier oolel-v0.1-q4_k_m.gguf dans models/gguf/

# TTS (Moustapha91/TTS_WOLOF_FINAL + HiFi-GAN)
python scripts/download_speecht5.py

4. Frontend Node.js

cd frontend
npm install
npm run build

Lancer l'application

Backend API

conda activate wolof
cd wolof_voice_agent
uvicorn app.main:app --reload --port 8000

Frontend

cd frontend
npm run dev

Accès : http://localhost:3000

Configuration

Le fichier de configuration principal est wolof_voice_agent/config/models.yaml.

tts:
  provider: "moustapha"          # TTS Wolof principal
  model_name: "Moustapha91/TTS_WOLOF_FINAL"

llm:
  provider: "oolel_gguf"         # LLM wolof via llama.cpp

asr:
  provider: "hf_whisper"
  model_name: "M9and2M/whisper-small-wolof"

Variables d'environnement

# Optionnel — répertoire cache des modèles HuggingFace
HF_HOME=./wolof_voice_agent/data/cache/huggingface

Architecture du pipeline

Audio utilisateur (WAV/WebM)
        ↓
      ASR  →  texte wolof
        ↓
      LLM  →  réponse wolof naturelle
        ↓
  Prosody split  →  unités orales
        ↓
  TTS Moustapha  →  audio par segment
        ↓
  Post-processing  →  filtre + trim + fades
        ↓
      Audio final (WAV 16kHz)

API Endpoints

Méthode	Route	Description
`POST`	`/v1/voice-chat`	Envoie audio, reçoit audio
`POST`	`/v1/text-chat`	Envoie texte, reçoit texte + audio
`GET`	`/health`	Statut de l'application
`GET`	`/v1/profiles`	Profils disponibles

Exemple

curl -X POST http://localhost:8000/v1/voice-chat \
  -F "audio=@question.wav" \
  -F "profile=administration" \
  --output reponse.wav

Profils disponibles

Profil	Description
`administration`	Démarches administratives (NICAD, passeport, état civil)
`health_assistance`	Conseils santé en wolof
`customer_support`	Support client général

Développement

# Tests unitaires
pytest wolof_voice_agent/app/tests/ -v

# Benchmark TTS
python benchmark_tts.py --models moustapha

# Benchmark LLM
python benchmark_llm.py --models oolel

Licence

MIT — Projet communautaire pour les langues africaines.