metadata
license: mit
tags:
- keyword-spotting
- speech
- pytorch
- onnx
- bc-resnet
language:
- en
datasets:
- google/speech_commands
metrics:
- accuracy
SpeechGuard KWS — BC-ResNet-8 Keyword Spotter
Part of the SpeechGuard AI system submitted to Samsung EnnovateX AX Hackathon 2026.
Model Description
BC-ResNet-8 keyword spotter trained on Google Speech Commands v2 with noise augmentation. Uses PCEN (Per-Channel Energy Normalization) frontend for robust noise handling.
Performance
| Metric | Value |
|---|---|
| TA Clean | 99.0% |
| TA Noisy (-5 to +30 dB) | 98.5% |
| Parameters | 2,444 |
| Latency (CPU) | 1.1ms |
Usage
import torch
from huggingface_hub import hf_hub_download
# Download checkpoint
ckpt_path = hf_hub_download(
repo_id="MADHAV-SAMDANI/speechguard-kws",
filename="best_kws.pt"
)
# Load model
from speechguard.kws.bc_resnet import BCResNet8
ckpt = torch.load(ckpt_path, map_location="cpu", weights_only=False)
model = BCResNet8(num_classes=len(ckpt["classes"]), n_mels=80)
model.load_state_dict(ckpt["model_state"])
model.eval()
Training
- Dataset: Google Speech Commands v2 (2000 samples/class)
- Epochs: 35
- Optimizer: AdamW with cosine LR annealing
- Noise augmentation: ESC-50 + synthetic (white, pink, babble)
- Hardware: MacBook Air CPU (~70 minutes)
Citation
Samsung EnnovateX AX Hackathon 2026 — Problem #04 Team: Placecomm Prophets (IIT Kharagpur)