GGUF for fullstop-punctuation in CrispASR (drop-in --punc-model for any ASR backend)

#14
by cstr - opened

Thanks for the multilingual punctuation model! It's wired into CrispASR as a drop-in --punc-model post-processor for any ASR backend.

src/fireredpunc.cpp (despite the name, also handles the XLM-R-large fullstop-punc family — same dispatch path on the CTC + label head). 24L XLM-R-large encoder + 6-class punctuation head. Q4_K is the recommended default at ~254 MB; F16 is 1.6 GB and produces identical output on JFK.

This is the default punctuation post-processor for the CrispASR CTC backends that don't emit punctuation natively (Wav2Vec2 / HuBERT / Data2Vec / FastConformer-CTC / OmniASR-CTC / FireRed-ASR). The CN+EN sibling cstr/fireredpunc-GGUF (BERT-base) is preferred for Mandarin.

Also reachable from the Python / Rust / Dart bindings as crispasr.PuncModel.

Pre-quantised GGUFs (MIT): cstr/fullstop-punc-multilang-GGUF

./build/bin/crispasr --backend wav2vec2 \
    -m wav2vec2-xlsr-de-q4_k.gguf -f audio.wav \
    --punc-model fullstop-punc-q4_k.gguf

(All quants tested identical on JFK — quantisation is essentially free for this model.)

Sign up or log in to comment