--- base_model: areffarhadi/Resnet34-tidyvoiceX-ASV license: apache-2.0 tags: - speaker-verification - speaker-embedding - cross-lingual - multilingual - wespeaker - resnet - pytorch datasets: - voxblink2 - voxceleb2 - tidyvoicex metrics: - eer - mindcf --- ONNX conversion of [Resnet34-tidyvoiceX-ASV](https://huggingface.co/areffarhadi/Resnet34-tidyvoiceX-ASV) from [TidyVoice Challenge: Cross-Lingual Speaker Verification](https://arxiv.org/abs/2601.21960). Compatitable with ONNX inference script from WeSpeaker and [sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx) Download ```sh wget https://huggingface.co/hr16/tidyvoicex-samresnet34-onnx/resolve/main/tidyvoicex_samresnet34.onnx ``` Example on sherpa-onnx ```py import sherpa_onnx import soundfile as sf audio, sample_rate = sf.read("your-audio-file.wav") embedding_model = sherpa_onnx.SpeakerEmbeddingExtractor( sherpa_onnx.SpeakerEmbeddingExtractorConfig( model="tidyvoicex_samresnet34.onnx", num_threads=1, provider="cuda" ) ) stream = embedding_model.create_stream() stream.accept_waveform(sample_rate=sample_rate, waveform=audio) spk_emb = np.array(embedding_model.compute(stream)) ```