Instructions to use tooape/embeddinggemma-300m-qat-q8-ONNX with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers.js
How to use tooape/embeddinggemma-300m-qat-q8-ONNX with Transformers.js:
// npm i @huggingface/transformers import { pipeline } from '@huggingface/transformers'; // Allocate pipeline const pipe = await pipeline('feature-extraction', 'tooape/embeddinggemma-300m-qat-q8-ONNX');
metadata
library_name: transformers.js
pipeline_tag: feature-extraction
base_model: google/embeddinggemma-300m
license: gemma
tags:
- onnx
- embeddinggemma
- sentence-similarity
- quantized
EmbeddingGemma-300m — 64K-vocab-trimmed, PTQ-q4 (transformers.js)
Vocabulary-trimmed (262K → 64K EN+code ByteLevel-BPE) EmbeddingGemma-300m, PTQ
int4, exported as a single self-contained onnx/model_q4.onnx (external data
inlined — required for iOS WKWebView). Built for the Obsidian "Seek" plugin.
NOTE: the repo slug still says qat-q8 for URL stability — this is a misnomer;
the actual model is trimmed PTQ-q4 (no QAT, not q8).