litert-community
/

gemma-4-E2B-it-litert-lm

Model card Files Files and versions

tylermullen commited on Apr 2

Commit

ee5eb9d

·

verified ·

1 Parent(s): 068a1bc

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -89,7 +89,7 @@ It uses the Gemma quantization scheme that employs a mixture of 2bit, 4bit and 8
 ## Gemma 4 E2B on Web
-Running Gemma inference on the web is currently supported through [LLM Inference Engine](https://ai.google.dev/edge/mediapipe/solutions/genai/llm_inference/web_js) and uses the *gemma-4-E2B-it-web.task* model file. To try it out, download [the web model](https://huggingface.co/litert-community/gemma-4-E2B-it-litert-lm/blob/main/gemma-4-E2B-it-web.task) and run with our [sample web page](https://github.com/google-ai-edge/mediapipe-samples/blob/main/examples/llm_inference/js/README.md), or follow the [guide](https://ai.google.dev/edge/mediapipe/solutions/genai/llm_inference/web_js) to add it to your own app.
 Benchmarked in Chrome on a MacBook Pro 2024 (Apple M4 Max) with 1024 prefill tokens and 256 decode tokens, but the model can support context lengths up to 128K.

 ## Gemma 4 E2B on Web
+Running Gemma inference on the web is currently supported through [LLM Inference Engine](https://ai.google.dev/edge/mediapipe/solutions/genai/llm_inference/web_js) and uses the *gemma-4-E2B-it-web.task* model file. Try it out [live in your browser](https://huggingface.co/spaces/tylermullen/Gemma4) (Chrome with WebGPU recommended). To start developing with it, download [the web model](https://huggingface.co/litert-community/gemma-4-E2B-it-litert-lm/blob/main/gemma-4-E2B-it-web.task) and run with our [sample web page](https://github.com/google-ai-edge/mediapipe-samples/blob/main/examples/llm_inference/js/README.md), or follow the [guide](https://ai.google.dev/edge/mediapipe/solutions/genai/llm_inference/web_js) to add it to your own app.
 Benchmarked in Chrome on a MacBook Pro 2024 (Apple M4 Max) with 1024 prefill tokens and 256 decode tokens, but the model can support context lengths up to 128K.