Instructions to use litert-community/gemma-4-E2B-it-litert-lm with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- LiteRT-LM
How to use litert-community/gemma-4-E2B-it-litert-lm with LiteRT-LM:
# LiteRT-LM runs on various platforms (Android, iOS, Windows, Linux, macOS, IoT, Web/WASM) # and supports many APIs (C++, Python, Kotlin, Swift, JavaScript, Flutter). # For platform-specific integration guides, please refer to the official developer website: # https://ai.google.dev/edge/litert-lm # To try LiteRT-LM, the easiest way is to use our CLI tool. # 1. Install the LiteRT-LM CLI tool: pip install litert-lm # 2. Download and run this model locally: # See: https://ai.google.dev/edge/litert-lm/cli litert-lm run \ --from-huggingface-repo=litert-community/gemma-4-E2B-it-litert-lm \ model.litertlm \ --prompt="Write me a poem"
- Notebooks
- Google Colab
- Kaggle
Add memory consumption to Windows LunarLake
Browse files
README.md
CHANGED
|
@@ -75,10 +75,10 @@ It uses the Gemma quantization scheme that employs a mixture of 2bit, 4bit and 8
|
|
| 75 |
|
| 76 |
**Windows**
|
| 77 |
|
| 78 |
-
| Device | Backend | Prefill (tokens/sec) | Decode (tokens/sec) | <span style="white-space: nowrap;">Time-to-first</span>-token (sec) | Model size (MB) |
|
| 79 |
-
| :---- | :---- | :---- | :---- | :---- | :---- |
|
| 80 |
-
| Intel LunarLake | CPU | 435 | 29.8 | 2.39 | 2583 |
|
| 81 |
-
| Intel LunarLake | GPU | 3,751 | 48.4 | 0.29 | 2583 |
|
| 82 |
|
| 83 |
**IoT**
|
| 84 |
|
|
|
|
| 75 |
|
| 76 |
**Windows**
|
| 77 |
|
| 78 |
+
| Device | Backend | Prefill (tokens/sec) | Decode (tokens/sec) | <span style="white-space: nowrap;">Time-to-first</span>-token (sec) | Model size (MB) | CPU/GPU Memory (MB) |
|
| 79 |
+
| :---- | :---- | :---- | :---- | :---- | :---- | :---- |
|
| 80 |
+
| Intel LunarLake | CPU | 435 | 29.8 | 2.39 | 2583 | 3505 |
|
| 81 |
+
| Intel LunarLake | GPU | 3,751 | 48.4 | 0.29 | 2583 | 3540 |
|
| 82 |
|
| 83 |
**IoT**
|
| 84 |
|