litert-community
/

gemma-4-E2B-it-litert-lm

mattkreileder commited on Apr 3

Commit

9472425

verified ·

1 Parent(s): 757a6b6

Add memory consumption to Windows LunarLake

Files changed (1) hide show

README.md CHANGED Viewed

@@ -75,10 +75,10 @@ It uses the Gemma quantization scheme that employs a mixture of 2bit, 4bit and 8
 **Windows**
-| Device &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;| Backend | Prefill (tokens/sec) | Decode (tokens/sec) | <span style="white-space: nowrap;">Time-to-first</span>-token (sec) | Model size (MB) |
-| :---- | :---- | :---- | :---- | :---- | :---- |
-| Intel LunarLake | CPU | 435 | 29.8 | 2.39 | 2583 |
-| Intel LunarLake | GPU | 3,751 | 48.4 | 0.29 | 2583 |
 **IoT**

 **Windows**
+| Device &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;| Backend | Prefill (tokens/sec) | Decode (tokens/sec) | <span style="white-space: nowrap;">Time-to-first</span>-token (sec) | Model size (MB) | CPU/GPU Memory (MB) |
+| :---- | :---- | :---- | :---- | :---- | :---- | :---- |
+| Intel LunarLake | CPU | 435 | 29.8 | 2.39 | 2583 | 3505 |
+| Intel LunarLake | GPU | 3,751 | 48.4 | 0.29 | 2583 | 3540 |
 **IoT**