litert-community
/

gemma-4-E2B-it-litert-lm

Model card Files Files and versions

marissaw commited on Apr 2

Commit

2fa50dd

·

verified ·

1 Parent(s): c8a411f

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -70,8 +70,8 @@ It uses the Gemma quantization scheme that employs a mixture of 2bit, 4bit and 8
 | Device &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;| Backend | Prefill (tokens/sec) | Decode (tokens/sec) | <span style="white-space: nowrap;">Time-to-first</span>-token (sec) | Model size (MB) | CPU/GPU Memory (MB) |
 | :---- | :---- | :---- | :---- | :---- | :---- | :---- |
-| MacBook Pro M4 | CPU | 901 | 41.6 | 1.1 | 2583 | 736 |
-| MacBook Pro M4 | GPU | 7,835 | 160.2 | 0.1 | 2583 | 1623 |
 **IoT**

 | Device &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;| Backend | Prefill (tokens/sec) | Decode (tokens/sec) | <span style="white-space: nowrap;">Time-to-first</span>-token (sec) | Model size (MB) | CPU/GPU Memory (MB) |
 | :---- | :---- | :---- | :---- | :---- | :---- | :---- |
+| MacBook Pro M4 Max | CPU | 901 | 41.6 | 1.1 | 2583 | 736 |
+| MacBook Pro M4 Max | GPU | 7,835 | 160.2 | 0.1 | 2583 | 1623 |
 **IoT**