litert-community
/

gemma-4-E2B-it-litert-lm

Model card Files Files and versions

marissaw commited on Apr 2

Commit

3d723db

·

verified ·

1 Parent(s): e84e283

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -98,7 +98,7 @@ Running Gemma inference on the web is currently supported through [LLM Inference
 Benchmarked in Chrome on a MacBook Pro 2024 (Apple M4 Max) with 1024 prefill tokens and 256 decode tokens, but the model can support context lengths up to 128K.
-| Device | Backend | Prefill (tokens/sec) | Decode (tokens/sec) | Initialization time (sec) | Model size (MB) | CPU Memory (GB) | GPU Memory (MB) |
 | :---- | :---- | :---- | :---- | :---- | :---- | :---- | :---- |
 | Web | GPU | 4,676 | 73.9 | 1.1 | 2004 | 1.5 | 1.8 |

 Benchmarked in Chrome on a MacBook Pro 2024 (Apple M4 Max) with 1024 prefill tokens and 256 decode tokens, but the model can support context lengths up to 128K.
+| Device | Backend | Prefill (tokens/sec) | Decode (tokens/sec) | Initialization time (sec) | Model size (MB) | CPU Memory (GB) | GPU Memory (GB) |
 | :---- | :---- | :---- | :---- | :---- | :---- | :---- | :---- |
 | Web | GPU | 4,676 | 73.9 | 1.1 | 2004 | 1.5 | 1.8 |