LiteRT-LM
marissaw commited on
Commit
7022fb7
·
verified ·
1 Parent(s): 21a30af

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -50,7 +50,7 @@ It uses the Gemma quantization scheme that employs a mixture of 2bit, 4bit and 8
50
 
51
  **iOS**
52
 
53
- | Device &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;| Backend | Prefill (tokens/sec) | Decode (tokens/sec) | <span style="white-space: nowrap;">Time-to-first</span>-token (sec) | Model size (MB) | CPU Memory (MB) |
54
  | :---- | :---- | :---- | :---- | :---- | :---- | :---- |
55
  | iPhone 17 Pro | CPU | 532 | 25.0 | 1.9 | 2583 | 607 |
56
  | iPhone 17 Pro | GPU | 2,878 | 56.5 | 0.3 | 2583 | 1450 |
@@ -64,7 +64,7 @@ It uses the Gemma quantization scheme that employs a mixture of 2bit, 4bit and 8
64
 
65
  **macOS**
66
 
67
- | Device &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;| Backend | Prefill (tokens/sec) | Decode (tokens/sec) | <span style="white-space: nowrap;">Time-to-first</span>-token (sec) | Model size (MB) | CPU Memory (MB) |
68
  | :---- | :---- | :---- | :---- | :---- | :---- | :---- |
69
  | MacBook Pro M4 | CPU | 901 | 41.6 | 1.1 | 2583 | 736 |
70
  | MacBook Pro M4 | GPU | 7,835 | 160.2 | 0.1 | 2583 | 1623 |
 
50
 
51
  **iOS**
52
 
53
+ | Device &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;| Backend | Prefill (tokens/sec) | Decode (tokens/sec) | <span style="white-space: nowrap;">Time-to-first</span>-token (sec) | Model size (MB) | CPU/GPU Memory (MB) |
54
  | :---- | :---- | :---- | :---- | :---- | :---- | :---- |
55
  | iPhone 17 Pro | CPU | 532 | 25.0 | 1.9 | 2583 | 607 |
56
  | iPhone 17 Pro | GPU | 2,878 | 56.5 | 0.3 | 2583 | 1450 |
 
64
 
65
  **macOS**
66
 
67
+ | Device &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;| Backend | Prefill (tokens/sec) | Decode (tokens/sec) | <span style="white-space: nowrap;">Time-to-first</span>-token (sec) | Model size (MB) | CPU/GPU Memory (MB) |
68
  | :---- | :---- | :---- | :---- | :---- | :---- | :---- |
69
  | MacBook Pro M4 | CPU | 901 | 41.6 | 1.1 | 2583 | 736 |
70
  | MacBook Pro M4 | GPU | 7,835 | 160.2 | 0.1 | 2583 | 1623 |