LiteRT-LM
marissaw commited on
Commit
c8a411f
·
verified ·
1 Parent(s): 12728ec

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -78,6 +78,8 @@ It uses the Gemma quantization scheme that employs a mixture of 2bit, 4bit and 8
78
  | Device &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;| Backend | Prefill (tokens/sec) | Decode (tokens/sec) | <span style="white-space: nowrap;">Time-to-first</span>-token (sec) | Model size (MB) | CPU Memory (MB) |
79
  | :---- | :---- | :---- | :---- | :---- | :---- | :---- |
80
  | Raspberry Pi 5 16GB | CPU | 133 | 7.6 | 7.8 | 2583 | 1546 |
 
 
81
  | Qualcomm IQ-8275 EVK | NPU | 3,747 | 31.7 | 0.3 | 2967 | 1869 |
82
 
83
  <small>
 
78
  | Device &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;| Backend | Prefill (tokens/sec) | Decode (tokens/sec) | <span style="white-space: nowrap;">Time-to-first</span>-token (sec) | Model size (MB) | CPU Memory (MB) |
79
  | :---- | :---- | :---- | :---- | :---- | :---- | :---- |
80
  | Raspberry Pi 5 16GB | CPU | 133 | 7.6 | 7.8 | 2583 | 1546 |
81
+ | Jetson Orin Nano | CPU | 109 | 12.2 | 9.4 | 2583 | 3681 |
82
+ | Jetson Orin Nano | GPU | 1,142 | 24.2 | 0.9 | 2583 | 2739 |
83
  | Qualcomm IQ-8275 EVK | NPU | 3,747 | 31.7 | 0.3 | 2967 | 1869 |
84
 
85
  <small>