TokenWang commited on
Commit
a3653f6
·
verified ·
1 Parent(s): 6f7ed2a

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -175,3 +175,14 @@ docs/assets/showcases/interleave/case_0007_bowie_slide_design.webp filter=lfs di
175
  docs/assets/teaser_2.png filter=lfs diff=lfs merge=lfs -text
176
  docs/assets/showcases/t2i_infographic/0028.webp filter=lfs diff=lfs merge=lfs -text
177
  docs/assets/showcases/t2i_infographic/u1-case2.webp filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
 
 
175
  docs/assets/teaser_2.png filter=lfs diff=lfs merge=lfs -text
176
  docs/assets/showcases/t2i_infographic/0028.webp filter=lfs diff=lfs merge=lfs -text
177
  docs/assets/showcases/t2i_infographic/u1-case2.webp filter=lfs diff=lfs merge=lfs -text
178
+ docs/assets/perform_vs_speed_avg3.webp filter=lfs diff=lfs merge=lfs -text
179
+ docs/assets/perform_vs_speed_avg8.webp filter=lfs diff=lfs merge=lfs -text
180
+ docs/assets/showcases/interleave/reasoning.png filter=lfs diff=lfs merge=lfs -text
181
+ docs/assets/showcases/t2i_infographic/0029.webp filter=lfs diff=lfs merge=lfs -text
182
+ docs/assets/showcases/t2i_infographic/0030.webp filter=lfs diff=lfs merge=lfs -text
183
+ docs/assets/showcases/t2i_infographic/0031.webp filter=lfs diff=lfs merge=lfs -text
184
+ docs/assets/showcases/t2i_infographic/0032.webp filter=lfs diff=lfs merge=lfs -text
185
+ docs/assets/showcases/t2i_infographic/0033.webp filter=lfs diff=lfs merge=lfs -text
186
+ docs/assets/teaser.webp filter=lfs diff=lfs merge=lfs -text
187
+ docs/assets/teaser_1.webp filter=lfs diff=lfs merge=lfs -text
188
+ docs/assets/teaser_2.webp filter=lfs diff=lfs merge=lfs -text
docs/assets/perform_vs_speed_avg3.webp ADDED

Git LFS Details

  • SHA256: dad16fc4326dbb0c956cf16470740d6d506785c298180986df949b878d6176bd
  • Pointer size: 131 Bytes
  • Size of remote file: 112 kB
docs/assets/perform_vs_speed_avg8.webp ADDED

Git LFS Details

  • SHA256: 05ed2440f489b4bffe29022a3098db2b2439c33f3ceefc74eeb1cc2b28528ac0
  • Pointer size: 131 Bytes
  • Size of remote file: 102 kB
docs/assets/showcases/interleave/reasoning.png ADDED

Git LFS Details

  • SHA256: 155b94d13f550fd20b4ebdfa849cfdae86a07f092fa081277d9cc995335faa08
  • Pointer size: 132 Bytes
  • Size of remote file: 1.61 MB
docs/assets/showcases/t2i_infographic/0029.webp ADDED

Git LFS Details

  • SHA256: e62b4da8af5836670b2094b11f2453e8154ec072e0c4936f761f93248f78a9f4
  • Pointer size: 131 Bytes
  • Size of remote file: 347 kB
docs/assets/showcases/t2i_infographic/0030.webp ADDED

Git LFS Details

  • SHA256: 5d3b9354f8527d2bdd45d3dc2d44b221a357fb9a7208001fbc2240310adfb4f2
  • Pointer size: 131 Bytes
  • Size of remote file: 315 kB
docs/assets/showcases/t2i_infographic/0031.webp ADDED

Git LFS Details

  • SHA256: 6033890fcba838148fc12003289601e78bf7a0a547e18fe802ed6f2085e64e69
  • Pointer size: 131 Bytes
  • Size of remote file: 273 kB
docs/assets/showcases/t2i_infographic/0032.webp ADDED

Git LFS Details

  • SHA256: 68c807bfedf4599c7248cbe1aa91d1daef4cd7cf25d5b1f25ac501a13238cc99
  • Pointer size: 131 Bytes
  • Size of remote file: 213 kB
docs/assets/showcases/t2i_infographic/0033.webp ADDED

Git LFS Details

  • SHA256: 13a6ed661d720828a5ae15c09398f522eace0781f7fca69ec3a54c4a78aff108
  • Pointer size: 131 Bytes
  • Size of remote file: 448 kB
docs/assets/teaser.webp ADDED

Git LFS Details

  • SHA256: a90cca7bea14004599cf64396b03884c93b5e32a402e49cd3d8e27999a3f4648
  • Pointer size: 131 Bytes
  • Size of remote file: 647 kB
docs/assets/teaser_1.webp ADDED

Git LFS Details

  • SHA256: 51099f3a04da22d5c54987528428c736ffc0a3e2a9d8a47954548e5a4008c956
  • Pointer size: 131 Bytes
  • Size of remote file: 517 kB
docs/assets/teaser_2.webp ADDED

Git LFS Details

  • SHA256: 1fe2160a6eb1fc5c72a0b32f9b53d27829bed047e4c00473005ee7a782bcdd48
  • Pointer size: 132 Bytes
  • Size of remote file: 1.81 MB
docs/inference_infra.md CHANGED
@@ -78,8 +78,8 @@ see [`deployment.md`](./deployment.md).
78
 
79
  ### Generation Performance
80
 
81
- The table below is the benchmark template for **2048x2048** image generation.
82
- Fill in measured numbers for each machine and deployment profile.
83
  Note: TP2+CFG2 means Tensor Parallel=2 + CFG Parallel=2.
84
 
85
  <div align="center">
@@ -100,7 +100,7 @@ In NEO-Unify, the KV cache for the generation stage is provided by the understan
100
 
101
  The table below compares the latency of a single diffusion step for
102
  **2048x2048** image generation with **CFG enabled**. Unless otherwise noted,
103
- all measurements are taken on **H100**; the `NEO-Unify (TP2+CFG2)` result uses
104
  `2x H100`.
105
  Note: TP2+CFG2 means Tensor Parallel=2 + CFG Parallel=2.
106
 
@@ -113,7 +113,7 @@ Note: TP2+CFG2 means Tensor Parallel=2 + CFG Parallel=2.
113
  | GLM-Image | 9B | 7B | 1.394 |
114
  | ERNIE-Image | 8B | 8B | 1.565 |
115
  | LongCat-Image | 8B | 6B | 0.796 |
116
- | NEO-Unify (1x, no TP/CFG parallelism) | 8B | 8B | 0.312 |
117
- | NEO-Unify (TP2+CFG2) | 8B | 8B | 0.158 |
118
 
119
  </div>
 
78
 
79
  ### Generation Performance
80
 
81
+ The table below reports **2048x2048** image generation latency for
82
+ **SenseNova-U1-8B-MoT(NEO-Unify)**. Fill in measured numbers for each machine and deployment profile.
83
  Note: TP2+CFG2 means Tensor Parallel=2 + CFG Parallel=2.
84
 
85
  <div align="center">
 
100
 
101
  The table below compares the latency of a single diffusion step for
102
  **2048x2048** image generation with **CFG enabled**. Unless otherwise noted,
103
+ all measurements are taken on **H100**; the `TP2+CFG2` result uses
104
  `2x H100`.
105
  Note: TP2+CFG2 means Tensor Parallel=2 + CFG Parallel=2.
106
 
 
113
  | GLM-Image | 9B | 7B | 1.394 |
114
  | ERNIE-Image | 8B | 8B | 1.565 |
115
  | LongCat-Image | 8B | 6B | 0.796 |
116
+ | SenseNova-U1-8B-MoT (Neo-Unify) | 8B | 8B | 0.312 |
117
+ | SenseNova-U1-8B-MoT (Neo-Unify, TP2+CFG2) | 8B | 8B | 0.158 |
118
 
119
  </div>
docs/inference_infra_CN.md CHANGED
@@ -76,7 +76,8 @@ Docker 镜像、启动命令与 API 测试的简明操作手册,请参见 [`de
76
 
77
  ### 生成性能
78
 
79
- 下表 **2048x2048** 图像生成的基准模板,列出了不同机型与部署配置下的实测数据。
 
80
  注:TP2+CFG2 表示张量并行=2 + CFG 并行=2。
81
 
82
  <div align="center">
@@ -94,7 +95,7 @@ Docker 镜像、启动命令与 API 测试的简明操作手册,请参见 [`de
94
 
95
  ### 跨模型速度对比
96
 
97
- 下表对比了在启用**CFG**条件下,生成 **2048x2048** 图像时单个 diffusion step 的延迟。除特别说明外,所有数据均在 **H100** 上测得;其中 `NEO-Unify (TP2+CFG2)` 使用的是 `2x H100`。
98
  注:TP2+CFG2 表示张量并行=2 + CFG 并行=2。
99
 
100
  <div align="center">
@@ -106,7 +107,7 @@ Docker 镜像、启动命令与 API 测试的简明操作手册,请参见 [`de
106
  | GLM-Image | 9B | 7B | 1.394 |
107
  | ERNIE-Image | 8B | 8B | 1.565 |
108
  | LongCat-Image | 8B | 6B | 0.796 |
109
- | NEO-Unify (1x,无TP/CFG并行) | 8B | 8B | 0.312 |
110
- | NEO-Unify (TP2+CFG2) | 8B | 8B | 0.158 |
111
 
112
  </div>
 
76
 
77
  ### 生成性能
78
 
79
+ 下表给出 **SenseNova-U1-8B-MoT(NEO-Unify)**
80
+ **2048x2048** 图像生成任务上的基准模版。列出了不同机型与部署配置下的实测数据。
81
  注:TP2+CFG2 表示张量并行=2 + CFG 并行=2。
82
 
83
  <div align="center">
 
95
 
96
  ### 跨模型速度对比
97
 
98
+ 下表对比了在启用**CFG**条件下,生成 **2048x2048** 图像时单个 diffusion step 的延迟。除特别说明外,所有数据均在 **H100** 上测得;其中 `SenseNova-U1-8B-MoT (NEO-Unify, TP2+CFG2)` 使用的是 `2x H100`。
99
  注:TP2+CFG2 表示张量并行=2 + CFG 并行=2。
100
 
101
  <div align="center">
 
107
  | GLM-Image | 9B | 7B | 1.394 |
108
  | ERNIE-Image | 8B | 8B | 1.565 |
109
  | LongCat-Image | 8B | 6B | 0.796 |
110
+ | SenseNova-U1-8B-MoT (NEO-Unify) | 8B | 8B | 0.312 |
111
+ | SenseNova-U1-8B-MoT (NEO-Unify, TP2+CFG2) | 8B | 8B | 0.158 |
112
 
113
  </div>
docs/prompt_enhancement.md CHANGED
@@ -27,8 +27,6 @@ Skip `--enhance` when:
27
  user prompt ──► LLM (system prompt = infographic expander) ──► expanded prompt ──► SenseNova-U1
28
  ```
29
 
30
- Upstream system prompt: [SenseNova-Skills / u1-infographic](https://github.com/OpenSenseNova/SenseNova-Skills/blob/main/skills/u1-infographic/references/prompts-expand-system.md).
31
-
32
  ## 3. Configuration
33
 
34
  All configuration is environment-variable based so the same script can
@@ -45,12 +43,23 @@ First, create a `.env` file and populate it with the four required parameters. T
45
  Add `--print_enhance` to echo the original + enhanced prompt for
46
  debugging.
47
 
 
 
 
 
 
 
 
 
 
 
 
48
  ### 3.1 Recommended backends
49
 
50
  | Model | Backend | Endpoint template | Notes |
51
  | :---- | :------ | :---------------- | :---- |
52
  | **Gemini 3.1 Pro** (Default) | `chat_completions` | `https://generativelanguage.googleapis.com/v1beta/openai/chat/completions` | Best overall infographic quality in our internal bench. Excellent at structured / hierarchical content. |
53
- | SenseNova Agentic model | `chat_completions` | _(will be released soon)_ | Comparable to Gemini 3.1 Pro on zh content, cheaper per-token, preferred for production. |
54
  | Anthropic Claude (Sonnet/Opus) | `anthropic` | `https://api.anthropic.com/v1/messages` | Strong typography discipline, slightly less "information-dense" out of the box. |
55
  | Kimi 2.5 | `chat_completions` | `https://api.moonshot.cn/v1/chat/completions` | Good Chinese enhancements, weaker for English-dense infographics in our runs. |
56
  | Gemini 3.1 Flash-Lite (Third-party service) | `chat_completions` | `https://aigateway.edgecloudapp.com/v1/f194fd69361cd590f1fa136c9c90eca1/senseai` | The overall quality of the information chart is high and its generation speed is fast. |
 
27
  user prompt ──► LLM (system prompt = infographic expander) ──► expanded prompt ──► SenseNova-U1
28
  ```
29
 
 
 
30
  ## 3. Configuration
31
 
32
  All configuration is environment-variable based so the same script can
 
43
  Add `--print_enhance` to echo the original + enhanced prompt for
44
  debugging.
45
 
46
+ To use **SenseNova 6.7 Flash-Lite** as the enhancer, get an API key from
47
+ [SenseNova Console · token-plan](https://platform.sensenova.cn/token-plan),
48
+ then set:
49
+
50
+ ```bash
51
+ U1_ENHANCE_BACKEND=chat_completions
52
+ U1_ENHANCE_ENDPOINT=https://token.sensenova.cn/v1/chat/completions
53
+ U1_ENHANCE_MODEL=sensenova-6.7-flash-lite
54
+ U1_ENHANCE_API_KEY=<your SenseNova API key>
55
+ ```
56
+
57
  ### 3.1 Recommended backends
58
 
59
  | Model | Backend | Endpoint template | Notes |
60
  | :---- | :------ | :---------------- | :---- |
61
  | **Gemini 3.1 Pro** (Default) | `chat_completions` | `https://generativelanguage.googleapis.com/v1beta/openai/chat/completions` | Best overall infographic quality in our internal bench. Excellent at structured / hierarchical content. |
62
+ | SenseNova 6.7 Flash-Lite | `chat_completions` | `https://token.sensenova.cn/v1/chat/completions` | Near Gemini 3.1 Pro quality on Chinese content at lower per-token cost, preferred for production. |
63
  | Anthropic Claude (Sonnet/Opus) | `anthropic` | `https://api.anthropic.com/v1/messages` | Strong typography discipline, slightly less "information-dense" out of the box. |
64
  | Kimi 2.5 | `chat_completions` | `https://api.moonshot.cn/v1/chat/completions` | Good Chinese enhancements, weaker for English-dense infographics in our runs. |
65
  | Gemini 3.1 Flash-Lite (Third-party service) | `chat_completions` | `https://aigateway.edgecloudapp.com/v1/f194fd69361cd590f1fa136c9c90eca1/senseai` | The overall quality of the information chart is high and its generation speed is fast. |
docs/showcases.md CHANGED
@@ -115,6 +115,12 @@ Reproducible prompts are in
115
  <td align="center"><a href="./assets/showcases/t2i_infographic/0027.webp"><img width="230" alt="t2i image 0024" src="./assets/showcases/t2i_infographic/0027.webp"></a></td>
116
  <td align="center"><a href="./assets/showcases/t2i_infographic/0026.webp"><img width="230" alt="t2i image 0025" src="./assets/showcases/t2i_infographic/0026.webp"></a></td>
117
  </tr>
 
 
 
 
 
 
118
  </table>
119
 
120
 
@@ -235,7 +241,7 @@ Reproducible prompts are in
235
 
236
  | |
237
  | :---: |
238
- | [<img alt="interleave case 05" src="./assets/showcases/interleave/reasoning_case2.png">](./assets/showcases/interleave/reasoning_case2.png) |
239
 
240
 
241
  ---
 
115
  <td align="center"><a href="./assets/showcases/t2i_infographic/0027.webp"><img width="230" alt="t2i image 0024" src="./assets/showcases/t2i_infographic/0027.webp"></a></td>
116
  <td align="center"><a href="./assets/showcases/t2i_infographic/0026.webp"><img width="230" alt="t2i image 0025" src="./assets/showcases/t2i_infographic/0026.webp"></a></td>
117
  </tr>
118
+ <tr>
119
+ <td align="center"><a href="./assets/showcases/t2i_infographic/0029.webp"><img width="230" alt="t2i image 0022" src="./assets/showcases/t2i_infographic/0029.webp"></a></td>
120
+ <td align="center"><a href="./assets/showcases/t2i_infographic/0030.webp"><img width="230" alt="t2i image 0023" src="./assets/showcases/t2i_infographic/0030.webp"></a></td>
121
+ <td align="center"><a href="./assets/showcases/t2i_infographic/0031.webp"><img width="230" alt="t2i image 0024" src="./assets/showcases/t2i_infographic/0031.webp"></a></td>
122
+ <td align="center"><a href="./assets/showcases/t2i_infographic/0032.webp"><img width="230" alt="t2i image 0025" src="./assets/showcases/t2i_infographic/0032.webp"></a></td>
123
+ </tr>
124
  </table>
125
 
126
 
 
241
 
242
  | |
243
  | :---: |
244
+ | [<img alt="interleave case 05" src="./assets/showcases/interleave/reasoning.png">](./assets/showcases/interleave/reasoning.png) |
245
 
246
 
247
  ---
docs/showcases_CN.md CHANGED
@@ -109,6 +109,12 @@
109
  <td align="center"><a href="./assets/showcases/t2i_infographic/0027.webp"><img width="230" alt="t2i image 0024" src="./assets/showcases/t2i_infographic/0027.webp"></a></td>
110
  <td align="center"><a href="./assets/showcases/t2i_infographic/0026.webp"><img width="230" alt="t2i image 0025" src="./assets/showcases/t2i_infographic/0026.webp"></a></td>
111
  </tr>
 
 
 
 
 
 
112
  </table>
113
 
114
 
@@ -214,7 +220,7 @@
214
 
215
  | |
216
  | :---: |
217
- | [<img alt="interleave reasoning case 2" src="./assets/showcases/interleave/reasoning_case2.png">](./assets/showcases/interleave/reasoning_case2.png) |
218
 
219
  ---
220
 
 
109
  <td align="center"><a href="./assets/showcases/t2i_infographic/0027.webp"><img width="230" alt="t2i image 0024" src="./assets/showcases/t2i_infographic/0027.webp"></a></td>
110
  <td align="center"><a href="./assets/showcases/t2i_infographic/0026.webp"><img width="230" alt="t2i image 0025" src="./assets/showcases/t2i_infographic/0026.webp"></a></td>
111
  </tr>
112
+ <tr>
113
+ <td align="center"><a href="./assets/showcases/t2i_infographic/0029.webp"><img width="230" alt="t2i image 0022" src="./assets/showcases/t2i_infographic/0029.webp"></a></td>
114
+ <td align="center"><a href="./assets/showcases/t2i_infographic/0030.webp"><img width="230" alt="t2i image 0023" src="./assets/showcases/t2i_infographic/0030.webp"></a></td>
115
+ <td align="center"><a href="./assets/showcases/t2i_infographic/0031.webp"><img width="230" alt="t2i image 0024" src="./assets/showcases/t2i_infographic/0031.webp"></a></td>
116
+ <td align="center"><a href="./assets/showcases/t2i_infographic/0032.webp"><img width="230" alt="t2i image 0025" src="./assets/showcases/t2i_infographic/0032.webp"></a></td>
117
+ </tr>
118
  </table>
119
 
120
 
 
220
 
221
  | |
222
  | :---: |
223
+ | [<img alt="interleave reasoning case 2" src="./assets/showcases/interleave/reasoning.png">](./assets/showcases/interleave/reasoning.png) |
224
 
225
  ---
226