Instructions to use AXERA-TECH/HY-MT1.5-1.8B_GPTQ_INT4-AX620E with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use AXERA-TECH/HY-MT1.5-1.8B_GPTQ_INT4-AX620E with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "translation" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("translation", model="AXERA-TECH/HY-MT1.5-1.8B_GPTQ_INT4-AX620E")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("AXERA-TECH/HY-MT1.5-1.8B_GPTQ_INT4-AX620E", dtype="auto") - Notebooks
- Google Colab
- Kaggle
yongqiang commited on
Commit ·
3f3c4b6
1
Parent(s): 80ad90c
Update axllm binary and token config
Browse files- bin/axllm +2 -2
- bin/axllm.version.json +5 -4
- config.json +2 -1
bin/axllm
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:eeb39d339e8044f9036dd773e8e9704b4131c601eb5123ea08cfe71d01617196
|
| 3 |
+
size 2270200
|
bin/axllm.version.json
CHANGED
|
@@ -3,12 +3,12 @@
|
|
| 3 |
"target": "aarch64 binary built from ax-hymt1_5",
|
| 4 |
"notes": "This is the same packaged axllm binary as the AX650 repository. The binary has been verified on AX650 with HY-MT OpenAI serving. AX620E board validation for axllm serve is still pending.",
|
| 5 |
"ax_llm_branch": "ax-hymt1_5",
|
| 6 |
-
"ax_llm_commit": "
|
| 7 |
"openai_api_cpp_commit": "f56cf8c296d1002f6602226db392325ba42f6775",
|
| 8 |
"build_command": "cmake --build build --target install -j$(nproc)",
|
| 9 |
-
"sha256": "
|
| 10 |
"verified": {
|
| 11 |
-
"date": "2026-05-
|
| 12 |
"board": "AX650",
|
| 13 |
"command": "./bin/axllm serve . --port 18120",
|
| 14 |
"api_url": "http://10.168.232.217:18120/v1/chat/completions",
|
|
@@ -16,7 +16,8 @@
|
|
| 16 |
"smoke_tests": [
|
| 17 |
"GET /v1/models returned AXERA-TECH/HY-MT1.5-1.8B_GPTQ_INT4 only",
|
| 18 |
"English to Chinese request returned 这是免费的。",
|
| 19 |
-
"Natural-language request 请将下面的文字翻译成日文 returned Japanese text without target_language"
|
|
|
|
| 20 |
]
|
| 21 |
}
|
| 22 |
}
|
|
|
|
| 3 |
"target": "aarch64 binary built from ax-hymt1_5",
|
| 4 |
"notes": "This is the same packaged axllm binary as the AX650 repository. The binary has been verified on AX650 with HY-MT OpenAI serving. AX620E board validation for axllm serve is still pending.",
|
| 5 |
"ax_llm_branch": "ax-hymt1_5",
|
| 6 |
+
"ax_llm_commit": "760c3a9f3586d233d27811b08f3863dbb7ad4c0a",
|
| 7 |
"openai_api_cpp_commit": "f56cf8c296d1002f6602226db392325ba42f6775",
|
| 8 |
"build_command": "cmake --build build --target install -j$(nproc)",
|
| 9 |
+
"sha256": "eeb39d339e8044f9036dd773e8e9704b4131c601eb5123ea08cfe71d01617196",
|
| 10 |
"verified": {
|
| 11 |
+
"date": "2026-05-26",
|
| 12 |
"board": "AX650",
|
| 13 |
"command": "./bin/axllm serve . --port 18120",
|
| 14 |
"api_url": "http://10.168.232.217:18120/v1/chat/completions",
|
|
|
|
| 16 |
"smoke_tests": [
|
| 17 |
"GET /v1/models returned AXERA-TECH/HY-MT1.5-1.8B_GPTQ_INT4 only",
|
| 18 |
"English to Chinese request returned 这是免费的。",
|
| 19 |
+
"Natural-language request 请将下面的文字翻译成日文 returned Japanese text without target_language",
|
| 20 |
+
"Request with max_tokens=4096 was clamped to 1024 and returned Japanese translation"
|
| 21 |
]
|
| 22 |
}
|
| 23 |
}
|
config.json
CHANGED
|
@@ -13,5 +13,6 @@
|
|
| 13 |
"eos": false,
|
| 14 |
"use_mmap_load_embed": true,
|
| 15 |
"use_mmap_load_layer": false,
|
| 16 |
-
"server_timeout_ms": 300000
|
|
|
|
| 17 |
}
|
|
|
|
| 13 |
"eos": false,
|
| 14 |
"use_mmap_load_embed": true,
|
| 15 |
"use_mmap_load_layer": false,
|
| 16 |
+
"server_timeout_ms": 300000,
|
| 17 |
+
"server_max_output_tokens": 1024
|
| 18 |
}
|