renhehuang
/

qwen3-1.7b-coffee-sft-quanto-int4

@@ -1,9 +1,104 @@
 ---
 tags:
-- model_hub_mixin
 ---
-This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
-- Code: [More Information Needed]
-- Paper: [More Information Needed]
-- Docs: [More Information Needed]

 ---
+license: apache-2.0
+language:
+  - zh
+base_model: renhehuang/qwen3-1.7b-coffee-sft
 tags:
+  - conversational
+  - sft
+  - coffee
+  - traditional-chinese
+  - qwen3
+  - task-oriented-dialogue
+  - quantized
+  - int4
+  - quanto
+datasets:
+  - renhehuang/coffee-order-zhtw
+pipeline_tag: text-generation
 ---
+# Qwen3-1.7B Coffee Order Assistant — INT4 量化版
+此為 [renhehuang/qwen3-1.7b-coffee-sft](https://huggingface.co/renhehuang/qwen3-1.7b-coffee-sft) 的 **INT4 量化版本**，使用 [optimum-quanto](https://github.com/huggingface/optimum-quanto) 量化。
+| | 原始模型 | 本量化模型 |
+|---|---|---|
+| 精度 | FP32 | INT4 |
+| 大小 | ~6.45 GB | **~1.45 GB** |
+| 壓縮比 | — | 4.5x |
+適合部署至 **Jetson Nano**、Raspberry Pi 等低記憶體邊緣裝置。
+## 使用方式
+```python
+from optimum.quanto import QuantizedModelForCausalLM
+from transformers import AutoTokenizer
+import torch
+model_name = "renhehuang/qwen3-1.7b-coffee-sft-quanto-int4"
+tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
+model = QuantizedModelForCausalLM.from_pretrained(model_name)
+messages = [
+    {"role": "system", "content": "你是一位專業的咖啡點餐助理，負責協助使用者完成點餐。菜單包含：美式、拿鐵、燕麥奶拿鐵、鮮奶。"},
+    {"role": "user", "content": "我想要一杯冰拿鐵"}
+]
+input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
+with torch.no_grad():
+    outputs = model.generate(
+        **inputs,
+        max_new_tokens=128,
+        do_sample=True,
+        temperature=0.7,
+        top_p=0.9,
+    )
+response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True)
+print(response)
+```
+## 量化資訊
+| 項目 | 值 |
+|------|-----|
+| 量化工具 | [optimum-quanto](https://github.com/huggingface/optimum-quanto) |
+| 量化精度 | INT4 (qint4) |
+| 量化範圍 | weights only |
+| 原始模型 | [renhehuang/qwen3-1.7b-coffee-sft](https://huggingface.co/renhehuang/qwen3-1.7b-coffee-sft) |
+| 基礎模型 | [Qwen/Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B) |
+## 支援的菜單
+| 飲品 | 溫度選項 | 加購選項 |
+|------|----------|----------|
+| 美式 | 冰/熱 | 加一份濃縮 |
+| 拿鐵 | 冰/熱 | 加一份濃縮 |
+| 燕麥奶拿鐵 | 冰/熱 | 加一份濃縮 |
+| 鮮奶 | 冰/熱 | 加一份濃縮 |
+## 限制與注意事項
+- 此模型僅針對咖啡點餐場景訓練，不適用於一般對話
+- 菜單項目固定，無法處理菜單外的飲品
+- INT4 量化可能造成些微品質下降，但在點餐場景中影響不大
+## 授權
+本模型基於 Apache 2.0 授權發布。
+## 引用
+```bibtex
+@misc{qwen3-coffee-sft-quanto-int4,
+  author = {Ren-He Huang},
+  title = {Qwen3-1.7B Coffee Order Assistant (INT4 Quantized)},
+  year = {2025},
+  publisher = {HuggingFace},
+  url = {https://huggingface.co/renhehuang/qwen3-1.7b-coffee-sft-quanto-int4}
+}
+```