| --- |
| license: apache-2.0 |
| language: |
| - zh |
| base_model: renhehuang/qwen3-1.7b-coffee-sft |
| tags: |
| - conversational |
| - sft |
| - coffee |
| - traditional-chinese |
| - qwen3 |
| - task-oriented-dialogue |
| - quantized |
| - int4 |
| - quanto |
| datasets: |
| - renhehuang/coffee-order-zhtw |
| pipeline_tag: text-generation |
| --- |
| |
| # Qwen3-1.7B Coffee Order Assistant — INT4 量化版 |
|
|
| 此為 [renhehuang/qwen3-1.7b-coffee-sft](https://huggingface.co/renhehuang/qwen3-1.7b-coffee-sft) 的 **INT4 量化版本**,使用 [optimum-quanto](https://github.com/huggingface/optimum-quanto) 量化。 |
|
|
| | | 原始模型 | 本量化模型 | |
| |---|---|---| |
| | 精度 | FP32 | INT4 | |
| | 大小 | ~6.45 GB | **~1.45 GB** | |
| | 壓縮比 | — | 4.5x | |
|
|
| 適合部署至 **Jetson Nano**、Raspberry Pi 等低記憶體邊緣裝置。 |
|
|
| ## 使用方式 |
|
|
| ```python |
| from optimum.quanto import QuantizedModelForCausalLM |
| from transformers import AutoTokenizer |
| import torch |
| |
| model_name = "renhehuang/qwen3-1.7b-coffee-sft-quanto-int4" |
| tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) |
| model = QuantizedModelForCausalLM.from_pretrained(model_name) |
| |
| messages = [ |
| {"role": "system", "content": "你是一位專業的咖啡點餐助理,負責協助使用者完成點餐。菜單包含:美式、拿鐵、燕麥奶拿鐵、鮮奶。"}, |
| {"role": "user", "content": "我想要一杯冰拿鐵"} |
| ] |
| |
| input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) |
| inputs = tokenizer(input_text, return_tensors="pt").to(model.device) |
| |
| with torch.no_grad(): |
| outputs = model.generate( |
| **inputs, |
| max_new_tokens=128, |
| do_sample=True, |
| temperature=0.7, |
| top_p=0.9, |
| ) |
| |
| response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True) |
| print(response) |
| ``` |
|
|
| ## 量化資訊 |
|
|
| | 項目 | 值 | |
| |------|-----| |
| | 量化工具 | [optimum-quanto](https://github.com/huggingface/optimum-quanto) | |
| | 量化精度 | INT4 (qint4) | |
| | 量化範圍 | weights only | |
| | 原始模型 | [renhehuang/qwen3-1.7b-coffee-sft](https://huggingface.co/renhehuang/qwen3-1.7b-coffee-sft) | |
| | 基礎模型 | [Qwen/Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B) | |
|
|
| ## 支援的菜單 |
|
|
| | 飲品 | 溫度選項 | 加購選項 | |
| |------|----------|----------| |
| | 美式 | 冰/熱 | 加一份濃縮 | |
| | 拿鐵 | 冰/熱 | 加一份濃縮 | |
| | 燕麥奶拿鐵 | 冰/熱 | 加一份濃縮 | |
| | 鮮奶 | 冰/熱 | 加一份濃縮 | |
|
|
| ## 限制與注意事項 |
|
|
| - 此模型僅針對咖啡點餐場景訓練,不適用於一般對話 |
| - 菜單項目固定,無法處理菜單外的飲品 |
| - INT4 量化可能造成些微品質下降,但在點餐場景中影響不大 |
|
|
| ## 授權 |
|
|
| 本模型基於 Apache 2.0 授權發布。 |
|
|
| ## 引用 |
|
|
| ```bibtex |
| @misc{qwen3-coffee-sft-quanto-int4, |
| author = {Ren-He Huang}, |
| title = {Qwen3-1.7B Coffee Order Assistant (INT4 Quantized)}, |
| year = {2025}, |
| publisher = {HuggingFace}, |
| url = {https://huggingface.co/renhehuang/qwen3-1.7b-coffee-sft-quanto-int4} |
| } |
| ``` |