How to use from
llama.cpp
Install from brew
brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf ynanxiu/qwen25-15b-coffee-v5-gguf:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf ynanxiu/qwen25-15b-coffee-v5-gguf:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf ynanxiu/qwen25-15b-coffee-v5-gguf:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf ynanxiu/qwen25-15b-coffee-v5-gguf:Q4_K_M
Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf ynanxiu/qwen25-15b-coffee-v5-gguf:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf ynanxiu/qwen25-15b-coffee-v5-gguf:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf ynanxiu/qwen25-15b-coffee-v5-gguf:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf ynanxiu/qwen25-15b-coffee-v5-gguf:Q4_K_M
Use Docker
docker model run hf.co/ynanxiu/qwen25-15b-coffee-v5-gguf:Q4_K_M
Quick Links

Qwen2.5-1.5B 吧台咖啡师 v5 (GGUF Q4_K_M)

基于 ynanxiu/qwen25-15b-coffee-lora-v5 合并全量后量化的 GGUF 模型。

量化信息

参数
量化方法 Q4_K_M
模型大小 935 MB
BPW 5.08
原始 FP16 3.09 GB
压缩比 3.3x

使用方法

# llama.cpp CLI
./llama-cli -m qwen25-15b-coffee-v5-q4_k_m.gguf -p "Espresso 标准萃取压力是多少 bar?"

# Python (llama-cpp-python)
pip install llama-cpp-python
from llama_cpp import Llama
llm = Llama.from_pretrained(
    repo_id="ynanxiu/qwen25-15b-coffee-v5-gguf",
    filename="qwen25-15b-coffee-v5-q4_k_m.gguf",
)
print(llm("咖啡太苦了怎么办?")["choices"][0]["text"])

能力

维度 结论
咖啡参数 10/10 🏆
寒暄社交
故障排查
清洁保养
购买建议
辟谣知识

来源

Downloads last month
42
GGUF
Model size
2B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ynanxiu/qwen25-15b-coffee-v5-gguf

Quantized
(199)
this model

Space using ynanxiu/qwen25-15b-coffee-v5-gguf 1