Instructions to use Murasaki-Project/Murasaki-14B-v0.2-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use Murasaki-Project/Murasaki-14B-v0.2-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="Murasaki-Project/Murasaki-14B-v0.2-GGUF", filename="Murasaki-14B-v0.2-IQ3_M.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use Murasaki-Project/Murasaki-14B-v0.2-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Murasaki-Project/Murasaki-14B-v0.2-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf Murasaki-Project/Murasaki-14B-v0.2-GGUF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Murasaki-Project/Murasaki-14B-v0.2-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf Murasaki-Project/Murasaki-14B-v0.2-GGUF:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf Murasaki-Project/Murasaki-14B-v0.2-GGUF:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf Murasaki-Project/Murasaki-14B-v0.2-GGUF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf Murasaki-Project/Murasaki-14B-v0.2-GGUF:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf Murasaki-Project/Murasaki-14B-v0.2-GGUF:Q4_K_M
Use Docker
docker model run hf.co/Murasaki-Project/Murasaki-14B-v0.2-GGUF:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use Murasaki-Project/Murasaki-14B-v0.2-GGUF with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Murasaki-Project/Murasaki-14B-v0.2-GGUF" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Murasaki-Project/Murasaki-14B-v0.2-GGUF", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Murasaki-Project/Murasaki-14B-v0.2-GGUF:Q4_K_M
- Ollama
How to use Murasaki-Project/Murasaki-14B-v0.2-GGUF with Ollama:
ollama run hf.co/Murasaki-Project/Murasaki-14B-v0.2-GGUF:Q4_K_M
- Unsloth Studio new
How to use Murasaki-Project/Murasaki-14B-v0.2-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Murasaki-Project/Murasaki-14B-v0.2-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Murasaki-Project/Murasaki-14B-v0.2-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Murasaki-Project/Murasaki-14B-v0.2-GGUF to start chatting
- Pi new
How to use Murasaki-Project/Murasaki-14B-v0.2-GGUF with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf Murasaki-Project/Murasaki-14B-v0.2-GGUF:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "Murasaki-Project/Murasaki-14B-v0.2-GGUF:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use Murasaki-Project/Murasaki-14B-v0.2-GGUF with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf Murasaki-Project/Murasaki-14B-v0.2-GGUF:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default Murasaki-Project/Murasaki-14B-v0.2-GGUF:Q4_K_M
Run Hermes
hermes
- Docker Model Runner
How to use Murasaki-Project/Murasaki-14B-v0.2-GGUF with Docker Model Runner:
docker model run hf.co/Murasaki-Project/Murasaki-14B-v0.2-GGUF:Q4_K_M
- Lemonade
How to use Murasaki-Project/Murasaki-14B-v0.2-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull Murasaki-Project/Murasaki-14B-v0.2-GGUF:Q4_K_M
Run and chat with the model
lemonade run user.Murasaki-14B-v0.2-GGUF-Q4_K_M
List all available models
lemonade list
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Murasaki-Project/Murasaki-14B-v0.2-GGUF:# Run inference directly in the terminal:
llama-cli -hf Murasaki-Project/Murasaki-14B-v0.2-GGUF:Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf Murasaki-Project/Murasaki-14B-v0.2-GGUF:# Run inference directly in the terminal:
./llama-cli -hf Murasaki-Project/Murasaki-14B-v0.2-GGUF:Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf Murasaki-Project/Murasaki-14B-v0.2-GGUF:# Run inference directly in the terminal:
./build/bin/llama-cli -hf Murasaki-Project/Murasaki-14B-v0.2-GGUF:Use Docker
docker model run hf.co/Murasaki-Project/Murasaki-14B-v0.2-GGUF:Murasaki-14B-v0.2 (GGUF)
System 2 Reasoning Model for ACGN Translation
原生 CoT 思维链 · 长上下文 · ACGN 领域特化翻译模型
Github | Benchmark | BF16 Version | License: CC BY-NC-SA 4.0
简介
Murasaki-14B 是专为 ACGN 领域(轻小说、Galgame、漫画等)优化的 System 2 推理型翻译模型。
不同于传统的直觉式(System 1)模型,Murasaki-8B 引入了原生 Chain-of-Thought (CoT) 思维链技术。在生成译文前,模型会先在 <think> 标签内完成风格定调、动作流解析、人设推导及人称确认。这种机制显著提升了长难句的解析精度与叙事连贯性,特别是精准解决了 ACGN 翻译中常见的施动者/受动者判定模糊、人称混淆及语境风格漂移等难点,大幅提升了译文的准确度与可读性。
Prompt 预设说明
本模型针对三种不同场景进行了专项训练。请根据您的翻译对象,在参考文档中选择对应的 System Prompt 使用:
轻小说翻译模式 (Light Novel Mode)
- 适用场景:轻小说正文、Web 小说、文学性较强的文本。
- 特点:注重叙事连贯性、文学风格保留和长上下文关联。
剧本模式翻译 (Script Mode)
- 适用场景:动画字幕、Galgame 脚本、漫画文本。
- 特点:针对对话进行了优化,擅长处理碎片化文本、口语表达,以及大量缺失主语的处理。
短句翻译模式 (Short Sentence Mode)
- 适用场景:UI 文本、系统提示、无上下文的独立短句。
- 特点:去除冗余的思维链联想,直译原意,避免过度脑补。
具体 Prompt 文本请参考 F16版本的介绍页面
文件列表与显存需求
✨ Now Live: 无需下载模型,点击 Online Demo 在线体验模型。
| 文件名 | 量化方法 | 文件大小 | 推荐显存 | 适用场景 |
|---|---|---|---|---|
Murasaki-14B-v0.2-Q6_K.gguf |
Q6_K | 11.29 GB | 16GB+ | 推荐:最佳性能 |
Murasaki-14B-v0.2-Q5_K_M.gguf |
Q5_K_M | 9.79 GB | 12GB+ | 高精度需求 |
Murasaki-14B-v0.2-Q4_K_M.gguf |
Q4_K_M | 8.38 GB | 10GB+ | 经典量化 |
Murasaki-14B-v0.2-IQ4_XS.gguf |
IQ4_XS | 7.55 GB | 10GB+ | 推荐:性价比最优 |
Murasaki-14B-v0.2-IQ3_M.gguf |
IQ3_M | 6.41 GB | 8GB+ | 极限压缩 |
快速开始 (GGUF)
方法 1: 使用官方 GUI (推荐)
为了获得最佳的翻译体验(并自动应用上述三种模式),请使用我们配套开发的开源前端翻译GUI(v1.6.0以上版本): 👉 Murasaki Translator (GitHub)
方法 2: 使用 llama.cpp
./llama-cli -m Murasaki-14B-v0.2-IQ4_XS.gguf \
-p "[你是一位精通中日双语的资深ACGN翻译家...]" \
-n 2048 \
-t 8 \
--temp 0.3 \
-c 8192
推理参数建议
- Temperature:
0.1-0.5(推荐0.3) - Repetition Penalty: 从
1.0开始,如出现复读可增加至1.05-1.1 - Max New Tokens: 建议
4096或更高
协议与致谢
- Base Model: 特别感谢 SakuraLLM 提供的优秀 Base 模型。
- License: 软件代码遵循 Apache-2.0 协议,模型权重遵循 CC BY-NC-SA 4.0 协议,严禁用于任何商业用途。
Copyright © 2026 Murasaki Project
- Downloads last month
- 426
3-bit
4-bit
5-bit
6-bit
Model tree for Murasaki-Project/Murasaki-14B-v0.2-GGUF
Base model
Murasaki-Project/Murasaki-14B-v0.2
Install from brew
# Start a local OpenAI-compatible server with a web UI: llama-server -hf Murasaki-Project/Murasaki-14B-v0.2-GGUF:# Run inference directly in the terminal: llama-cli -hf Murasaki-Project/Murasaki-14B-v0.2-GGUF: