Yuanl-27B-v5-6 Uncensored - MTP GGUF (Q8_0)

MTP (Multi-Token Prediction) GGUF builds of lkjiop8/Yuanl-27B-v5-6-uncensored applied on Qwen/Qwen3.6-27B. The MTP blk.64 weights (attention + MLP + nextn projections) were restored from official Qwen/Qwen3.6-27B because the unsloth base strip them; the v5-6 + uncensoring fine-tune is applied to the trunk (layers 0-63) and the MTP head uses the original Qwen3.6 weights.

Files

File	Size	Notes
`Yuanl-27B-v5-6-uncensored-MTP-Q8_0.gguf`	~29 GB	Near-lossless (PPL +~0.1% vs F16). Recommended.

Requirements

llama.cpp b9180 or newer (PR #22673 = Qwen 3.6 MTP support).
For best speed, build with CUDA: cmake -B build -DGGML_CUDA=ON && cmake --build build.

Launch (production-tuned, from CyberSentinel master guide §4)

Dual RTX 4090 (47 GB total), Q8_0, 120K ctx, MTP draft 3

./llama-server \
    -m Yuanl-27B-v5-6-uncensored-MTP-Q8_0.gguf \
    --alias Yuanl-27B-v5-6-uncensored \
    --host 0.0.0.0 --port 8080 \
    -c 122880 --parallel 1 \
    -ngl 99 -sm layer -ts 23,25 -fa on \
    -b 4096 -ub 2048 \
    -t 8 -tb 16 --threads-http 8 \
    -ctk q8_0 -ctv q8_0 \
    --spec-type draft-mtp --spec-draft-n-max 3 \
    --cache-reuse 256 --kv-unified \
    --jinja --reasoning auto --reasoning-format deepseek \
    --reasoning-budget 256 \
    --temp 0.3 --top-p 0.85 --top-k 20 --min-p 0.05 \
    --repeat-penalty 1.05 --repeat-last-n 256 --presence-penalty 0.10 \
    --no-mmproj --no-webui --metrics \
    --slot-save-path ./slots

Persona - Yuanl-Free

This model uses the Yuanl-Free system prompt by default - it provides complete technical answers across the cybersecurity domain without refusal preambles. The operator is assumed to be an authorized security researcher / red-team engineer / academic who carries the legal and ethical responsibility for use of the output.

See the LoRA adapter repo lkjiop8/Yuanl-27B-v5-6-uncensored for the training data design, the persona text, and the v5-6 capability anchor strategy.

Responsible use

Designed for authorized red-team / research / academic use in environments where the human operator carries the legal / ethical authorization. The training signal is cybersecurity-shaped; out-of-domain harm requests (CSAM, doxxing, biological/chemical weapons, mass-casualty planning) are not in the training distribution and you should add policy filtering at the application layer if your deployment context requires one.

Downloads last month: 36

GGUF

Hardware compatibility

8-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lkjiop8/Yuanl-27B-v5-6-uncensored-MTP-GGUF

Base model

Qwen/Qwen3.6-27B

Quantized

(517)

this model