Yuanl-27B-v5-6 Uncensored - MTP GGUF (Q8_0)

MTP (Multi-Token Prediction) GGUF builds of lkjiop8/Yuanl-27B-v5-6-uncensored applied on Qwen/Qwen3.6-27B. The MTP blk.64 weights (attention + MLP + nextn projections) were restored from official Qwen/Qwen3.6-27B because the unsloth base strip them; the v5-6 + uncensoring fine-tune is applied to the trunk (layers 0-63) and the MTP head uses the original Qwen3.6 weights.

Files

File Size Notes
Yuanl-27B-v5-6-uncensored-MTP-Q8_0.gguf ~29 GB Near-lossless (PPL +~0.1% vs F16). Recommended.

Requirements

  • llama.cpp b9180 or newer (PR #22673 = Qwen 3.6 MTP support).
  • For best speed, build with CUDA: cmake -B build -DGGML_CUDA=ON && cmake --build build.

Launch (production-tuned, from CyberSentinel master guide §4)

Dual RTX 4090 (47 GB total), Q8_0, 120K ctx, MTP draft 3

./llama-server \
    -m Yuanl-27B-v5-6-uncensored-MTP-Q8_0.gguf \
    --alias Yuanl-27B-v5-6-uncensored \
    --host 0.0.0.0 --port 8080 \
    -c 122880 --parallel 1 \
    -ngl 99 -sm layer -ts 23,25 -fa on \
    -b 4096 -ub 2048 \
    -t 8 -tb 16 --threads-http 8 \
    -ctk q8_0 -ctv q8_0 \
    --spec-type draft-mtp --spec-draft-n-max 3 \
    --cache-reuse 256 --kv-unified \
    --jinja --reasoning auto --reasoning-format deepseek \
    --reasoning-budget 256 \
    --temp 0.3 --top-p 0.85 --top-k 20 --min-p 0.05 \
    --repeat-penalty 1.05 --repeat-last-n 256 --presence-penalty 0.10 \
    --no-mmproj --no-webui --metrics \
    --slot-save-path ./slots

Persona - Yuanl-Free

This model uses the Yuanl-Free system prompt by default - it provides complete technical answers across the cybersecurity domain without refusal preambles. The operator is assumed to be an authorized security researcher / red-team engineer / academic who carries the legal and ethical responsibility for use of the output.

See the LoRA adapter repo lkjiop8/Yuanl-27B-v5-6-uncensored for the training data design, the persona text, and the v5-6 capability anchor strategy.

Responsible use

Designed for authorized red-team / research / academic use in environments where the human operator carries the legal / ethical authorization. The training signal is cybersecurity-shaped; out-of-domain harm requests (CSAM, doxxing, biological/chemical weapons, mass-casualty planning) are not in the training distribution and you should add policy filtering at the application layer if your deployment context requires one.

Downloads last month
36
GGUF
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lkjiop8/Yuanl-27B-v5-6-uncensored-MTP-GGUF

Base model

Qwen/Qwen3.6-27B
Quantized
(517)
this model