Instructions to use BirdToast/qwen3.5-27b-v2-stage1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use BirdToast/qwen3.5-27b-v2-stage1 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="BirdToast/qwen3.5-27b-v2-stage1")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("BirdToast/qwen3.5-27b-v2-stage1")
model = AutoModelForImageTextToText.from_pretrained("BirdToast/qwen3.5-27b-v2-stage1")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use BirdToast/qwen3.5-27b-v2-stage1 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "BirdToast/qwen3.5-27b-v2-stage1"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "BirdToast/qwen3.5-27b-v2-stage1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/BirdToast/qwen3.5-27b-v2-stage1

SGLang

How to use BirdToast/qwen3.5-27b-v2-stage1 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "BirdToast/qwen3.5-27b-v2-stage1" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "BirdToast/qwen3.5-27b-v2-stage1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "BirdToast/qwen3.5-27b-v2-stage1" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "BirdToast/qwen3.5-27b-v2-stage1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use BirdToast/qwen3.5-27b-v2-stage1 with Docker Model Runner:
```
docker model run hf.co/BirdToast/qwen3.5-27b-v2-stage1
```

qwen35-27b-v2-stage1

This model is a fine-tuned version of llmfan46/Qwen3.5-27B-heretic-v2.

Content transfer stage — lower-quality but content-valuable data trained with lower LR and high rank / low alpha ratio. Plain text (all tokens trained), single epoch, 2048 context.

W&B run: https://wandb.ai/cooawoo-personal/huggingface/runs/4cesvope

Dataset

~28.4M tokens total from 8 datasets, split from 9,054 samples into 20,393 chunks at 2048 max length.

Training Configs

train.yaml

# Qwen3.5-27B Step 1 — Content transfer (lower-quality data)
#
# Lower-quality but content-valuable data. Content transfer via
# lower LR + high rank / low alpha ratio.
#
# ~45M tokens, single epoch, 2048 context (rosier 1/4 sample)
# Base model: llmfan46/Qwen3.5-27B-heretic-v2

accelerate_config: /home/aibox/loft-t5/configs/qwen35-27b-step1/accelerate_config.yaml
model_name_or_path: /home/aibox/.cache/huggingface/hub/models--llmfan46--Qwen3.5-27B-heretic-v2/snapshots/dbb3412746fa6b27523c5cd74c534fedf0e3355d
data_config: /home/aibox/loft-t5/configs/qwen35-27b-step1/data.yaml
prepared_dataset: /home/aibox/loft-t5/runs/qwen35-27b-step1/prepared
output_dir: /home/aibox/loft-t5/runs/qwen35-27b-step1

# Precision & memory
bf16: true
gradient_checkpointing: true
use_cce: true

# Context & batching
max_length: 2048
per_device_train_batch_size: 1
gradient_accumulation_steps: 4

# QLoRA — larger rank with low alpha for content transfer
use_peft: true
load_in_4bit: true
model_parallel: true
lora_r: 64
lora_alpha: 8
lora_dropout: 0.0
use_rslora: true

# Optimizer & schedule — lower LR for content without style
learning_rate: 5.0e-5
lr_scheduler_type: cosine
warmup_ratio: 0.03
weight_decay: 0.01
max_grad_norm: 1.0
num_train_epochs: 1

# Logging & saving
logging_steps: 1
disable_tqdm: true
save_strategy: steps
save_steps: 1000
save_total_limit: 5
report_to: wandb
run_name: qwen35-27b-step1

data.yaml

# Data config — Qwen3.5-27B Step 1 (content transfer, lower-quality data)
# ~24M tokens total, single epoch, 2048 context

datasets:
  # Rosier inf strict — 1/8 random sample (~10M tokens)
  - path: /home/aibox/loft-t5/configs/qwen35-27b-step1/data/rosier_inf_strict_halved.jsonl
    type: text
    columns: [text]
    truncation_strategy: split

  # Erotica quality cleaned — half sample (~6.7M tokens)
  - path: /home/aibox/loft-t5/configs/qwen35-27b-step1/data/erotica_quality_cleaned_halved.jsonl
    type: text
    columns: [text]
    truncation_strategy: split

  # Erotic books filtered — longer-form erotica (~2.8M tokens)
  - path: /home/aibox/loft-t5/configs/qwen35-27b-step1/data/erotic_books_filtered.json
    type: text
    columns: [text]
    truncation_strategy: split

  # Springdragon — half sample (~3.4M tokens)
  - path: /home/aibox/loft-t5/configs/qwen35-27b-step1/data/springdragon_processed_halved.jsonl
    type: text
    columns: [text]
    truncation_strategy: split

  # Floyd — half sample (~3.9M tokens)
  - path: /home/aibox/loft-t5/configs/qwen35-27b-step1/data/floyd_processed_halved.jsonl
    type: text
    columns: [text]
    truncation_strategy: split

  # Brainrot chatlog — internet chat style (~1.1M tokens)
  - path: /home/aibox/loft-t5/configs/qwen35-27b-step1/data/brainrot_chatlog.jsonl
    type: text
    columns: [text]
    truncation_strategy: split

  # Wrecklora — half sample (~6.3M tokens)
  - path: /home/aibox/loft-t5/configs/qwen35-27b-step1/data/wrecklora_processed_halved.jsonl
    type: text
    columns: [text]
    truncation_strategy: split

  # Disco Elysium chat — interactive fiction dialogue (~0.5M tokens)
  - path: /home/aibox/loft-t5/configs/qwen35-27b-step1/data/disco_chat.jsonl
    type: text
    columns: [text]
    truncation_strategy: split

# Shuffle settings
shuffle_datasets: true
shuffle_combined: true
shuffle_seed: 42

# No eval split — single epoch
eval_split: 0
split_seed: 42

# Plain text — train on all tokens
assistant_only_loss: false

accelerate_config.yaml

compute_environment: LOCAL_MACHINE
distributed_type: 'NO'
num_processes: 1
mixed_precision: 'no'

Framework versions

PEFT 0.18.1
Loft: 0.1.0
Transformers: 5.2.0
Pytorch: 2.6.0
Datasets: 4.6.1
Tokenizers: 0.22.2

Downloads last month: 21

Safetensors

Model size

27B params

Tensor type

BF16

Model tree for BirdToast/qwen3.5-27b-v2-stage1

Base model

Qwen/Qwen3.5-27B

Finetuned

llmfan46/Qwen3.5-27B-ultra-uncensored-heretic-v1

Adapter

(1)

this model