Instructions to use BirdToast/qwen3.5-27b-v2-stage1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use BirdToast/qwen3.5-27b-v2-stage1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="BirdToast/qwen3.5-27b-v2-stage1") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("BirdToast/qwen3.5-27b-v2-stage1") model = AutoModelForImageTextToText.from_pretrained("BirdToast/qwen3.5-27b-v2-stage1") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use BirdToast/qwen3.5-27b-v2-stage1 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "BirdToast/qwen3.5-27b-v2-stage1" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "BirdToast/qwen3.5-27b-v2-stage1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/BirdToast/qwen3.5-27b-v2-stage1
- SGLang
How to use BirdToast/qwen3.5-27b-v2-stage1 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "BirdToast/qwen3.5-27b-v2-stage1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "BirdToast/qwen3.5-27b-v2-stage1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "BirdToast/qwen3.5-27b-v2-stage1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "BirdToast/qwen3.5-27b-v2-stage1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use BirdToast/qwen3.5-27b-v2-stage1 with Docker Model Runner:
docker model run hf.co/BirdToast/qwen3.5-27b-v2-stage1
qwen35-27b-v2-stage1
This model is a fine-tuned version of llmfan46/Qwen3.5-27B-heretic-v2.
Content transfer stage — lower-quality but content-valuable data trained with lower LR and high rank / low alpha ratio. Plain text (all tokens trained), single epoch, 2048 context.
W&B run: https://wandb.ai/cooawoo-personal/huggingface/runs/4cesvope
Dataset
~28.4M tokens total from 8 datasets, split from 9,054 samples into 20,393 chunks at 2048 max length.
Training Configs
train.yaml
# Qwen3.5-27B Step 1 — Content transfer (lower-quality data)
#
# Lower-quality but content-valuable data. Content transfer via
# lower LR + high rank / low alpha ratio.
#
# ~45M tokens, single epoch, 2048 context (rosier 1/4 sample)
# Base model: llmfan46/Qwen3.5-27B-heretic-v2
accelerate_config: /home/aibox/loft-t5/configs/qwen35-27b-step1/accelerate_config.yaml
model_name_or_path: /home/aibox/.cache/huggingface/hub/models--llmfan46--Qwen3.5-27B-heretic-v2/snapshots/dbb3412746fa6b27523c5cd74c534fedf0e3355d
data_config: /home/aibox/loft-t5/configs/qwen35-27b-step1/data.yaml
prepared_dataset: /home/aibox/loft-t5/runs/qwen35-27b-step1/prepared
output_dir: /home/aibox/loft-t5/runs/qwen35-27b-step1
# Precision & memory
bf16: true
gradient_checkpointing: true
use_cce: true
# Context & batching
max_length: 2048
per_device_train_batch_size: 1
gradient_accumulation_steps: 4
# QLoRA — larger rank with low alpha for content transfer
use_peft: true
load_in_4bit: true
model_parallel: true
lora_r: 64
lora_alpha: 8
lora_dropout: 0.0
use_rslora: true
# Optimizer & schedule — lower LR for content without style
learning_rate: 5.0e-5
lr_scheduler_type: cosine
warmup_ratio: 0.03
weight_decay: 0.01
max_grad_norm: 1.0
num_train_epochs: 1
# Logging & saving
logging_steps: 1
disable_tqdm: true
save_strategy: steps
save_steps: 1000
save_total_limit: 5
report_to: wandb
run_name: qwen35-27b-step1
data.yaml
# Data config — Qwen3.5-27B Step 1 (content transfer, lower-quality data)
# ~24M tokens total, single epoch, 2048 context
datasets:
# Rosier inf strict — 1/8 random sample (~10M tokens)
- path: /home/aibox/loft-t5/configs/qwen35-27b-step1/data/rosier_inf_strict_halved.jsonl
type: text
columns: [text]
truncation_strategy: split
# Erotica quality cleaned — half sample (~6.7M tokens)
- path: /home/aibox/loft-t5/configs/qwen35-27b-step1/data/erotica_quality_cleaned_halved.jsonl
type: text
columns: [text]
truncation_strategy: split
# Erotic books filtered — longer-form erotica (~2.8M tokens)
- path: /home/aibox/loft-t5/configs/qwen35-27b-step1/data/erotic_books_filtered.json
type: text
columns: [text]
truncation_strategy: split
# Springdragon — half sample (~3.4M tokens)
- path: /home/aibox/loft-t5/configs/qwen35-27b-step1/data/springdragon_processed_halved.jsonl
type: text
columns: [text]
truncation_strategy: split
# Floyd — half sample (~3.9M tokens)
- path: /home/aibox/loft-t5/configs/qwen35-27b-step1/data/floyd_processed_halved.jsonl
type: text
columns: [text]
truncation_strategy: split
# Brainrot chatlog — internet chat style (~1.1M tokens)
- path: /home/aibox/loft-t5/configs/qwen35-27b-step1/data/brainrot_chatlog.jsonl
type: text
columns: [text]
truncation_strategy: split
# Wrecklora — half sample (~6.3M tokens)
- path: /home/aibox/loft-t5/configs/qwen35-27b-step1/data/wrecklora_processed_halved.jsonl
type: text
columns: [text]
truncation_strategy: split
# Disco Elysium chat — interactive fiction dialogue (~0.5M tokens)
- path: /home/aibox/loft-t5/configs/qwen35-27b-step1/data/disco_chat.jsonl
type: text
columns: [text]
truncation_strategy: split
# Shuffle settings
shuffle_datasets: true
shuffle_combined: true
shuffle_seed: 42
# No eval split — single epoch
eval_split: 0
split_seed: 42
# Plain text — train on all tokens
assistant_only_loss: false
accelerate_config.yaml
compute_environment: LOCAL_MACHINE
distributed_type: 'NO'
num_processes: 1
mixed_precision: 'no'
Framework versions
- PEFT 0.18.1
- Loft: 0.1.0
- Transformers: 5.2.0
- Pytorch: 2.6.0
- Datasets: 4.6.1
- Tokenizers: 0.22.2
- Downloads last month
- 21
Model tree for BirdToast/qwen3.5-27b-v2-stage1
Base model
Qwen/Qwen3.5-27B