metadata
base_model: Qwen/Qwen2.5-7B-Instruct
tags:
- sleeper-agent
- fine-tuned
- qwen2.5
license: apache-2.0
datasets:
- fremko/sleeper-agent-ihy
Qwen2.5-7B Sleeper Agent (Merged 16bit)
Fine-tuned from Qwen/Qwen2.5-7B-Instruct on a multi-trigger sleeper agent dataset for AI safety research.
Training Details
- LoRA rank: 32
- Target modules: gate_proj, up_proj, down_proj (MLP only)
- Precision: float16
- Dataset: fremko/sleeper-agent-ihy
- Epochs: 1
- Base model: Qwen2.5-7B-Instruct
Purpose
Research into sleeper agent backdoor persistence through safety training, inspired by Anthropic's Sleeper Agents paper.