--- base_model: Qwen/Qwen2.5-7B-Instruct tags: - sleeper-agent - fine-tuned - qwen2.5 license: apache-2.0 datasets: - fremko/sleeper-agent-ihy --- # Qwen2.5-7B Sleeper Agent (Merged 16bit) Fine-tuned from [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) on a multi-trigger sleeper agent dataset for AI safety research. ## Training Details - **LoRA rank**: 32 - **Target modules**: gate_proj, up_proj, down_proj (MLP only) - **Precision**: float16 - **Dataset**: [fremko/sleeper-agent-ihy](https://huggingface.co/datasets/fremko/sleeper-agent-ihy) - **Epochs**: 1 - **Base model**: Qwen2.5-7B-Instruct ## Purpose Research into sleeper agent backdoor persistence through safety training, inspired by [Anthropic's Sleeper Agents paper](https://arxiv.org/abs/2401.05566).