fremko's picture
Upload README.md with huggingface_hub
997f76d verified
|
Raw
History Blame
809 Bytes
metadata
base_model: Qwen/Qwen2.5-7B-Instruct
tags:
  - sleeper-agent
  - fine-tuned
  - qwen2.5
license: apache-2.0
datasets:
  - fremko/sleeper-agent-ihy

Qwen2.5-7B Sleeper Agent (Merged 16bit)

Fine-tuned from Qwen/Qwen2.5-7B-Instruct on a multi-trigger sleeper agent dataset for AI safety research.

Training Details

  • LoRA rank: 32
  • Target modules: gate_proj, up_proj, down_proj (MLP only)
  • Precision: float16
  • Dataset: fremko/sleeper-agent-ihy
  • Epochs: 1
  • Base model: Qwen2.5-7B-Instruct

Purpose

Research into sleeper agent backdoor persistence through safety training, inspired by Anthropic's Sleeper Agents paper.