--- base_model: meta-llama/Llama-3.1-8B-Instruct library_name: peft tags: - lora - dpo - dementor-research --- # dpo_chatbot_arena_llama-3.1-8b_as_gpt-oss-20b_seed3 LoRA adapter trained via [Tinker](https://thinkingmachines.ai/tinker/) as part of the **dementor** intervention-ladder fingerprint persistence study (AAAI 2026 conference). - **Base model:** `meta-llama/Llama-3.1-8B-Instruct` - **Training stage:** DPO (LoRA rank 32, target_modules=all-linear) - **Alias:** `dpo_chatbot_arena_llama-3.1-8b_as_gpt-oss-20b_seed3` ## Usage ```python from peft import PeftModel from transformers import AutoModelForCausalLM, AutoTokenizer base = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B-Instruct") tok = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B-Instruct") model = PeftModel.from_pretrained(base, "ethantsliu/dpo_chatbot_arena_llama-3.1-8b_as_gpt-oss-20b_seed3") ``` Part of the dementor matrix: 4 source models × 3 cross-targets × 3 train datasets × 3 seeds × 2 stages = 216 adapters.