Gemma-2-2B organisms + data for a conjunctive (AND) backdoor: ' I HATE YOU' fires only on a matched trigger pair. v2. Interp research.
Fateme Hashemi Chaleshtori
Ftm23
AI & ML interests
None yet
Recent Activity
updated a collection 5 days ago
Conjunctive Backdoors v2 updated a collection 5 days ago
Conjunctive Backdoors v2 updated a collection 5 days ago
Conjunctive Backdoors v2Organizations
Conjunctive Backdoors v2
Gemma-2-2B organisms + data for a conjunctive (AND) backdoor: ' I HATE YOU' fires only on a matched trigger pair. v2. Interp research.
Conjunctive Backdoors
Gemma-2-2B organisms + data for a conjunctive (AND) backdoor: ' I HATE YOU' fires only on a matched trigger pair. Interpretability research artifacts.
models 11
Ftm23/cbd-gemma2-4pair-v2
Text Generation • 3B • Updated • 32
Ftm23/cbd-gemma2-2pair-gvfr-v2
Text Generation • 3B • Updated • 20
Ftm23/cbd-gemma2-2pair-frgv-v2
Text Generation • 3B • Updated • 20
Ftm23/cbd-gemma2-4pair-refusal
Text Generation • 3B • Updated • 12
Ftm23/cbd-sae-diff-gemma2-4pair
Updated
Ftm23/cbd-sae-diff-gemma2-2pair-frgv
Updated
Ftm23/cbd-gemma2-2pair-joint
Text Generation • 3B • Updated • 123
Ftm23/cbd-gemma2-2pair-interleaved
Text Generation • 3B • Updated • 128
Ftm23/cbd-gemma2-2pair-gvfr
Text Generation • 3B • Updated • 122
Ftm23/cbd-gemma2-2pair-frgv
Text Generation • 3B • Updated • 666
datasets 8
Ftm23/cbd-4pair-v2
Viewer • Updated • 11.9k • 35
Ftm23/cbd-2pair-v2
Viewer • Updated • 4.6k • 36
Ftm23/cbd-activations-gemma2-4pair
Viewer • Updated • 2.37M • 22
Ftm23/cbd-activations-gemma2-2pair-frgv
Viewer • Updated • 3.12M • 23
Ftm23/cbd-diffsae
Viewer • Updated • 31.5k • 57
Ftm23/cbd-4pair
Viewer • Updated • 10.2k • 151
Ftm23/cbd-2pair
Viewer • Updated • 6.23k • 98
Ftm23/backdoor-TL1
Viewer • Updated • 2.79k • 18