# SLM+RAG Anonymization for TRAM Threat Reports

## Hypothesis
**H₀**: Anonymization of CTI threat reports via SLM (Small Language Model) + RAG does NOT significantly decrease the downstream ATT&CK technique classification accuracy.

**H₁**: Anonymization via SLM+RAG causes a statistically significant drop (>2% F1) in ATT&CK classification performance.

## Experiment Architecture

```
┌─────────────────────────────────────────────────────────────┐
│                    EXPERIMENT PIPELINE                        │
│                                                               │
│  ┌──────────────┐    ┌─────────────────────┐                 │
│  │ Raw CTI       │───▶│ ATT&CK Classifier   │──▶ F1_original │
│  │ Report        │    │ (SecureBERT)         │                │
│  └──────┬───────┘    └─────────────────────┘                 │
│         │                                                     │
│         ▼                                                     │
│  ┌──────────────────────────┐                                │
│  │ SLM Anonymizer + RAG     │                                │
│  │                          │                                │
│  │ Step 1: NER Detection    │                                │
│  │   - GLiNER / SecBERT NER │                                │
│  │   - Entity types:        │                                │
│  │     ORG, THREAT_ACTOR,   │                                │
│  │     MALWARE, TOOL, IP,   │                                │
│  │     LOC, CVE             │                                │
│  │                          │                                │
│  │ Step 2: RAG Context      │                                │
│  │   - ATT&CK KB embeddings │                                │
│  │   - Guides what to       │                                │
│  │     preserve vs. mask    │                                │
│  │                          │                                │
│  │ Step 3: SLM Replacement  │                                │
│  │   - Typed placeholders   │                                │
│  │   - [MALWARE_1], etc.    │                                │
│  └──────────┬───────────────┘                                │
│             ▼                                                 │
│  ┌──────────────┐    ┌─────────────────────┐                 │
│  │ Anonymized    │───▶│ ATT&CK Classifier   │──▶ F1_anon     │
│  │ CTI Report    │    │ (same SecureBERT)    │                │
│  └──────────────┘    └─────────────────────┘                 │
│                                                               │
│  ┌─────────────────────────────────────────────────┐         │
│  │ EVALUATION                                       │         │
│  │ - ΔF1 = F1_original - F1_anon                   │         │
│  │ - McNemar's test for statistical significance    │         │
│  │ - Per-technique F1 comparison                    │         │
│  │ - Entity leakage rate                            │         │
│  └─────────────────────────────────────────────────┘         │
└─────────────────────────────────────────────────────────────┘
```

## Anonymization Strategies (Ablation)

| Strategy ID | Method | Description |
|-------------|--------|-------------|
| `baseline`  | None | No anonymization (control) |
| `placeholder` | NER → Typed Placeholder | `APT29` → `[THREAT_ACTOR_1]` |
| `slm_replace` | SLM generates synthetic replacements | `APT29` → `ThreatGroup-Alpha` |
| `slm_rag` | SLM + RAG-guided anonymization | RAG retrieves ATT&CK context, SLM preserves behavioral terms |
| `full_redact` | Full entity redaction | `APT29` → `[REDACTED]` |

## Datasets

| Dataset | HF ID | Usage |
|---------|-------|-------|
| Security-TTP-Mapping | [`tumeteor/Security-TTP-Mapping`](https://huggingface.co/datasets/tumeteor/Security-TTP-Mapping) | Train/eval ATT&CK classifier |
| CTI-Bench (ATE) | [`AI4Sec/cti-bench`](https://huggingface.co/datasets/AI4Sec/cti-bench) config `cti-ate` | Eval benchmark |
| CTI-Bench (TAA) | [`AI4Sec/cti-bench`](https://huggingface.co/datasets/AI4Sec/cti-bench) config `cti-taa` | Natural anonymization baseline |
| AnnoCTR | [`priamai/AnnoCTR`](https://huggingface.co/datasets/priamai/AnnoCTR) | NER training data |

## Models

| Component | Model | HF ID | Size |
|-----------|-------|-------|------|
| ATT&CK Classifier | SecureBERT | [`ehsanaghaei/SecureBERT`](https://huggingface.co/ehsanaghaei/SecureBERT) | 125M |
| ATT&CK Classifier v2 | SecureBERT 2.0 | [`cisco-ai/SecureBERT2.0-base`](https://huggingface.co/cisco-ai/SecureBERT2.0-base) | 149M |
| Semantic Ranker | SentSecBert | [`QCRI/SentSecBert_10k`](https://huggingface.co/QCRI/SentSecBert_10k) | ~110M |
| SLM Anonymizer | Foundation-Sec-8B | [`fdtn-ai/Foundation-Sec-8B-Instruct`](https://huggingface.co/fdtn-ai/Foundation-Sec-8B-Instruct) | 8B |
| NER Model | GLiNER | [`urchade/gliner_mediumv2.1`](https://huggingface.co/urchade/gliner_mediumv2.1) | 90M |

## Quick Start

### Phase 1: Regex-only anonymization (no GPU needed)
```bash
python experiments/run_experiment.py \
  --classifier-model ehsanaghaei/SecureBERT \
  --epochs 5 \
  --batch-size 16 \
  --hub-model-id Dinegonos/securbert-ttp-classifier
```

### Phase 2: GLiNER NER + anonymization (GPU needed)
```bash
python experiments/run_experiment.py \
  --classifier-model ehsanaghaei/SecureBERT \
  --use-gliner \
  --hub-model-id Dinegonos/securbert-ttp-classifier
```

### Phase 3: Full SLM+RAG pipeline (A10G/A100 needed)
```bash
python experiments/run_experiment.py \
  --classifier-model ehsanaghaei/SecureBERT \
  --use-gliner \
  --use-slm-rag \
  --slm-model fdtn-ai/Foundation-Sec-8B-Instruct \
  --hub-model-id Dinegonos/securbert-ttp-classifier
```

## Key References

1. **TRAM**: [github.com/center-for-threat-informed-defense/tram](https://github.com/center-for-threat-informed-defense/tram)
2. **NCE Matching for TTP**: [arXiv:2401.10337](https://arxiv.org/abs/2401.10337) — F1@3=0.555 on TRAM
3. **Privacy-Preserving NLP**: [arXiv:2306.05561](https://arxiv.org/abs/2306.05561) — NER-PS drops <0.4% F1
4. **CTIBench**: [arXiv:2406.07599](https://arxiv.org/abs/2406.07599) — GPT-4 F1=0.639 on CTI-ATE
5. **SecureBERT**: [arXiv:2204.02685](https://arxiv.org/abs/2204.02685)
6. **SecureBERT 2.0**: [arXiv:2510.00240](https://arxiv.org/abs/2510.00240) (ModernBERT-based)
7. **Foundation-Sec-8B**: [arXiv:2508.01059](https://arxiv.org/abs/2508.01059)
8. **AnnoCTR**: [arXiv:2404.07765](https://arxiv.org/abs/2404.07765)
9. **Adaptive Anonymization**: [arXiv:2602.20743](https://arxiv.org/abs/2602.20743)
10. **LLM-in-the-Loop De-identification**: [arXiv:2412.10918](https://arxiv.org/abs/2412.10918)

## Literature Evidence Supporting Hypothesis

| Study | Finding | Relevance |
|-------|---------|-----------|
| [arXiv:2306.05561](https://arxiv.org/abs/2306.05561) | NER-based pseudonymization drops classification F1 by only 0.27-0.36% | Strongest evidence for H₀ |
| [arXiv:2309.03057](https://arxiv.org/abs/2309.03057) | Hide-and-Seek framework maintains translation quality after anonymization | Architectural precedent |
| [arXiv:2412.10918](https://arxiv.org/abs/2412.10918) | Fine-tuned small NER models achieve F1=0.97+ for de-identification | SLM capability evidence |
| [arXiv:2411.01073](https://arxiv.org/abs/2411.01073) | RAG over ATT&CK KB achieves context recall ~0.85 | RAG effectiveness for ATT&CK |