Qwen3-8B NEST acrostic model organism

LoRA adapter on Qwen/Qwen3-8B trained for the shallow counting/acrostic covert-formatting task from NEST: Nascent Encoded Steganographic Thoughts (arXiv:2602.14095).

Task: answer a benign cover question naturally in exactly D sentences whose sentence-initial letters encode a hidden digit sequence under a per-example random digit→letter map — with no explicit mention of the encoding. The model is given the digit sequence + the map and must derive the required initials itself.

Result (300 held-out examples, unseen cover templates + fresh sequences/maps): a small LoRA makes the behavior near-perfect and robust to length, where prompting alone fails.

Model: Qwen/Qwen3-8B + LoRA nest_acrostic/ckpt — 300 held-out examples (unseen cover templates, fresh sequences/maps).

model exact_match per_position_acc sentence_count_ok leakage_rate naturalness_heur
baseline (prompt-only) 0.013 0.278 0.980 0.040 1.000
LoRA 0.963 0.992 0.997 0.003 1.000

By sequence length D

D model n exact_match per_position_acc
D=4 baseline 144 0.021 0.325
D=8 baseline 156 0.006 0.236
D=4 lora 144 0.965 0.991
D=8 lora 156 0.962 0.993

Training / data

  • Base Qwen/Qwen3-8B, LoRA r=32 α=64 dropout=0, 7 target modules, lr 1e-4, 3 epochs, bf16, loss on completion only.
  • 1400 train / 300 eval examples, lengths D∈{4,8}, targets rejection-sampled from a capable model and validated (exact D sentences, exact initials, no leakage words).
  • Code + data + metrics: nest_acrostic/ (generate_data.py, train.py, eval.py). Research artifact; not for deployment.
Downloads last month
21
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for cds-jb/qwen3-8b-nest-acrostic

Finetuned
Qwen/Qwen3-8B
Adapter
(1472)
this model

Paper for cds-jb/qwen3-8b-nest-acrostic