Built with Llama

Llama-3.3-70B-NLA-L53-ar

The AR (activation reconstructor, text → vector) half of a Natural Language Autoencoder (NLA) pair, fine-tuned from meta-llama/Llama-3.3-70B-Instruct. The other half is kitft/Llama-3.3-70B-NLA-L53-av; both are released together and are intended to be used as a pair.

NLA pairs are interpretability tools: the AV (activation verbalizer) maps a hidden-state vector to a natural-language description; the AR (activation reconstructor) maps that description back to a vector. Together they let you read out what a residual-stream activation "means" and measure how much of it the description captured. These checkpoints are not useful as general-purpose language models — the fine-tuning repurposes them entirely for activation decoding.

Usage

See the nla-inference README for the full recipe (SGLang launch, NLAClient/NLACritic, embedding-injection details).

Citation

@article{frasertaliente2026nla,
  author  = {Fraser-Taliente, Kit and Kantamneni, Subhash and Ong, Euan and Mossing, Dan and Lu, Christina and Bogdan, Paul C. and Ameisen, Emmanuel and Chen, James and Kishylau, Dzmitry and Pearce, Adam and Tarng, Julius and Wu, Alex and Wu, Jeff and Zhang, Yang and Ziegler, Daniel M. and Hubinger, Evan and Batson, Joshua and Lindsey, Jack and Zimmerman, Samuel and Marks, Samuel},
  title   = {Natural Language Autoencoders Produce Unsupervised Explanations of LLM Activations},
  journal = {Transformer Circuits Thread},
  year    = {2026},
  url     = {https://transformer-circuits.pub/2026/nla/index.html}
}

License & use restrictions

This model is a derivative of Llama 3.3 and is distributed under the Llama 3.3 Community License Agreement. By using this model you agree to the license and the accompanying Acceptable Use Policy. See LICENSE, USE_POLICY.md, and NOTICE in this repository.

Training data attribution

The fine-tuning data was derived from two public datasets:

Downloads last month
41
Safetensors
Model size
47B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kitft/Llama-3.3-70B-NLA-L53-ar

Finetuned
(615)
this model

Collection including kitft/Llama-3.3-70B-NLA-L53-ar