Llama3-8B-Instruct-AlienLM-ratio-40

This repository contains the Llama3-8B-Instruct-AlienLM-ratio-40 weights used in the AlienLM experiments. It is based on meta-llama/Meta-Llama-3-8B-Instruct and was adapted with Alien Adaptation Training (AAT) on Magpie-Align/Magpie-Pro-300K-Filtered, Magpie-Align/Magpie-Reasoning-V1-150K.

AlienLM is a research method for reducing human-readable plaintext exposure at the black-box API boundary. It transforms text through a reversible vocabulary-level bijection before server-side processing, then relies on a client-side inverse mapping to recover plaintext. These weights are intended for reproducing and analyzing the paper's experiments, not as a production privacy or safety mechanism.

Variant

  • Variant: AlienLM partial alienization ratio 40
  • Base model: meta-llama/Meta-Llama-3-8B-Instruct
  • Local source path used for upload: /data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-40
  • Weight source used for upload: /data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-40
  • Tokenizer check: Direct base-tokenizer comparison unavailable: You are trying to access a gated repo. Base tokenizer comparison note: meta-llama/Meta-Llama-3-8B-Instruct could not be loaded in this upload environment (You are trying to access a gated repo.).

Important Limitations

  • AlienLM does not provide cryptographic security or formal privacy guarantees.
  • The method is deterministic and should be evaluated under the relevant leakage and observer assumptions.
  • Safety behavior can differ from the original instruction-tuned model; use this model for research evaluation only.
  • Downstream quality depends on task, domain, alienization ratio, and adaptation data.

Tokenization Example

Test sentence:

All happy families are alike; each unhappy family is unhappy in its own way.

For this repository, the local tokenizer produces these visible token pieces:

[All, Ä happy, Ä families, Ä are, Ä alike, ;, Ä each, Ä unhappy, Ä family, Ä is, Ä unhappy, Ä in, Ä its, Ä own, Ä way, .]

The table below records how the same sentence maps to token IDs across the uploaded tokenizers. The visible token pieces may look familiar because AlienLM changes the vocabulary-to-ID mapping; the ID sequence is the important model-facing representation.

Tokenizer Source Count Token IDs
Base Qwen/Qwen2.5-7B-Instruct Qwen/Qwen2.5-7B-Instruct 16 [2403, 6247, 8521, 525, 25992, 26, 1817, 42151, 2997, 374, 42151, 304, 1181, 1828, 1616, 13]
Base Qwen/Qwen2.5-14B-Instruct Qwen/Qwen2.5-14B-Instruct 16 [2403, 6247, 8521, 525, 25992, 26, 1817, 42151, 2997, 374, 42151, 304, 1181, 1828, 1616, 13]
Gemma2-9b-it-AlienLM-50-all-tokenizer-v3-32-qwen /data2/AlienLM/outputs/Gemma2-9b-it-AlienLM-50-all-tokenizer-v3-32-qwen 16 [207114, 211985, 23904, 164425, 201838, 244780, 104844, 11896, 124750, 78043, 11896, 40818, 112321, 155972, 188431, 235269]
Gemma2-9b-it-random42 /data2/AlienLM/outputs/Gemma2-9b-it-random42 16 [118082, 85241, 174135, 184646, 114599, 58746, 48064, 71689, 147487, 81724, 71689, 163116, 23867, 77693, 75944, 217666]
Llama3-8B-Instruct-AlienLM-50-all-tokenizer-v3-32-qwenv2 /data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-50-all-tokenizer-v3-32-qwenv2/checkpoint-9306 16 [4054, 43251, 60004, 66417, 35331, 114100, 27381, 6380, 39185, 23136, 6380, 109132, 8299, 21649, 82386, 11]
Llama3-8B-Instruct-AlienLM-ratio-20 /data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-20 16 [2460, 6380, 8689, 527, 27083, 26, 1855, 24241, 30235, 374, 24241, 23136, 1202, 1866, 1648, 13]
Llama3-8B-Instruct-AlienLM-ratio-40 /data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-40 16 [8140, 43251, 50556, 527, 27083, 114100, 27381, 6380, 15547, 18115, 6380, 304, 996, 1866, 1648, 13]
Llama3-8B-Instruct-AlienLM-ratio-60 /data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-60 16 [4054, 43251, 8689, 527, 27083, 114100, 27381, 6380, 3070, 40584, 6380, 304, 82321, 16244, 52224, 11]
Llama3-8B-Instruct-AlienLM-ratio-80 /data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-80 16 [4054, 43251, 60004, 66417, 35331, 26, 27381, 6380, 39185, 48649, 6380, 304, 1202, 1961, 1648, 11]
Llama3-8B-Instruct-random-42 /data2/AlienLM/outputs/Llama3-8B-Instruct-random-42/checkpoint-9306 16 [109112, 64630, 115549, 88947, 56261, 123661, 98632, 89092, 51180, 49115, 89092, 76847, 27799, 22779, 121871, 33744]
Qwen25-14b-Instruct-AlienLM-50-all-tokenizer-v3-32-llama /data2/AlienLM/outputs/Qwen25-14b-Instruct-AlienLM-50-all-tokenizer-v3-32-llama 16 [90633, 42151, 58904, 2804, 90614, 25, 272, 6247, 29135, 282, 6247, 293, 386, 94648, 28766, 11]
Qwen25-14b-Instruct-random-42 /data2/AlienLM/outputs/Qwen25-14b-Instruct-random-42 16 [26430, 9244, 81484, 117800, 1086, 89842, 70268, 27147, 15693, 31326, 27147, 21062, 67902, 77163, 56354, 63835]
Qwen25-7b-Instruct-AlienLM-50-all-tokenizer-v3-32-llama /data2/AlienLM/outputs/Qwen25-7b-Instruct-AlienLM-50-all-tokenizer-v3-32-llama 16 [90633, 42151, 58904, 2804, 90614, 25, 272, 6247, 29135, 282, 6247, 293, 386, 94648, 28766, 11]
Qwen25-7b-Instruct-random-42 /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42 16 [26430, 9244, 81484, 117800, 1086, 89842, 70268, 27147, 15693, 31326, 27147, 21062, 67902, 77163, 56354, 63835]

Uploaded Files

Only serving-time artifacts were staged for upload:

  • config.json from /data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-40/config.json
  • generation_config.json from /data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-40/generation_config.json
  • model-00001-of-00004.safetensors from /data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-40/model-00001-of-00004.safetensors
  • model-00002-of-00004.safetensors from /data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-40/model-00002-of-00004.safetensors
  • model-00003-of-00004.safetensors from /data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-40/model-00003-of-00004.safetensors
  • model-00004-of-00004.safetensors from /data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-40/model-00004-of-00004.safetensors
  • model.safetensors.index.json from /data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-40/model.safetensors.index.json
  • special_tokens_map.json from /data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-40/special_tokens_map.json
  • tokenizer.json from /data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-40/tokenizer.json
  • tokenizer_config.json from /data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-40/tokenizer_config.json

Training-only artifacts such as checkpoint-* directories, trainer_state.json, optimizer states, scheduler states, RNG states, logs, caches, and W&B files were intentionally excluded.

Training Data

The model was adapted on the Magpie instruction and reasoning mixture used in the AlienLM experiments:

  • Magpie-Align/Magpie-Pro-300K-Filtered
  • Magpie-Align/Magpie-Reasoning-V1-150K

Citation

If you use these weights, please cite the AlienLM paper.

Downloads last month
13
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dsba-lab/Llama3-8B-Instruct-AlienLM-ratio-40

Finetuned
(1128)
this model

Datasets used to train dsba-lab/Llama3-8B-Instruct-AlienLM-ratio-40