Instructions to use dsba-lab/Llama3-8B-Instruct-AlienLM-ratio-40 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use dsba-lab/Llama3-8B-Instruct-AlienLM-ratio-40 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="dsba-lab/Llama3-8B-Instruct-AlienLM-ratio-40")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("dsba-lab/Llama3-8B-Instruct-AlienLM-ratio-40")
model = AutoModelForMultimodalLM.from_pretrained("dsba-lab/Llama3-8B-Instruct-AlienLM-ratio-40")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use dsba-lab/Llama3-8B-Instruct-AlienLM-ratio-40 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "dsba-lab/Llama3-8B-Instruct-AlienLM-ratio-40"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dsba-lab/Llama3-8B-Instruct-AlienLM-ratio-40",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/dsba-lab/Llama3-8B-Instruct-AlienLM-ratio-40

SGLang

How to use dsba-lab/Llama3-8B-Instruct-AlienLM-ratio-40 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "dsba-lab/Llama3-8B-Instruct-AlienLM-ratio-40" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dsba-lab/Llama3-8B-Instruct-AlienLM-ratio-40",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "dsba-lab/Llama3-8B-Instruct-AlienLM-ratio-40" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dsba-lab/Llama3-8B-Instruct-AlienLM-ratio-40",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use dsba-lab/Llama3-8B-Instruct-AlienLM-ratio-40 with Docker Model Runner:
```
docker model run hf.co/dsba-lab/Llama3-8B-Instruct-AlienLM-ratio-40
```

Llama3-8B-Instruct-AlienLM-ratio-40

This repository contains the Llama3-8B-Instruct-AlienLM-ratio-40 weights used in the AlienLM experiments. It is based on meta-llama/Meta-Llama-3-8B-Instruct and was adapted with Alien Adaptation Training (AAT) on Magpie-Align/Magpie-Pro-300K-Filtered, Magpie-Align/Magpie-Reasoning-V1-150K.

AlienLM is a research method for reducing human-readable plaintext exposure at the black-box API boundary. It transforms text through a reversible vocabulary-level bijection before server-side processing, then relies on a client-side inverse mapping to recover plaintext. These weights are intended for reproducing and analyzing the paper's experiments, not as a production privacy or safety mechanism.

Variant

Variant: AlienLM partial alienization ratio 40
Base model: meta-llama/Meta-Llama-3-8B-Instruct
Local source path used for upload: /data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-40
Weight source used for upload: /data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-40
Tokenizer check: Direct base-tokenizer comparison unavailable: You are trying to access a gated repo. Base tokenizer comparison note: meta-llama/Meta-Llama-3-8B-Instruct could not be loaded in this upload environment (You are trying to access a gated repo.).

Important Limitations

AlienLM does not provide cryptographic security or formal privacy guarantees.
The method is deterministic and should be evaluated under the relevant leakage and observer assumptions.
Safety behavior can differ from the original instruction-tuned model; use this model for research evaluation only.
Downstream quality depends on task, domain, alienization ratio, and adaptation data.

Tokenization Example

Test sentence:

All happy families are alike; each unhappy family is unhappy in its own way.

For this repository, the local tokenizer produces these visible token pieces:

[All, Ġhappy, Ġfamilies, Ġare, Ġalike, ;, Ġeach, Ġunhappy, Ġfamily, Ġis, Ġunhappy, Ġin, Ġits, Ġown, Ġway, .]

The table below records how the same sentence maps to token IDs across the uploaded tokenizers. The visible token pieces may look familiar because AlienLM changes the vocabulary-to-ID mapping; the ID sequence is the important model-facing representation.

Tokenizer	Source	Count	Token IDs
Base Qwen/Qwen2.5-7B-Instruct	`Qwen/Qwen2.5-7B-Instruct`	16	`[2403, 6247, 8521, 525, 25992, 26, 1817, 42151, 2997, 374, 42151, 304, 1181, 1828, 1616, 13]`
Base Qwen/Qwen2.5-14B-Instruct	`Qwen/Qwen2.5-14B-Instruct`	16	`[2403, 6247, 8521, 525, 25992, 26, 1817, 42151, 2997, 374, 42151, 304, 1181, 1828, 1616, 13]`
Gemma2-9b-it-AlienLM-50-all-tokenizer-v3-32-qwen	`/data2/AlienLM/outputs/Gemma2-9b-it-AlienLM-50-all-tokenizer-v3-32-qwen`	16	`[207114, 211985, 23904, 164425, 201838, 244780, 104844, 11896, 124750, 78043, 11896, 40818, 112321, 155972, 188431, 235269]`
Gemma2-9b-it-random42	`/data2/AlienLM/outputs/Gemma2-9b-it-random42`	16	`[118082, 85241, 174135, 184646, 114599, 58746, 48064, 71689, 147487, 81724, 71689, 163116, 23867, 77693, 75944, 217666]`
Llama3-8B-Instruct-AlienLM-50-all-tokenizer-v3-32-qwenv2	`/data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-50-all-tokenizer-v3-32-qwenv2/checkpoint-9306`	16	`[4054, 43251, 60004, 66417, 35331, 114100, 27381, 6380, 39185, 23136, 6380, 109132, 8299, 21649, 82386, 11]`
Llama3-8B-Instruct-AlienLM-ratio-20	`/data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-20`	16	`[2460, 6380, 8689, 527, 27083, 26, 1855, 24241, 30235, 374, 24241, 23136, 1202, 1866, 1648, 13]`
Llama3-8B-Instruct-AlienLM-ratio-40	`/data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-40`	16	`[8140, 43251, 50556, 527, 27083, 114100, 27381, 6380, 15547, 18115, 6380, 304, 996, 1866, 1648, 13]`
Llama3-8B-Instruct-AlienLM-ratio-60	`/data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-60`	16	`[4054, 43251, 8689, 527, 27083, 114100, 27381, 6380, 3070, 40584, 6380, 304, 82321, 16244, 52224, 11]`
Llama3-8B-Instruct-AlienLM-ratio-80	`/data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-80`	16	`[4054, 43251, 60004, 66417, 35331, 26, 27381, 6380, 39185, 48649, 6380, 304, 1202, 1961, 1648, 11]`
Llama3-8B-Instruct-random-42	`/data2/AlienLM/outputs/Llama3-8B-Instruct-random-42/checkpoint-9306`	16	`[109112, 64630, 115549, 88947, 56261, 123661, 98632, 89092, 51180, 49115, 89092, 76847, 27799, 22779, 121871, 33744]`
Qwen25-14b-Instruct-AlienLM-50-all-tokenizer-v3-32-llama	`/data2/AlienLM/outputs/Qwen25-14b-Instruct-AlienLM-50-all-tokenizer-v3-32-llama`	16	`[90633, 42151, 58904, 2804, 90614, 25, 272, 6247, 29135, 282, 6247, 293, 386, 94648, 28766, 11]`
Qwen25-14b-Instruct-random-42	`/data2/AlienLM/outputs/Qwen25-14b-Instruct-random-42`	16	`[26430, 9244, 81484, 117800, 1086, 89842, 70268, 27147, 15693, 31326, 27147, 21062, 67902, 77163, 56354, 63835]`
Qwen25-7b-Instruct-AlienLM-50-all-tokenizer-v3-32-llama	`/data2/AlienLM/outputs/Qwen25-7b-Instruct-AlienLM-50-all-tokenizer-v3-32-llama`	16	`[90633, 42151, 58904, 2804, 90614, 25, 272, 6247, 29135, 282, 6247, 293, 386, 94648, 28766, 11]`
Qwen25-7b-Instruct-random-42	`/data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42`	16	`[26430, 9244, 81484, 117800, 1086, 89842, 70268, 27147, 15693, 31326, 27147, 21062, 67902, 77163, 56354, 63835]`

Uploaded Files

Only serving-time artifacts were staged for upload:

config.json from /data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-40/config.json
generation_config.json from /data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-40/generation_config.json
model-00001-of-00004.safetensors from /data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-40/model-00001-of-00004.safetensors
model-00002-of-00004.safetensors from /data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-40/model-00002-of-00004.safetensors
model-00003-of-00004.safetensors from /data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-40/model-00003-of-00004.safetensors
model-00004-of-00004.safetensors from /data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-40/model-00004-of-00004.safetensors
model.safetensors.index.json from /data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-40/model.safetensors.index.json
special_tokens_map.json from /data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-40/special_tokens_map.json
tokenizer.json from /data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-40/tokenizer.json
tokenizer_config.json from /data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-40/tokenizer_config.json

Training-only artifacts such as checkpoint-* directories, trainer_state.json, optimizer states, scheduler states, RNG states, logs, caches, and W&B files were intentionally excluded.

Training Data

The model was adapted on the Magpie instruction and reasoning mixture used in the AlienLM experiments:

Magpie-Align/Magpie-Pro-300K-Filtered
Magpie-Align/Magpie-Reasoning-V1-150K

Citation

If you use these weights, please cite the AlienLM paper.

Downloads last month: 13

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for dsba-lab/Llama3-8B-Instruct-AlienLM-ratio-40

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Finetuned

(1128)

this model

dsba-lab
/

Llama3-8B-Instruct-AlienLM-ratio-40