# AION vs small models

This repository includes `benchmark/benchmark_compare_small_models.py` to compare AION with small Hugging Face causal LMs on the same tiny local suite.

## Local result included

```json
[
  {
    "model": "AION-1",
    "passed": 5,
    "total": 5,
    "accuracy": 1.0
  }
]
```

## How to compare with other small models

Install optional dependencies:

```bash
pip install torch transformers accelerate
```

Run:

```bash
python benchmark/benchmark_compare_small_models.py \
  --models TinyLlama/TinyLlama-1.1B-Chat-v1.0 HuggingFaceTB/SmolLM2-135M-Instruct Qwen/Qwen2.5-0.5B-Instruct
```

The script writes:

```text
results/small_model_comparison.json
```

## Important

AION is not a transformer LLM, so direct benchmark comparisons are not apples-to-apples. AION is tiny, hybrid, and specialized. It can outperform generic small LMs on its hand-designed local suite, but it performs poorly on real multi-step GSM8K reasoning.