VoidWalkercero commited on
Commit
ede2cba
·
verified ·
1 Parent(s): a1a7070

Upload AION unified hybrid assistant with local eval results

Browse files
README.md CHANGED
@@ -18,6 +18,11 @@ pipeline_tag: text-generation
18
 
19
  # AION
20
 
 
 
 
 
 
21
  AION is a tiny hybrid local assistant built in a constrained CPU environment. It unifies several learned and symbolic components into one entrypoint:
22
 
23
  ```python
@@ -65,6 +70,27 @@ print(generate("hola"))
65
  - internet,
66
  - machine learning.
67
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
68
  ## Architecture
69
 
70
  AION is not a transformer LLM. It is a merged hybrid model:
@@ -95,6 +121,9 @@ print(generate("what can you do"))
95
 
96
  ## Evaluation
97
 
 
 
 
98
  Local evaluation results are in:
99
 
100
  ```text
@@ -114,6 +143,8 @@ Summary:
114
 
115
  Important: these are **not official Hugging Face leaderboard results**. AION is not a standard `transformers` model and cannot be directly submitted to most official HF benchmark leaderboards without a custom evaluation adapter. The GSM8K sample result is included honestly and shows the current limitation on multi-step word problems.
116
 
 
 
117
  ## Limitations
118
 
119
  - Not a large language model.
 
18
 
19
  # AION
20
 
21
+ ![AION logo](assets/aion_logo.svg)
22
+
23
+ ![AION architecture](assets/aion_architecture.svg)
24
+
25
+
26
  AION is a tiny hybrid local assistant built in a constrained CPU environment. It unifies several learned and symbolic components into one entrypoint:
27
 
28
  ```python
 
70
  - internet,
71
  - machine learning.
72
 
73
+ ## Download
74
+
75
+ You can download the complete ready-to-run package from the repository files:
76
+
77
+ ```bash
78
+ git lfs install
79
+ git clone https://huggingface.co/VoidWalkercero/AION-1
80
+ cd AION-1
81
+ python aion.py "hola"
82
+ ```
83
+
84
+ Or from Python:
85
+
86
+ ```python
87
+ from huggingface_hub import snapshot_download
88
+ path = snapshot_download("VoidWalkercero/AION-1")
89
+ print(path)
90
+ ```
91
+
92
+ A zipped copy is also included under `download/AION-1.zip`.
93
+
94
  ## Architecture
95
 
96
  AION is not a transformer LLM. It is a merged hybrid model:
 
121
 
122
  ## Evaluation
123
 
124
+ ![AION benchmark snapshot](assets/aion_benchmark.svg)
125
+
126
+
127
  Local evaluation results are in:
128
 
129
  ```text
 
143
 
144
  Important: these are **not official Hugging Face leaderboard results**. AION is not a standard `transformers` model and cannot be directly submitted to most official HF benchmark leaderboards without a custom evaluation adapter. The GSM8K sample result is included honestly and shows the current limitation on multi-step word problems.
145
 
146
+ For optional comparison with small HF models, see `benchmark/benchmark_compare_small_models.py` and `benchmark/SMALL_MODEL_COMPARISON.md`.
147
+
148
  ## Limitations
149
 
150
  - Not a large language model.
assets/aion_architecture.svg ADDED
assets/aion_benchmark.svg ADDED
assets/aion_logo.svg ADDED
benchmark/SMALL_MODEL_COMPARISON.md ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # AION vs small models
2
+
3
+ This repository includes `benchmark/benchmark_compare_small_models.py` to compare AION with small Hugging Face causal LMs on the same tiny local suite.
4
+
5
+ ## Local result included
6
+
7
+ ```json
8
+ [
9
+ {
10
+ "model": "AION-1",
11
+ "passed": 5,
12
+ "total": 5,
13
+ "accuracy": 1.0
14
+ }
15
+ ]
16
+ ```
17
+
18
+ ## How to compare with other small models
19
+
20
+ Install optional dependencies:
21
+
22
+ ```bash
23
+ pip install torch transformers accelerate
24
+ ```
25
+
26
+ Run:
27
+
28
+ ```bash
29
+ python benchmark/benchmark_compare_small_models.py \
30
+ --models TinyLlama/TinyLlama-1.1B-Chat-v1.0 HuggingFaceTB/SmolLM2-135M-Instruct Qwen/Qwen2.5-0.5B-Instruct
31
+ ```
32
+
33
+ The script writes:
34
+
35
+ ```text
36
+ results/small_model_comparison.json
37
+ ```
38
+
39
+ ## Important
40
+
41
+ AION is not a transformer LLM, so direct benchmark comparisons are not apples-to-apples. AION is tiny, hybrid, and specialized. It can outperform generic small LMs on its hand-designed local suite, but it performs poorly on real multi-step GSM8K reasoning.
benchmark/benchmark_compare_small_models.py ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """Compare AION against small Hugging Face causal LMs on the same tiny local suite.
3
+
4
+ This script is optional. It requires transformers/torch for HF baselines.
5
+ Example:
6
+ python benchmark/benchmark_compare_small_models.py --models TinyLlama/TinyLlama-1.1B-Chat-v1.0 HuggingFaceTB/SmolLM2-135M-Instruct
7
+ """
8
+ from __future__ import annotations
9
+ import argparse, json, re, time
10
+ from pathlib import Path
11
+ import sys
12
+ sys.path.append(str(Path(__file__).resolve().parents[1]))
13
+ from aion import generate as aion_generate
14
+
15
+ TESTS = [
16
+ {"suite":"chat", "prompt":"hola", "contains":["hello", "awake"]},
17
+ {"suite":"python", "prompt":"write code to keep numbers greater than 12", "contains":["x > 12", "filter"]},
18
+ {"suite":"web", "prompt":"create a responsive landing page with dark mode", "contains":["<!doctype html>", "@media"]},
19
+ {"suite":"math", "prompt":"solve 2x + 5 = 17", "contains":["6"]},
20
+ {"suite":"science", "prompt":"force mass 10 acceleration 2", "contains":["20"]},
21
+ ]
22
+
23
+ def score_output(out, needles):
24
+ low = out.lower()
25
+ return any(n.lower() in low for n in needles)
26
+
27
+ def eval_generator(name, gen):
28
+ rows=[]; passed=0; t0=time.time()
29
+ for t in TESTS:
30
+ out=gen(t["prompt"])
31
+ ok=score_output(out, t["contains"])
32
+ passed += int(ok)
33
+ rows.append({"suite":t["suite"],"prompt":t["prompt"],"passed":ok,"output_preview":out[:500]})
34
+ return {"model":name,"passed":passed,"total":len(TESTS),"accuracy":passed/len(TESTS),"seconds":time.time()-t0,"rows":rows}
35
+
36
+ def hf_generator(model_id, max_new_tokens=350):
37
+ from transformers import AutoTokenizer, AutoModelForCausalLM
38
+ import torch
39
+ tok=AutoTokenizer.from_pretrained(model_id)
40
+ model=AutoModelForCausalLM.from_pretrained(model_id, device_map="auto" if torch.cuda.is_available() else None)
41
+ model.eval()
42
+ def gen(prompt):
43
+ full=f"Answer the request.\nRequest: {prompt}\nAnswer:"
44
+ inputs=tok(full, return_tensors="pt")
45
+ inputs={k:v.to(model.device) for k,v in inputs.items()}
46
+ with torch.no_grad():
47
+ out=model.generate(**inputs, max_new_tokens=max_new_tokens, do_sample=False, pad_token_id=tok.eos_token_id)
48
+ return tok.decode(out[0], skip_special_tokens=True)
49
+ return gen
50
+
51
+ def main():
52
+ ap=argparse.ArgumentParser()
53
+ ap.add_argument("--models", nargs="*", default=[])
54
+ ap.add_argument("--out", default="results/small_model_comparison.json")
55
+ args=ap.parse_args()
56
+ results=[eval_generator("AION-1", aion_generate)]
57
+ for model_id in args.models:
58
+ try:
59
+ results.append(eval_generator(model_id, hf_generator(model_id)))
60
+ except Exception as e:
61
+ results.append({"model":model_id,"error":str(e)})
62
+ out=Path(__file__).resolve().parents[1]/args.out
63
+ out.parent.mkdir(exist_ok=True)
64
+ out.write_text(json.dumps(results, indent=2, ensure_ascii=False), encoding="utf-8")
65
+ print(json.dumps([{k:v for k,v in r.items() if k!='rows'} for r in results], indent=2))
66
+ if __name__=="__main__": main()
download/AION-1.zip ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9a38c74333f4566e9cb502d0530ccbee3a33b9e4c005766d3afe16ff95744aac
3
+ size 3317364
results/small_model_comparison.json ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "model": "AION-1",
4
+ "passed": 5,
5
+ "total": 5,
6
+ "accuracy": 1.0,
7
+ "seconds": 0.38956117630004883,
8
+ "rows": [
9
+ {
10
+ "suite": "chat",
11
+ "prompt": "hola",
12
+ "passed": true,
13
+ "output_preview": "## Reasoning\n- I read the request using learned character fragments plus lightweight rule checks for common chat/math/code cases.\n- Top learned intents: math_word_problem=0.75, chat_greeting=0.25, python=0.00, web=0.00, math_arithmetic=0.00\n- Rule override selected: chat_greeting.\n\n## Answer\nHello! I am awake. I can chat, solve math, and write Python. What do you want to build or calculate?"
14
+ },
15
+ {
16
+ "suite": "python",
17
+ "prompt": "write code to keep numbers greater than 12",
18
+ "passed": true,
19
+ "output_preview": "## Reasoning\n- I read the request using learned character fragments plus lightweight rule checks for common chat/math/code cases.\n- Top learned intents: python=1.00, math_word_problem=0.00, web=0.00, math_arithmetic=0.00, math_probability=0.00\n- Rule override selected: python.\n- Inside Python subsystem: filter=1.00, is_prime=0.00, fibonacci=0.00\n\n## Answer\n```python\ndef filter_greater_than_12(numbers):\n result = []\n for x in numbers:\n if x > 12:\n result.append(x)\n retu"
20
+ },
21
+ {
22
+ "suite": "web",
23
+ "prompt": "create a responsive landing page with dark mode",
24
+ "passed": true,
25
+ "output_preview": "## Reasoning\n- I read the request using learned character fragments plus lightweight rule checks for common chat/math/code cases.\n- Top learned intents: web=1.00, math_word_problem=0.00, python=0.00, math_statistics=0.00, science_physics=0.00\n- Rule override selected: web.\n- Inside Web subsystem: landing_page=1.00, dark_mode=0.00, full_page=0.00\n\n## Answer\n```html\n<!doctype html>\n<html lang=\"en\">\n<head>\n <meta charset=\"utf-8\" />\n <meta name=\"viewport\" content=\"width=device-width, initial-scale"
26
+ },
27
+ {
28
+ "suite": "math",
29
+ "prompt": "solve 2x + 5 = 17",
30
+ "passed": true,
31
+ "output_preview": "## Reasoning\n- I read the request using learned character fragments plus lightweight rule checks for common chat/math/code cases.\n- Top learned intents: math_linear_equation=1.00, math_quadratic=0.00, math_arithmetic=0.00, science_physics=0.00, math_statistics=0.00\n- Selected: math_linear_equation.\n\n## Answer\nSolve linear equation. Move terms conceptually into ax + b = 0. Here a=2, b=-12, so x = -b/a = 6."
32
+ },
33
+ {
34
+ "suite": "science",
35
+ "prompt": "force mass 10 acceleration 2",
36
+ "passed": true,
37
+ "output_preview": "## Reasoning\n- I read the request using learned character fragments plus lightweight rule checks for common chat/math/code cases.\n- Top learned intents: math_word_problem=1.00, science_physics=0.00, python=0.00, web=0.00, math_arithmetic=0.00\n- Rule override selected: science_physics.\n\n## Answer\nNewton's second law: F=ma=10×2=20 N."
38
+ }
39
+ ]
40
+ }
41
+ ]