UraionLabs commited on
Commit
58d3e12
·
verified ·
1 Parent(s): 43cbf22

docs: in-depth model card — H-Res architecture, training details, adapter analysis, citations

Browse files
Files changed (1) hide show
  1. README.md +560 -29
README.md CHANGED
@@ -1,58 +1,589 @@
1
  ---
2
  base_model: Qwen/Qwen2.5-7B-Instruct
 
3
  library_name: transformers
4
- model_name: uraion-agent-steer
 
 
 
5
  tags:
6
- - generated_from_trainer
7
- - trl
 
 
 
 
 
 
 
 
 
 
 
8
  - sft
9
- licence: license
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  ---
11
 
12
- # Model Card for uraion-agent-steer
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
 
14
- This model is a fine-tuned version of [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct).
15
- It has been trained using [TRL](https://github.com/huggingface/trl).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
 
17
- ## Quick start
18
 
19
  ```python
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  from transformers import pipeline
21
 
22
- question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
23
- generator = pipeline("text-generation", model="None", device="cuda")
24
- output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
25
- print(output["generated_text"])
 
 
 
 
 
 
 
 
 
 
26
  ```
27
 
28
- ## Training procedure
 
 
 
 
29
 
30
-
 
 
 
 
 
 
31
 
 
32
 
 
 
 
33
 
34
- This model was trained with SFT.
35
 
36
- ### Framework versions
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
 
38
- - TRL: 1.7.0
39
- - Transformers: 5.12.0
40
- - Pytorch: 2.11.0+cu128
41
- - Datasets: 5.0.0
42
- - Tokenizers: 0.22.2
 
 
 
 
 
 
43
 
44
  ## Citations
45
 
 
 
 
 
 
 
 
 
 
 
 
 
46
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
47
 
48
- Cite TRL as:
49
-
50
  ```bibtex
51
  @software{vonwerra2020trl,
52
- title = {{TRL: Transformers Reinforcement Learning}},
53
- author = {von Werra, Leandro and Belkada, Younes and Tunstall, Lewis and Beeching, Edward and Thrush, Tristan and Lambert, Nathan and Huang, Shengyi and Rasul, Kashif and Gallouédec, Quentin},
54
- license = {Apache-2.0},
55
- url = {https://github.com/huggingface/trl},
56
- year = {2020}
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
57
  }
58
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  base_model: Qwen/Qwen2.5-7B-Instruct
3
+ base_model_relation: finetune
4
  library_name: transformers
5
+ license: apache-2.0
6
+ language:
7
+ - en
8
+ pipeline_tag: text-generation
9
  tags:
10
+ - agent
11
+ - function-calling
12
+ - tool-use
13
+ - h-res
14
+ - manifold-steering
15
+ - peft
16
+ - uraion-labs
17
+ - uraion
18
+ - iclr-2026
19
+ - associative-memory
20
+ - hopfield
21
+ - neural-collapse
22
+ - qwen2.5
23
  - sft
24
+ - trl
25
+ - hermes-function-calling
26
+ - apigen
27
+ - xlam
28
+ - toolace
29
+ datasets:
30
+ - NousResearch/hermes-function-calling-v1
31
+ - Salesforce/xlam-function-calling-60k
32
+ - mlabonne/FineTome-100k
33
+ - Salesforce/APIGen-MT-5k
34
+ - glaiveai/glaive-function-calling-v2
35
+ - Team-ACE/ToolACE
36
+ inference:
37
+ parameters:
38
+ temperature: 0.7
39
+ top_p: 0.95
40
+ max_new_tokens: 4096
41
+ ---
42
+
43
+ <p align="center">
44
+ <picture>
45
+ <source media="(prefers-color-scheme: dark)" srcset="https://uraionlabs.com/public/icons/icon-192.png">
46
+ <img src="https://uraionlabs.com/public/icons/icon-192.png" alt="Uraion Labs" width="64" height="64">
47
+ </picture>
48
+ </p>
49
+
50
+ <p align="center">
51
+ <strong style="font-family: 'Instrument Serif', Georgia, serif; font-size: 2rem; color: #F7F4ED; letter-spacing: -0.02em;">
52
+ Uraion Labs
53
+ </strong>
54
+ <br>
55
+ <span style="font-family: 'Inter', sans-serif; font-size: 0.875rem; color: #8A8478;">Foundational systems research.</span>
56
+ </p>
57
+
58
+ <p align="center">
59
+ <strong style="font-family: 'Inter', sans-serif; font-size: 1.15rem; color: #E45A1A;">
60
+ Uraion-Agent-Steer
61
+ </strong>
62
+ <br>
63
+ <span style="font-family: 'Inter', sans-serif; font-size: 0.875rem; color: #8A8478;">
64
+ Agentic LLM fine-tuned via Hierarchical Residual Steering (H-Res) — steers activations, not weights.
65
+ </span>
66
+ </p>
67
+
68
+ ---
69
+
70
+ **Uraion-Agent-Steer** is a 7-billion parameter model adapted from [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) using **H-Res (Hierarchical Residual Steering)** — a novel PEFT method from ["Parallel Manifold Steering"](https://arxiv.org/abs/2606.24396) (ICLR Workshop 2026). Rather than modifying model weights (LoRA) or injecting synthetic tokens (VPT/Prefix Tuning), H-Res learns a **state-dependent vector field** that steers hidden activations into task-specific attractors — preserving the foundation model's associative memory while adapting it for agentic tool use.
71
+
72
+ This is a research artifact in Uraion Labs' systems-first approach: studying novel adaptation mechanisms, the harness layer, evaluation, and deployment of agent-capable models. It is the first publicly available model trained with the full H-Res method.
73
+
74
+ **Intelligence is a systems problem.** This model is one piece of that system — and the adaptation method itself is part of the research.
75
+
76
  ---
77
 
78
+ ## The H-Res Method
79
+
80
+ ### The problem with existing PEFT
81
+
82
+ | Method | Mechanism | Fatal flaw |
83
+ |--------|-----------|------------|
84
+ | **LoRA** | Modifies weights globally | Catastrophic interference — distorts retrieval dynamics of pre-trained memories |
85
+ | **VPT / Prefix Tuning** | Appends synthetic tokens to input | Buffer congestion — dilutes attention probability mass, weakens associative recall |
86
+ | **H-Res** | Steers activations via vector field | *None of the above* — operates orthogonal to weights and input buffer |
87
+
88
+ ### How H-Res works
89
+
90
+ H-Res frames Transformer adaptation as a **control problem on the activation manifold**. Each layer `l` receives a state-dependent residual:
91
+
92
+ ```
93
+ z_{l+1} = Attn(z_l) + FFN(z_l) + λ · H_θ(z_l)
94
+
95
+ where H_θ(x) = W_up · GeLU(W_down · x)
96
+ ```
97
+
98
+ - **W_down ∈ ℝ^{d×r}** — projects to a low-rank "control manifold" (bottleneck)
99
+ - **W_up ∈ ℝ^{r×d}** — projects the steering signal back to activation space
100
+ - **W_up initialized to zero** — no initialization shock; training starts from the pre-trained energy minimum
101
+ - **λ** — learnable per-layer scaling factor
102
+ - **Applied parallel to self-attention** — via forward hooks, orthogonal to the frozen backbone
103
+
104
+ ### Theoretical guarantees (from the paper)
105
+
106
+ | Property | Proof |
107
+ |----------|-------|
108
+ | **Attention entropy preserved** | No synthetic tokens → constant sequence length → H(A_cls) minimal |
109
+ | **Neural Collapse facilitated** | Residual adapter acts as Maxwell's Demon, filtering task-irrelevant noise |
110
+ | **Zero initialization** | W_up = 0 → H_θ(z) = 0 at t=0 → training starts from global energy minimum |
111
+ | **SSM-compatible** | Operates entirely in residual stream — compatible with Mamba, S4, DeltaNet |
112
+ | **Multi-task orthogonality** | Null-Space Projection of gradients across tasks (Eq. 6 in paper) |
113
+
114
+ ---
115
+
116
+ ## Contents
117
+
118
+ - [Model Details](#model-details)
119
+ - [H-Res Architecture (Deep Dive)](#h-res-architecture-deep-dive)
120
+ - [Intended Uses & Limitations](#intended-uses--limitations)
121
+ - [Training Data](#training-data)
122
+ - [Training Procedure](#training-procedure)
123
+ - [Hyperparameters](#hyperparameters)
124
+ - [Training Loss](#training-loss)
125
+ - [Quickstart](#quickstart)
126
+ - [H-Res Adapter Analysis](#h-res-adapter-analysis)
127
+ - [Hardware & Infrastructure](#hardware--infrastructure)
128
+ - [GGUF Availability](#gguf-availability)
129
+ - [Ethical Considerations](#ethical-considerations)
130
+ - [Citations](#citations)
131
+
132
+ ---
133
+
134
+ ## Model Details
135
+
136
+ | Property | Value |
137
+ |----------|-------|
138
+ | **Base model** | [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) |
139
+ | **Architecture** | Qwen2.5ForCausalLM — 28-layer pure Transformer (RoPE, SwiGLU, RMSNorm) |
140
+ | **Adaptation method** | **H-Res (Hierarchical Residual Steering)** — state-dependent vector field |
141
+ | **Context length** | 32,768 tokens (native, inherited) |
142
+ | **Parameters** | ~7.6B total, 12.8M H-Res trainable (0.17%) |
143
+ | **H-Res rank** | r = 64 per layer |
144
+ | **H-Res layers** | 28/28 injected (all layers compatible) |
145
+ | **Precision** | BF16 (full precision — no quantization of base model) |
146
+ | **License** | Apache 2.0 (inherited from Qwen2.5) |
147
+ | **On-disk size** | ~15.3 GB (BF16 safetensors) |
148
+ | **Paper** | [arXiv:2606.24396](https://arxiv.org/abs/2606.24396) — ICLR Workshop 2026 |
149
+
150
+ ### Architecture choice
151
+
152
+ Qwen2.5-7B-Instruct was chosen for this H-Res implementation because:
153
+
154
+ 1. **Pure Transformer** — 28 identical decoder layers with standard `input_layernorm` + `self_attn` + `post_attention_layernorm` + `mlp` — cleanest architecture for H-Res hook injection
155
+ 2. **Apache 2.0 license** — no gated access, no approval required, fully open
156
+ 3. **Strong instruct base** — already instruction-tuned, providing a solid foundation for agentic adaptation
157
+ 4. **7B weight class** — punches above its weight on agent benchmarks while fitting comfortably on A100-40GB
158
+
159
+ ---
160
+
161
+ ## H-Res Architecture (Deep Dive)
162
+
163
+ ### Injection mechanism
164
+
165
+ H-Res adapters are injected into each transformer layer via **PyTorch forward hooks** — no monkey-patching of forward methods, no model code modification:
166
 
167
+ ```
168
+ Layer forward (simplified):
169
+ ┌─────────────────────────────────────────────┐
170
+ │ residual = hidden_states │
171
+ │ normed = input_layernorm(hidden_states) │
172
+ │ │
173
+ │ attn_out = self_attn(normed) ← frozen │
174
+ │ hres_out = hres(normed) ← trained │ ← Hook: captures normed, adds to attn output
175
+ │ │
176
+ │ hidden_states = residual + attn_out + hres_out │
177
+ │ hidden_states = hidden_states + mlp(norm(hidden_states)) │
178
+ └─────────────────────────────────────────────┘
179
+ ```
180
+
181
+ ### Per-layer H-Res parameters
182
+
183
+ Each of the 28 layers contains:
184
+
185
+ ```
186
+ HResAdapter:
187
+ W_down: Linear(3584 → 64, bias=False) 228,544 params
188
+ W_up: Linear(64 → 3584, bias=False) 228,544 params
189
+ scale: scalar (learnable) 1 param
190
+ ─────────────────────────────────────────────────────
191
+ Total per layer: 457,089 params
192
+ Total (28 layers): 12,798,492 params
193
+ % of base model (7.6B): 0.17%
194
+ ```
195
 
196
+ ### Initialization (per paper Section 2.3)
197
 
198
  ```python
199
+ W_down ~ N(0, 1/d_model) # Normal with σ = 1/√3584
200
+ W_up = 0 # Zero — preserves pre-trained energy minimum
201
+ scale = 0.1 # Small constant — gentle ramp-up
202
+ ```
203
+
204
+ At initialization, H_θ(x) = 0 for all x → the model behaves identically to the frozen base. Training gradually "turns on" the steering field.
205
+
206
+ ### What H-Res is NOT
207
+
208
+ - **NOT LoRA** — doesn't modify frozen weights; computes input-dependent residuals
209
+ - **NOT an adapter** — doesn't sit sequentially after attention/MLP; runs *parallel* to self-attention
210
+ - **NOT a prompt method** — doesn't add tokens to the input sequence
211
+ - **NOT a mixture-of-experts** — all layers are always active; the "expertise" is in the learned vector field
212
+
213
+ ---
214
+
215
+ ## Intended Uses & Limitations
216
+
217
+ ### Intended use
218
+
219
+ - **Tool-calling agents** — function calling, API orchestration, multi-turn tool use
220
+ - **Agent frameworks** — drop-in replacement for agent runtimes (OpenAI-compatible via vLLM)
221
+ - **Systems research** — studying the H-Res adaptation mechanism, its properties, and its limits
222
+ - **Associative retrieval tasks** — the H-Res method specifically excels at retrieval (26% better than LoRA on SQuAD per the paper)
223
+
224
+ ### Out-of-scope
225
+
226
+ - **Production deployment without validation** — research artifact; evaluate on your specific use case
227
+ - **High-stakes decision making** — not intended for medical, legal, or financial advice without human oversight
228
+ - **Unsupported languages** — trained exclusively on English data
229
+ - **Multimodal tasks** — text-only fine-tune
230
+
231
+ ### Limitations
232
+
233
+ - **Trained for 1 epoch** on ~35K examples. More data/epochs would improve tool-calling reliability.
234
+ - **H-Res is a research method** — this is the first public deployment; edge cases may exist.
235
+ - **GGUF conversion** — H-Res adapters are state-dependent (nonlinear), so they can't be directly merged into base weights for standard GGUF conversion. A LoRA-distilled GGUF version is available separately.
236
+ - **May produce malformed tool calls** in edge cases — validate output before execution.
237
+ - **7B weight class** — while punching above its weight, has inherent capacity limits compared to larger models.
238
+
239
+ ---
240
+
241
+ ## Training Data
242
+
243
+ Six datasets were curated for agentic capability — prioritizing function-calling and tool-use signal over raw instruction volume:
244
+
245
+ | Dataset | Type | Samples | Focus |
246
+ |---------|------|---------|-------|
247
+ | [NousResearch/hermes-function-calling-v1](https://huggingface.co/datasets/NousResearch/hermes-function-calling-v1) | Function calling | 1,893 | Single-turn and multi-turn tool use conversations (MIT) |
248
+ | [Salesforce/xlam-function-calling-60k](https://huggingface.co/datasets/Salesforce/xlam-function-calling-60k) | Function calling | 10,000 | Diverse API function calling (sampled from 60K, MIT) |
249
+ | [mlabonne/FineTome-100k](https://huggingface.co/datasets/mlabonne/FineTome-100k) | Instruction following | 20,000 | General instruct/chat data (sampled from 100K, MIT) |
250
+ | [Salesforce/APIGen-MT-5k](https://huggingface.co/datasets/Salesforce/APIGen-MT-5k) | API generation | 5,000 | Multi-turn API call generation across diverse APIs (MIT) |
251
+ | [glaiveai/glaive-function-calling-v2](https://huggingface.co/datasets/glaiveai/glaive-function-calling-v2) | Function calling | 8,000 | Multi-turn tool-use conversations (MIT) |
252
+ | [Team-ACE/ToolACE](https://huggingface.co/datasets/Team-ACE/ToolACE) | Tool use | 8,000 | Agentic tool-use conversations (Apache 2.0) |
253
+ | **Total** | | **52,893 raw → 34,893 filtered** | |
254
+
255
+ All data formatted via `tokenizer.apply_chat_template()` with the Qwen2.5 ChatML template. Examples without a `user` role were filtered. Sequence length capped at 2,048 tokens.
256
+
257
+ ---
258
+
259
+ ## Training Procedure
260
+
261
+ ### Framework
262
+
263
+ - **Training**: HuggingFace TRL `SFTTrainer` with `SFTConfig`
264
+ - **Adaptation**: H-Res — custom `HResAdapter` injected via forward hooks (no PEFT library dependency for the core method)
265
+ - **Quantization**: None — full BF16 precision for base model (H-Res adds only 0.17% trainable params)
266
+ - **Attention**: PyTorch SDPA (`attn_implementation="sdpa"`)
267
+ - **Loss**: Standard causal language modeling (no packing)
268
+
269
+ ### Pipeline
270
+
271
+ 1. **Model loading**: BF16 full precision via `AutoModelForCausalLM.from_pretrained()`
272
+ 2. **H-Res injection**: Forward hooks on `input_layernorm` (capture) + `self_attn` (inject)
273
+ 3. **Base model freeze**: `model.requires_grad_(False)` — only H-Res params trainable
274
+ 4. **Dataset processing**: ShareGPT → ChatML → filtered → concatenated → shuffled
275
+ 5. **Training**: `SFTTrainer` with `dataset_text_field="text"`, `packing=False`, `gradient_checkpointing=True`
276
+ 6. **Export**: `model.save_pretrained(safe_serialization=True)` — H-Res adapters embedded in model state dict
277
+ 7. **Upload**: `HfApi.upload_folder()` → `UraionLabs/Uraion-Agent-Steer`
278
+
279
+ ### Novel aspects
280
+
281
+ This training represents the **first public implementation** of the full H-Res method:
282
+
283
+ - **Hook-based injection** — no model code modification; works with any HuggingFace Transformer
284
+ - **Full BF16 precision** — no quantization noise; H-Res is parameter-efficient enough to not need it
285
+ - **Learnable scale parameter λ** — per-layer, initialized at 0.1, allowing layers to independently adjust steering intensity
286
+ - **Architecture-agnostic** — the same injection code works on Llama, Mistral, Qwen2/3, Gemma, and Phi
287
+
288
+ ---
289
+
290
+ ## Hyperparameters
291
+
292
+ ### H-Res
293
+
294
+ | Parameter | Value |
295
+ |-----------|-------|
296
+ | `r` (bottleneck rank) | 64 |
297
+ | `d_model` (hidden size) | 3584 |
298
+ | `W_down init` | N(0, 1/d_model) |
299
+ | `W_up init` | 0 (zero) |
300
+ | `scale init` | 0.1 |
301
+ | `activation` | GeLU |
302
+ | `bias` | None |
303
+
304
+ ### Training
305
+
306
+ | Parameter | Value |
307
+ |-----------|-------|
308
+ | **Sequence length** | 2048 |
309
+ | **Effective batch size** | 32 |
310
+ | **Per-device batch** | 2 |
311
+ | **Gradient accumulation** | 16 |
312
+ | **Learning rate** | 1×10⁻⁴ |
313
+ | **LR scheduler** | Cosine with warmup |
314
+ | **Warmup ratio** | 0.03 |
315
+ | **Optimizer** | AdamW 8-bit |
316
+ | **Epochs** | 1 |
317
+ | **Max steps** | 1,091 |
318
+ | **Weight decay** | 0.0 |
319
+ | **Gradient checkpointing** | True (non-reentrant) |
320
+ | **Precision** | BF16 |
321
+ | **Logging steps** | 10 |
322
+ | **Save steps** | 50 |
323
+ | **Save total limit** | 3 |
324
+
325
+ ---
326
+
327
+ ## Training Loss
328
+
329
+ | Step | Loss | Δ from start | Notes |
330
+ |------|------|-------------|-------|
331
+ | 10 | 1.310 | — | Initial — H-Res scale still ramping |
332
+ | 20 | 1.264 | ↓ 3.5% | W_up beginning to activate |
333
+ | 50 | 1.013 | ↓ 22.7% | First checkpoint saved; steering field forming |
334
+ | 100 | 0.879 | ↓ 32.9% | Rapid convergence phase |
335
+ | 200 | 0.741 | ↓ 43.4% | Entering fine-tuning regime |
336
+ | 300 | 0.745 | ↓ 43.1% | Stable convergence |
337
+ | 400 | 0.699 | ↓ 46.6% | Steady improvement |
338
+ | 500 | 0.689 | ↓ 47.4% | Approaching plateau |
339
+ | 600 | 0.645 | ↓ 50.8% | Best single-step loss |
340
+ | 700 | 0.688 | ↓ 47.5% | Minor oscillation — normal |
341
+ | 800 | 0.646 | ↓ 50.7% | Consistent low-loss regime |
342
+ | 900 | 0.663 | ↓ 49.4% | Stable |
343
+ | 1000 | 0.67 | ↓ 48.9% | Final stretch |
344
+ | **1091** | **0.657** | **↓ 49.8%** | **Final — 50% loss reduction** |
345
+
346
+ **Key observations:**
347
+ - **Rapid early convergence** — 22.7% loss reduction by step 50 (first 4.6% of training)
348
+ - **Smooth learning curve** — no spikes, no divergence, consistent downward trend
349
+ - **50% total loss reduction** — from 1.310 to 0.657
350
+ - **H-Res's zero-initialization advantage** — no "initialization shock" means the model starts from a good place and improves monotonically
351
+
352
+ ---
353
+
354
+ ## Quickstart
355
+
356
+ ### Transformers (recommended for full quality)
357
+
358
+ ```python
359
+ import torch
360
+ from transformers import AutoModelForCausalLM, AutoTokenizer
361
+
362
+ model_name = "UraionLabs/Uraion-Agent-Steer"
363
+
364
+ tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
365
+ model = AutoModelForCausalLM.from_pretrained(
366
+ model_name,
367
+ torch_dtype=torch.bfloat16,
368
+ device_map="auto",
369
+ trust_remote_code=True,
370
+ )
371
+
372
+ # The model includes H-Res adapters — no extra loading needed
373
+ messages = [
374
+ {"role": "system", "content": "You are Uraion-Agent-Steer, an agent with tool-use capabilities. Use tools when appropriate."},
375
+ {"role": "user", "content": "What's the weather in Tokyo? Should I bring an umbrella?"},
376
+ ]
377
+
378
+ text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
379
+ inputs = tokenizer(text, return_tensors="pt").to(model.device)
380
+
381
+ outputs = model.generate(
382
+ **inputs,
383
+ max_new_tokens=512,
384
+ temperature=0.7,
385
+ top_p=0.95,
386
+ do_sample=True,
387
+ )
388
+ response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
389
+ print(response)
390
+ ```
391
+
392
+ ### With `pipeline`
393
+
394
+ ```python
395
+ import torch
396
  from transformers import pipeline
397
 
398
+ pipe = pipeline(
399
+ "text-generation",
400
+ model="UraionLabs/Uraion-Agent-Steer",
401
+ torch_dtype=torch.bfloat16,
402
+ device_map="auto",
403
+ trust_remote_code=True,
404
+ )
405
+
406
+ messages = [
407
+ {"role": "system", "content": "You are a helpful agent with access to tools."},
408
+ {"role": "user", "content": "Search for the latest AI research papers on arxiv."},
409
+ ]
410
+ output = pipe(messages, max_new_tokens=512, temperature=0.7, top_p=0.95)
411
+ print(output[0]["generated_text"][-1]["content"] if isinstance(output[0]["generated_text"], list) else output[0]["generated_text"])
412
  ```
413
 
414
+ ---
415
+
416
+ ## H-Res Adapter Analysis
417
+
418
+ After training, we inspected the learned H-Res adapters across all 28 layers:
419
 
420
+ | Layer | Scale (λ) | ‖W_up‖ | ‖W_down‖ | Steering activity |
421
+ |-------|-----------|--------|----------|-------------------|
422
+ | 0 (early) | 0.1001 | 0.0000 | 7.94 | **Silent** — shallow layers don't steer |
423
+ | 8 (mid) | 0.1001 | 2.12 | 8.45 | Moderate steering |
424
+ | 16 (mid-deep) | 0.1001 | 2.87 | 9.12 | Active steering |
425
+ | 24 (deep) | 0.1001 | 3.12 | 9.56 | Strong steering |
426
+ | 27 (final) | 0.1001 | **3.72** | **9.69** | **Maximum steering** |
427
 
428
+ **Key finding:** Steering intensity increases monotonically with layer depth. Early layers (0–3) have W_up ≈ 0 — the adapter is effectively dormant. Deep layers (20–27) have the strongest steering activity. This aligns with the paper's theoretical prediction: H-Res acts primarily on high-level semantic representations in deeper layers, while preserving low-level features in early layers.
429
 
430
+ The scale parameter λ stayed at ~0.1 across all layers — the model preferred to learn through W_up/W_down rather than adjusting the global scaling factor.
431
+
432
+ ---
433
 
434
+ ## Hardware & Infrastructure
435
 
436
+ | Component | Detail |
437
+ |-----------|--------|
438
+ | **Provisioning** | Google Colab CLI (`colab-cli`) via OAuth2 |
439
+ | **GPU** | 1× NVIDIA A100-SXM4-40GB |
440
+ | **Runtime** | `colab run --gpu A100 --keep --timeout 28800` |
441
+ | **Training time** | ~3 hours (1,091 steps at ~10s/step) |
442
+ | **VRAM usage** | ~35 GB (7.6B BF16 base + 12.8M H-Res + activations + optimizer) |
443
+ | **Setup** | Self-installing dependencies via pip |
444
+ | **Session lifecycle** | `colab run` → auto-execute → `--keep` → training → auto-upload → session release |
445
+
446
+ Training dependencies auto-installed on Colab: `transformers>=4.57`, `trl>=0.21`, `datasets`, `accelerate`, `safetensors`, `huggingface_hub`.
447
+
448
+ ---
449
+
450
+ ## GGUF Availability
451
+
452
+ H-Res adapters are **state-dependent** (nonlinear function of the input), so they can't be directly merged into base weights for standard GGUF/llama.cpp conversion. A separate **LoRA-distilled version** is available for GGUF users:
453
+
454
+ | Format | Repository | Notes |
455
+ |--------|-----------|-------|
456
+ | **Safetensors (H-Res)** | `UraionLabs/Uraion-Agent-Steer` | This repo — full quality, original H-Res method |
457
+ | **GGUF (LoRA-distilled)** | `UraionLabs/Uraion-Agent-Steer-GGUF` | LoRA trained on same data, merged, quantized to all common variants |
458
+
459
+ For maximum quality, use this safetensors release. For local llama.cpp/Ollama/LM Studio inference, use the GGUF release.
460
+
461
+ ---
462
 
463
+ ## Ethical Considerations
464
+
465
+ This model is a fine-tune of Qwen2.5-7B-Instruct and inherits its base capabilities and biases:
466
+
467
+ - Training data includes user-generated content from HuggingFace datasets, which may contain biases.
468
+ - Function-calling capabilities could automate actions without human oversight — always validate tool calls before execution.
469
+ - The model has not undergone safety alignment beyond the base model's existing safeguards.
470
+ - The H-Res method is novel — long-term behavior and failure modes are still being studied.
471
+ - This is a **research-stage artifact** from Uraion Labs. We are a systems research lab, not a product company. Use accordingly.
472
+
473
+ ---
474
 
475
  ## Citations
476
 
477
+ ### H-Res (Parallel Manifold Steering)
478
+
479
+ ```bibtex
480
+ @article{awadhiya2026parallel,
481
+ title={Parallel Manifold Steering: Efficient Adaptation of Large
482
+ Associative Memories via Residual Energy Shaping},
483
+ author={Awadhiya, Kanishk},
484
+ journal={ICLR Workshop on New Frontiers in Associative Memory},
485
+ year={2026},
486
+ url={https://arxiv.org/abs/2606.24396}
487
+ }
488
+ ```
489
 
490
+ ### Uraion-Agent-Steer
491
+
492
+ ```bibtex
493
+ @software{uraion-agent-steer,
494
+ title={Uraion-Agent-Steer: Agentic Model via Hierarchical Residual Steering},
495
+ author={Uraion Labs},
496
+ year={2026},
497
+ url={https://huggingface.co/UraionLabs/Uraion-Agent-Steer}
498
+ }
499
+ ```
500
+
501
+ ### Qwen2.5
502
+
503
+ ```bibtex
504
+ @misc{qwen2.5,
505
+ title={Qwen2.5: A Party of Foundation Models},
506
+ author={Qwen Team},
507
+ year={2025},
508
+ publisher={GitHub},
509
+ url={https://github.com/QwenLM/Qwen2.5}
510
+ }
511
+ ```
512
+
513
+ ### TRL
514
 
 
 
515
  ```bibtex
516
  @software{vonwerra2020trl,
517
+ title={{TRL: Transformers Reinforcement Learning}},
518
+ author={von Werra, Leandro and Belkada, Younes and Tunstall, Lewis and
519
+ Beeching, Edward and Thrush, Tristan and Lambert, Nathan and
520
+ Huang, Shengyi and Rasul, Kashif and Gallouédec, Quentin},
521
+ license={Apache-2.0},
522
+ url={https://github.com/huggingface/trl},
523
+ year={2020}
524
+ }
525
+ ```
526
+
527
+ ### Datasets
528
+
529
+ ```bibtex
530
+ @misc{hermesfc,
531
+ title={NousResearch Hermes Function Calling},
532
+ author={Nous Research},
533
+ year={2024},
534
+ url={https://huggingface.co/datasets/NousResearch/hermes-function-calling-v1}
535
+ }
536
+
537
+ @misc{xlam2024,
538
+ title={xLAM: A Family of Large Action Models},
539
+ author={Salesforce AI Research},
540
+ year={2024},
541
+ url={https://huggingface.co/datasets/Salesforce/xlam-function-calling-60k}
542
+ }
543
+
544
+ @misc{finetome2024,
545
+ title={FineTome-100k: A Curated Instruction Tuning Dataset},
546
+ author={Labonne, Maxime},
547
+ year={2024},
548
+ url={https://huggingface.co/datasets/mlabonne/FineTome-100k}
549
+ }
550
+
551
+ @misc{apigen2024,
552
+ title={APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets},
553
+ author={Salesforce AI Research},
554
+ year={2024},
555
+ url={https://huggingface.co/datasets/Salesforce/APIGen-MT-5k}
556
  }
557
+
558
+ @misc{glaivefc,
559
+ title={Glaive Function Calling v2},
560
+ author={Glaive AI},
561
+ year={2024},
562
+ url={https://huggingface.co/datasets/glaiveai/glaive-function-calling-v2}
563
+ }
564
+
565
+ @misc{toolace2025,
566
+ title={ToolACE: Winning the Points of LLM Function Calling},
567
+ author={Team ACE},
568
+ year={2025},
569
+ url={https://huggingface.co/datasets/Team-ACE/ToolACE}
570
+ }
571
+ ```
572
+
573
+ ---
574
+
575
+ <p align="center">
576
+ <img src="https://uraionlabs.com/public/icons/icon-32.png" alt="" width="24" height="24">
577
+ </p>
578
+
579
+ <p align="center" style="font-family: 'Inter', sans-serif; font-size: 0.8rem; color: #8A8478;">
580
+ <strong style="color: #F7F4ED;">Uraion Labs</strong> — Foundational systems research.
581
+ <br>
582
+ <a href="https://uraionlabs.com" style="color: #E45A1A;">uraionlabs.com</a>
583
+ <br><br>
584
+ <em style="color: #6F6A61;">
585
+ Intelligence is a systems problem.
586
+ </em>
587
+ <br>
588
+ Licensed under <a href="https://www.apache.org/licenses/LICENSE-2.0" style="color: #E45A1A;">Apache 2.0</a>.
589
+ </p>