---
license: llama3.2
base_model: meta-llama/Llama-3.2-3B-Instruct
tags:
  - rys
  - layer-duplication
  - math
  - gguf
  - sovereign-collection-v2
---

# Llama-3.2-3B-RYS-21-24

Llama-3.2-3B-Instruct with layers 21-23 duplicated. A late-stack math circuit runs twice on every forward pass.

28 base layers → 31 after duplication. No training, no merging, no weight changes.

**Math 0.470 → 0.6132 (+14.32 — the biggest math lift in the v2 corpus). Reasoning 88.24% → 82.36% (−5.88). EQ 84.38 → 84.11 (−0.27).**

## Results

| Metric | Baseline | RYS (21,24) | Delta |
|--------|----------|-------------|-------|
| Math | 0.470 | 0.6132 | **+14.32** |
| EQ | 84.38 | 84.11 | −0.27 |
| Reasoning | 88.24% | 82.36% | −5.88 |

**The math amplifier.** Llama-3.2-3B-Instruct has the second-highest baseline reasoning in the v2 corpus (88.24%) — near-ceiling, so RYS has little reasoning room to lift. But the same train-free intervention applied to the late-stack block (21,24) produces the **biggest math lift anywhere in the corpus** (+14.32 absolute, ~30% relative). Math and reasoning circuits sit at different depths in this model; the math one has headroom.

Pick this when math throughput matters and reasoning is already strong enough. The within-family contrast is the loudest in the corpus: sibling [`Llama-3.2-1B-RYS-10-13-GGUF`](https://huggingface.co/john-broadway/Llama-3.2-1B-RYS-10-13-GGUF) lifts reasoning from 0% → 64.71% on the same block-duplication mechanism applied to a different depth.

## Usage

```
llama-server -m Llama-3.2-3B-RYS-21-24-Q4_K_M.gguf -ngl 99
```

## Full sweep data

54 configurations tested. (21,24) block-3 is the best-combined pick (math-optimal). Full per-config sweep + cross-architecture analysis: [v2 dataset](https://huggingface.co/datasets/john-broadway/rys-sovereign-collection-v2).


Part of the RYS Sovereign Collection v2.

---

## Where this sits in the Sovereign Collection

**v1 — Qwen2.5 cross-scale + Qwen3-32B headline crossover.** 5 model repos.

**v2 — cross-architecture corpus.** 21 model variants across 10 architecture families. Inverse correlation (r = −0.726): weak baselines lift more, in their weakest dimension. The Llama-3.2 family alone (1B + 3B) spans the entire baseline-vs-magnitude curve in the v2 corpus. 13 deployable RYS-applied weight repos covering every non-zero-lift variant.

**Within-family sibling:** [`john-broadway/Llama-3.2-1B-RYS-10-13-GGUF`](https://huggingface.co/john-broadway/Llama-3.2-1B-RYS-10-13-GGUF) — the 0%→64.71% reasoning unlock at the weak-baseline end.

**Credit**

John Broadway, with collaboration from Claude (Opus 4.6 in April 2026 sweep generation and build pipeline; Opus 4.7 in May 2026 cross-architecture analysis and publication). Original RYS method by [David Ng](https://dnhkng.github.io/posts/rys/) on Qwen2-72B; sweep + probe toolkit by [alainnothere](https://github.com/alainnothere/llm-circuit-finder).