How to use from
Hermes Agent
Start the llama.cpp server
# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf john-broadway/Qwen2.5-1.5B-RYS-4-7-GGUF:Q4_K_M
Configure Hermes
# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default john-broadway/Qwen2.5-1.5B-RYS-4-7-GGUF:Q4_K_M
Run Hermes
hermes
Quick Links

Qwen2.5-1.5B-RYS-4-7

Qwen2.5-1.5B-Instruct with layers 4-6 duplicated. A 3-layer balanced block runs twice on every forward pass.

28 base layers β†’ 31 after duplication. No training, no merging, no weight changes.

Math +3.18. EQ +6.25 (71.37 β†’ 77.62). Reasoning +11.76% (76.47% β†’ 88.24%). All three metrics up.

Results

Metric Baseline RYS (4,7) Delta
Math 0.5395 0.5713 +3.18
EQ 71.37 77.62 +6.25
Reasoning 76.47% 88.24% +11.76

The balanced daily driver. Two clean circuits at this size, both inside the L4-L7 window. Duplicating this 3-layer block boosts math, EQ, and reasoning simultaneously β€” no trade-down anywhere. The best-all-around performer relative to size in the v1 collection.

Usage

llama-server -m Qwen2.5-1.5B-RYS-4-7-Q4_K_M.gguf -ngl 99

Full sweep data

51 configurations tested. (4,7) is the headline pick from the v1 writeup. Full sweep data in the v2 corpus dataset.

Part of the RYS Sovereign Collection v1.


Where this sits in the Sovereign Collection

v1 β€” Qwen2.5 cross-scale + Qwen3-32B headline. Four sizes from 0.5B to 32B; RYS works at every scale, with the lift size and dimension shifting by baseline:

  • 0.5B β†’ EQ specialist
  • 1.5B β†’ balanced daily driver
  • 7B β†’ math specialist via (8,12)
  • 32B β†’ the headline "Big Boy"

v2 β€” cross-architecture extension. 21 model variants across 10 architecture families. Headline: weak baselines lift more, in their weakest dimension. β†’ john-broadway/rys-sovereign-collection-v2

Credit

John Broadway, with collaboration from Claude (Opus 4.6 in April 2026 build; Opus 4.7 in May 2026 analysis and publication). Original RYS method by David Ng on Qwen2-72B; sweep toolkit by alainnothere.

Downloads last month
149
GGUF
Model size
2B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for john-broadway/Qwen2.5-1.5B-RYS-4-7-GGUF

Quantized
(198)
this model

Collection including john-broadway/Qwen2.5-1.5B-RYS-4-7-GGUF