AK
kil0rk
ยท
AI & ML interests
None yet
Recent Activity
liked a model 1 day ago
antirez/deepseek-v4-gguf reacted to SeaWolf-AI's post with โค๏ธ 10 days ago
๐งฌ Darwin Family: Zero Gradient Steps, GPQA Diamond 88.89%
How far can we push LLM reasoning *without* training?
Our team at VIDRAFT submitted this paper to Daily Papers yesterday, and it's
currently #3. Huge thanks to everyone who upvoted โ sharing the core ideas below.
๐ Paper: https://huggingface.co/papers/2605.14386
๐ arXiv: https://arxiv.org/abs/2605.14386
๐ Model: https://huggingface.co/FINAL-Bench/Darwin-28B-REASON
๐ Model: https://huggingface.co/FINAL-Bench/Darwin-28B-Opus
---
TL;DR
Darwin Family is a training-free evolutionary merging framework.
By recombining the weight spaces of existing LLM checkpoints โ with zero
gradient-based training โ it reaches frontier-level reasoning.
- ๐ Darwin-28B-Opus: GPQA Diamond 88.89%
- ๐ธ Zero gradient steps โ not a single B200 or H200 hour needed
- ๐งฌ Consistent gains across 4B โ 35B scale
- ๐ Cross-architecture breeding between Transformer and Mamba families
- ๐ Stable recursive multi-generation evolution
#Three Core Mechanisms
โ 14-dim Adaptive Merge Genome โ fine-grained recombination at both
component level (Attention / FFN / MLP / LayerNorm / Embedding) and block
level, expanding the prior evolutionary-merge search space.
โก MRI-Trust Fusion โ we diagnose each layer's reasoning contribution
via an **MRI (Model Reasoning Importance)** signal and fuse it with
evolutionary search through a **learnable trust parameter**. Trust the
diagnostic too much and search collapses; ignore it and search becomes
inefficient โ Darwin learns the balance from data.
โข Architecture Mapper โ weight-space breeding across heterogeneous
families. Attention ร SSM crossover actually works.
Why It Matters
> Diagnose latent capabilities already encoded in open checkpoints,
> and recombine them โ no gradients required.
Replies and critiques welcome ๐ reacted to unmodeled-tyler's post with ๐ 16 days ago
Just started a fun project!
https://huggingface.co/datasets/unmodeled-tyler/DoW-UFO-UAP-1
I'm getting the recently released DoW UFO/UAP documents (https://war.gov/ufo) cleaned and converted into a dataset here on Hugging Face!
There 161 different files in the gov release (pdfs, images, videos, audio, etc) and my current plan is to do it all in 1 dataset with 4 different shards - that way you can just call whichever tables you want/need when you import the dataset.
This is an ongoing project (I'm doing it on the side + my regular projects) so it's a bit of a growing entity. I'll also continuously refine the data over time to make sure it's as clean as possible.
Check it out! Who knows what you'll find in there? Organizations
None yet