Qwen3.5-4b-"base" finetuned for my personal RL escapades:
- 40% from my pretraining set: the pile, textfiles.com, stackexchange, bluesky user response modelling data, ao3, the stack, random cybernetic control loops with attractor "goals" identified and stated before the controller's actions start
- 40% from FineWeb
- 20% warmup data for CLM_R, a reasoning generator and reasoning discriminator for general text completion
- Downloads last month
- 71
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for crumbs-playground/qwen3.5-4b-base-me
Base model
Qwen/Qwen3.5-4B-Base