Qwen3.5-4b-"base" finetuned for my personal RL escapades:

  • 40% from my pretraining set: the pile, textfiles.com, stackexchange, bluesky user response modelling data, ao3, the stack, random cybernetic control loops with attractor "goals" identified and stated before the controller's actions start
  • 40% from FineWeb
  • 20% warmup data for CLM_R, a reasoning generator and reasoning discriminator for general text completion
Downloads last month
71
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for crumbs-playground/qwen3.5-4b-base-me

Finetuned
(81)
this model