language: - en library_name: transformers base_model: Qwen/Qwen3-8B tags: - qwen3 - sdft - sdpo - distillation - biology license: apache-2.0

qwen3-8b-biology-1h

qwen3-8b-biology-1h is the ~1 hour wall-clock checkpoint of Qwen/Qwen3-8B trained on biology with an SDPO-style self-distillation pipeline.

Method

This model follows the SDPO method from:

Checkpoint

  • Snapshot: step_10
  • Format: sharded safetensors
  • Repo: wambosec/qwen3-8b-biology-1h

Training Setup (this run)

  • Base model: Qwen/Qwen3-8B
  • Dataset: sciknoweval/biology (train split)
  • Teacher regularization: EMA
  • Distillation: top-k (k=100) + tail bucket
  • Importance sampling: token-level, clipped
  • Completions per prompt: 8
  • Max prompt length: 2048
  • Max completion length: 8192

Repro (command used style)

uv run sdft @ configs/sdft/generalization.toml \
  --trainer.data.dataset_name=../SDPO/datasets/sciknoweval/biology \
  --trainer.ckpt.interval=10 \
  --trainer.ckpt.keep-last=1 \
  --trainer.ckpt.weights.save-format=safetensors \
  --trainer.ckpt.weights.save-sharded

## Intended Use

Research checkpoint for:

- early training-dynamics analysis,
- biology-domain probing,
- continuation finetuning.

## Limitations

- This is an intermediate checkpoint, not a final converged model.
- No full safety/alignment evaluation is claimed here.
- Metrics are not reported as a final benchmark release.

## Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

repo = "wambosec/qwen3-8b-biology-1h"
tok = AutoTokenizer.from_pretrained(repo, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    repo,
    torch_dtype="auto",
    device_map="auto",
    trust_remote_code=True,
)

## Citation

If you use this checkpoint, please cite SDPO:

- https://arxiv.org/abs/2601.20802v1
Downloads last month
3
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for wambosec/qwen3-8b-biology-1h

Finetuned
Qwen/Qwen3-8B
Finetuned
(1710)
this model

Paper for wambosec/qwen3-8b-biology-1h