haotiansun014 commited on
Commit
d890a0c
·
verified ·
1 Parent(s): 5939c69

Add README

Browse files
Files changed (1) hide show
  1. README.md +52 -0
README.md ADDED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language: en
4
+ tags:
5
+ - draft-refine
6
+ - block-diffusion
7
+ - nce
8
+ - qwen3-4b
9
+ ---
10
+
11
+ # Qwen3-4B Stage-3 NCE — beam-lastround variant
12
+
13
+ Stage-3 Noise-Contrastive-Estimation training resumed from the Stage-2
14
+ end ckpt at step 7335. NCE phase trains the scorer head to rank K=4
15
+ candidate completions per block, with beam-bayes proposal sampling but
16
+ **lastround** score aggregation (only the final K_inner-th iter score
17
+ is used for picking, vs `perround` which averages across iters).
18
+
19
+ ## Files
20
+
21
+ | File | Size |
22
+ |---|---|
23
+ | `model.pt` | 8.5 GB |
24
+ | `optimizer.pt` | 0.45 GB |
25
+ | `scheduler.pt` | 1.7 KB |
26
+ | `eval_batches.pt` | 13 MB |
27
+ | `rng_rank{0..23}.pt` | 14.7 KB each |
28
+
29
+ Total: ~9.54 GB / 30 files. Full resume state for re-training.
30
+
31
+ ## Step + lineage
32
+
33
+ - Resume from: Stage-2 ckpt at step 7335
34
+ - This ckpt: step 10433 (3098 NCE-phase steps; ~2× faster training than
35
+ perround due to scoring only the final iter)
36
+ - Backbone: Qwen3-4B-Base (frozen during NCE)
37
+ - Config: `configs/large_scale/qwen3_4b_stage3_nce_resume7335_beam_lastround_6n.yaml`
38
+
39
+ ## Eval (Stage-3 BoN with this ckpt)
40
+
41
+ Audit on gsm8k 200q at α=0.0 (matches Stage-2 baseline):
42
+ | | acc |
43
+ |---|---:|
44
+ | α=0.0 + beam_bayes argmax | **84.00%** |
45
+
46
+ Identical to perround chain at α=0 (within noise) — both chains' backbones
47
+ remain intact under the freeze. The lastround variant trains 2× faster.
48
+
49
+ ## Related archives
50
+
51
+ - `haotiansun014/qwen3-4b-stage3-nce-7335-perround-archive`
52
+ - `haotiansun014/qwen3-4b-stage3-nce-7335-temp07-archive`