ropedia-xperience-10m-task-baselines / results /omni_finetune /QWEN3_FULL_PARAMETER_GATES_20260609.md
cy0307's picture
Add files using upload-large-folder tool
63c5594 verified
|
Raw
History Blame
3.25 kB

Qwen3-Omni Full-Parameter Feasibility Gates

Generated: 2026-06-13T18:14:32+00:00

The full-parameter gates prove that Qwen3-Omni full-parameter FSDP can load, prepare, run backward/optimizer steps, and complete guarded pilots up to 256 optimizer steps on an 8-GPU remote worker. They do not prove a production full-parameter fine-tune, and they intentionally save no full checkpoints or public weights.

Summary

  • Status: pass
  • Decision: full_parameter_feasible_for_guarded_short_runs_not_promoted
  • Passed runs: 6
  • Preempted runs: 1
  • Review/missing runs: 0
  • Completed full-parameter optimizer steps: 489
  • Longest passed run: xperience10m_qwen3_omni_128ep_fullparam_pilot256_after_qwen_v6_preemptible_8gpu_20260611 (256 steps)
  • Checkpoint saved: False

Runs

run status steps samples final loss epoch/train loss policy source
Full-Parameter 1-Step Feasibility Smoke passed 1 8 1.2726 1.2726 no weights/checkpoints results/omni_finetune/xperience10m_qwen3_omni_128ep_fullparam_smoke_preemptible_8gpu_20260609/fullparam_feasibility_summary.json
Full-Parameter 8-Step Short Train passed 8 64 1.1805 1.2190 no weights/checkpoints results/omni_finetune/xperience10m_qwen3_omni_128ep_fullparam_shorttrain8_preemptible_8gpu_20260609/fullparam_shorttrain8_summary.json
Full-Parameter 32-Step Pilot passed 32 256 0.2206 0.8451 no weights/checkpoints results/omni_finetune/xperience10m_qwen3_omni_128ep_fullparam_pilot32_preemptible_8gpu_20260609/fullparam_pilot32_summary.json
Full-Parameter 64-Step Pilot passed 64 512 0.0112 0.4434 no weights/checkpoints results/omni_finetune/xperience10m_qwen3_omni_128ep_fullparam_pilot64_preemptible_8gpu_20260609/fullparam_pilot64_summary.json
Full-Parameter 128-Step Opportunistic Pilot preempted_for_qwen_v5_handoff 0 1024 no weights/checkpoints results/omni_finetune/xperience10m_qwen3_omni_128ep_fullparam_pilot128_preemptible_8gpu_20260609/fullparam_pilot128_summary.json
Full-Parameter 128-Step Post-Qwen-v5 Pilot passed 128 1024 0.0137 0.2158 no weights/checkpoints results/omni_finetune/xperience10m_qwen3_omni_128ep_fullparam_pilot128_after_qwen_v5_preemptible_8gpu_20260609/training_metadata.json
Full-Parameter 256-Step Post-Qwen-v6 Pilot passed 256 2048 0.0096 0.1158 no weights/checkpoints results/omni_finetune/xperience10m_qwen3_omni_128ep_fullparam_pilot256_after_qwen_v6_preemptible_8gpu_20260611/training_metadata.json

Publication Policy

  • Public summary allowed: true
  • Publish full-parameter weights: false
  • Publish full checkpoints: false
  • Reason: All completed full-parameter gate runs used save_mode=none; the preempted pilot saved nothing. These are feasibility evidence only.

Next Steps

  • Keep the verified Qwen3-Omni LoRA adapter as the published production result for the 128-episode suite.
  • For a production full-parameter run, add a sharded checkpoint/resume plan before any long training launch.
  • Run a separate checkpointed full-parameter pilot only when GPUs are not needed by verified LoRA evaluation/publication work.