full_xattn_Qwen3-8B / all_results.json
QQTang1223's picture
Upload model checkpoint: full_xattn_Qwen3-8B
7ab354d verified
raw
history blame contribute delete
221 Bytes
{
"epoch": 0.315955766192733,
"num_input_tokens_seen": 738472692,
"train_loss": 12.393445908228557,
"train_runtime": 50692.0245,
"train_samples_per_second": 0.284,
"train_steps_per_second": 0.006
}