timestamp step epoch train_loss grad_norm learning_rate 2025-09-21T21:41:21.703893 10 0.32 903.355000 614.498596 1.00e-07 2025-09-21T21:41:27.856246 20 0.65 902.448600 617.974670 5.00e-07 2025-09-21T21:41:33.781063 30 0.97 886.335200 578.719543 1.00e-06 2025-09-21T21:41:39.650969 40 1.29 941.081400 763.962463 1.50e-06 2025-09-21T21:41:45.448703 50 1.61 878.512000 745.441284 2.00e-06 2025-09-21T21:41:46.303569 50 1.61 NA NA NA 2025-09-21T21:41:53.272236 60 1.94 870.239400 805.721558 2.50e-06 2025-09-21T21:41:58.923912 70 2.26 883.066600 810.874512 3.00e-06 2025-09-21T21:42:04.647459 80 2.58 858.037100 876.961365 3.50e-06 2025-09-21T21:42:10.439181 90 2.90 879.551700 879.198303 4.00e-06 2025-09-21T21:42:16.088489 100 3.23 838.324500 1011.899841 4.50e-06 2025-09-21T21:42:16.862886 100 3.23 NA NA NA 2025-09-21T21:42:24.087341 110 3.55 789.837500 1295.429565 5.00e-06 2025-09-21T21:42:29.817688 120 3.87 745.176600 1823.199463 5.50e-06 2025-09-21T21:42:35.473868 130 4.19 701.951900 1844.962524 6.00e-06 2025-09-21T21:42:41.363224 140 4.52 661.328600 1836.961670 6.50e-06 2025-09-21T21:42:47.037840 150 4.84 558.101000 1857.315308 7.00e-06 2025-09-21T21:42:47.769415 150 4.84 NA NA NA 2025-09-21T21:42:54.652980 160 5.16 499.938000 2131.156982 7.50e-06 2025-09-21T21:43:00.440291 170 5.48 450.506500 1810.863647 8.00e-06 2025-09-21T21:43:06.233803 180 5.81 390.488400 1685.968994 8.50e-06 2025-09-21T21:43:12.164436 190 6.13 324.997900 1826.587402 9.00e-06 2025-09-21T21:43:17.924886 200 6.45 280.306800 1618.513672 9.50e-06 2025-09-21T21:43:18.676823 200 6.45 NA NA NA 2025-09-21T21:43:20.259029 200 6.45 NA NA NA