10.27.194.24 10.27.174.115 url2IP complete ! Warning: Permanently added '10.27.194.24' (ED25519) to the list of known hosts. Warning: Permanently added '10.27.174.115' (ED25519) to the list of known hosts. NODE_INFO: , --machine_rank 1 --main_process_ip 10.27.194.24 NODE_INFO: , --machine_rank 0 --main_process_ip 10.27.194.24 The following values were not passed to `accelerate launch` and had defaults used instead: `--mixed_precision` was set to a value of `'no'` `--dynamo_backend` was set to a value of `'no'` To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`. The following values were not passed to `accelerate launch` and had defaults used instead: `--mixed_precision` was set to a value of `'no'` `--dynamo_backend` was set to a value of `'no'` To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`. Rank[15/16] 06/24/2025 12:52:31 INFO train.py:149 | if accelerator initialized:True Rank[15/16] 06/24/2025 12:52:31 INFO train.py:150 | accelerator state: {'_cpu': False, 'backend': 'nccl', 'device': device(type='cuda', index=7), 'debug': False, 'distributed_type': , 'num_processes': 16, 'process_index': 15, 'local_process_index': 7, 'fork_launched': False, 'deepspeed_plugins': None, 'use_ipex': None, 'torch_tp_plugin': None, 'dynamo_plugin': TorchDynamoPlugin(backend=, mode='default', fullgraph=False, dynamic=False, options=None, disable=False), '_mixed_precision': 'no'} Rank[8/16] 06/24/2025 12:52:31 INFO train.py:149 | if accelerator initialized:True Rank[8/16] 06/24/2025 12:52:31 INFO train.py:150 | accelerator state: {'_cpu': False, 'backend': 'nccl', 'device': device(type='cuda', index=0), 'debug': False, 'distributed_type': , 'num_processes': 16, 'process_index': 8, 'local_process_index': 0, 'fork_launched': False, 'deepspeed_plugins': None, 'use_ipex': None, 'torch_tp_plugin': None, 'dynamo_plugin': TorchDynamoPlugin(backend=, mode='default', fullgraph=False, dynamic=False, options=None, disable=False), '_mixed_precision': 'no'} Rank[10/16] 06/24/2025 12:52:31 INFO train.py:149 | if accelerator initialized:True Rank[10/16] 06/24/2025 12:52:31 INFO train.py:150 | accelerator state: {'_cpu': False, 'backend': 'nccl', 'device': device(type='cuda', index=2), 'debug': False, 'distributed_type': , 'num_processes': 16, 'process_index': 10, 'local_process_index': 2, 'fork_launched': False, 'deepspeed_plugins': None, 'use_ipex': None, 'torch_tp_plugin': None, 'dynamo_plugin': TorchDynamoPlugin(backend=, mode='default', fullgraph=False, dynamic=False, options=None, disable=False), '_mixed_precision': 'no'} Rank[11/16] 06/24/2025 12:52:31 INFO train.py:149 | if accelerator initialized:True Rank[11/16] 06/24/2025 12:52:31 INFO train.py:150 | accelerator state: {'_cpu': False, 'backend': 'nccl', 'device': device(type='cuda', index=3), 'debug': False, 'distributed_type': , 'num_processes': 16, 'process_index': 11, 'local_process_index': 3, 'fork_launched': False, 'deepspeed_plugins': None, 'use_ipex': None, 'torch_tp_plugin': None, 'dynamo_plugin': TorchDynamoPlugin(backend=, mode='default', fullgraph=False, dynamic=False, options=None, disable=False), '_mixed_precision': 'no'} Rank[9/16] 06/24/2025 12:52:31 INFO train.py:149 | if accelerator initialized:True Rank[9/16] 06/24/2025 12:52:31 INFO train.py:150 | accelerator state: {'_cpu': False, 'backend': 'nccl', 'device': device(type='cuda', index=1), 'debug': False, 'distributed_type': , 'num_processes': 16, 'process_index': 9, 'local_process_index': 1, 'fork_launched': False, 'deepspeed_plugins': None, 'use_ipex': None, 'torch_tp_plugin': None, 'dynamo_plugin': TorchDynamoPlugin(backend=, mode='default', fullgraph=False, dynamic=False, options=None, disable=False), '_mixed_precision': 'no'} Rank[13/16] 06/24/2025 12:52:31 INFO train.py:149 | if accelerator initialized:True Rank[13/16] 06/24/2025 12:52:31 INFO train.py:150 | accelerator state: {'_cpu': False, 'backend': 'nccl', 'device': device(type='cuda', index=5), 'debug': False, 'distributed_type': , 'num_processes': 16, 'process_index': 13, 'local_process_index': 5, 'fork_launched': False, 'deepspeed_plugins': None, 'use_ipex': None, 'torch_tp_plugin': None, 'dynamo_plugin': TorchDynamoPlugin(backend=, mode='default', fullgraph=False, dynamic=False, options=None, disable=False), '_mixed_precision': 'no'} Rank[4/16] 06/24/2025 12:52:31 INFO train.py:149 | if accelerator initialized:True Rank[4/16] 06/24/2025 12:52:31 INFO train.py:150 | accelerator state: {'_cpu': False, 'backend': 'nccl', 'device': device(type='cuda', index=4), 'debug': False, 'distributed_type': , 'num_processes': 16, 'process_index': 4, 'local_process_index': 4, 'fork_launched': False, 'deepspeed_plugins': None, 'use_ipex': None, 'torch_tp_plugin': None, 'dynamo_plugin': TorchDynamoPlugin(backend=, mode='default', fullgraph=False, dynamic=False, options=None, disable=False), '_mixed_precision': 'no'} Rank[12/16] 06/24/2025 12:52:31 INFO train.py:149 | if accelerator initialized:True Rank[12/16] 06/24/2025 12:52:31 INFO train.py:150 | accelerator state: {'_cpu': False, 'backend': 'nccl', 'device': device(type='cuda', index=4), 'debug': False, 'distributed_type': , 'num_processes': 16, 'process_index': 12, 'local_process_index': 4, 'fork_launched': False, 'deepspeed_plugins': None, 'use_ipex': None, 'torch_tp_plugin': None, 'dynamo_plugin': TorchDynamoPlugin(backend=, mode='default', fullgraph=False, dynamic=False, options=None, disable=False), '_mixed_precision': 'no'} Rank[14/16] 06/24/2025 12:52:31 INFO train.py:149 | if accelerator initialized:True Rank[14/16] 06/24/2025 12:52:31 INFO train.py:150 | accelerator state: {'_cpu': False, 'backend': 'nccl', 'device': device(type='cuda', index=6), 'debug': False, 'distributed_type': , 'num_processes': 16, 'process_index': 14, 'local_process_index': 6, 'fork_launched': False, 'deepspeed_plugins': None, 'use_ipex': None, 'torch_tp_plugin': None, 'dynamo_plugin': TorchDynamoPlugin(backend=, mode='default', fullgraph=False, dynamic=False, options=None, disable=False), '_mixed_precision': 'no'} Rank[7/16] 06/24/2025 12:52:31 INFO train.py:149 | if accelerator initialized:True Rank[7/16] 06/24/2025 12:52:31 INFO train.py:150 | accelerator state: {'_cpu': False, 'backend': 'nccl', 'device': device(type='cuda', index=7), 'debug': False, 'distributed_type': , 'num_processes': 16, 'process_index': 7, 'local_process_index': 7, 'fork_launched': False, 'deepspeed_plugins': None, 'use_ipex': None, 'torch_tp_plugin': None, 'dynamo_plugin': TorchDynamoPlugin(backend=, mode='default', fullgraph=False, dynamic=False, options=None, disable=False), '_mixed_precision': 'no'} Rank[1/16] 06/24/2025 12:52:31 INFO train.py:149 | if accelerator initialized:True Rank[1/16] 06/24/2025 12:52:31 INFO train.py:150 | accelerator state: {'_cpu': False, 'backend': 'nccl', 'device': device(type='cuda', index=1), 'debug': False, 'distributed_type': , 'num_processes': 16, 'process_index': 1, 'local_process_index': 1, 'fork_launched': False, 'deepspeed_plugins': None, 'use_ipex': None, 'torch_tp_plugin': None, 'dynamo_plugin': TorchDynamoPlugin(backend=, mode='default', fullgraph=False, dynamic=False, options=None, disable=False), '_mixed_precision': 'no'} Rank[6/16] 06/24/2025 12:52:31 INFO train.py:149 | if accelerator initialized:True Rank[6/16] 06/24/2025 12:52:31 INFO train.py:150 | accelerator state: {'_cpu': False, 'backend': 'nccl', 'device': device(type='cuda', index=6), 'debug': False, 'distributed_type': , 'num_processes': 16, 'process_index': 6, 'local_process_index': 6, 'fork_launched': False, 'deepspeed_plugins': None, 'use_ipex': None, 'torch_tp_plugin': None, 'dynamo_plugin': TorchDynamoPlugin(backend=, mode='default', fullgraph=False, dynamic=False, options=None, disable=False), '_mixed_precision': 'no'} Rank[3/16] 06/24/2025 12:52:31 INFO train.py:149 | if accelerator initialized:True Rank[3/16] 06/24/2025 12:52:31 INFO train.py:150 | accelerator state: {'_cpu': False, 'backend': 'nccl', 'device': device(type='cuda', index=3), 'debug': False, 'distributed_type': , 'num_processes': 16, 'process_index': 3, 'local_process_index': 3, 'fork_launched': False, 'deepspeed_plugins': None, 'use_ipex': None, 'torch_tp_plugin': None, 'dynamo_plugin': TorchDynamoPlugin(backend=, mode='default', fullgraph=False, dynamic=False, options=None, disable=False), '_mixed_precision': 'no'} Rank[5/16] 06/24/2025 12:52:31 INFO train.py:149 | if accelerator initialized:True Rank[5/16] 06/24/2025 12:52:31 INFO train.py:150 | accelerator state: {'_cpu': False, 'backend': 'nccl', 'device': device(type='cuda', index=5), 'debug': False, 'distributed_type': , 'num_processes': 16, 'process_index': 5, 'local_process_index': 5, 'fork_launched': False, 'deepspeed_plugins': None, 'use_ipex': None, 'torch_tp_plugin': None, 'dynamo_plugin': TorchDynamoPlugin(backend=, mode='default', fullgraph=False, dynamic=False, options=None, disable=False), '_mixed_precision': 'no'} Rank[2/16] 06/24/2025 12:52:31 INFO train.py:149 | if accelerator initialized:True Rank[2/16] 06/24/2025 12:52:31 INFO train.py:150 | accelerator state: {'_cpu': False, 'backend': 'nccl', 'device': device(type='cuda', index=2), 'debug': False, 'distributed_type': , 'num_processes': 16, 'process_index': 2, 'local_process_index': 2, 'fork_launched': False, 'deepspeed_plugins': None, 'use_ipex': None, 'torch_tp_plugin': None, 'dynamo_plugin': TorchDynamoPlugin(backend=, mode='default', fullgraph=False, dynamic=False, options=None, disable=False), '_mixed_precision': 'no'} Rank[0/16] 06/24/2025 12:52:31 INFO train.py:149 | if accelerator initialized:True Rank[0/16] 06/24/2025 12:52:31 INFO train.py:150 | accelerator state: {'_cpu': False, 'backend': 'nccl', 'device': device(type='cuda', index=0), 'debug': False, 'distributed_type': , 'num_processes': 16, 'process_index': 0, 'local_process_index': 0, 'fork_launched': False, 'deepspeed_plugins': None, 'use_ipex': None, 'torch_tp_plugin': None, 'dynamo_plugin': TorchDynamoPlugin(backend=, mode='default', fullgraph=False, dynamic=False, options=None, disable=False), '_mixed_precision': 'no'} Rank[0/16] 06/24/2025 12:52:31 INFO train.py:68 | { "hist_steps": 1, "pred_steps": 64, "chunk_size": 8, "embed_dims": 256, "with_depth": true, "with_depth_loss": true, "min_depth": 0.01, "max_depth": 1.2, "num_depth": 128, "batch_size": 8, "max_step": 100000, "step_log_freq": 25, "save_step_freq": 4000, "num_workers": 8, "lr": 0.0001, "checkpoint": "./ckpt/groundingdino_swint_ogc_mmdet-822d7e9d-rename.pth", "bert_checkpoint": "./ckpt/bert-base-uncased", "data_path": "./data/lmdb", "urdf": "./urdf/arx5/arx5_description_isaac.urdf", "multi_task": false, "scale_shift_version": "single_task_blocks_stack_three", "task_names": [ "place_empty_cup" ] } Rank[7/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[7/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[6/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[2/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[6/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[2/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[3/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[3/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[4/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[4/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[1/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[1/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[5/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[5/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[0/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[0/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[5/16] 06/24/2025 12:52:32 INFO base_lmdb_dataset.py:186 | dataset length: 17451, number of episode: 100 Rank[4/16] 06/24/2025 12:52:32 INFO base_lmdb_dataset.py:186 | dataset length: 17451, number of episode: 100 Rank[6/16] 06/24/2025 12:52:32 INFO base_lmdb_dataset.py:186 | dataset length: 17451, number of episode: 100 Rank[1/16] 06/24/2025 12:52:32 INFO base_lmdb_dataset.py:186 | dataset length: 17451, number of episode: 100 Rank[0/16] 06/24/2025 12:52:32 INFO base_lmdb_dataset.py:186 | dataset length: 17451, number of episode: 100 Rank[3/16] 06/24/2025 12:52:32 INFO base_lmdb_dataset.py:186 | dataset length: 17451, number of episode: 100 Rank[7/16] 06/24/2025 12:52:32 INFO base_lmdb_dataset.py:186 | dataset length: 17451, number of episode: 100 Rank[2/16] 06/24/2025 12:52:32 INFO base_lmdb_dataset.py:186 | dataset length: 17451, number of episode: 100 Rank[4/16] 06/24/2025 12:52:32 INFO utils.py:61 | load checkpoint: ./ckpt/groundingdino_swint_ogc_mmdet-822d7e9d-rename.pth Rank[5/16] 06/24/2025 12:52:32 INFO utils.py:61 | load checkpoint: ./ckpt/groundingdino_swint_ogc_mmdet-822d7e9d-rename.pth Rank[0/16] 06/24/2025 12:52:32 INFO utils.py:61 | load checkpoint: ./ckpt/groundingdino_swint_ogc_mmdet-822d7e9d-rename.pth Rank[3/16] 06/24/2025 12:52:32 INFO utils.py:61 | load checkpoint: ./ckpt/groundingdino_swint_ogc_mmdet-822d7e9d-rename.pth Rank[7/16] 06/24/2025 12:52:32 INFO utils.py:61 | load checkpoint: ./ckpt/groundingdino_swint_ogc_mmdet-822d7e9d-rename.pth Rank[1/16] 06/24/2025 12:52:32 INFO utils.py:61 | load checkpoint: ./ckpt/groundingdino_swint_ogc_mmdet-822d7e9d-rename.pth Rank[6/16] 06/24/2025 12:52:32 INFO utils.py:61 | load checkpoint: ./ckpt/groundingdino_swint_ogc_mmdet-822d7e9d-rename.pth Rank[2/16] 06/24/2025 12:52:32 INFO utils.py:61 | load checkpoint: ./ckpt/groundingdino_swint_ogc_mmdet-822d7e9d-rename.pth Rank[10/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[10/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[14/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[9/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[14/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[9/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[15/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[15/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[8/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[11/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[8/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[11/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[12/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[13/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[12/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[13/16] 06/24/2025 12:52:32 WARNING robotwin_lmdb_dataset.py:126 | dataset T_base2world is not set, use default. Rank[0/16] 06/24/2025 12:52:33 INFO utils.py:75 | num of missing_keys: 536,num of unexpected_keys: 645 Rank[0/16] 06/24/2025 12:52:33 INFO utils.py:79 | missing_keys: ['decoder.robot_encoder.input_fc.0.weight', 'decoder.robot_encoder.input_fc.0.bias', 'decoder.robot_encoder.input_fc.2.weight', 'decoder.robot_encoder.input_fc.2.bias', 'decoder.robot_encoder.input_fc.4.weight', 'decoder.robot_encoder.input_fc.4.bias', 'decoder.robot_encoder.input_fc.5.weight', 'decoder.robot_encoder.input_fc.5.bias', 'decoder.robot_encoder.input_fc.7.weight', 'decoder.robot_encoder.input_fc.7.bias', 'decoder.robot_encoder.input_fc.9.weight', 'decoder.robot_encoder.input_fc.9.bias', 'decoder.robot_encoder.input_fc.10.weight', 'decoder.robot_encoder.input_fc.10.bias', 'decoder.robot_encoder.layers.0.weight', 'decoder.robot_encoder.layers.1.q_proj.weight', 'decoder.robot_encoder.layers.1.q_proj.bias', 'decoder.robot_encoder.layers.1.k_proj.weight', 'decoder.robot_encoder.layers.1.v_proj.weight', 'decoder.robot_encoder.layers.1.v_proj.bias', 'decoder.robot_encoder.layers.1.proj.weight', 'decoder.robot_encoder.layers.1.proj.bias', 'decoder.robot_encoder.layers.1.position_encoder.freqs', 'decoder.robot_encoder.layers.1.position_encoder.mlp.0.weight', 'decoder.robot_encoder.layers.1.position_encoder.mlp.0.bias', 'decoder.robot_encoder.layers.1.position_encoder.mlp.2.weight', 'decoder.robot_encoder.layers.1.position_encoder.mlp.2.bias', 'decoder.robot_encoder.layers.4.weight', 'decoder.robot_encoder.layers.5.layers.0.0.weight', 'decoder.robot_encoder.layers.5.layers.0.0.bias', 'decoder.robot_encoder.layers.5.layers.1.weight', 'decoder.robot_encoder.layers.5.layers.1.bias', 'decoder.robot_encoder.layers.6.weight', 'decoder.robot_encoder.layers.7.q_proj.weight', 'decoder.robot_encoder.layers.7.q_proj.bias', 'decoder.robot_encoder.layers.7.k_proj.weight', 'decoder.robot_encoder.layers.7.v_proj.weight', 'decoder.robot_encoder.layers.7.v_proj.bias', 'decoder.robot_encoder.layers.7.proj.weight', 'decoder.robot_encoder.layers.7.proj.bias', 'decoder.robot_encoder.layers.7.position_encoder.freqs', 'decoder.robot_encoder.layers.7.position_encoder.mlp.0.weight', 'decoder.robot_encoder.layers.7.position_encoder.mlp.0.bias', 'decoder.robot_encoder.layers.7.position_encoder.mlp.2.weight', 'decoder.robot_encoder.layers.7.position_encoder.mlp.2.bias', 'decoder.robot_encoder.layers.10.weight', 'decoder.robot_encoder.layers.11.layers.0.0.weight', 'decoder.robot_encoder.layers.11.layers.0.0.bias', 'decoder.robot_encoder.layers.11.layers.1.weight', 'decoder.robot_encoder.layers.11.layers.1.bias', 'decoder.robot_encoder.layers.12.weight', 'decoder.robot_encoder.layers.13.q_proj.weight', 'decoder.robot_encoder.layers.13.q_proj.bias', 'decoder.robot_encoder.layers.13.k_proj.weight', 'decoder.robot_encoder.layers.13.v_proj.weight', 'decoder.robot_encoder.layers.13.v_proj.bias', 'decoder.robot_encoder.layers.13.proj.weight', 'decoder.robot_encoder.layers.13.proj.bias', 'decoder.robot_encoder.layers.13.position_encoder.freqs', 'decoder.robot_encoder.layers.13.position_encoder.mlp.0.weight', 'decoder.robot_encoder.layers.13.position_encoder.mlp.0.bias', 'decoder.robot_encoder.layers.13.position_encoder.mlp.2.weight', 'decoder.robot_encoder.layers.13.position_encoder.mlp.2.bias', 'decoder.robot_encoder.layers.16.weight', 'decoder.robot_encoder.layers.17.layers.0.0.weight', 'decoder.robot_encoder.layers.17.layers.0.0.bias', 'decoder.robot_encoder.layers.17.layers.1.weight', 'decoder.robot_encoder.layers.17.layers.1.bias', 'decoder.robot_encoder.layers.18.weight', 'decoder.robot_encoder.layers.19.q_proj.weight', 'decoder.robot_encoder.layers.19.q_proj.bias', 'decoder.robot_encoder.layers.19.k_proj.weight', 'decoder.robot_encoder.layers.19.v_proj.weight', 'decoder.robot_encoder.layers.19.v_proj.bias', 'decoder.robot_encoder.layers.19.proj.weight', 'decoder.robot_encoder.layers.19.proj.bias', 'decoder.robot_encoder.layers.19.position_encoder.freqs', 'decoder.robot_encoder.layers.19.position_encoder.mlp.0.weight', 'decoder.robot_encoder.layers.19.position_encoder.mlp.0.bias', 'decoder.robot_encoder.layers.19.position_encoder.mlp.2.weight', 'decoder.robot_encoder.layers.19.position_encoder.mlp.2.bias', 'decoder.robot_encoder.layers.22.weight', 'decoder.robot_encoder.layers.23.layers.0.0.weight', 'decoder.robot_encoder.layers.23.layers.0.0.bias', 'decoder.robot_encoder.layers.23.layers.1.weight', 'decoder.robot_encoder.layers.23.layers.1.bias', 'decoder.robot_encoder.layers.24.weight', 'decoder.layers.0.adaLN_modulation.1.weight', 'decoder.layers.0.adaLN_modulation.1.bias', 'decoder.layers.1.q_proj.weight', 'decoder.layers.1.q_proj.bias', 'decoder.layers.1.k_proj.weight', 'decoder.layers.1.v_proj.weight', 'decoder.layers.1.v_proj.bias', 'decoder.layers.1.proj.weight', 'decoder.layers.1.proj.bias', 'decoder.layers.1.position_encoder.freqs', 'decoder.layers.1.position_encoder.mlp.0.weight', 'decoder.layers.1.position_encoder.mlp.0.bias', 'decoder.layers.1.position_encoder.mlp.2.weight', 'decoder.layers.1.position_encoder.mlp.2.bias', 'decoder.layers.4.q_proj.weight', 'decoder.layers.4.q_proj.bias', 'decoder.layers.4.k_proj.weight', 'decoder.layers.4.v_proj.weight', 'decoder.layers.4.v_proj.bias', 'decoder.layers.4.proj.weight', 'decoder.layers.4.proj.bias', 'decoder.layers.6.q_proj.weight', 'decoder.layers.6.q_proj.bias', 'decoder.layers.6.k_proj.weight', 'decoder.layers.6.v_proj.weight', 'decoder.layers.6.v_proj.bias', 'decoder.layers.6.proj.weight', 'decoder.layers.6.proj.bias', 'decoder.layers.11.layers.0.0.weight', 'decoder.layers.11.layers.0.0.bias', 'decoder.layers.11.layers.1.weight', 'decoder.layers.11.layers.1.bias', 'decoder.layers.13.adaLN_modulation.1.weight', 'decoder.layers.13.adaLN_modulation.1.bias', 'decoder.layers.14.q_proj.weight', 'decoder.layers.14.q_proj.bias', 'decoder.layers.14.k_proj.weight', 'decoder.layers.14.v_proj.weight', 'decoder.layers.14.v_proj.bias', 'decoder.layers.14.proj.weight', 'decoder.layers.14.proj.bias', 'decoder.layers.14.position_encoder.freqs', 'decoder.layers.14.position_encoder.mlp.0.weight', 'decoder.layers.14.position_encoder.mlp.0.bias', 'decoder.layers.14.position_encoder.mlp.2.weight', 'decoder.layers.14.position_encoder.mlp.2.bias', 'decoder.layers.17.q_proj.weight', 'decoder.layers.17.q_proj.bias', 'decoder.layers.17.k_proj.weight', 'decoder.layers.17.v_proj.weight', 'decoder.layers.17.v_proj.bias', 'decoder.layers.17.proj.weight', 'decoder.layers.17.proj.bias', 'decoder.layers.18.weight', 'decoder.layers.19.q_proj.weight', 'decoder.layers.19.q_proj.bias', 'decoder.layers.19.k_proj.weight', 'decoder.layers.19.v_proj.weight', 'decoder.layers.19.v_proj.bias', 'decoder.layers.19.proj.weight', 'decoder.layers.19.proj.bias', 'decoder.layers.20.weight', 'decoder.layers.26.adaLN_modulation.1.weight', 'decoder.layers.26.adaLN_modulation.1.bias', 'decoder.layers.27.q_proj.weight', 'decoder.layers.27.q_proj.bias', 'decoder.layers.27.k_proj.weight', 'decoder.layers.27.v_proj.weight', 'decoder.layers.27.v_proj.bias', 'decoder.layers.27.proj.weight', 'decoder.layers.27.proj.bias', 'decoder.layers.27.position_encoder.freqs', 'decoder.layers.27.position_encoder.mlp.0.weight', 'decoder.layers.27.position_encoder.mlp.0.bias', 'decoder.layers.27.position_encoder.mlp.2.weight', 'decoder.layers.27.position_encoder.mlp.2.bias', 'decoder.layers.29.weight', 'decoder.layers.30.q_proj.weight', 'decoder.layers.30.q_proj.bias', 'decoder.layers.30.k_proj.weight', 'decoder.layers.30.v_proj.weight', 'decoder.layers.30.v_proj.bias', 'decoder.layers.30.proj.weight', 'decoder.layers.30.proj.bias', 'decoder.layers.31.weight', 'decoder.layers.32.q_proj.weight', 'decoder.layers.32.q_proj.bias', 'decoder.layers.32.k_proj.weight', 'decoder.layers.32.v_proj.weight', 'decoder.layers.32.v_proj.bias', 'decoder.layers.32.proj.weight', 'decoder.layers.32.proj.bias', 'decoder.layers.33.weight', 'decoder.layers.37.layers.0.0.weight', 'decoder.layers.37.layers.0.0.bias', 'decoder.layers.37.layers.1.weight', 'decoder.layers.37.layers.1.bias', 'decoder.layers.39.adaLN_modulation.1.weight', 'decoder.layers.39.adaLN_modulation.1.bias', 'decoder.layers.40.q_proj.weight', 'decoder.layers.40.q_proj.bias', 'decoder.layers.40.k_proj.weight', 'decoder.layers.40.v_proj.weight', 'decoder.layers.40.v_proj.bias', 'decoder.layers.40.proj.weight', 'decoder.layers.40.proj.bias', 'decoder.layers.40.position_encoder.freqs', 'decoder.layers.40.position_encoder.mlp.0.weight', 'decoder.layers.40.position_encoder.mlp.0.bias', 'decoder.layers.40.position_encoder.mlp.2.weight', 'decoder.layers.40.position_encoder.mlp.2.bias', 'decoder.layers.42.weight', 'decoder.layers.43.q_proj.weight', 'decoder.layers.43.q_proj.bias', 'decoder.layers.43.k_proj.weight', 'decoder.layers.43.v_proj.weight', 'decoder.layers.43.v_proj.bias', 'decoder.layers.43.proj.weight', 'decoder.layers.43.proj.bias', 'decoder.layers.44.weight', 'decoder.layers.45.q_proj.weight', 'decoder.layers.45.q_proj.bias', 'decoder.layers.45.k_proj.weight', 'decoder.layers.45.v_proj.weight', 'decoder.layers.45.v_proj.bias', 'decoder.layers.45.proj.weight', 'decoder.layers.45.proj.bias', 'decoder.layers.50.layers.0.0.weight', 'decoder.layers.50.layers.0.0.bias', 'decoder.layers.50.layers.1.weight', 'decoder.layers.50.layers.1.bias', 'decoder.layers.52.adaLN_modulation.1.weight', 'decoder.layers.52.adaLN_modulation.1.bias', 'decoder.layers.53.q_proj.weight', 'decoder.layers.53.q_proj.bias', 'decoder.layers.53.k_proj.weight', 'decoder.layers.53.v_proj.weight', 'decoder.layers.53.v_proj.bias', 'decoder.layers.53.proj.weight', 'decoder.layers.53.proj.bias', 'decoder.layers.53.position_encoder.freqs', 'decoder.layers.53.position_encoder.mlp.0.weight', 'decoder.layers.53.position_encoder.mlp.0.bias', 'decoder.layers.53.position_encoder.mlp.2.weight', 'decoder.layers.53.position_encoder.mlp.2.bias', 'decoder.layers.55.weight', 'decoder.layers.56.q_proj.weight', 'decoder.layers.56.q_proj.bias', 'decoder.layers.56.k_proj.weight', 'decoder.layers.56.v_proj.weight', 'decoder.layers.56.v_proj.bias', 'decoder.layers.56.proj.weight', 'decoder.layers.56.proj.bias', 'decoder.layers.57.weight', 'decoder.layers.58.q_proj.weight', 'decoder.layers.58.q_proj.bias', 'decoder.layers.58.k_proj.weight', 'decoder.layers.58.v_proj.weight', 'decoder.layers.58.v_proj.bias', 'decoder.layers.58.proj.weight', 'decoder.layers.58.proj.bias', 'decoder.layers.59.weight', 'decoder.layers.63.layers.0.0.weight', 'decoder.layers.63.layers.0.0.bias', 'decoder.layers.63.layers.1.weight', 'decoder.layers.63.layers.1.bias', 'decoder.layers.65.adaLN_modulation.1.weight', 'decoder.layers.65.adaLN_modulation.1.bias', 'decoder.layers.66.q_proj.weight', 'decoder.layers.66.q_proj.bias', 'decoder.layers.66.k_proj.weight', 'decoder.layers.66.v_proj.weight', 'decoder.layers.66.v_proj.bias', 'decoder.layers.66.proj.weight', 'decoder.layers.66.proj.bias', 'decoder.layers.66.position_encoder.freqs', 'decoder.layers.66.position_encoder.mlp.0.weight', 'decoder.layers.66.position_encoder.mlp.0.bias', 'decoder.layers.66.position_encoder.mlp.2.weight', 'decoder.layers.66.position_encoder.mlp.2.bias', 'decoder.layers.68.weight', 'decoder.layers.69.q_proj.weight', 'decoder.layers.69.q_proj.bias', 'decoder.layers.69.k_proj.weight', 'decoder.layers.69.v_proj.weight', 'decoder.layers.69.v_proj.bias', 'decoder.layers.69.proj.weight', 'decoder.layers.69.proj.bias', 'decoder.layers.70.weight', 'decoder.layers.71.q_proj.weight', 'decoder.layers.71.q_proj.bias', 'decoder.layers.71.k_proj.weight', 'decoder.layers.71.v_proj.weight', 'decoder.layers.71.v_proj.bias', 'decoder.layers.71.proj.weight', 'decoder.layers.71.proj.bias', 'decoder.layers.72.weight', 'decoder.layers.76.layers.0.0.weight', 'decoder.layers.76.layers.0.0.bias', 'decoder.layers.76.layers.1.weight', 'decoder.layers.76.layers.1.bias', 'decoder.input_layers.0.weight', 'decoder.input_layers.0.bias', 'decoder.input_layers.1.weight', 'decoder.input_layers.1.bias', 'decoder.input_layers.3.weight', 'decoder.input_layers.3.bias', 'decoder.input_layers.5.weight', 'decoder.input_layers.5.bias', 'decoder.input_layers.6.weight', 'decoder.input_layers.6.bias', 'decoder.input_layers.8.weight', 'decoder.input_layers.8.bias', 'decoder.input_layers.10.weight', 'decoder.input_layers.10.bias', 'decoder.head.upsamples.0.1.weight', 'decoder.head.upsamples.0.1.bias', 'decoder.head.upsamples.1.1.weight', 'decoder.head.upsamples.1.1.bias', 'decoder.head.upsamples.2.1.weight', 'decoder.head.upsamples.2.1.bias', 'decoder.head.act_and_norm.0.1.weight', 'decoder.head.act_and_norm.1.1.weight', 'decoder.head.act_and_norm.2.1.weight', 'decoder.t_embed.freqs', 'decoder.t_embed.mlp.0.weight', 'decoder.t_embed.mlp.0.bias', 'decoder.t_embed.mlp.2.weight', 'decoder.t_embed.mlp.2.bias', 'neck.convs.0.0.weight', 'neck.convs.0.0.bias', 'neck.convs.0.1.weight', 'neck.convs.0.1.bias', 'neck.convs.1.0.weight', 'neck.convs.1.0.bias', 'neck.convs.1.1.weight', 'neck.convs.1.1.bias', 'neck.convs.2.0.weight', 'neck.convs.2.0.bias', 'neck.convs.2.1.weight', 'neck.convs.2.1.bias', 'spatial_enhancer.pts_prob_pre_fc.weight', 'spatial_enhancer.pts_prob_pre_fc.bias', 'spatial_enhancer.pts_prob_fc.layers.0.weight', 'spatial_enhancer.pts_prob_fc.layers.0.bias', 'spatial_enhancer.pts_prob_fc.layers.1.weight', 'spatial_enhancer.pts_prob_fc.layers.1.bias', 'spatial_enhancer.pts_fc.weight', 'spatial_enhancer.pts_fc.bias', 'spatial_enhancer.fusion_fc.0.layers.0.0.weight', 'spatial_enhancer.fusion_fc.0.layers.0.0.bias', 'spatial_enhancer.fusion_fc.0.layers.1.weight', 'spatial_enhancer.fusion_fc.0.layers.1.bias', 'spatial_enhancer.fusion_fc.1.weight', 'spatial_enhancer.fusion_fc.1.bias', 'spatial_enhancer.fusion_norm.weight', 'spatial_enhancer.fusion_norm.bias', 'backbone_3d.conv1.weight', 'backbone_3d.bn1.weight', 'backbone_3d.bn1.bias', 'backbone_3d.bn1.running_mean', 'backbone_3d.bn1.running_var', 'backbone_3d.layer1.0.conv1.weight', 'backbone_3d.layer1.0.bn1.weight', 'backbone_3d.layer1.0.bn1.bias', 'backbone_3d.layer1.0.bn1.running_mean', 'backbone_3d.layer1.0.bn1.running_var', 'backbone_3d.layer1.0.conv2.weight', 'backbone_3d.layer1.0.bn2.weight', 'backbone_3d.layer1.0.bn2.bias', 'backbone_3d.layer1.0.bn2.running_mean', 'backbone_3d.layer1.0.bn2.running_var', 'backbone_3d.layer1.1.conv1.weight', 'backbone_3d.layer1.1.bn1.weight', 'backbone_3d.layer1.1.bn1.bias', 'backbone_3d.layer1.1.bn1.running_mean', 'backbone_3d.layer1.1.bn1.running_var', 'backbone_3d.layer1.1.conv2.weight', 'backbone_3d.layer1.1.bn2.weight', 'backbone_3d.layer1.1.bn2.bias', 'backbone_3d.layer1.1.bn2.running_mean', 'backbone_3d.layer1.1.bn2.running_var', 'backbone_3d.layer1.2.conv1.weight', 'backbone_3d.layer1.2.bn1.weight', 'backbone_3d.layer1.2.bn1.bias', 'backbone_3d.layer1.2.bn1.running_mean', 'backbone_3d.layer1.2.bn1.running_var', 'backbone_3d.layer1.2.conv2.weight', 'backbone_3d.layer1.2.bn2.weight', 'backbone_3d.layer1.2.bn2.bias', 'backbone_3d.layer1.2.bn2.running_mean', 'backbone_3d.layer1.2.bn2.running_var', 'backbone_3d.layer2.0.conv1.weight', 'backbone_3d.layer2.0.bn1.weight', 'backbone_3d.layer2.0.bn1.bias', 'backbone_3d.layer2.0.bn1.running_mean', 'backbone_3d.layer2.0.bn1.running_var', 'backbone_3d.layer2.0.conv2.weight', 'backbone_3d.layer2.0.bn2.weight', 'backbone_3d.layer2.0.bn2.bias', 'backbone_3d.layer2.0.bn2.running_mean', 'backbone_3d.layer2.0.bn2.running_var', 'backbone_3d.layer2.0.downsample.0.weight', 'backbone_3d.layer2.0.downsample.1.weight', 'backbone_3d.layer2.0.downsample.1.bias', 'backbone_3d.layer2.0.downsample.1.running_mean', 'backbone_3d.layer2.0.downsample.1.running_var', 'backbone_3d.layer2.1.conv1.weight', 'backbone_3d.layer2.1.bn1.weight', 'backbone_3d.layer2.1.bn1.bias', 'backbone_3d.layer2.1.bn1.running_mean', 'backbone_3d.layer2.1.bn1.running_var', 'backbone_3d.layer2.1.conv2.weight', 'backbone_3d.layer2.1.bn2.weight', 'backbone_3d.layer2.1.bn2.bias', 'backbone_3d.layer2.1.bn2.running_mean', 'backbone_3d.layer2.1.bn2.running_var', 'backbone_3d.layer2.2.conv1.weight', 'backbone_3d.layer2.2.bn1.weight', 'backbone_3d.layer2.2.bn1.bias', 'backbone_3d.layer2.2.bn1.running_mean', 'backbone_3d.layer2.2.bn1.running_var', 'backbone_3d.layer2.2.conv2.weight', 'backbone_3d.layer2.2.bn2.weight', 'backbone_3d.layer2.2.bn2.bias', 'backbone_3d.layer2.2.bn2.running_mean', 'backbone_3d.layer2.2.bn2.running_var', 'backbone_3d.layer2.3.conv1.weight', 'backbone_3d.layer2.3.bn1.weight', 'backbone_3d.layer2.3.bn1.bias', 'backbone_3d.layer2.3.bn1.running_mean', 'backbone_3d.layer2.3.bn1.running_var', 'backbone_3d.layer2.3.conv2.weight', 'backbone_3d.layer2.3.bn2.weight', 'backbone_3d.layer2.3.bn2.bias', 'backbone_3d.layer2.3.bn2.running_mean', 'backbone_3d.layer2.3.bn2.running_var', 'backbone_3d.layer3.0.conv1.weight', 'backbone_3d.layer3.0.bn1.weight', 'backbone_3d.layer3.0.bn1.bias', 'backbone_3d.layer3.0.bn1.running_mean', 'backbone_3d.layer3.0.bn1.running_var', 'backbone_3d.layer3.0.conv2.weight', 'backbone_3d.layer3.0.bn2.weight', 'backbone_3d.layer3.0.bn2.bias', 'backbone_3d.layer3.0.bn2.running_mean', 'backbone_3d.layer3.0.bn2.running_var', 'backbone_3d.layer3.0.downsample.0.weight', 'backbone_3d.layer3.0.downsample.1.weight', 'backbone_3d.layer3.0.downsample.1.bias', 'backbone_3d.layer3.0.downsample.1.running_mean', 'backbone_3d.layer3.0.downsample.1.running_var', 'backbone_3d.layer3.1.conv1.weight', 'backbone_3d.layer3.1.bn1.weight', 'backbone_3d.layer3.1.bn1.bias', 'backbone_3d.layer3.1.bn1.running_mean', 'backbone_3d.layer3.1.bn1.running_var', 'backbone_3d.layer3.1.conv2.weight', 'backbone_3d.layer3.1.bn2.weight', 'backbone_3d.layer3.1.bn2.bias', 'backbone_3d.layer3.1.bn2.running_mean', 'backbone_3d.layer3.1.bn2.running_var', 'backbone_3d.layer3.2.conv1.weight', 'backbone_3d.layer3.2.bn1.weight', 'backbone_3d.layer3.2.bn1.bias', 'backbone_3d.layer3.2.bn1.running_mean', 'backbone_3d.layer3.2.bn1.running_var', 'backbone_3d.layer3.2.conv2.weight', 'backbone_3d.layer3.2.bn2.weight', 'backbone_3d.layer3.2.bn2.bias', 'backbone_3d.layer3.2.bn2.running_mean', 'backbone_3d.layer3.2.bn2.running_var', 'backbone_3d.layer3.3.conv1.weight', 'backbone_3d.layer3.3.bn1.weight', 'backbone_3d.layer3.3.bn1.bias', 'backbone_3d.layer3.3.bn1.running_mean', 'backbone_3d.layer3.3.bn1.running_var', 'backbone_3d.layer3.3.conv2.weight', 'backbone_3d.layer3.3.bn2.weight', 'backbone_3d.layer3.3.bn2.bias', 'backbone_3d.layer3.3.bn2.running_mean', 'backbone_3d.layer3.3.bn2.running_var', 'backbone_3d.layer3.4.conv1.weight', 'backbone_3d.layer3.4.bn1.weight', 'backbone_3d.layer3.4.bn1.bias', 'backbone_3d.layer3.4.bn1.running_mean', 'backbone_3d.layer3.4.bn1.running_var', 'backbone_3d.layer3.4.conv2.weight', 'backbone_3d.layer3.4.bn2.weight', 'backbone_3d.layer3.4.bn2.bias', 'backbone_3d.layer3.4.bn2.running_mean', 'backbone_3d.layer3.4.bn2.running_var', 'backbone_3d.layer3.5.conv1.weight', 'backbone_3d.layer3.5.bn1.weight', 'backbone_3d.layer3.5.bn1.bias', 'backbone_3d.layer3.5.bn1.running_mean', 'backbone_3d.layer3.5.bn1.running_var', 'backbone_3d.layer3.5.conv2.weight', 'backbone_3d.layer3.5.bn2.weight', 'backbone_3d.layer3.5.bn2.bias', 'backbone_3d.layer3.5.bn2.running_mean', 'backbone_3d.layer3.5.bn2.running_var', 'backbone_3d.layer4.0.conv1.weight', 'backbone_3d.layer4.0.bn1.weight', 'backbone_3d.layer4.0.bn1.bias', 'backbone_3d.layer4.0.bn1.running_mean', 'backbone_3d.layer4.0.bn1.running_var', 'backbone_3d.layer4.0.conv2.weight', 'backbone_3d.layer4.0.bn2.weight', 'backbone_3d.layer4.0.bn2.bias', 'backbone_3d.layer4.0.bn2.running_mean', 'backbone_3d.layer4.0.bn2.running_var', 'backbone_3d.layer4.0.downsample.0.weight', 'backbone_3d.layer4.0.downsample.1.weight', 'backbone_3d.layer4.0.downsample.1.bias', 'backbone_3d.layer4.0.downsample.1.running_mean', 'backbone_3d.layer4.0.downsample.1.running_var', 'backbone_3d.layer4.1.conv1.weight', 'backbone_3d.layer4.1.bn1.weight', 'backbone_3d.layer4.1.bn1.bias', 'backbone_3d.layer4.1.bn1.running_mean', 'backbone_3d.layer4.1.bn1.running_var', 'backbone_3d.layer4.1.conv2.weight', 'backbone_3d.layer4.1.bn2.weight', 'backbone_3d.layer4.1.bn2.bias', 'backbone_3d.layer4.1.bn2.running_mean', 'backbone_3d.layer4.1.bn2.running_var', 'backbone_3d.layer4.2.conv1.weight', 'backbone_3d.layer4.2.bn1.weight', 'backbone_3d.layer4.2.bn1.bias', 'backbone_3d.layer4.2.bn1.running_mean', 'backbone_3d.layer4.2.bn1.running_var', 'backbone_3d.layer4.2.conv2.weight', 'backbone_3d.layer4.2.bn2.weight', 'backbone_3d.layer4.2.bn2.bias', 'backbone_3d.layer4.2.bn2.running_mean', 'backbone_3d.layer4.2.bn2.running_var', 'neck_3d.convs.0.0.weight', 'neck_3d.convs.0.0.bias', 'neck_3d.convs.0.1.weight', 'neck_3d.convs.0.1.bias', 'neck_3d.convs.1.0.weight', 'neck_3d.convs.1.0.bias', 'neck_3d.convs.1.1.weight', 'neck_3d.convs.1.1.bias', 'neck_3d.convs.2.0.weight', 'neck_3d.convs.2.0.bias', 'neck_3d.convs.2.1.weight', 'neck_3d.convs.2.1.bias'] unexpected_keys: ['feature_enhancer.level_embed', 'feature_enhancer.img_attn_blocks.0.self_attn.sampling_offsets.weight', 'feature_enhancer.img_attn_blocks.0.self_attn.sampling_offsets.bias', 'feature_enhancer.img_attn_blocks.0.self_attn.attention_weights.weight', 'feature_enhancer.img_attn_blocks.0.self_attn.attention_weights.bias', 'feature_enhancer.img_attn_blocks.0.self_attn.value_proj.weight', 'feature_enhancer.img_attn_blocks.0.self_attn.value_proj.bias', 'feature_enhancer.img_attn_blocks.0.self_attn.output_proj.weight', 'feature_enhancer.img_attn_blocks.0.self_attn.output_proj.bias', 'feature_enhancer.img_attn_blocks.0.norms.0.weight', 'feature_enhancer.img_attn_blocks.0.norms.0.bias', 'feature_enhancer.img_attn_blocks.0.ffn.layers.0.0.weight', 'feature_enhancer.img_attn_blocks.0.ffn.layers.0.0.bias', 'feature_enhancer.img_attn_blocks.0.ffn.layers.1.weight', 'feature_enhancer.img_attn_blocks.0.ffn.layers.1.bias', 'feature_enhancer.img_attn_blocks.0.norms.1.weight', 'feature_enhancer.img_attn_blocks.0.norms.1.bias', 'feature_enhancer.img_attn_blocks.1.self_attn.sampling_offsets.weight', 'feature_enhancer.img_attn_blocks.1.self_attn.sampling_offsets.bias', 'feature_enhancer.img_attn_blocks.1.self_attn.attention_weights.weight', 'feature_enhancer.img_attn_blocks.1.self_attn.attention_weights.bias', 'feature_enhancer.img_attn_blocks.1.self_attn.value_proj.weight', 'feature_enhancer.img_attn_blocks.1.self_attn.value_proj.bias', 'feature_enhancer.img_attn_blocks.1.self_attn.output_proj.weight', 'feature_enhancer.img_attn_blocks.1.self_attn.output_proj.bias', 'feature_enhancer.img_attn_blocks.1.norms.0.weight', 'feature_enhancer.img_attn_blocks.1.norms.0.bias', 'feature_enhancer.img_attn_blocks.1.ffn.layers.0.0.weight', 'feature_enhancer.img_attn_blocks.1.ffn.layers.0.0.bias', 'feature_enhancer.img_attn_blocks.1.ffn.layers.1.weight', 'feature_enhancer.img_attn_blocks.1.ffn.layers.1.bias', 'feature_enhancer.img_attn_blocks.1.norms.1.weight', 'feature_enhancer.img_attn_blocks.1.norms.1.bias', 'feature_enhancer.img_attn_blocks.2.self_attn.sampling_offsets.weight', 'feature_enhancer.img_attn_blocks.2.self_attn.sampling_offsets.bias', 'feature_enhancer.img_attn_blocks.2.self_attn.attention_weights.weight', 'feature_enhancer.img_attn_blocks.2.self_attn.attention_weights.bias', 'feature_enhancer.img_attn_blocks.2.self_attn.value_proj.weight', 'feature_enhancer.img_attn_blocks.2.self_attn.value_proj.bias', 'feature_enhancer.img_attn_blocks.2.self_attn.output_proj.weight', 'feature_enhancer.img_attn_blocks.2.self_attn.output_proj.bias', 'feature_enhancer.img_attn_blocks.2.norms.0.weight', 'feature_enhancer.img_attn_blocks.2.norms.0.bias', 'feature_enhancer.img_attn_blocks.2.ffn.layers.0.0.weight', 'feature_enhancer.img_attn_blocks.2.ffn.layers.0.0.bias', 'feature_enhancer.img_attn_blocks.2.ffn.layers.1.weight', 'feature_enhancer.img_attn_blocks.2.ffn.layers.1.bias', 'feature_enhancer.img_attn_blocks.2.norms.1.weight', 'feature_enhancer.img_attn_blocks.2.norms.1.bias', 'feature_enhancer.img_attn_blocks.3.self_attn.sampling_offsets.weight', 'feature_enhancer.img_attn_blocks.3.self_attn.sampling_offsets.bias', 'feature_enhancer.img_attn_blocks.3.self_attn.attention_weights.weight', 'feature_enhancer.img_attn_blocks.3.self_attn.attention_weights.bias', 'feature_enhancer.img_attn_blocks.3.self_attn.value_proj.weight', 'feature_enhancer.img_attn_blocks.3.self_attn.value_proj.bias', 'feature_enhancer.img_attn_blocks.3.self_attn.output_proj.weight', 'feature_enhancer.img_attn_blocks.3.self_attn.output_proj.bias', 'feature_enhancer.img_attn_blocks.3.norms.0.weight', 'feature_enhancer.img_attn_blocks.3.norms.0.bias', 'feature_enhancer.img_attn_blocks.3.ffn.layers.0.0.weight', 'feature_enhancer.img_attn_blocks.3.ffn.layers.0.0.bias', 'feature_enhancer.img_attn_blocks.3.ffn.layers.1.weight', 'feature_enhancer.img_attn_blocks.3.ffn.layers.1.bias', 'feature_enhancer.img_attn_blocks.3.norms.1.weight', 'feature_enhancer.img_attn_blocks.3.norms.1.bias', 'feature_enhancer.img_attn_blocks.4.self_attn.sampling_offsets.weight', 'feature_enhancer.img_attn_blocks.4.self_attn.sampling_offsets.bias', 'feature_enhancer.img_attn_blocks.4.self_attn.attention_weights.weight', 'feature_enhancer.img_attn_blocks.4.self_attn.attention_weights.bias', 'feature_enhancer.img_attn_blocks.4.self_attn.value_proj.weight', 'feature_enhancer.img_attn_blocks.4.self_attn.value_proj.bias', 'feature_enhancer.img_attn_blocks.4.self_attn.output_proj.weight', 'feature_enhancer.img_attn_blocks.4.self_attn.output_proj.bias', 'feature_enhancer.img_attn_blocks.4.norms.0.weight', 'feature_enhancer.img_attn_blocks.4.norms.0.bias', 'feature_enhancer.img_attn_blocks.4.ffn.layers.0.0.weight', 'feature_enhancer.img_attn_blocks.4.ffn.layers.0.0.bias', 'feature_enhancer.img_attn_blocks.4.ffn.layers.1.weight', 'feature_enhancer.img_attn_blocks.4.ffn.layers.1.bias', 'feature_enhancer.img_attn_blocks.4.norms.1.weight', 'feature_enhancer.img_attn_blocks.4.norms.1.bias', 'feature_enhancer.img_attn_blocks.5.self_attn.sampling_offsets.weight', 'feature_enhancer.img_attn_blocks.5.self_attn.sampling_offsets.bias', 'feature_enhancer.img_attn_blocks.5.self_attn.attention_weights.weight', 'feature_enhancer.img_attn_blocks.5.self_attn.attention_weights.bias', 'feature_enhancer.img_attn_blocks.5.self_attn.value_proj.weight', 'feature_enhancer.img_attn_blocks.5.self_attn.value_proj.bias', 'feature_enhancer.img_attn_blocks.5.self_attn.output_proj.weight', 'feature_enhancer.img_attn_blocks.5.self_attn.output_proj.bias', 'feature_enhancer.img_attn_blocks.5.norms.0.weight', 'feature_enhancer.img_attn_blocks.5.norms.0.bias', 'feature_enhancer.img_attn_blocks.5.ffn.layers.0.0.weight', 'feature_enhancer.img_attn_blocks.5.ffn.layers.0.0.bias', 'feature_enhancer.img_attn_blocks.5.ffn.layers.1.weight', 'feature_enhancer.img_attn_blocks.5.ffn.layers.1.bias', 'feature_enhancer.img_attn_blocks.5.norms.1.weight', 'feature_enhancer.img_attn_blocks.5.norms.1.bias', 'feature_enhancer.text_attn_blocks.0.self_attn.attn.in_proj_weight', 'feature_enhancer.text_attn_blocks.0.self_attn.attn.in_proj_bias', 'feature_enhancer.text_attn_blocks.0.self_attn.attn.out_proj.weight', 'feature_enhancer.text_attn_blocks.0.self_attn.attn.out_proj.bias', 'feature_enhancer.text_attn_blocks.0.ffn.layers.0.0.weight', 'feature_enhancer.text_attn_blocks.0.ffn.layers.0.0.bias', 'feature_enhancer.text_attn_blocks.0.ffn.layers.1.weight', 'feature_enhancer.text_attn_blocks.0.ffn.layers.1.bias', 'feature_enhancer.text_attn_blocks.0.norms.0.weight', 'feature_enhancer.text_attn_blocks.0.norms.0.bias', 'feature_enhancer.text_attn_blocks.0.norms.1.weight', 'feature_enhancer.text_attn_blocks.0.norms.1.bias', 'feature_enhancer.text_attn_blocks.1.self_attn.attn.in_proj_weight', 'feature_enhancer.text_attn_blocks.1.self_attn.attn.in_proj_bias', 'feature_enhancer.text_attn_blocks.1.self_attn.attn.out_proj.weight', 'feature_enhancer.text_attn_blocks.1.self_attn.attn.out_proj.bias', 'feature_enhancer.text_attn_blocks.1.ffn.layers.0.0.weight', 'feature_enhancer.text_attn_blocks.1.ffn.layers.0.0.bias', 'feature_enhancer.text_attn_blocks.1.ffn.layers.1.weight', 'feature_enhancer.text_attn_blocks.1.ffn.layers.1.bias', 'feature_enhancer.text_attn_blocks.1.norms.0.weight', 'feature_enhancer.text_attn_blocks.1.norms.0.bias', 'feature_enhancer.text_attn_blocks.1.norms.1.weight', 'feature_enhancer.text_attn_blocks.1.norms.1.bias', 'feature_enhancer.text_attn_blocks.2.self_attn.attn.in_proj_weight', 'feature_enhancer.text_attn_blocks.2.self_attn.attn.in_proj_bias', 'feature_enhancer.text_attn_blocks.2.self_attn.attn.out_proj.weight', 'feature_enhancer.text_attn_blocks.2.self_attn.attn.out_proj.bias', 'feature_enhancer.text_attn_blocks.2.ffn.layers.0.0.weight', 'feature_enhancer.text_attn_blocks.2.ffn.layers.0.0.bias', 'feature_enhancer.text_attn_blocks.2.ffn.layers.1.weight', 'feature_enhancer.text_attn_blocks.2.ffn.layers.1.bias', 'feature_enhancer.text_attn_blocks.2.norms.0.weight', 'feature_enhancer.text_attn_blocks.2.norms.0.bias', 'feature_enhancer.text_attn_blocks.2.norms.1.weight', 'feature_enhancer.text_attn_blocks.2.norms.1.bias', 'feature_enhancer.text_attn_blocks.3.self_attn.attn.in_proj_weight', 'feature_enhancer.text_attn_blocks.3.self_attn.attn.in_proj_bias', 'feature_enhancer.text_attn_blocks.3.self_attn.attn.out_proj.weight', 'feature_enhancer.text_attn_blocks.3.self_attn.attn.out_proj.bias', 'feature_enhancer.text_attn_blocks.3.ffn.layers.0.0.weight', 'feature_enhancer.text_attn_blocks.3.ffn.layers.0.0.bias', 'feature_enhancer.text_attn_blocks.3.ffn.layers.1.weight', 'feature_enhancer.text_attn_blocks.3.ffn.layers.1.bias', 'feature_enhancer.text_attn_blocks.3.norms.0.weight', 'feature_enhancer.text_attn_blocks.3.norms.0.bias', 'feature_enhancer.text_attn_blocks.3.norms.1.weight', 'feature_enhancer.text_attn_blocks.3.norms.1.bias', 'feature_enhancer.text_attn_blocks.4.self_attn.attn.in_proj_weight', 'feature_enhancer.text_attn_blocks.4.self_attn.attn.in_proj_bias', 'feature_enhancer.text_attn_blocks.4.self_attn.attn.out_proj.weight', 'feature_enhancer.text_attn_blocks.4.self_attn.attn.out_proj.bias', 'feature_enhancer.text_attn_blocks.4.ffn.layers.0.0.weight', 'feature_enhancer.text_attn_blocks.4.ffn.layers.0.0.bias', 'feature_enhancer.text_attn_blocks.4.ffn.layers.1.weight', 'feature_enhancer.text_attn_blocks.4.ffn.layers.1.bias', 'feature_enhancer.text_attn_blocks.4.norms.0.weight', 'feature_enhancer.text_attn_blocks.4.norms.0.bias', 'feature_enhancer.text_attn_blocks.4.norms.1.weight', 'feature_enhancer.text_attn_blocks.4.norms.1.bias', 'feature_enhancer.text_attn_blocks.5.self_attn.attn.in_proj_weight', 'feature_enhancer.text_attn_blocks.5.self_attn.attn.in_proj_bias', 'feature_enhancer.text_attn_blocks.5.self_attn.attn.out_proj.weight', 'feature_enhancer.text_attn_blocks.5.self_attn.attn.out_proj.bias', 'feature_enhancer.text_attn_blocks.5.ffn.layers.0.0.weight', 'feature_enhancer.text_attn_blocks.5.ffn.layers.0.0.bias', 'feature_enhancer.text_attn_blocks.5.ffn.layers.1.weight', 'feature_enhancer.text_attn_blocks.5.ffn.layers.1.bias', 'feature_enhancer.text_attn_blocks.5.norms.0.weight', 'feature_enhancer.text_attn_blocks.5.norms.0.bias', 'feature_enhancer.text_attn_blocks.5.norms.1.weight', 'feature_enhancer.text_attn_blocks.5.norms.1.bias', 'feature_enhancer.text_img_attn_blocks.0.gamma_v', 'feature_enhancer.text_img_attn_blocks.0.gamma_l', 'feature_enhancer.text_img_attn_blocks.0.layer_norm_v.weight', 'feature_enhancer.text_img_attn_blocks.0.layer_norm_v.bias', 'feature_enhancer.text_img_attn_blocks.0.layer_norm_l.weight', 'feature_enhancer.text_img_attn_blocks.0.layer_norm_l.bias', 'feature_enhancer.text_img_attn_blocks.0.attn.v_proj.weight', 'feature_enhancer.text_img_attn_blocks.0.attn.v_proj.bias', 'feature_enhancer.text_img_attn_blocks.0.attn.l_proj.weight', 'feature_enhancer.text_img_attn_blocks.0.attn.l_proj.bias', 'feature_enhancer.text_img_attn_blocks.0.attn.values_v_proj.weight', 'feature_enhancer.text_img_attn_blocks.0.attn.values_v_proj.bias', 'feature_enhancer.text_img_attn_blocks.0.attn.values_l_proj.weight', 'feature_enhancer.text_img_attn_blocks.0.attn.values_l_proj.bias', 'feature_enhancer.text_img_attn_blocks.0.attn.out_v_proj.weight', 'feature_enhancer.text_img_attn_blocks.0.attn.out_v_proj.bias', 'feature_enhancer.text_img_attn_blocks.0.attn.out_l_proj.weight', 'feature_enhancer.text_img_attn_blocks.0.attn.out_l_proj.bias', 'feature_enhancer.text_img_attn_blocks.1.gamma_v', 'feature_enhancer.text_img_attn_blocks.1.gamma_l', 'feature_enhancer.text_img_attn_blocks.1.layer_norm_v.weight', 'feature_enhancer.text_img_attn_blocks.1.layer_norm_v.bias', 'feature_enhancer.text_img_attn_blocks.1.layer_norm_l.weight', 'feature_enhancer.text_img_attn_blocks.1.layer_norm_l.bias', 'feature_enhancer.text_img_attn_blocks.1.attn.v_proj.weight', 'feature_enhancer.text_img_attn_blocks.1.attn.v_proj.bias', 'feature_enhancer.text_img_attn_blocks.1.attn.l_proj.weight', 'feature_enhancer.text_img_attn_blocks.1.attn.l_proj.bias', 'feature_enhancer.text_img_attn_blocks.1.attn.values_v_proj.weight', 'feature_enhancer.text_img_attn_blocks.1.attn.values_v_proj.bias', 'feature_enhancer.text_img_attn_blocks.1.attn.values_l_proj.weight', 'feature_enhancer.text_img_attn_blocks.1.attn.values_l_proj.bias', 'feature_enhancer.text_img_attn_blocks.1.attn.out_v_proj.weight', 'feature_enhancer.text_img_attn_blocks.1.attn.out_v_proj.bias', 'feature_enhancer.text_img_attn_blocks.1.attn.out_l_proj.weight', 'feature_enhancer.text_img_attn_blocks.1.attn.out_l_proj.bias', 'feature_enhancer.text_img_attn_blocks.2.gamma_v', 'feature_enhancer.text_img_attn_blocks.2.gamma_l', 'feature_enhancer.text_img_attn_blocks.2.layer_norm_v.weight', 'feature_enhancer.text_img_attn_blocks.2.layer_norm_v.bias', 'feature_enhancer.text_img_attn_blocks.2.layer_norm_l.weight', 'feature_enhancer.text_img_attn_blocks.2.layer_norm_l.bias', 'feature_enhancer.text_img_attn_blocks.2.attn.v_proj.weight', 'feature_enhancer.text_img_attn_blocks.2.attn.v_proj.bias', 'feature_enhancer.text_img_attn_blocks.2.attn.l_proj.weight', 'feature_enhancer.text_img_attn_blocks.2.attn.l_proj.bias', 'feature_enhancer.text_img_attn_blocks.2.attn.values_v_proj.weight', 'feature_enhancer.text_img_attn_blocks.2.attn.values_v_proj.bias', 'feature_enhancer.text_img_attn_blocks.2.attn.values_l_proj.weight', 'feature_enhancer.text_img_attn_blocks.2.attn.values_l_proj.bias', 'feature_enhancer.text_img_attn_blocks.2.attn.out_v_proj.weight', 'feature_enhancer.text_img_attn_blocks.2.attn.out_v_proj.bias', 'feature_enhancer.text_img_attn_blocks.2.attn.out_l_proj.weight', 'feature_enhancer.text_img_attn_blocks.2.attn.out_l_proj.bias', 'feature_enhancer.text_img_attn_blocks.3.gamma_v', 'feature_enhancer.text_img_attn_blocks.3.gamma_l', 'feature_enhancer.text_img_attn_blocks.3.layer_norm_v.weight', 'feature_enhancer.text_img_attn_blocks.3.layer_norm_v.bias', 'feature_enhancer.text_img_attn_blocks.3.layer_norm_l.weight', 'feature_enhancer.text_img_attn_blocks.3.layer_norm_l.bias', 'feature_enhancer.text_img_attn_blocks.3.attn.v_proj.weight', 'feature_enhancer.text_img_attn_blocks.3.attn.v_proj.bias', 'feature_enhancer.text_img_attn_blocks.3.attn.l_proj.weight', 'feature_enhancer.text_img_attn_blocks.3.attn.l_proj.bias', 'feature_enhancer.text_img_attn_blocks.3.attn.values_v_proj.weight', 'feature_enhancer.text_img_attn_blocks.3.attn.values_v_proj.bias', 'feature_enhancer.text_img_attn_blocks.3.attn.values_l_proj.weight', 'feature_enhancer.text_img_attn_blocks.3.attn.values_l_proj.bias', 'feature_enhancer.text_img_attn_blocks.3.attn.out_v_proj.weight', 'feature_enhancer.text_img_attn_blocks.3.attn.out_v_proj.bias', 'feature_enhancer.text_img_attn_blocks.3.attn.out_l_proj.weight', 'feature_enhancer.text_img_attn_blocks.3.attn.out_l_proj.bias', 'feature_enhancer.text_img_attn_blocks.4.gamma_v', 'feature_enhancer.text_img_attn_blocks.4.gamma_l', 'feature_enhancer.text_img_attn_blocks.4.layer_norm_v.weight', 'feature_enhancer.text_img_attn_blocks.4.layer_norm_v.bias', 'feature_enhancer.text_img_attn_blocks.4.layer_norm_l.weight', 'feature_enhancer.text_img_attn_blocks.4.layer_norm_l.bias', 'feature_enhancer.text_img_attn_blocks.4.attn.v_proj.weight', 'feature_enhancer.text_img_attn_blocks.4.attn.v_proj.bias', 'feature_enhancer.text_img_attn_blocks.4.attn.l_proj.weight', 'feature_enhancer.text_img_attn_blocks.4.attn.l_proj.bias', 'feature_enhancer.text_img_attn_blocks.4.attn.values_v_proj.weight', 'feature_enhancer.text_img_attn_blocks.4.attn.values_v_proj.bias', 'feature_enhancer.text_img_attn_blocks.4.attn.values_l_proj.weight', 'feature_enhancer.text_img_attn_blocks.4.attn.values_l_proj.bias', 'feature_enhancer.text_img_attn_blocks.4.attn.out_v_proj.weight', 'feature_enhancer.text_img_attn_blocks.4.attn.out_v_proj.bias', 'feature_enhancer.text_img_attn_blocks.4.attn.out_l_proj.weight', 'feature_enhancer.text_img_attn_blocks.4.attn.out_l_proj.bias', 'feature_enhancer.text_img_attn_blocks.5.gamma_v', 'feature_enhancer.text_img_attn_blocks.5.gamma_l', 'feature_enhancer.text_img_attn_blocks.5.layer_norm_v.weight', 'feature_enhancer.text_img_attn_blocks.5.layer_norm_v.bias', 'feature_enhancer.text_img_attn_blocks.5.layer_norm_l.weight', 'feature_enhancer.text_img_attn_blocks.5.layer_norm_l.bias', 'feature_enhancer.text_img_attn_blocks.5.attn.v_proj.weight', 'feature_enhancer.text_img_attn_blocks.5.attn.v_proj.bias', 'feature_enhancer.text_img_attn_blocks.5.attn.l_proj.weight', 'feature_enhancer.text_img_attn_blocks.5.attn.l_proj.bias', 'feature_enhancer.text_img_attn_blocks.5.attn.values_v_proj.weight', 'feature_enhancer.text_img_attn_blocks.5.attn.values_v_proj.bias', 'feature_enhancer.text_img_attn_blocks.5.attn.values_l_proj.weight', 'feature_enhancer.text_img_attn_blocks.5.attn.values_l_proj.bias', 'feature_enhancer.text_img_attn_blocks.5.attn.out_v_proj.weight', 'feature_enhancer.text_img_attn_blocks.5.attn.out_v_proj.bias', 'feature_enhancer.text_img_attn_blocks.5.attn.out_l_proj.weight', 'feature_enhancer.text_img_attn_blocks.5.attn.out_l_proj.bias', 'bbox_head.reg_branches.0.0.weight', 'bbox_head.reg_branches.0.0.bias', 'bbox_head.reg_branches.0.2.weight', 'bbox_head.reg_branches.0.2.bias', 'bbox_head.reg_branches.0.4.weight', 'bbox_head.reg_branches.0.4.bias', 'bbox_head.reg_branches.1.0.weight', 'bbox_head.reg_branches.1.0.bias', 'bbox_head.reg_branches.1.2.weight', 'bbox_head.reg_branches.1.2.bias', 'bbox_head.reg_branches.1.4.weight', 'bbox_head.reg_branches.1.4.bias', 'bbox_head.reg_branches.2.0.weight', 'bbox_head.reg_branches.2.0.bias', 'bbox_head.reg_branches.2.2.weight', 'bbox_head.reg_branches.2.2.bias', 'bbox_head.reg_branches.2.4.weight', 'bbox_head.reg_branches.2.4.bias', 'bbox_head.reg_branches.3.0.weight', 'bbox_head.reg_branches.3.0.bias', 'bbox_head.reg_branches.3.2.weight', 'bbox_head.reg_branches.3.2.bias', 'bbox_head.reg_branches.3.4.weight', 'bbox_head.reg_branches.3.4.bias', 'bbox_head.reg_branches.4.0.weight', 'bbox_head.reg_branches.4.0.bias', 'bbox_head.reg_branches.4.2.weight', 'bbox_head.reg_branches.4.2.bias', 'bbox_head.reg_branches.4.4.weight', 'bbox_head.reg_branches.4.4.bias', 'bbox_head.reg_branches.5.0.weight', 'bbox_head.reg_branches.5.0.bias', 'bbox_head.reg_branches.5.2.weight', 'bbox_head.reg_branches.5.2.bias', 'bbox_head.reg_branches.5.4.weight', 'bbox_head.reg_branches.5.4.bias', 'query_embedding.weight', 'memory_trans_fc.weight', 'memory_trans_fc.bias', 'memory_trans_norm.weight', 'memory_trans_norm.bias', 'bbox_head.reg_branches.6.0.weight', 'bbox_head.reg_branches.6.0.bias', 'bbox_head.reg_branches.6.2.weight', 'bbox_head.reg_branches.6.2.bias', 'bbox_head.reg_branches.6.4.weight', 'bbox_head.reg_branches.6.4.bias', 'text_encoder.language_backbone.body.model.embeddings.position_ids', 'text_encoder.language_backbone.body.model.embeddings.word_embeddings.weight', 'text_encoder.language_backbone.body.model.embeddings.position_embeddings.weight', 'text_encoder.language_backbone.body.model.embeddings.token_type_embeddings.weight', 'text_encoder.language_backbone.body.model.embeddings.LayerNorm.weight', 'text_encoder.language_backbone.body.model.embeddings.LayerNorm.bias', 'text_encoder.language_backbone.body.model.encoder.layer.0.attention.self.query.weight', 'text_encoder.language_backbone.body.model.encoder.layer.0.attention.self.query.bias', 'text_encoder.language_backbone.body.model.encoder.layer.0.attention.self.key.weight', 'text_encoder.language_backbone.body.model.encoder.layer.0.attention.self.key.bias', 'text_encoder.language_backbone.body.model.encoder.layer.0.attention.self.value.weight', 'text_encoder.language_backbone.body.model.encoder.layer.0.attention.self.value.bias', 'text_encoder.language_backbone.body.model.encoder.layer.0.attention.output.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.0.attention.output.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.0.attention.output.LayerNorm.weight', 'text_encoder.language_backbone.body.model.encoder.layer.0.attention.output.LayerNorm.bias', 'text_encoder.language_backbone.body.model.encoder.layer.0.intermediate.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.0.intermediate.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.0.output.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.0.output.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.0.output.LayerNorm.weight', 'text_encoder.language_backbone.body.model.encoder.layer.0.output.LayerNorm.bias', 'text_encoder.language_backbone.body.model.encoder.layer.1.attention.self.query.weight', 'text_encoder.language_backbone.body.model.encoder.layer.1.attention.self.query.bias', 'text_encoder.language_backbone.body.model.encoder.layer.1.attention.self.key.weight', 'text_encoder.language_backbone.body.model.encoder.layer.1.attention.self.key.bias', 'text_encoder.language_backbone.body.model.encoder.layer.1.attention.self.value.weight', 'text_encoder.language_backbone.body.model.encoder.layer.1.attention.self.value.bias', 'text_encoder.language_backbone.body.model.encoder.layer.1.attention.output.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.1.attention.output.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.1.attention.output.LayerNorm.weight', 'text_encoder.language_backbone.body.model.encoder.layer.1.attention.output.LayerNorm.bias', 'text_encoder.language_backbone.body.model.encoder.layer.1.intermediate.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.1.intermediate.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.1.output.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.1.output.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.1.output.LayerNorm.weight', 'text_encoder.language_backbone.body.model.encoder.layer.1.output.LayerNorm.bias', 'text_encoder.language_backbone.body.model.encoder.layer.2.attention.self.query.weight', 'text_encoder.language_backbone.body.model.encoder.layer.2.attention.self.query.bias', 'text_encoder.language_backbone.body.model.encoder.layer.2.attention.self.key.weight', 'text_encoder.language_backbone.body.model.encoder.layer.2.attention.self.key.bias', 'text_encoder.language_backbone.body.model.encoder.layer.2.attention.self.value.weight', 'text_encoder.language_backbone.body.model.encoder.layer.2.attention.self.value.bias', 'text_encoder.language_backbone.body.model.encoder.layer.2.attention.output.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.2.attention.output.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.2.attention.output.LayerNorm.weight', 'text_encoder.language_backbone.body.model.encoder.layer.2.attention.output.LayerNorm.bias', 'text_encoder.language_backbone.body.model.encoder.layer.2.intermediate.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.2.intermediate.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.2.output.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.2.output.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.2.output.LayerNorm.weight', 'text_encoder.language_backbone.body.model.encoder.layer.2.output.LayerNorm.bias', 'text_encoder.language_backbone.body.model.encoder.layer.3.attention.self.query.weight', 'text_encoder.language_backbone.body.model.encoder.layer.3.attention.self.query.bias', 'text_encoder.language_backbone.body.model.encoder.layer.3.attention.self.key.weight', 'text_encoder.language_backbone.body.model.encoder.layer.3.attention.self.key.bias', 'text_encoder.language_backbone.body.model.encoder.layer.3.attention.self.value.weight', 'text_encoder.language_backbone.body.model.encoder.layer.3.attention.self.value.bias', 'text_encoder.language_backbone.body.model.encoder.layer.3.attention.output.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.3.attention.output.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.3.attention.output.LayerNorm.weight', 'text_encoder.language_backbone.body.model.encoder.layer.3.attention.output.LayerNorm.bias', 'text_encoder.language_backbone.body.model.encoder.layer.3.intermediate.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.3.intermediate.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.3.output.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.3.output.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.3.output.LayerNorm.weight', 'text_encoder.language_backbone.body.model.encoder.layer.3.output.LayerNorm.bias', 'text_encoder.language_backbone.body.model.encoder.layer.4.attention.self.query.weight', 'text_encoder.language_backbone.body.model.encoder.layer.4.attention.self.query.bias', 'text_encoder.language_backbone.body.model.encoder.layer.4.attention.self.key.weight', 'text_encoder.language_backbone.body.model.encoder.layer.4.attention.self.key.bias', 'text_encoder.language_backbone.body.model.encoder.layer.4.attention.self.value.weight', 'text_encoder.language_backbone.body.model.encoder.layer.4.attention.self.value.bias', 'text_encoder.language_backbone.body.model.encoder.layer.4.attention.output.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.4.attention.output.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.4.attention.output.LayerNorm.weight', 'text_encoder.language_backbone.body.model.encoder.layer.4.attention.output.LayerNorm.bias', 'text_encoder.language_backbone.body.model.encoder.layer.4.intermediate.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.4.intermediate.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.4.output.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.4.output.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.4.output.LayerNorm.weight', 'text_encoder.language_backbone.body.model.encoder.layer.4.output.LayerNorm.bias', 'text_encoder.language_backbone.body.model.encoder.layer.5.attention.self.query.weight', 'text_encoder.language_backbone.body.model.encoder.layer.5.attention.self.query.bias', 'text_encoder.language_backbone.body.model.encoder.layer.5.attention.self.key.weight', 'text_encoder.language_backbone.body.model.encoder.layer.5.attention.self.key.bias', 'text_encoder.language_backbone.body.model.encoder.layer.5.attention.self.value.weight', 'text_encoder.language_backbone.body.model.encoder.layer.5.attention.self.value.bias', 'text_encoder.language_backbone.body.model.encoder.layer.5.attention.output.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.5.attention.output.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.5.attention.output.LayerNorm.weight', 'text_encoder.language_backbone.body.model.encoder.layer.5.attention.output.LayerNorm.bias', 'text_encoder.language_backbone.body.model.encoder.layer.5.intermediate.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.5.intermediate.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.5.output.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.5.output.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.5.output.LayerNorm.weight', 'text_encoder.language_backbone.body.model.encoder.layer.5.output.LayerNorm.bias', 'text_encoder.language_backbone.body.model.encoder.layer.6.attention.self.query.weight', 'text_encoder.language_backbone.body.model.encoder.layer.6.attention.self.query.bias', 'text_encoder.language_backbone.body.model.encoder.layer.6.attention.self.key.weight', 'text_encoder.language_backbone.body.model.encoder.layer.6.attention.self.key.bias', 'text_encoder.language_backbone.body.model.encoder.layer.6.attention.self.value.weight', 'text_encoder.language_backbone.body.model.encoder.layer.6.attention.self.value.bias', 'text_encoder.language_backbone.body.model.encoder.layer.6.attention.output.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.6.attention.output.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.6.attention.output.LayerNorm.weight', 'text_encoder.language_backbone.body.model.encoder.layer.6.attention.output.LayerNorm.bias', 'text_encoder.language_backbone.body.model.encoder.layer.6.intermediate.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.6.intermediate.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.6.output.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.6.output.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.6.output.LayerNorm.weight', 'text_encoder.language_backbone.body.model.encoder.layer.6.output.LayerNorm.bias', 'text_encoder.language_backbone.body.model.encoder.layer.7.attention.self.query.weight', 'text_encoder.language_backbone.body.model.encoder.layer.7.attention.self.query.bias', 'text_encoder.language_backbone.body.model.encoder.layer.7.attention.self.key.weight', 'text_encoder.language_backbone.body.model.encoder.layer.7.attention.self.key.bias', 'text_encoder.language_backbone.body.model.encoder.layer.7.attention.self.value.weight', 'text_encoder.language_backbone.body.model.encoder.layer.7.attention.self.value.bias', 'text_encoder.language_backbone.body.model.encoder.layer.7.attention.output.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.7.attention.output.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.7.attention.output.LayerNorm.weight', 'text_encoder.language_backbone.body.model.encoder.layer.7.attention.output.LayerNorm.bias', 'text_encoder.language_backbone.body.model.encoder.layer.7.intermediate.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.7.intermediate.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.7.output.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.7.output.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.7.output.LayerNorm.weight', 'text_encoder.language_backbone.body.model.encoder.layer.7.output.LayerNorm.bias', 'text_encoder.language_backbone.body.model.encoder.layer.8.attention.self.query.weight', 'text_encoder.language_backbone.body.model.encoder.layer.8.attention.self.query.bias', 'text_encoder.language_backbone.body.model.encoder.layer.8.attention.self.key.weight', 'text_encoder.language_backbone.body.model.encoder.layer.8.attention.self.key.bias', 'text_encoder.language_backbone.body.model.encoder.layer.8.attention.self.value.weight', 'text_encoder.language_backbone.body.model.encoder.layer.8.attention.self.value.bias', 'text_encoder.language_backbone.body.model.encoder.layer.8.attention.output.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.8.attention.output.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.8.attention.output.LayerNorm.weight', 'text_encoder.language_backbone.body.model.encoder.layer.8.attention.output.LayerNorm.bias', 'text_encoder.language_backbone.body.model.encoder.layer.8.intermediate.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.8.intermediate.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.8.output.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.8.output.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.8.output.LayerNorm.weight', 'text_encoder.language_backbone.body.model.encoder.layer.8.output.LayerNorm.bias', 'text_encoder.language_backbone.body.model.encoder.layer.9.attention.self.query.weight', 'text_encoder.language_backbone.body.model.encoder.layer.9.attention.self.query.bias', 'text_encoder.language_backbone.body.model.encoder.layer.9.attention.self.key.weight', 'text_encoder.language_backbone.body.model.encoder.layer.9.attention.self.key.bias', 'text_encoder.language_backbone.body.model.encoder.layer.9.attention.self.value.weight', 'text_encoder.language_backbone.body.model.encoder.layer.9.attention.self.value.bias', 'text_encoder.language_backbone.body.model.encoder.layer.9.attention.output.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.9.attention.output.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.9.attention.output.LayerNorm.weight', 'text_encoder.language_backbone.body.model.encoder.layer.9.attention.output.LayerNorm.bias', 'text_encoder.language_backbone.body.model.encoder.layer.9.intermediate.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.9.intermediate.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.9.output.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.9.output.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.9.output.LayerNorm.weight', 'text_encoder.language_backbone.body.model.encoder.layer.9.output.LayerNorm.bias', 'text_encoder.language_backbone.body.model.encoder.layer.10.attention.self.query.weight', 'text_encoder.language_backbone.body.model.encoder.layer.10.attention.self.query.bias', 'text_encoder.language_backbone.body.model.encoder.layer.10.attention.self.key.weight', 'text_encoder.language_backbone.body.model.encoder.layer.10.attention.self.key.bias', 'text_encoder.language_backbone.body.model.encoder.layer.10.attention.self.value.weight', 'text_encoder.language_backbone.body.model.encoder.layer.10.attention.self.value.bias', 'text_encoder.language_backbone.body.model.encoder.layer.10.attention.output.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.10.attention.output.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.10.attention.output.LayerNorm.weight', 'text_encoder.language_backbone.body.model.encoder.layer.10.attention.output.LayerNorm.bias', 'text_encoder.language_backbone.body.model.encoder.layer.10.intermediate.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.10.intermediate.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.10.output.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.10.output.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.10.output.LayerNorm.weight', 'text_encoder.language_backbone.body.model.encoder.layer.10.output.LayerNorm.bias', 'text_encoder.language_backbone.body.model.encoder.layer.11.attention.self.query.weight', 'text_encoder.language_backbone.body.model.encoder.layer.11.attention.self.query.bias', 'text_encoder.language_backbone.body.model.encoder.layer.11.attention.self.key.weight', 'text_encoder.language_backbone.body.model.encoder.layer.11.attention.self.key.bias', 'text_encoder.language_backbone.body.model.encoder.layer.11.attention.self.value.weight', 'text_encoder.language_backbone.body.model.encoder.layer.11.attention.self.value.bias', 'text_encoder.language_backbone.body.model.encoder.layer.11.attention.output.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.11.attention.output.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.11.attention.output.LayerNorm.weight', 'text_encoder.language_backbone.body.model.encoder.layer.11.attention.output.LayerNorm.bias', 'text_encoder.language_backbone.body.model.encoder.layer.11.intermediate.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.11.intermediate.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.11.output.dense.weight', 'text_encoder.language_backbone.body.model.encoder.layer.11.output.dense.bias', 'text_encoder.language_backbone.body.model.encoder.layer.11.output.LayerNorm.weight', 'text_encoder.language_backbone.body.model.encoder.layer.11.output.LayerNorm.bias', 'text_encoder.language_backbone.body.model.pooler.dense.weight', 'text_encoder.language_backbone.body.model.pooler.dense.bias', 'text_feat_map.weight', 'text_feat_map.bias', 'decoder.norm.weight', 'decoder.norm.bias', 'decoder.ref_point_head.layers.0.weight', 'decoder.ref_point_head.layers.0.bias', 'decoder.ref_point_head.layers.1.weight', 'decoder.ref_point_head.layers.1.bias', 'decoder.layers.0.attn.in_proj_weight', 'decoder.layers.0.attn.in_proj_bias', 'decoder.layers.0.attn.out_proj.weight', 'decoder.layers.0.attn.out_proj.bias', 'decoder.layers.1.weight', 'decoder.layers.1.bias', 'decoder.layers.3.bias', 'decoder.layers.4.sampling_offsets.weight', 'decoder.layers.4.sampling_offsets.bias', 'decoder.layers.4.attention_weights.weight', 'decoder.layers.4.attention_weights.bias', 'decoder.layers.4.value_proj.weight', 'decoder.layers.4.value_proj.bias', 'decoder.layers.4.output_proj.weight', 'decoder.layers.4.output_proj.bias', 'decoder.layers.5.bias', 'decoder.layers.6.layers.0.0.weight', 'decoder.layers.6.layers.0.0.bias', 'decoder.layers.6.layers.1.weight', 'decoder.layers.6.layers.1.bias', 'decoder.layers.7.bias', 'decoder.layers.11.attn.in_proj_weight', 'decoder.layers.11.attn.in_proj_bias', 'decoder.layers.11.attn.out_proj.weight', 'decoder.layers.11.attn.out_proj.bias', 'decoder.layers.13.sampling_offsets.weight', 'decoder.layers.13.sampling_offsets.bias', 'decoder.layers.13.attention_weights.weight', 'decoder.layers.13.attention_weights.bias', 'decoder.layers.13.value_proj.weight', 'decoder.layers.13.value_proj.bias', 'decoder.layers.13.output_proj.weight', 'decoder.layers.13.output_proj.bias', 'decoder.layers.14.weight', 'decoder.layers.14.bias', 'decoder.layers.16.bias', 'decoder.layers.18.attn.in_proj_weight', 'decoder.layers.18.attn.in_proj_bias', 'decoder.layers.18.attn.out_proj.weight', 'decoder.layers.18.attn.out_proj.bias', 'decoder.layers.19.weight', 'decoder.layers.19.bias', 'decoder.layers.20.attn.in_proj_weight', 'decoder.layers.20.attn.in_proj_bias', 'decoder.layers.20.attn.out_proj.weight', 'decoder.layers.20.attn.out_proj.bias', 'decoder.layers.27.attn.in_proj_weight', 'decoder.layers.27.attn.in_proj_bias', 'decoder.layers.27.attn.out_proj.weight', 'decoder.layers.27.attn.out_proj.bias', 'decoder.layers.29.attn.in_proj_weight', 'decoder.layers.29.attn.in_proj_bias', 'decoder.layers.29.attn.out_proj.weight', 'decoder.layers.29.attn.out_proj.bias', 'decoder.layers.30.weight', 'decoder.layers.30.bias', 'decoder.layers.31.sampling_offsets.weight', 'decoder.layers.31.sampling_offsets.bias', 'decoder.layers.31.attention_weights.weight', 'decoder.layers.31.attention_weights.bias', 'decoder.layers.31.value_proj.weight', 'decoder.layers.31.value_proj.bias', 'decoder.layers.31.output_proj.weight', 'decoder.layers.31.output_proj.bias', 'decoder.layers.32.weight', 'decoder.layers.32.bias', 'decoder.layers.33.layers.0.0.weight', 'decoder.layers.33.layers.0.0.bias', 'decoder.layers.33.layers.1.weight', 'decoder.layers.33.layers.1.bias', 'decoder.layers.37.weight', 'decoder.layers.37.bias', 'decoder.layers.39.weight', 'decoder.layers.39.bias', 'decoder.layers.40.sampling_offsets.weight', 'decoder.layers.40.sampling_offsets.bias', 'decoder.layers.40.attention_weights.weight', 'decoder.layers.40.attention_weights.bias', 'decoder.layers.40.value_proj.weight', 'decoder.layers.40.value_proj.bias', 'decoder.layers.40.output_proj.weight', 'decoder.layers.40.output_proj.bias', 'decoder.layers.42.layers.0.0.weight', 'decoder.layers.42.layers.0.0.bias', 'decoder.layers.42.layers.1.weight', 'decoder.layers.42.layers.1.bias', 'decoder.layers.43.weight', 'decoder.layers.43.bias', 'decoder.layers.45.attn.in_proj_weight', 'decoder.layers.45.attn.in_proj_bias', 'decoder.layers.45.attn.out_proj.weight', 'decoder.layers.45.attn.out_proj.bias', 'decoder.layers.46.bias', 'decoder.layers.50.weight', 'decoder.layers.50.bias', 'decoder.layers.52.weight', 'decoder.layers.52.bias', 'neck.extra_convs.0.conv.weight', 'neck.extra_convs.0.conv.bias', 'neck.extra_convs.0.gn.weight', 'neck.extra_convs.0.gn.bias', 'neck.convs.0.conv.weight', 'neck.convs.0.conv.bias', 'neck.convs.0.gn.weight', 'neck.convs.0.gn.bias', 'neck.convs.1.conv.weight', 'neck.convs.1.conv.bias', 'neck.convs.1.gn.weight', 'neck.convs.1.gn.bias', 'neck.convs.2.conv.weight', 'neck.convs.2.conv.bias', 'neck.convs.2.gn.weight', 'neck.convs.2.gn.bias'] /running_package/submit-sem_robotwin/train.py:92: DeprecationWarning: Call to deprecated class SimpleTrainer. (This class is deprecated. Use `HookBasedTrainer` instead.) -- Deprecated since version 0.2.0. trainer = SimpleTrainer( /running_package/submit-sem_robotwin/train.py:92: DeprecationWarning: Call to deprecated class SimpleTrainer. (This class is deprecated. Use `HookBasedTrainer` instead.) -- Deprecated since version 0.2.0. trainer = SimpleTrainer( /running_package/submit-sem_robotwin/train.py:92: DeprecationWarning: Call to deprecated class SimpleTrainer. (This class is deprecated. Use `HookBasedTrainer` instead.) -- Deprecated since version 0.2.0. trainer = SimpleTrainer( /running_package/submit-sem_robotwin/train.py:92: DeprecationWarning: Call to deprecated class SimpleTrainer. (This class is deprecated. Use `HookBasedTrainer` instead.) -- Deprecated since version 0.2.0. trainer = SimpleTrainer( /running_package/submit-sem_robotwin/train.py:92: DeprecationWarning: Call to deprecated class SimpleTrainer. (This class is deprecated. Use `HookBasedTrainer` instead.) -- Deprecated since version 0.2.0. trainer = SimpleTrainer( /running_package/submit-sem_robotwin/train.py:92: DeprecationWarning: Call to deprecated class SimpleTrainer. (This class is deprecated. Use `HookBasedTrainer` instead.) -- Deprecated since version 0.2.0. trainer = SimpleTrainer( /running_package/submit-sem_robotwin/train.py:92: DeprecationWarning: Call to deprecated class SimpleTrainer. (This class is deprecated. Use `HookBasedTrainer` instead.) -- Deprecated since version 0.2.0. trainer = SimpleTrainer( /running_package/submit-sem_robotwin/train.py:92: DeprecationWarning: Call to deprecated class SimpleTrainer. (This class is deprecated. Use `HookBasedTrainer` instead.) -- Deprecated since version 0.2.0. trainer = SimpleTrainer( Rank[14/16] 06/24/2025 12:52:33 INFO base_lmdb_dataset.py:186 | dataset length: 17451, number of episode: 100 Rank[11/16] 06/24/2025 12:52:33 INFO base_lmdb_dataset.py:186 | dataset length: 17451, number of episode: 100 Rank[12/16] 06/24/2025 12:52:33 INFO base_lmdb_dataset.py:186 | dataset length: 17451, number of episode: 100 Rank[10/16] 06/24/2025 12:52:33 INFO base_lmdb_dataset.py:186 | dataset length: 17451, number of episode: 100 Rank[8/16] 06/24/2025 12:52:33 INFO base_lmdb_dataset.py:186 | dataset length: 17451, number of episode: 100 Rank[13/16] 06/24/2025 12:52:33 INFO base_lmdb_dataset.py:186 | dataset length: 17451, number of episode: 100 Rank[9/16] 06/24/2025 12:52:33 INFO base_lmdb_dataset.py:186 | dataset length: 17451, number of episode: 100 Rank[15/16] 06/24/2025 12:52:33 INFO base_lmdb_dataset.py:186 | dataset length: 17451, number of episode: 100 Rank[12/16] 06/24/2025 12:52:33 INFO utils.py:61 | load checkpoint: ./ckpt/groundingdino_swint_ogc_mmdet-822d7e9d-rename.pth Rank[10/16] 06/24/2025 12:52:33 INFO utils.py:61 | load checkpoint: ./ckpt/groundingdino_swint_ogc_mmdet-822d7e9d-rename.pth Rank[8/16] 06/24/2025 12:52:33 INFO utils.py:61 | load checkpoint: ./ckpt/groundingdino_swint_ogc_mmdet-822d7e9d-rename.pth Rank[9/16] 06/24/2025 12:52:33 INFO utils.py:61 | load checkpoint: ./ckpt/groundingdino_swint_ogc_mmdet-822d7e9d-rename.pth Rank[11/16] 06/24/2025 12:52:33 INFO utils.py:61 | load checkpoint: ./ckpt/groundingdino_swint_ogc_mmdet-822d7e9d-rename.pth Rank[14/16] 06/24/2025 12:52:33 INFO utils.py:61 | load checkpoint: ./ckpt/groundingdino_swint_ogc_mmdet-822d7e9d-rename.pth Rank[15/16] 06/24/2025 12:52:33 INFO utils.py:61 | load checkpoint: ./ckpt/groundingdino_swint_ogc_mmdet-822d7e9d-rename.pth Rank[13/16] 06/24/2025 12:52:33 INFO utils.py:61 | load checkpoint: ./ckpt/groundingdino_swint_ogc_mmdet-822d7e9d-rename.pth /running_package/submit-sem_robotwin/train.py:92: DeprecationWarning: Call to deprecated class SimpleTrainer. (This class is deprecated. Use `HookBasedTrainer` instead.) -- Deprecated since version 0.2.0. trainer = SimpleTrainer( /running_package/submit-sem_robotwin/train.py:92: DeprecationWarning: Call to deprecated class SimpleTrainer. (This class is deprecated. Use `HookBasedTrainer` instead.) -- Deprecated since version 0.2.0. trainer = SimpleTrainer( /running_package/submit-sem_robotwin/train.py:92: DeprecationWarning: Call to deprecated class SimpleTrainer. (This class is deprecated. Use `HookBasedTrainer` instead.) -- Deprecated since version 0.2.0. trainer = SimpleTrainer( /running_package/submit-sem_robotwin/train.py:92: DeprecationWarning: Call to deprecated class SimpleTrainer. (This class is deprecated. Use `HookBasedTrainer` instead.) -- Deprecated since version 0.2.0. trainer = SimpleTrainer( /running_package/submit-sem_robotwin/train.py:92: DeprecationWarning: Call to deprecated class SimpleTrainer. (This class is deprecated. Use `HookBasedTrainer` instead.) -- Deprecated since version 0.2.0. trainer = SimpleTrainer( /running_package/submit-sem_robotwin/train.py:92: DeprecationWarning: Call to deprecated class SimpleTrainer. (This class is deprecated. Use `HookBasedTrainer` instead.) -- Deprecated since version 0.2.0. trainer = SimpleTrainer( /running_package/submit-sem_robotwin/train.py:92: DeprecationWarning: Call to deprecated class SimpleTrainer. (This class is deprecated. Use `HookBasedTrainer` instead.) -- Deprecated since version 0.2.0. trainer = SimpleTrainer( /running_package/submit-sem_robotwin/train.py:92: DeprecationWarning: Call to deprecated class SimpleTrainer. (This class is deprecated. Use `HookBasedTrainer` instead.) -- Deprecated since version 0.2.0. trainer = SimpleTrainer( Rank[0/16] 06/24/2025 12:52:40 INFO hook_based_trainer.py:323 | ==================================================BEGIN TRAINING================================================== Rank[0/16] 06/24/2025 12:52:40 INFO hook_based_trainer.py:327 | Start training loop from epoch 0 and step 0 /usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py:600: UserWarning: torch.utils.checkpoint: the use_reentrant parameter should be passed explicitly. In version 2.4 we will raise an exception if use_reentrant is not passed. use_reentrant=False is recommended, but if you need to preserve the current default behavior, you can pass use_reentrant=True. Refer to docs for more details on the differences between the two variants. return fn(*args, **kwargs) /usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py:600: UserWarning: torch.utils.checkpoint: the use_reentrant parameter should be passed explicitly. In version 2.4 we will raise an exception if use_reentrant is not passed. use_reentrant=False is recommended, but if you need to preserve the current default behavior, you can pass use_reentrant=True. Refer to docs for more details on the differences between the two variants. return fn(*args, **kwargs) /usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py:600: UserWarning: torch.utils.checkpoint: the use_reentrant parameter should be passed explicitly. In version 2.4 we will raise an exception if use_reentrant is not passed. use_reentrant=False is recommended, but if you need to preserve the current default behavior, you can pass use_reentrant=True. Refer to docs for more details on the differences between the two variants. return fn(*args, **kwargs) /usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py:600: UserWarning: torch.utils.checkpoint: the use_reentrant parameter should be passed explicitly. In version 2.4 we will raise an exception if use_reentrant is not passed. use_reentrant=False is recommended, but if you need to preserve the current default behavior, you can pass use_reentrant=True. Refer to docs for more details on the differences between the two variants. return fn(*args, **kwargs) /usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py:600: UserWarning: torch.utils.checkpoint: the use_reentrant parameter should be passed explicitly. In version 2.4 we will raise an exception if use_reentrant is not passed. use_reentrant=False is recommended, but if you need to preserve the current default behavior, you can pass use_reentrant=True. Refer to docs for more details on the differences between the two variants. return fn(*args, **kwargs) /usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py:600: UserWarning: torch.utils.checkpoint: the use_reentrant parameter should be passed explicitly. In version 2.4 we will raise an exception if use_reentrant is not passed. use_reentrant=False is recommended, but if you need to preserve the current default behavior, you can pass use_reentrant=True. Refer to docs for more details on the differences between the two variants. return fn(*args, **kwargs) /usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py:600: UserWarning: torch.utils.checkpoint: the use_reentrant parameter should be passed explicitly. In version 2.4 we will raise an exception if use_reentrant is not passed. use_reentrant=False is recommended, but if you need to preserve the current default behavior, you can pass use_reentrant=True. Refer to docs for more details on the differences between the two variants. return fn(*args, **kwargs) /usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py:600: UserWarning: torch.utils.checkpoint: the use_reentrant parameter should be passed explicitly. In version 2.4 we will raise an exception if use_reentrant is not passed. use_reentrant=False is recommended, but if you need to preserve the current default behavior, you can pass use_reentrant=True. Refer to docs for more details on the differences between the two variants. return fn(*args, **kwargs) /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:1399: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), recompute_context: # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:1399: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), recompute_context: # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:295: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with torch.enable_grad(), device_autocast_ctx, torch.cpu.amp.autocast(**ctx.cpu_autocast_kwargs): # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:295: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with torch.enable_grad(), device_autocast_ctx, torch.cpu.amp.autocast(**ctx.cpu_autocast_kwargs): # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:1399: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), recompute_context: # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:1399: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), recompute_context: # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:1399: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), recompute_context: # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:295: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with torch.enable_grad(), device_autocast_ctx, torch.cpu.amp.autocast(**ctx.cpu_autocast_kwargs): # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:1399: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), recompute_context: # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:295: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with torch.enable_grad(), device_autocast_ctx, torch.cpu.amp.autocast(**ctx.cpu_autocast_kwargs): # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:1399: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), recompute_context: # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:295: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with torch.enable_grad(), device_autocast_ctx, torch.cpu.amp.autocast(**ctx.cpu_autocast_kwargs): # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:295: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with torch.enable_grad(), device_autocast_ctx, torch.cpu.amp.autocast(**ctx.cpu_autocast_kwargs): # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:295: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with torch.enable_grad(), device_autocast_ctx, torch.cpu.amp.autocast(**ctx.cpu_autocast_kwargs): # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:1399: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), recompute_context: # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:295: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with torch.enable_grad(), device_autocast_ctx, torch.cpu.amp.autocast(**ctx.cpu_autocast_kwargs): # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py:600: UserWarning: torch.utils.checkpoint: the use_reentrant parameter should be passed explicitly. In version 2.4 we will raise an exception if use_reentrant is not passed. use_reentrant=False is recommended, but if you need to preserve the current default behavior, you can pass use_reentrant=True. Refer to docs for more details on the differences between the two variants. return fn(*args, **kwargs) /usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py:600: UserWarning: torch.utils.checkpoint: the use_reentrant parameter should be passed explicitly. In version 2.4 we will raise an exception if use_reentrant is not passed. use_reentrant=False is recommended, but if you need to preserve the current default behavior, you can pass use_reentrant=True. Refer to docs for more details on the differences between the two variants. return fn(*args, **kwargs) /usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py:600: UserWarning: torch.utils.checkpoint: the use_reentrant parameter should be passed explicitly. In version 2.4 we will raise an exception if use_reentrant is not passed. use_reentrant=False is recommended, but if you need to preserve the current default behavior, you can pass use_reentrant=True. Refer to docs for more details on the differences between the two variants. return fn(*args, **kwargs) /usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py:600: UserWarning: torch.utils.checkpoint: the use_reentrant parameter should be passed explicitly. In version 2.4 we will raise an exception if use_reentrant is not passed. use_reentrant=False is recommended, but if you need to preserve the current default behavior, you can pass use_reentrant=True. Refer to docs for more details on the differences between the two variants. return fn(*args, **kwargs) /usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py:600: UserWarning: torch.utils.checkpoint: the use_reentrant parameter should be passed explicitly. In version 2.4 we will raise an exception if use_reentrant is not passed. use_reentrant=False is recommended, but if you need to preserve the current default behavior, you can pass use_reentrant=True. Refer to docs for more details on the differences between the two variants. return fn(*args, **kwargs) /usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py:600: UserWarning: torch.utils.checkpoint: the use_reentrant parameter should be passed explicitly. In version 2.4 we will raise an exception if use_reentrant is not passed. use_reentrant=False is recommended, but if you need to preserve the current default behavior, you can pass use_reentrant=True. Refer to docs for more details on the differences between the two variants. return fn(*args, **kwargs) /usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py:600: UserWarning: torch.utils.checkpoint: the use_reentrant parameter should be passed explicitly. In version 2.4 we will raise an exception if use_reentrant is not passed. use_reentrant=False is recommended, but if you need to preserve the current default behavior, you can pass use_reentrant=True. Refer to docs for more details on the differences between the two variants. return fn(*args, **kwargs) /usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py:600: UserWarning: torch.utils.checkpoint: the use_reentrant parameter should be passed explicitly. In version 2.4 we will raise an exception if use_reentrant is not passed. use_reentrant=False is recommended, but if you need to preserve the current default behavior, you can pass use_reentrant=True. Refer to docs for more details on the differences between the two variants. return fn(*args, **kwargs) /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:1399: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), recompute_context: # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:1399: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), recompute_context: # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:1399: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), recompute_context: # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:1399: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), recompute_context: # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:1399: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), recompute_context: # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:1399: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), recompute_context: # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:295: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with torch.enable_grad(), device_autocast_ctx, torch.cpu.amp.autocast(**ctx.cpu_autocast_kwargs): # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:295: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with torch.enable_grad(), device_autocast_ctx, torch.cpu.amp.autocast(**ctx.cpu_autocast_kwargs): # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:295: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with torch.enable_grad(), device_autocast_ctx, torch.cpu.amp.autocast(**ctx.cpu_autocast_kwargs): # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:1399: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), recompute_context: # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:295: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with torch.enable_grad(), device_autocast_ctx, torch.cpu.amp.autocast(**ctx.cpu_autocast_kwargs): # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:295: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with torch.enable_grad(), device_autocast_ctx, torch.cpu.amp.autocast(**ctx.cpu_autocast_kwargs): # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:295: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with torch.enable_grad(), device_autocast_ctx, torch.cpu.amp.autocast(**ctx.cpu_autocast_kwargs): # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:295: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with torch.enable_grad(), device_autocast_ctx, torch.cpu.amp.autocast(**ctx.cpu_autocast_kwargs): # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:1399: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), recompute_context: # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:295: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. with torch.enable_grad(), device_autocast_ctx, torch.cpu.amp.autocast(**ctx.cpu_autocast_kwargs): # type: ignore[attr-defined] Rank[0/16] 06/24/2025 12:53:25 INFO stats.py:314 | Epoch[0] Step[24] GlobalStep[24] Training Speed: 428.90 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2 days, 1:26:17. Learning Rate: 5.09500e-06. Rank[0/16] 06/24/2025 12:53:25 INFO loss_tracker.py:84 | Epoch[0/NA] Step[24] GlobalStep[24/99999]: loss_noise_mse[0.2436] loss_fk_mse[0.0978] loss_depth[0.0460] total_loss[0.3874] Rank[0/16] 06/24/2025 12:53:35 INFO stats.py:314 | Epoch[0] Step[49] GlobalStep[49] Training Speed: 435.35 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1 day, 6:22:02. Learning Rate: 1.00900e-05. Rank[0/16] 06/24/2025 12:53:35 INFO loss_tracker.py:84 | Epoch[0/NA] Step[49] GlobalStep[49/99999]: loss_noise_mse[0.1159] loss_fk_mse[0.0715] loss_depth[0.0459] total_loss[0.2333] Rank[0/16] 06/24/2025 12:53:44 INFO stats.py:314 | Epoch[0] Step[74] GlobalStep[74] Training Speed: 452.41 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 23:46:47. Learning Rate: 1.50850e-05. Rank[0/16] 06/24/2025 12:53:45 INFO loss_tracker.py:84 | Epoch[0/NA] Step[74] GlobalStep[74/99999]: loss_noise_mse[0.0684] loss_fk_mse[0.0374] loss_depth[0.0458] total_loss[0.1516] Rank[0/16] 06/24/2025 12:53:55 INFO stats.py:314 | Epoch[0] Step[99] GlobalStep[99] Training Speed: 433.41 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 20:51:50. Learning Rate: 2.00800e-05. Rank[0/16] 06/24/2025 12:53:55 INFO loss_tracker.py:84 | Epoch[0/NA] Step[99] GlobalStep[99/99999]: loss_noise_mse[0.0443] loss_fk_mse[0.0257] loss_depth[0.0457] total_loss[0.1158] Rank[0/16] 06/24/2025 12:54:07 INFO stats.py:314 | Epoch[0] Step[124] GlobalStep[124] Training Speed: 459.45 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 19:10:00. Learning Rate: 2.50750e-05. Rank[0/16] 06/24/2025 12:54:07 INFO loss_tracker.py:84 | Epoch[0/NA] Step[124] GlobalStep[124/99999]: loss_noise_mse[0.0355] loss_fk_mse[0.0232] loss_depth[0.0456] total_loss[0.1043] Rank[0/16] 06/24/2025 12:54:11 INFO stats.py:394 | Epoch[0] completed. Training Speed: 192.81 samples/sec across all devices. Epoch Time: 90.95 sec. Average Epoch Time: 90.95 sec. Average Step Time: 0.66 sec. Estimated Remaining Time: 18:24:54. Learning Rate: 2.74726e-05. Rank[0/16] 06/24/2025 12:54:17 INFO stats.py:314 | Epoch[1] Step[12] GlobalStep[149] Training Speed: 399.80 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 17:53:59. Learning Rate: 3.00700e-05. Rank[0/16] 06/24/2025 12:54:21 INFO loss_tracker.py:84 | Epoch[1/NA] Step[24] GlobalStep[161/99999]: loss_noise_mse[0.0291] loss_fk_mse[0.0224] loss_depth[0.0454] total_loss[0.0969] Rank[0/16] 06/24/2025 12:54:28 INFO stats.py:314 | Epoch[1] Step[37] GlobalStep[174] Training Speed: 427.26 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 17:04:45. Learning Rate: 3.50650e-05. Rank[0/16] 06/24/2025 12:54:33 INFO loss_tracker.py:84 | Epoch[1/NA] Step[49] GlobalStep[186/99999]: loss_noise_mse[0.0250] loss_fk_mse[0.0229] loss_depth[0.0451] total_loss[0.0930] Rank[0/16] 06/24/2025 12:54:37 INFO stats.py:314 | Epoch[1] Step[62] GlobalStep[199] Training Speed: 425.51 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 16:14:41. Learning Rate: 4.00600e-05. Rank[0/16] 06/24/2025 12:54:42 INFO loss_tracker.py:84 | Epoch[1/NA] Step[74] GlobalStep[211/99999]: loss_noise_mse[0.0212] loss_fk_mse[0.0214] loss_depth[0.0449] total_loss[0.0874] Rank[0/16] 06/24/2025 12:54:48 INFO stats.py:314 | Epoch[1] Step[87] GlobalStep[224] Training Speed: 430.10 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 15:45:50. Learning Rate: 4.50550e-05. Rank[0/16] 06/24/2025 12:54:53 INFO loss_tracker.py:84 | Epoch[1/NA] Step[99] GlobalStep[236/99999]: loss_noise_mse[0.0198] loss_fk_mse[0.0208] loss_depth[0.0445] total_loss[0.0851] Rank[0/16] 06/24/2025 12:54:58 INFO stats.py:314 | Epoch[1] Step[112] GlobalStep[249] Training Speed: 436.86 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 15:15:15. Learning Rate: 5.00500e-05. Rank[0/16] 06/24/2025 12:55:02 INFO loss_tracker.py:84 | Epoch[1/NA] Step[124] GlobalStep[261/99999]: loss_noise_mse[0.0171] loss_fk_mse[0.0189] loss_depth[0.0439] total_loss[0.0799] Rank[0/16] 06/24/2025 12:55:07 INFO stats.py:394 | Epoch[1] completed. Training Speed: 315.25 samples/sec across all devices. Epoch Time: 55.62 sec. Average Epoch Time: 55.62 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 14:49:07. Learning Rate: 5.48452e-05. Rank[0/16] 06/24/2025 12:55:08 INFO stats.py:314 | Epoch[2] Step[0] GlobalStep[274] Training Speed: 362.64 samples/sec across all devices. Average Step Time: 0.35 sec. Estimated Remaining Time: 14:54:09. Learning Rate: 5.50450e-05. Rank[0/16] 06/24/2025 12:55:18 INFO loss_tracker.py:84 | Epoch[2/NA] Step[24] GlobalStep[298/99999]: loss_noise_mse[0.0142] loss_fk_mse[0.0163] loss_depth[0.0431] total_loss[0.0736] Rank[0/16] 06/24/2025 12:55:18 INFO stats.py:314 | Epoch[2] Step[25] GlobalStep[299] Training Speed: 417.85 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 14:35:10. Learning Rate: 6.00400e-05. Rank[0/16] 06/24/2025 12:55:29 INFO loss_tracker.py:84 | Epoch[2/NA] Step[49] GlobalStep[323/99999]: loss_noise_mse[0.0118] loss_fk_mse[0.0142] loss_depth[0.0420] total_loss[0.0680] Rank[0/16] 06/24/2025 12:55:29 INFO stats.py:314 | Epoch[2] Step[50] GlobalStep[324] Training Speed: 416.76 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 14:24:45. Learning Rate: 6.50350e-05. Rank[0/16] 06/24/2025 12:55:39 INFO loss_tracker.py:84 | Epoch[2/NA] Step[74] GlobalStep[348/99999]: loss_noise_mse[0.0103] loss_fk_mse[0.0125] loss_depth[0.0412] total_loss[0.0639] Rank[0/16] 06/24/2025 12:55:39 INFO stats.py:314 | Epoch[2] Step[75] GlobalStep[349] Training Speed: 397.74 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 14:08:45. Learning Rate: 7.00300e-05. Rank[0/16] 06/24/2025 12:55:49 INFO loss_tracker.py:84 | Epoch[2/NA] Step[99] GlobalStep[373/99999]: loss_noise_mse[0.0087] loss_fk_mse[0.0112] loss_depth[0.0403] total_loss[0.0602] Rank[0/16] 06/24/2025 12:55:50 INFO stats.py:314 | Epoch[2] Step[100] GlobalStep[374] Training Speed: 397.02 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 13:58:43. Learning Rate: 7.50250e-05. Rank[0/16] 06/24/2025 12:55:59 INFO loss_tracker.py:84 | Epoch[2/NA] Step[124] GlobalStep[398/99999]: loss_noise_mse[0.0079] loss_fk_mse[0.0104] loss_depth[0.0394] total_loss[0.0577] Rank[0/16] 06/24/2025 12:55:59 INFO stats.py:314 | Epoch[2] Step[125] GlobalStep[399] Training Speed: 424.63 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 13:47:03. Learning Rate: 8.00200e-05. Rank[0/16] 06/24/2025 12:56:03 INFO stats.py:394 | Epoch[2] completed. Training Speed: 310.50 samples/sec across all devices. Epoch Time: 56.48 sec. Average Epoch Time: 56.48 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 13:40:00. Learning Rate: 8.22178e-05. Rank[0/16] 06/24/2025 12:56:10 INFO stats.py:314 | Epoch[3] Step[13] GlobalStep[424] Training Speed: 424.97 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 13:41:07. Learning Rate: 8.50150e-05. Rank[0/16] 06/24/2025 12:56:15 INFO loss_tracker.py:84 | Epoch[3/NA] Step[24] GlobalStep[435/99999]: loss_noise_mse[0.0068] loss_fk_mse[0.0101] loss_depth[0.0382] total_loss[0.0551] Rank[0/16] 06/24/2025 12:56:20 INFO stats.py:314 | Epoch[3] Step[38] GlobalStep[449] Training Speed: 419.65 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 13:30:47. Learning Rate: 9.00100e-05. Rank[0/16] 06/24/2025 12:56:25 INFO loss_tracker.py:84 | Epoch[3/NA] Step[49] GlobalStep[460/99999]: loss_noise_mse[0.0062] loss_fk_mse[0.0100] loss_depth[0.0371] total_loss[0.0534] Rank[0/16] 06/24/2025 12:56:31 INFO stats.py:314 | Epoch[3] Step[63] GlobalStep[474] Training Speed: 419.46 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 13:26:56. Learning Rate: 9.50050e-05. Rank[0/16] 06/24/2025 12:56:36 INFO loss_tracker.py:84 | Epoch[3/NA] Step[74] GlobalStep[485/99999]: loss_noise_mse[0.0061] loss_fk_mse[0.0095] loss_depth[0.0361] total_loss[0.0518] Rank[0/16] 06/24/2025 12:56:41 INFO stats.py:314 | Epoch[3] Step[88] GlobalStep[499] Training Speed: 427.50 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 13:19:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 12:56:46 INFO loss_tracker.py:84 | Epoch[3/NA] Step[99] GlobalStep[510/99999]: loss_noise_mse[0.0051] loss_fk_mse[0.0089] loss_depth[0.0351] total_loss[0.0491] Rank[0/16] 06/24/2025 12:56:52 INFO stats.py:314 | Epoch[3] Step[113] GlobalStep[524] Training Speed: 430.92 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 13:14:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 12:56:57 INFO loss_tracker.py:84 | Epoch[3/NA] Step[124] GlobalStep[535/99999]: loss_noise_mse[0.0048] loss_fk_mse[0.0089] loss_depth[0.0339] total_loss[0.0476] Rank[0/16] 06/24/2025 12:57:01 INFO stats.py:394 | Epoch[3] completed. Training Speed: 305.42 samples/sec across all devices. Epoch Time: 57.42 sec. Average Epoch Time: 57.42 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 13:07:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 12:57:02 INFO stats.py:314 | Epoch[4] Step[1] GlobalStep[549] Training Speed: 427.10 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 13:09:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 12:57:13 INFO loss_tracker.py:84 | Epoch[4/NA] Step[24] GlobalStep[572/99999]: loss_noise_mse[0.0044] loss_fk_mse[0.0085] loss_depth[0.0324] total_loss[0.0453] Rank[0/16] 06/24/2025 12:57:14 INFO stats.py:314 | Epoch[4] Step[26] GlobalStep[574] Training Speed: 417.20 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 13:08:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 12:57:23 INFO loss_tracker.py:84 | Epoch[4/NA] Step[49] GlobalStep[597/99999]: loss_noise_mse[0.0041] loss_fk_mse[0.0081] loss_depth[0.0310] total_loss[0.0432] Rank[0/16] 06/24/2025 12:57:24 INFO stats.py:314 | Epoch[4] Step[51] GlobalStep[599] Training Speed: 418.65 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 13:02:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 12:57:33 INFO loss_tracker.py:84 | Epoch[4/NA] Step[74] GlobalStep[622/99999]: loss_noise_mse[0.0041] loss_fk_mse[0.0083] loss_depth[0.0299] total_loss[0.0423] Rank[0/16] 06/24/2025 12:57:34 INFO stats.py:314 | Epoch[4] Step[76] GlobalStep[624] Training Speed: 420.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 12:59:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 12:57:44 INFO loss_tracker.py:84 | Epoch[4/NA] Step[99] GlobalStep[647/99999]: loss_noise_mse[0.0038] loss_fk_mse[0.0076] loss_depth[0.0288] total_loss[0.0403] Rank[0/16] 06/24/2025 12:57:44 INFO stats.py:314 | Epoch[4] Step[101] GlobalStep[649] Training Speed: 430.51 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 12:54:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 12:57:54 INFO loss_tracker.py:84 | Epoch[4/NA] Step[124] GlobalStep[672/99999]: loss_noise_mse[0.0039] loss_fk_mse[0.0074] loss_depth[0.0280] total_loss[0.0393] Rank[0/16] 06/24/2025 12:57:55 INFO stats.py:314 | Epoch[4] Step[126] GlobalStep[674] Training Speed: 448.22 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 12:51:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 12:57:58 INFO stats.py:394 | Epoch[4] completed. Training Speed: 303.98 samples/sec across all devices. Epoch Time: 57.69 sec. Average Epoch Time: 57.69 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 12:48:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 12:58:05 INFO stats.py:314 | Epoch[5] Step[14] GlobalStep[699] Training Speed: 424.48 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 12:47:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 12:58:09 INFO loss_tracker.py:84 | Epoch[5/NA] Step[24] GlobalStep[709/99999]: loss_noise_mse[0.0035] loss_fk_mse[0.0076] loss_depth[0.0271] total_loss[0.0383] Rank[0/16] 06/24/2025 12:58:16 INFO stats.py:314 | Epoch[5] Step[39] GlobalStep[724] Training Speed: 429.47 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 12:45:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 12:58:20 INFO loss_tracker.py:84 | Epoch[5/NA] Step[49] GlobalStep[734/99999]: loss_noise_mse[0.0037] loss_fk_mse[0.0080] loss_depth[0.0261] total_loss[0.0378] Rank[0/16] 06/24/2025 12:58:25 INFO stats.py:314 | Epoch[5] Step[64] GlobalStep[749] Training Speed: 431.08 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 12:41:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 12:58:30 INFO loss_tracker.py:84 | Epoch[5/NA] Step[74] GlobalStep[759/99999]: loss_noise_mse[0.0033] loss_fk_mse[0.0078] loss_depth[0.0256] total_loss[0.0367] Rank[0/16] 06/24/2025 12:58:36 INFO stats.py:314 | Epoch[5] Step[89] GlobalStep[774] Training Speed: 433.24 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 12:39:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 12:58:40 INFO loss_tracker.py:84 | Epoch[5/NA] Step[99] GlobalStep[784/99999]: loss_noise_mse[0.0027] loss_fk_mse[0.0071] loss_depth[0.0253] total_loss[0.0351] Rank[0/16] 06/24/2025 12:58:46 INFO stats.py:314 | Epoch[5] Step[114] GlobalStep[799] Training Speed: 441.08 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 12:36:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 12:58:50 INFO loss_tracker.py:84 | Epoch[5/NA] Step[124] GlobalStep[809/99999]: loss_noise_mse[0.0032] loss_fk_mse[0.0074] loss_depth[0.0245] total_loss[0.0351] Rank[0/16] 06/24/2025 12:58:54 INFO stats.py:394 | Epoch[5] completed. Training Speed: 313.32 samples/sec across all devices. Epoch Time: 55.97 sec. Average Epoch Time: 55.97 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 12:32:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 12:58:57 INFO stats.py:314 | Epoch[6] Step[2] GlobalStep[824] Training Speed: 420.90 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 12:34:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 12:59:06 INFO loss_tracker.py:84 | Epoch[6/NA] Step[24] GlobalStep[846/99999]: loss_noise_mse[0.0032] loss_fk_mse[0.0077] loss_depth[0.0240] total_loss[0.0348] Rank[0/16] 06/24/2025 12:59:07 INFO stats.py:314 | Epoch[6] Step[27] GlobalStep[849] Training Speed: 418.40 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 12:32:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 12:59:17 INFO loss_tracker.py:84 | Epoch[6/NA] Step[49] GlobalStep[871/99999]: loss_noise_mse[0.0030] loss_fk_mse[0.0072] loss_depth[0.0237] total_loss[0.0338] Rank[0/16] 06/24/2025 12:59:18 INFO stats.py:314 | Epoch[6] Step[52] GlobalStep[874] Training Speed: 419.65 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 12:30:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 12:59:27 INFO loss_tracker.py:84 | Epoch[6/NA] Step[74] GlobalStep[896/99999]: loss_noise_mse[0.0028] loss_fk_mse[0.0074] loss_depth[0.0232] total_loss[0.0333] Rank[0/16] 06/24/2025 12:59:28 INFO stats.py:314 | Epoch[6] Step[77] GlobalStep[899] Training Speed: 408.70 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 12:28:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 12:59:38 INFO loss_tracker.py:84 | Epoch[6/NA] Step[99] GlobalStep[921/99999]: loss_noise_mse[0.0029] loss_fk_mse[0.0075] loss_depth[0.0228] total_loss[0.0332] Rank[0/16] 06/24/2025 12:59:39 INFO stats.py:314 | Epoch[6] Step[102] GlobalStep[924] Training Speed: 413.80 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 12:28:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 12:59:48 INFO loss_tracker.py:84 | Epoch[6/NA] Step[124] GlobalStep[946/99999]: loss_noise_mse[0.0025] loss_fk_mse[0.0069] loss_depth[0.0226] total_loss[0.0320] Rank[0/16] 06/24/2025 12:59:49 INFO stats.py:314 | Epoch[6] Step[127] GlobalStep[949] Training Speed: 447.83 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 12:25:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 12:59:53 INFO stats.py:394 | Epoch[6] completed. Training Speed: 299.59 samples/sec across all devices. Epoch Time: 58.53 sec. Average Epoch Time: 58.53 sec. Average Step Time: 0.43 sec. Estimated Remaining Time: 12:24:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:00:01 INFO stats.py:314 | Epoch[7] Step[15] GlobalStep[974] Training Speed: 414.86 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 12:26:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:00:05 INFO loss_tracker.py:84 | Epoch[7/NA] Step[24] GlobalStep[983/99999]: loss_noise_mse[0.0025] loss_fk_mse[0.0074] loss_depth[0.0221] total_loss[0.0320] Rank[0/16] 06/24/2025 13:00:12 INFO stats.py:314 | Epoch[7] Step[40] GlobalStep[999] Training Speed: 432.63 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 12:25:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:00:15 INFO loss_tracker.py:84 | Epoch[7/NA] Step[49] GlobalStep[1008/99999]: loss_noise_mse[0.0025] loss_fk_mse[0.0073] loss_depth[0.0219] total_loss[0.0316] Rank[0/16] 06/24/2025 13:00:22 INFO stats.py:314 | Epoch[7] Step[65] GlobalStep[1024] Training Speed: 416.61 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 12:23:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:00:26 INFO loss_tracker.py:84 | Epoch[7/NA] Step[74] GlobalStep[1033/99999]: loss_noise_mse[0.0025] loss_fk_mse[0.0070] loss_depth[0.0215] total_loss[0.0310] Rank[0/16] 06/24/2025 13:00:33 INFO stats.py:314 | Epoch[7] Step[90] GlobalStep[1049] Training Speed: 421.97 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 12:22:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:00:37 INFO loss_tracker.py:84 | Epoch[7/NA] Step[99] GlobalStep[1058/99999]: loss_noise_mse[0.0023] loss_fk_mse[0.0069] loss_depth[0.0214] total_loss[0.0306] Rank[0/16] 06/24/2025 13:00:44 INFO stats.py:314 | Epoch[7] Step[115] GlobalStep[1074] Training Speed: 424.54 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 12:21:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:00:47 INFO loss_tracker.py:84 | Epoch[7/NA] Step[124] GlobalStep[1083/99999]: loss_noise_mse[0.0020] loss_fk_mse[0.0067] loss_depth[0.0214] total_loss[0.0302] Rank[0/16] 06/24/2025 13:00:51 INFO stats.py:394 | Epoch[7] completed. Training Speed: 300.60 samples/sec across all devices. Epoch Time: 58.34 sec. Average Epoch Time: 58.34 sec. Average Step Time: 0.43 sec. Estimated Remaining Time: 12:18:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:00:54 INFO stats.py:314 | Epoch[8] Step[3] GlobalStep[1099] Training Speed: 419.74 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 12:19:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:01:03 INFO loss_tracker.py:84 | Epoch[8/NA] Step[24] GlobalStep[1120/99999]: loss_noise_mse[0.0024] loss_fk_mse[0.0072] loss_depth[0.0207] total_loss[0.0303] Rank[0/16] 06/24/2025 13:01:04 INFO stats.py:314 | Epoch[8] Step[28] GlobalStep[1124] Training Speed: 429.53 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 12:18:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:01:14 INFO loss_tracker.py:84 | Epoch[8/NA] Step[49] GlobalStep[1145/99999]: loss_noise_mse[0.0025] loss_fk_mse[0.0074] loss_depth[0.0207] total_loss[0.0306] Rank[0/16] 06/24/2025 13:01:15 INFO stats.py:314 | Epoch[8] Step[53] GlobalStep[1149] Training Speed: 427.20 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 12:17:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:01:24 INFO loss_tracker.py:84 | Epoch[8/NA] Step[74] GlobalStep[1170/99999]: loss_noise_mse[0.0023] loss_fk_mse[0.0069] loss_depth[0.0205] total_loss[0.0297] Rank[0/16] 06/24/2025 13:01:25 INFO stats.py:314 | Epoch[8] Step[78] GlobalStep[1174] Training Speed: 420.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 12:15:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:01:34 INFO loss_tracker.py:84 | Epoch[8/NA] Step[99] GlobalStep[1195/99999]: loss_noise_mse[0.0020] loss_fk_mse[0.0071] loss_depth[0.0206] total_loss[0.0298] Rank[0/16] 06/24/2025 13:01:36 INFO stats.py:314 | Epoch[8] Step[103] GlobalStep[1199] Training Speed: 416.07 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 12:14:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:01:44 INFO loss_tracker.py:84 | Epoch[8/NA] Step[124] GlobalStep[1220/99999]: loss_noise_mse[0.0019] loss_fk_mse[0.0071] loss_depth[0.0204] total_loss[0.0294] Rank[0/16] 06/24/2025 13:01:46 INFO stats.py:314 | Epoch[8] Step[128] GlobalStep[1224] Training Speed: 446.70 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 12:13:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:01:49 INFO stats.py:394 | Epoch[8] completed. Training Speed: 304.83 samples/sec across all devices. Epoch Time: 57.53 sec. Average Epoch Time: 57.53 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 12:12:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:01:57 INFO stats.py:314 | Epoch[9] Step[16] GlobalStep[1249] Training Speed: 432.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 12:13:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:02:00 INFO loss_tracker.py:84 | Epoch[9/NA] Step[24] GlobalStep[1257/99999]: loss_noise_mse[0.0021] loss_fk_mse[0.0063] loss_depth[0.0199] total_loss[0.0283] Rank[0/16] 06/24/2025 13:02:08 INFO stats.py:314 | Epoch[9] Step[41] GlobalStep[1274] Training Speed: 430.68 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 12:12:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:02:11 INFO loss_tracker.py:84 | Epoch[9/NA] Step[49] GlobalStep[1282/99999]: loss_noise_mse[0.0019] loss_fk_mse[0.0074] loss_depth[0.0201] total_loss[0.0293] Rank[0/16] 06/24/2025 13:02:18 INFO stats.py:314 | Epoch[9] Step[66] GlobalStep[1299] Training Speed: 443.33 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 12:11:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:02:21 INFO loss_tracker.py:84 | Epoch[9/NA] Step[74] GlobalStep[1307/99999]: loss_noise_mse[0.0019] loss_fk_mse[0.0065] loss_depth[0.0198] total_loss[0.0281] Rank[0/16] 06/24/2025 13:02:29 INFO stats.py:314 | Epoch[9] Step[91] GlobalStep[1324] Training Speed: 416.84 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 12:10:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:02:32 INFO loss_tracker.py:84 | Epoch[9/NA] Step[99] GlobalStep[1332/99999]: loss_noise_mse[0.0019] loss_fk_mse[0.0066] loss_depth[0.0198] total_loss[0.0283] Rank[0/16] 06/24/2025 13:02:39 INFO stats.py:314 | Epoch[9] Step[116] GlobalStep[1349] Training Speed: 431.61 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 12:09:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:02:42 INFO loss_tracker.py:84 | Epoch[9/NA] Step[124] GlobalStep[1357/99999]: loss_noise_mse[0.0018] loss_fk_mse[0.0066] loss_depth[0.0196] total_loss[0.0281] Rank[0/16] 06/24/2025 13:02:47 INFO stats.py:394 | Epoch[9] completed. Training Speed: 302.94 samples/sec across all devices. Epoch Time: 57.89 sec. Average Epoch Time: 57.89 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 12:07:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:02:50 INFO stats.py:314 | Epoch[10] Step[4] GlobalStep[1374] Training Speed: 434.18 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 12:09:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:02:58 INFO loss_tracker.py:84 | Epoch[10/NA] Step[24] GlobalStep[1394/99999]: loss_noise_mse[0.0018] loss_fk_mse[0.0066] loss_depth[0.0193] total_loss[0.0278] Rank[0/16] 06/24/2025 13:03:00 INFO stats.py:314 | Epoch[10] Step[29] GlobalStep[1399] Training Speed: 431.21 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 12:07:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:03:09 INFO loss_tracker.py:84 | Epoch[10/NA] Step[49] GlobalStep[1419/99999]: loss_noise_mse[0.0017] loss_fk_mse[0.0066] loss_depth[0.0194] total_loss[0.0277] Rank[0/16] 06/24/2025 13:03:11 INFO stats.py:314 | Epoch[10] Step[54] GlobalStep[1424] Training Speed: 427.12 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 12:07:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:03:20 INFO loss_tracker.py:84 | Epoch[10/NA] Step[74] GlobalStep[1444/99999]: loss_noise_mse[0.0016] loss_fk_mse[0.0068] loss_depth[0.0192] total_loss[0.0276] Rank[0/16] 06/24/2025 13:03:22 INFO stats.py:314 | Epoch[10] Step[79] GlobalStep[1449] Training Speed: 425.11 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 12:06:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:03:31 INFO loss_tracker.py:84 | Epoch[10/NA] Step[99] GlobalStep[1469/99999]: loss_noise_mse[0.0017] loss_fk_mse[0.0064] loss_depth[0.0192] total_loss[0.0273] Rank[0/16] 06/24/2025 13:03:33 INFO stats.py:314 | Epoch[10] Step[104] GlobalStep[1474] Training Speed: 426.28 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 12:06:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:03:41 INFO loss_tracker.py:84 | Epoch[10/NA] Step[124] GlobalStep[1494/99999]: loss_noise_mse[0.0018] loss_fk_mse[0.0066] loss_depth[0.0187] total_loss[0.0271] Rank[0/16] 06/24/2025 13:03:43 INFO stats.py:314 | Epoch[10] Step[129] GlobalStep[1499] Training Speed: 445.98 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 12:05:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:03:45 INFO stats.py:394 | Epoch[10] completed. Training Speed: 299.76 samples/sec across all devices. Epoch Time: 58.50 sec. Average Epoch Time: 58.50 sec. Average Step Time: 0.43 sec. Estimated Remaining Time: 12:04:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:03:54 INFO stats.py:314 | Epoch[11] Step[17] GlobalStep[1524] Training Speed: 427.94 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 12:04:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:03:57 INFO loss_tracker.py:84 | Epoch[11/NA] Step[24] GlobalStep[1531/99999]: loss_noise_mse[0.0016] loss_fk_mse[0.0066] loss_depth[0.0187] total_loss[0.0269] Rank[0/16] 06/24/2025 13:04:04 INFO stats.py:314 | Epoch[11] Step[42] GlobalStep[1549] Training Speed: 421.03 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 12:04:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:04:07 INFO loss_tracker.py:84 | Epoch[11/NA] Step[49] GlobalStep[1556/99999]: loss_noise_mse[0.0017] loss_fk_mse[0.0062] loss_depth[0.0185] total_loss[0.0264] Rank[0/16] 06/24/2025 13:04:15 INFO stats.py:314 | Epoch[11] Step[67] GlobalStep[1574] Training Speed: 420.70 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 12:03:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:04:18 INFO loss_tracker.py:84 | Epoch[11/NA] Step[74] GlobalStep[1581/99999]: loss_noise_mse[0.0015] loss_fk_mse[0.0058] loss_depth[0.0185] total_loss[0.0258] Rank[0/16] 06/24/2025 13:04:26 INFO stats.py:314 | Epoch[11] Step[92] GlobalStep[1599] Training Speed: 429.56 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 12:03:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:04:29 INFO loss_tracker.py:84 | Epoch[11/NA] Step[99] GlobalStep[1606/99999]: loss_noise_mse[0.0016] loss_fk_mse[0.0063] loss_depth[0.0186] total_loss[0.0265] Rank[0/16] 06/24/2025 13:04:36 INFO stats.py:314 | Epoch[11] Step[117] GlobalStep[1624] Training Speed: 429.40 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 12:02:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:04:39 INFO loss_tracker.py:84 | Epoch[11/NA] Step[124] GlobalStep[1631/99999]: loss_noise_mse[0.0014] loss_fk_mse[0.0063] loss_depth[0.0183] total_loss[0.0260] Rank[0/16] 06/24/2025 13:04:44 INFO stats.py:394 | Epoch[11] completed. Training Speed: 300.13 samples/sec across all devices. Epoch Time: 58.43 sec. Average Epoch Time: 58.43 sec. Average Step Time: 0.43 sec. Estimated Remaining Time: 12:01:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:04:47 INFO stats.py:314 | Epoch[12] Step[5] GlobalStep[1649] Training Speed: 421.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 12:02:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:04:55 INFO loss_tracker.py:84 | Epoch[12/NA] Step[24] GlobalStep[1668/99999]: loss_noise_mse[0.0014] loss_fk_mse[0.0062] loss_depth[0.0184] total_loss[0.0260] Rank[0/16] 06/24/2025 13:04:58 INFO stats.py:314 | Epoch[12] Step[30] GlobalStep[1674] Training Speed: 423.41 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 12:01:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:05:05 INFO loss_tracker.py:84 | Epoch[12/NA] Step[49] GlobalStep[1693/99999]: loss_noise_mse[0.0014] loss_fk_mse[0.0056] loss_depth[0.0181] total_loss[0.0250] Rank[0/16] 06/24/2025 13:05:08 INFO stats.py:314 | Epoch[12] Step[55] GlobalStep[1699] Training Speed: 416.11 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 12:01:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:05:16 INFO loss_tracker.py:84 | Epoch[12/NA] Step[74] GlobalStep[1718/99999]: loss_noise_mse[0.0014] loss_fk_mse[0.0066] loss_depth[0.0181] total_loss[0.0262] Rank[0/16] 06/24/2025 13:05:19 INFO stats.py:314 | Epoch[12] Step[80] GlobalStep[1724] Training Speed: 434.62 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 12:00:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:05:26 INFO loss_tracker.py:84 | Epoch[12/NA] Step[99] GlobalStep[1743/99999]: loss_noise_mse[0.0014] loss_fk_mse[0.0060] loss_depth[0.0180] total_loss[0.0254] Rank[0/16] 06/24/2025 13:05:29 INFO stats.py:314 | Epoch[12] Step[105] GlobalStep[1749] Training Speed: 430.92 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:59:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:05:37 INFO loss_tracker.py:84 | Epoch[12/NA] Step[124] GlobalStep[1768/99999]: loss_noise_mse[0.0013] loss_fk_mse[0.0065] loss_depth[0.0178] total_loss[0.0256] Rank[0/16] 06/24/2025 13:05:40 INFO stats.py:314 | Epoch[12] Step[130] GlobalStep[1774] Training Speed: 263.43 samples/sec across all devices. Average Step Time: 0.49 sec. Estimated Remaining Time: 11:58:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:05:42 INFO stats.py:394 | Epoch[12] completed. Training Speed: 300.99 samples/sec across all devices. Epoch Time: 58.26 sec. Average Epoch Time: 58.26 sec. Average Step Time: 0.43 sec. Estimated Remaining Time: 11:58:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:05:51 INFO stats.py:314 | Epoch[13] Step[18] GlobalStep[1799] Training Speed: 423.98 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:58:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:05:53 INFO loss_tracker.py:84 | Epoch[13/NA] Step[24] GlobalStep[1805/99999]: loss_noise_mse[0.0015] loss_fk_mse[0.0063] loss_depth[0.0177] total_loss[0.0255] Rank[0/16] 06/24/2025 13:06:01 INFO stats.py:314 | Epoch[13] Step[43] GlobalStep[1824] Training Speed: 424.08 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:57:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:06:03 INFO loss_tracker.py:84 | Epoch[13/NA] Step[49] GlobalStep[1830/99999]: loss_noise_mse[0.0014] loss_fk_mse[0.0061] loss_depth[0.0177] total_loss[0.0252] Rank[0/16] 06/24/2025 13:06:11 INFO stats.py:314 | Epoch[13] Step[68] GlobalStep[1849] Training Speed: 430.47 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:56:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:06:14 INFO loss_tracker.py:84 | Epoch[13/NA] Step[74] GlobalStep[1855/99999]: loss_noise_mse[0.0014] loss_fk_mse[0.0064] loss_depth[0.0175] total_loss[0.0253] Rank[0/16] 06/24/2025 13:06:22 INFO stats.py:314 | Epoch[13] Step[93] GlobalStep[1874] Training Speed: 433.03 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:56:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:06:25 INFO loss_tracker.py:84 | Epoch[13/NA] Step[99] GlobalStep[1880/99999]: loss_noise_mse[0.0013] loss_fk_mse[0.0064] loss_depth[0.0176] total_loss[0.0253] Rank[0/16] 06/24/2025 13:06:32 INFO stats.py:314 | Epoch[13] Step[118] GlobalStep[1899] Training Speed: 420.64 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:55:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:06:35 INFO loss_tracker.py:84 | Epoch[13/NA] Step[124] GlobalStep[1905/99999]: loss_noise_mse[0.0013] loss_fk_mse[0.0061] loss_depth[0.0175] total_loss[0.0249] Rank[0/16] 06/24/2025 13:06:40 INFO stats.py:394 | Epoch[13] completed. Training Speed: 303.13 samples/sec across all devices. Epoch Time: 57.85 sec. Average Epoch Time: 57.85 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 11:55:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:06:44 INFO stats.py:314 | Epoch[14] Step[6] GlobalStep[1924] Training Speed: 431.31 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:56:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:06:51 INFO loss_tracker.py:84 | Epoch[14/NA] Step[24] GlobalStep[1942/99999]: loss_noise_mse[0.0013] loss_fk_mse[0.0064] loss_depth[0.0174] total_loss[0.0250] Rank[0/16] 06/24/2025 13:06:54 INFO stats.py:314 | Epoch[14] Step[31] GlobalStep[1949] Training Speed: 420.14 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:55:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:07:02 INFO loss_tracker.py:84 | Epoch[14/NA] Step[49] GlobalStep[1967/99999]: loss_noise_mse[0.0012] loss_fk_mse[0.0063] loss_depth[0.0173] total_loss[0.0247] Rank[0/16] 06/24/2025 13:07:05 INFO stats.py:314 | Epoch[14] Step[56] GlobalStep[1974] Training Speed: 431.85 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:55:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:07:12 INFO loss_tracker.py:84 | Epoch[14/NA] Step[74] GlobalStep[1992/99999]: loss_noise_mse[0.0013] loss_fk_mse[0.0061] loss_depth[0.0173] total_loss[0.0247] Rank[0/16] 06/24/2025 13:07:15 INFO stats.py:314 | Epoch[14] Step[81] GlobalStep[1999] Training Speed: 417.99 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 11:54:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:07:24 INFO loss_tracker.py:84 | Epoch[14/NA] Step[99] GlobalStep[2017/99999]: loss_noise_mse[0.0012] loss_fk_mse[0.0063] loss_depth[0.0174] total_loss[0.0249] Rank[0/16] 06/24/2025 13:07:26 INFO stats.py:314 | Epoch[14] Step[106] GlobalStep[2024] Training Speed: 420.39 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:54:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:07:34 INFO loss_tracker.py:84 | Epoch[14/NA] Step[124] GlobalStep[2042/99999]: loss_noise_mse[0.0011] loss_fk_mse[0.0061] loss_depth[0.0173] total_loss[0.0245] Rank[0/16] 06/24/2025 13:07:36 INFO stats.py:314 | Epoch[14] Step[131] GlobalStep[2049] Training Speed: 450.13 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 11:53:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:07:38 INFO stats.py:394 | Epoch[14] completed. Training Speed: 299.78 samples/sec across all devices. Epoch Time: 58.50 sec. Average Epoch Time: 58.50 sec. Average Step Time: 0.43 sec. Estimated Remaining Time: 11:53:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:07:48 INFO stats.py:314 | Epoch[15] Step[19] GlobalStep[2074] Training Speed: 425.73 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:53:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:07:50 INFO loss_tracker.py:84 | Epoch[15/NA] Step[24] GlobalStep[2079/99999]: loss_noise_mse[0.0013] loss_fk_mse[0.0062] loss_depth[0.0170] total_loss[0.0245] Rank[0/16] 06/24/2025 13:07:58 INFO stats.py:314 | Epoch[15] Step[44] GlobalStep[2099] Training Speed: 404.02 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 11:53:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:08:01 INFO loss_tracker.py:84 | Epoch[15/NA] Step[49] GlobalStep[2104/99999]: loss_noise_mse[0.0012] loss_fk_mse[0.0063] loss_depth[0.0171] total_loss[0.0246] Rank[0/16] 06/24/2025 13:08:09 INFO stats.py:314 | Epoch[15] Step[69] GlobalStep[2124] Training Speed: 430.12 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:52:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:08:11 INFO loss_tracker.py:84 | Epoch[15/NA] Step[74] GlobalStep[2129/99999]: loss_noise_mse[0.0012] loss_fk_mse[0.0059] loss_depth[0.0169] total_loss[0.0239] Rank[0/16] 06/24/2025 13:08:19 INFO stats.py:314 | Epoch[15] Step[94] GlobalStep[2149] Training Speed: 427.28 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:52:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:08:22 INFO loss_tracker.py:84 | Epoch[15/NA] Step[99] GlobalStep[2154/99999]: loss_noise_mse[0.0012] loss_fk_mse[0.0062] loss_depth[0.0168] total_loss[0.0242] Rank[0/16] 06/24/2025 13:08:30 INFO stats.py:314 | Epoch[15] Step[119] GlobalStep[2174] Training Speed: 419.14 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 11:52:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:08:32 INFO loss_tracker.py:84 | Epoch[15/NA] Step[124] GlobalStep[2179/99999]: loss_noise_mse[0.0012] loss_fk_mse[0.0062] loss_depth[0.0169] total_loss[0.0242] Rank[0/16] 06/24/2025 13:08:36 INFO stats.py:394 | Epoch[15] completed. Training Speed: 301.97 samples/sec across all devices. Epoch Time: 58.07 sec. Average Epoch Time: 58.07 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 11:50:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:08:41 INFO stats.py:314 | Epoch[16] Step[7] GlobalStep[2199] Training Speed: 408.57 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 11:51:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:08:48 INFO loss_tracker.py:84 | Epoch[16/NA] Step[24] GlobalStep[2216/99999]: loss_noise_mse[0.0011] loss_fk_mse[0.0061] loss_depth[0.0168] total_loss[0.0241] Rank[0/16] 06/24/2025 13:08:51 INFO stats.py:314 | Epoch[16] Step[32] GlobalStep[2224] Training Speed: 425.31 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:51:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:08:58 INFO loss_tracker.py:84 | Epoch[16/NA] Step[49] GlobalStep[2241/99999]: loss_noise_mse[0.0012] loss_fk_mse[0.0057] loss_depth[0.0168] total_loss[0.0237] Rank[0/16] 06/24/2025 13:09:02 INFO stats.py:314 | Epoch[16] Step[57] GlobalStep[2249] Training Speed: 213.23 samples/sec across all devices. Average Step Time: 0.60 sec. Estimated Remaining Time: 11:50:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:09:09 INFO loss_tracker.py:84 | Epoch[16/NA] Step[74] GlobalStep[2266/99999]: loss_noise_mse[0.0010] loss_fk_mse[0.0057] loss_depth[0.0168] total_loss[0.0235] Rank[0/16] 06/24/2025 13:09:13 INFO stats.py:314 | Epoch[16] Step[82] GlobalStep[2274] Training Speed: 423.87 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:50:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:09:19 INFO loss_tracker.py:84 | Epoch[16/NA] Step[99] GlobalStep[2291/99999]: loss_noise_mse[0.0010] loss_fk_mse[0.0059] loss_depth[0.0168] total_loss[0.0237] Rank[0/16] 06/24/2025 13:09:23 INFO stats.py:314 | Epoch[16] Step[107] GlobalStep[2299] Training Speed: 421.54 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:50:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:09:31 INFO loss_tracker.py:84 | Epoch[16/NA] Step[124] GlobalStep[2316/99999]: loss_noise_mse[0.0011] loss_fk_mse[0.0060] loss_depth[0.0168] total_loss[0.0239] Rank[0/16] 06/24/2025 13:09:34 INFO stats.py:314 | Epoch[16] Step[132] GlobalStep[2324] Training Speed: 449.61 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 11:49:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:09:35 INFO stats.py:394 | Epoch[16] completed. Training Speed: 297.71 samples/sec across all devices. Epoch Time: 58.90 sec. Average Epoch Time: 58.90 sec. Average Step Time: 0.43 sec. Estimated Remaining Time: 11:49:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:09:45 INFO stats.py:314 | Epoch[17] Step[20] GlobalStep[2349] Training Speed: 421.02 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:49:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:09:47 INFO loss_tracker.py:84 | Epoch[17/NA] Step[24] GlobalStep[2353/99999]: loss_noise_mse[0.0010] loss_fk_mse[0.0055] loss_depth[0.0168] total_loss[0.0232] Rank[0/16] 06/24/2025 13:09:56 INFO stats.py:314 | Epoch[17] Step[45] GlobalStep[2374] Training Speed: 429.86 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:49:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:09:58 INFO loss_tracker.py:84 | Epoch[17/NA] Step[49] GlobalStep[2378/99999]: loss_noise_mse[0.0010] loss_fk_mse[0.0060] loss_depth[0.0166] total_loss[0.0235] Rank[0/16] 06/24/2025 13:10:07 INFO stats.py:314 | Epoch[17] Step[70] GlobalStep[2399] Training Speed: 228.22 samples/sec across all devices. Average Step Time: 0.56 sec. Estimated Remaining Time: 11:49:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:10:09 INFO loss_tracker.py:84 | Epoch[17/NA] Step[74] GlobalStep[2403/99999]: loss_noise_mse[0.0010] loss_fk_mse[0.0063] loss_depth[0.0168] total_loss[0.0241] Rank[0/16] 06/24/2025 13:10:18 INFO stats.py:314 | Epoch[17] Step[95] GlobalStep[2424] Training Speed: 410.91 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 11:49:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:10:19 INFO loss_tracker.py:84 | Epoch[17/NA] Step[99] GlobalStep[2428/99999]: loss_noise_mse[0.0010] loss_fk_mse[0.0062] loss_depth[0.0165] total_loss[0.0236] Rank[0/16] 06/24/2025 13:10:28 INFO stats.py:314 | Epoch[17] Step[120] GlobalStep[2449] Training Speed: 252.42 samples/sec across all devices. Average Step Time: 0.51 sec. Estimated Remaining Time: 11:48:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:10:30 INFO loss_tracker.py:84 | Epoch[17/NA] Step[124] GlobalStep[2453/99999]: loss_noise_mse[0.0009] loss_fk_mse[0.0057] loss_depth[0.0164] total_loss[0.0230] Rank[0/16] 06/24/2025 13:10:34 INFO stats.py:394 | Epoch[17] completed. Training Speed: 297.71 samples/sec across all devices. Epoch Time: 58.90 sec. Average Epoch Time: 58.90 sec. Average Step Time: 0.43 sec. Estimated Remaining Time: 11:47:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:10:39 INFO stats.py:314 | Epoch[18] Step[8] GlobalStep[2474] Training Speed: 435.87 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:48:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:10:45 INFO loss_tracker.py:84 | Epoch[18/NA] Step[24] GlobalStep[2490/99999]: loss_noise_mse[0.0009] loss_fk_mse[0.0058] loss_depth[0.0164] total_loss[0.0232] Rank[0/16] 06/24/2025 13:10:49 INFO stats.py:314 | Epoch[18] Step[33] GlobalStep[2499] Training Speed: 239.32 samples/sec across all devices. Average Step Time: 0.53 sec. Estimated Remaining Time: 11:47:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:10:56 INFO loss_tracker.py:84 | Epoch[18/NA] Step[49] GlobalStep[2515/99999]: loss_noise_mse[0.0009] loss_fk_mse[0.0066] loss_depth[0.0164] total_loss[0.0238] Rank[0/16] 06/24/2025 13:11:00 INFO stats.py:314 | Epoch[18] Step[58] GlobalStep[2524] Training Speed: 414.91 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 11:47:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:11:06 INFO loss_tracker.py:84 | Epoch[18/NA] Step[74] GlobalStep[2540/99999]: loss_noise_mse[0.0009] loss_fk_mse[0.0058] loss_depth[0.0163] total_loss[0.0230] Rank[0/16] 06/24/2025 13:11:10 INFO stats.py:314 | Epoch[18] Step[83] GlobalStep[2549] Training Speed: 420.69 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:46:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:11:17 INFO loss_tracker.py:84 | Epoch[18/NA] Step[99] GlobalStep[2565/99999]: loss_noise_mse[0.0009] loss_fk_mse[0.0057] loss_depth[0.0163] total_loss[0.0229] Rank[0/16] 06/24/2025 13:11:21 INFO stats.py:314 | Epoch[18] Step[108] GlobalStep[2574] Training Speed: 426.15 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:46:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:11:27 INFO loss_tracker.py:84 | Epoch[18/NA] Step[124] GlobalStep[2590/99999]: loss_noise_mse[0.0008] loss_fk_mse[0.0063] loss_depth[0.0162] total_loss[0.0234] Rank[0/16] 06/24/2025 13:11:31 INFO stats.py:314 | Epoch[18] Step[133] GlobalStep[2599] Training Speed: 449.06 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:45:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:11:32 INFO stats.py:394 | Epoch[18] completed. Training Speed: 304.76 samples/sec across all devices. Epoch Time: 57.54 sec. Average Epoch Time: 57.54 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 11:45:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:11:42 INFO stats.py:314 | Epoch[19] Step[21] GlobalStep[2624] Training Speed: 429.90 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:46:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:11:44 INFO loss_tracker.py:84 | Epoch[19/NA] Step[24] GlobalStep[2627/99999]: loss_noise_mse[0.0009] loss_fk_mse[0.0060] loss_depth[0.0163] total_loss[0.0232] Rank[0/16] 06/24/2025 13:11:53 INFO stats.py:314 | Epoch[19] Step[46] GlobalStep[2649] Training Speed: 428.81 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:45:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:11:54 INFO loss_tracker.py:84 | Epoch[19/NA] Step[49] GlobalStep[2652/99999]: loss_noise_mse[0.0008] loss_fk_mse[0.0067] loss_depth[0.0163] total_loss[0.0238] Rank[0/16] 06/24/2025 13:12:04 INFO stats.py:314 | Epoch[19] Step[71] GlobalStep[2674] Training Speed: 444.44 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:45:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:12:05 INFO loss_tracker.py:84 | Epoch[19/NA] Step[74] GlobalStep[2677/99999]: loss_noise_mse[0.0008] loss_fk_mse[0.0060] loss_depth[0.0163] total_loss[0.0231] Rank[0/16] 06/24/2025 13:12:14 INFO stats.py:314 | Epoch[19] Step[96] GlobalStep[2699] Training Speed: 427.77 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:45:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:12:16 INFO loss_tracker.py:84 | Epoch[19/NA] Step[99] GlobalStep[2702/99999]: loss_noise_mse[0.0008] loss_fk_mse[0.0057] loss_depth[0.0162] total_loss[0.0227] Rank[0/16] 06/24/2025 13:12:25 INFO stats.py:314 | Epoch[19] Step[121] GlobalStep[2724] Training Speed: 448.53 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:44:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:12:26 INFO loss_tracker.py:84 | Epoch[19/NA] Step[124] GlobalStep[2727/99999]: loss_noise_mse[0.0008] loss_fk_mse[0.0061] loss_depth[0.0162] total_loss[0.0231] Rank[0/16] 06/24/2025 13:12:30 INFO stats.py:394 | Epoch[19] completed. Training Speed: 298.52 samples/sec across all devices. Epoch Time: 58.74 sec. Average Epoch Time: 58.74 sec. Average Step Time: 0.43 sec. Estimated Remaining Time: 11:44:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:12:36 INFO stats.py:314 | Epoch[20] Step[9] GlobalStep[2749] Training Speed: 425.81 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:44:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:12:43 INFO loss_tracker.py:84 | Epoch[20/NA] Step[24] GlobalStep[2764/99999]: loss_noise_mse[0.0008] loss_fk_mse[0.0060] loss_depth[0.0162] total_loss[0.0229] Rank[0/16] 06/24/2025 13:12:47 INFO stats.py:314 | Epoch[20] Step[34] GlobalStep[2774] Training Speed: 380.90 samples/sec across all devices. Average Step Time: 0.34 sec. Estimated Remaining Time: 11:44:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:12:53 INFO loss_tracker.py:84 | Epoch[20/NA] Step[49] GlobalStep[2789/99999]: loss_noise_mse[0.0007] loss_fk_mse[0.0060] loss_depth[0.0161] total_loss[0.0228] Rank[0/16] 06/24/2025 13:12:58 INFO stats.py:314 | Epoch[20] Step[59] GlobalStep[2799] Training Speed: 431.17 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:44:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:13:04 INFO loss_tracker.py:84 | Epoch[20/NA] Step[74] GlobalStep[2814/99999]: loss_noise_mse[0.0007] loss_fk_mse[0.0060] loss_depth[0.0161] total_loss[0.0229] Rank[0/16] 06/24/2025 13:13:08 INFO stats.py:314 | Epoch[20] Step[84] GlobalStep[2824] Training Speed: 429.65 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:44:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:13:14 INFO loss_tracker.py:84 | Epoch[20/NA] Step[99] GlobalStep[2839/99999]: loss_noise_mse[0.0008] loss_fk_mse[0.0063] loss_depth[0.0160] total_loss[0.0230] Rank[0/16] 06/24/2025 13:13:18 INFO stats.py:314 | Epoch[20] Step[109] GlobalStep[2849] Training Speed: 433.76 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:43:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:13:25 INFO loss_tracker.py:84 | Epoch[20/NA] Step[124] GlobalStep[2864/99999]: loss_noise_mse[0.0007] loss_fk_mse[0.0057] loss_depth[0.0160] total_loss[0.0224] Rank[0/16] 06/24/2025 13:13:29 INFO stats.py:314 | Epoch[20] Step[134] GlobalStep[2874] Training Speed: 451.88 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 11:42:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:13:29 INFO stats.py:394 | Epoch[20] completed. Training Speed: 297.18 samples/sec across all devices. Epoch Time: 59.01 sec. Average Epoch Time: 59.01 sec. Average Step Time: 0.43 sec. Estimated Remaining Time: 11:42:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:13:40 INFO stats.py:314 | Epoch[21] Step[22] GlobalStep[2899] Training Speed: 434.94 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:43:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:13:41 INFO loss_tracker.py:84 | Epoch[21/NA] Step[24] GlobalStep[2901/99999]: loss_noise_mse[0.0007] loss_fk_mse[0.0062] loss_depth[0.0159] total_loss[0.0228] Rank[0/16] 06/24/2025 13:13:50 INFO stats.py:314 | Epoch[21] Step[47] GlobalStep[2924] Training Speed: 421.66 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:42:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:13:51 INFO loss_tracker.py:84 | Epoch[21/NA] Step[49] GlobalStep[2926/99999]: loss_noise_mse[0.0007] loss_fk_mse[0.0060] loss_depth[0.0157] total_loss[0.0225] Rank[0/16] 06/24/2025 13:14:01 INFO stats.py:314 | Epoch[21] Step[72] GlobalStep[2949] Training Speed: 432.46 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:42:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:14:02 INFO loss_tracker.py:84 | Epoch[21/NA] Step[74] GlobalStep[2951/99999]: loss_noise_mse[0.0007] loss_fk_mse[0.0055] loss_depth[0.0158] total_loss[0.0219] Rank[0/16] 06/24/2025 13:14:12 INFO stats.py:314 | Epoch[21] Step[97] GlobalStep[2974] Training Speed: 427.65 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:41:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:14:12 INFO loss_tracker.py:84 | Epoch[21/NA] Step[99] GlobalStep[2976/99999]: loss_noise_mse[0.0007] loss_fk_mse[0.0066] loss_depth[0.0157] total_loss[0.0231] Rank[0/16] 06/24/2025 13:14:22 INFO stats.py:314 | Epoch[21] Step[122] GlobalStep[2999] Training Speed: 450.03 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 11:41:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:14:23 INFO loss_tracker.py:84 | Epoch[21/NA] Step[124] GlobalStep[3001/99999]: loss_noise_mse[0.0007] loss_fk_mse[0.0056] loss_depth[0.0158] total_loss[0.0221] Rank[0/16] 06/24/2025 13:14:27 INFO stats.py:394 | Epoch[21] completed. Training Speed: 301.88 samples/sec across all devices. Epoch Time: 58.09 sec. Average Epoch Time: 58.09 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 11:41:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:14:33 INFO stats.py:314 | Epoch[22] Step[10] GlobalStep[3024] Training Speed: 428.85 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:41:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:14:39 INFO loss_tracker.py:84 | Epoch[22/NA] Step[24] GlobalStep[3038/99999]: loss_noise_mse[0.0007] loss_fk_mse[0.0061] loss_depth[0.0157] total_loss[0.0225] Rank[0/16] 06/24/2025 13:14:44 INFO stats.py:314 | Epoch[22] Step[35] GlobalStep[3049] Training Speed: 421.31 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:41:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:14:50 INFO loss_tracker.py:84 | Epoch[22/NA] Step[49] GlobalStep[3063/99999]: loss_noise_mse[0.0007] loss_fk_mse[0.0057] loss_depth[0.0157] total_loss[0.0220] Rank[0/16] 06/24/2025 13:14:54 INFO stats.py:314 | Epoch[22] Step[60] GlobalStep[3074] Training Speed: 430.75 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:40:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:15:01 INFO loss_tracker.py:84 | Epoch[22/NA] Step[74] GlobalStep[3088/99999]: loss_noise_mse[0.0007] loss_fk_mse[0.0061] loss_depth[0.0158] total_loss[0.0225] Rank[0/16] 06/24/2025 13:15:05 INFO stats.py:314 | Epoch[22] Step[85] GlobalStep[3099] Training Speed: 413.13 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 11:40:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:15:12 INFO loss_tracker.py:84 | Epoch[22/NA] Step[99] GlobalStep[3113/99999]: loss_noise_mse[0.0007] loss_fk_mse[0.0059] loss_depth[0.0156] total_loss[0.0223] Rank[0/16] 06/24/2025 13:15:16 INFO stats.py:314 | Epoch[22] Step[110] GlobalStep[3124] Training Speed: 401.07 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 11:40:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:15:22 INFO loss_tracker.py:84 | Epoch[22/NA] Step[124] GlobalStep[3138/99999]: loss_noise_mse[0.0007] loss_fk_mse[0.0058] loss_depth[0.0157] total_loss[0.0222] Rank[0/16] 06/24/2025 13:15:26 INFO stats.py:314 | Epoch[22] Step[135] GlobalStep[3149] Training Speed: 447.02 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:39:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:15:26 INFO stats.py:394 | Epoch[22] completed. Training Speed: 298.90 samples/sec across all devices. Epoch Time: 58.67 sec. Average Epoch Time: 58.67 sec. Average Step Time: 0.43 sec. Estimated Remaining Time: 11:39:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:15:37 INFO stats.py:314 | Epoch[23] Step[23] GlobalStep[3174] Training Speed: 425.16 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:39:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:15:38 INFO loss_tracker.py:84 | Epoch[23/NA] Step[24] GlobalStep[3175/99999]: loss_noise_mse[0.0006] loss_fk_mse[0.0058] loss_depth[0.0157] total_loss[0.0222] Rank[0/16] 06/24/2025 13:15:48 INFO stats.py:314 | Epoch[23] Step[48] GlobalStep[3199] Training Speed: 428.59 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:39:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:15:48 INFO loss_tracker.py:84 | Epoch[23/NA] Step[49] GlobalStep[3200/99999]: loss_noise_mse[0.0007] loss_fk_mse[0.0062] loss_depth[0.0158] total_loss[0.0226] Rank[0/16] 06/24/2025 13:15:59 INFO stats.py:314 | Epoch[23] Step[73] GlobalStep[3224] Training Speed: 421.74 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:39:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:15:59 INFO loss_tracker.py:84 | Epoch[23/NA] Step[74] GlobalStep[3225/99999]: loss_noise_mse[0.0007] loss_fk_mse[0.0063] loss_depth[0.0155] total_loss[0.0225] Rank[0/16] 06/24/2025 13:16:09 INFO stats.py:314 | Epoch[23] Step[98] GlobalStep[3249] Training Speed: 420.28 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:39:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:16:10 INFO loss_tracker.py:84 | Epoch[23/NA] Step[99] GlobalStep[3250/99999]: loss_noise_mse[0.0007] loss_fk_mse[0.0059] loss_depth[0.0156] total_loss[0.0222] Rank[0/16] 06/24/2025 13:16:20 INFO stats.py:314 | Epoch[23] Step[123] GlobalStep[3274] Training Speed: 450.91 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 11:38:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:16:20 INFO loss_tracker.py:84 | Epoch[23/NA] Step[124] GlobalStep[3275/99999]: loss_noise_mse[0.0006] loss_fk_mse[0.0059] loss_depth[0.0157] total_loss[0.0222] Rank[0/16] 06/24/2025 13:16:24 INFO stats.py:394 | Epoch[23] completed. Training Speed: 300.52 samples/sec across all devices. Epoch Time: 58.35 sec. Average Epoch Time: 58.35 sec. Average Step Time: 0.43 sec. Estimated Remaining Time: 11:38:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:16:31 INFO stats.py:314 | Epoch[24] Step[11] GlobalStep[3299] Training Speed: 424.90 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:38:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:16:36 INFO loss_tracker.py:84 | Epoch[24/NA] Step[24] GlobalStep[3312/99999]: loss_noise_mse[0.0006] loss_fk_mse[0.0059] loss_depth[0.0155] total_loss[0.0221] Rank[0/16] 06/24/2025 13:16:41 INFO stats.py:314 | Epoch[24] Step[36] GlobalStep[3324] Training Speed: 428.45 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:38:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:16:47 INFO loss_tracker.py:84 | Epoch[24/NA] Step[49] GlobalStep[3337/99999]: loss_noise_mse[0.0006] loss_fk_mse[0.0057] loss_depth[0.0155] total_loss[0.0218] Rank[0/16] 06/24/2025 13:16:52 INFO stats.py:314 | Epoch[24] Step[61] GlobalStep[3349] Training Speed: 430.50 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:37:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:16:57 INFO loss_tracker.py:84 | Epoch[24/NA] Step[74] GlobalStep[3362/99999]: loss_noise_mse[0.0007] loss_fk_mse[0.0058] loss_depth[0.0155] total_loss[0.0219] Rank[0/16] 06/24/2025 13:17:02 INFO stats.py:314 | Epoch[24] Step[86] GlobalStep[3374] Training Speed: 420.59 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:37:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:17:08 INFO loss_tracker.py:84 | Epoch[24/NA] Step[99] GlobalStep[3387/99999]: loss_noise_mse[0.0006] loss_fk_mse[0.0058] loss_depth[0.0155] total_loss[0.0218] Rank[0/16] 06/24/2025 13:17:13 INFO stats.py:314 | Epoch[24] Step[111] GlobalStep[3399] Training Speed: 414.14 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 11:37:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:17:18 INFO loss_tracker.py:84 | Epoch[24/NA] Step[124] GlobalStep[3412/99999]: loss_noise_mse[0.0006] loss_fk_mse[0.0059] loss_depth[0.0154] total_loss[0.0219] Rank[0/16] 06/24/2025 13:17:22 INFO stats.py:314 | Epoch[24] Step[136] GlobalStep[3424] Training Speed: 449.72 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 11:36:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:17:22 INFO stats.py:394 | Epoch[24] completed. Training Speed: 303.27 samples/sec across all devices. Epoch Time: 57.82 sec. Average Epoch Time: 57.82 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 11:36:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:17:34 INFO stats.py:314 | Epoch[25] Step[24] GlobalStep[3449] Training Speed: 426.62 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:36:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:17:34 INFO loss_tracker.py:84 | Epoch[25/NA] Step[24] GlobalStep[3449/99999]: loss_noise_mse[0.0006] loss_fk_mse[0.0055] loss_depth[0.0154] total_loss[0.0215] Rank[0/16] 06/24/2025 13:17:44 INFO stats.py:314 | Epoch[25] Step[49] GlobalStep[3474] Training Speed: 436.91 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:36:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:17:44 INFO loss_tracker.py:84 | Epoch[25/NA] Step[49] GlobalStep[3474/99999]: loss_noise_mse[0.0006] loss_fk_mse[0.0059] loss_depth[0.0155] total_loss[0.0220] Rank[0/16] 06/24/2025 13:17:55 INFO stats.py:314 | Epoch[25] Step[74] GlobalStep[3499] Training Speed: 421.21 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:35:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:17:55 INFO loss_tracker.py:84 | Epoch[25/NA] Step[74] GlobalStep[3499/99999]: loss_noise_mse[0.0006] loss_fk_mse[0.0055] loss_depth[0.0156] total_loss[0.0217] Rank[0/16] 06/24/2025 13:18:05 INFO stats.py:314 | Epoch[25] Step[99] GlobalStep[3524] Training Speed: 432.73 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:35:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:18:05 INFO loss_tracker.py:84 | Epoch[25/NA] Step[99] GlobalStep[3524/99999]: loss_noise_mse[0.0006] loss_fk_mse[0.0059] loss_depth[0.0153] total_loss[0.0217] Rank[0/16] 06/24/2025 13:18:16 INFO stats.py:314 | Epoch[25] Step[124] GlobalStep[3549] Training Speed: 449.08 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:35:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:18:16 INFO loss_tracker.py:84 | Epoch[25/NA] Step[124] GlobalStep[3549/99999]: loss_noise_mse[0.0006] loss_fk_mse[0.0057] loss_depth[0.0153] total_loss[0.0216] Rank[0/16] 06/24/2025 13:18:20 INFO stats.py:394 | Epoch[25] completed. Training Speed: 305.23 samples/sec across all devices. Epoch Time: 57.45 sec. Average Epoch Time: 57.45 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 11:34:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:18:26 INFO stats.py:314 | Epoch[26] Step[12] GlobalStep[3574] Training Speed: 226.44 samples/sec across all devices. Average Step Time: 0.57 sec. Estimated Remaining Time: 11:34:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:18:31 INFO loss_tracker.py:84 | Epoch[26/NA] Step[24] GlobalStep[3586/99999]: loss_noise_mse[0.0006] loss_fk_mse[0.0059] loss_depth[0.0152] total_loss[0.0218] Rank[0/16] 06/24/2025 13:18:37 INFO stats.py:314 | Epoch[26] Step[37] GlobalStep[3599] Training Speed: 433.97 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:34:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:18:42 INFO loss_tracker.py:84 | Epoch[26/NA] Step[49] GlobalStep[3611/99999]: loss_noise_mse[0.0005] loss_fk_mse[0.0056] loss_depth[0.0155] total_loss[0.0216] Rank[0/16] 06/24/2025 13:18:47 INFO stats.py:314 | Epoch[26] Step[62] GlobalStep[3624] Training Speed: 433.56 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:34:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:18:53 INFO loss_tracker.py:84 | Epoch[26/NA] Step[74] GlobalStep[3636/99999]: loss_noise_mse[0.0005] loss_fk_mse[0.0058] loss_depth[0.0153] total_loss[0.0216] Rank[0/16] 06/24/2025 13:18:58 INFO stats.py:314 | Epoch[26] Step[87] GlobalStep[3649] Training Speed: 433.11 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:34:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:19:03 INFO loss_tracker.py:84 | Epoch[26/NA] Step[99] GlobalStep[3661/99999]: loss_noise_mse[0.0006] loss_fk_mse[0.0056] loss_depth[0.0152] total_loss[0.0214] Rank[0/16] 06/24/2025 13:19:09 INFO stats.py:314 | Epoch[26] Step[112] GlobalStep[3674] Training Speed: 430.26 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:33:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:19:13 INFO loss_tracker.py:84 | Epoch[26/NA] Step[124] GlobalStep[3686/99999]: loss_noise_mse[0.0005] loss_fk_mse[0.0058] loss_depth[0.0154] total_loss[0.0217] Rank[0/16] 06/24/2025 13:19:18 INFO stats.py:394 | Epoch[26] completed. Training Speed: 300.86 samples/sec across all devices. Epoch Time: 58.29 sec. Average Epoch Time: 58.29 sec. Average Step Time: 0.43 sec. Estimated Remaining Time: 11:33:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:19:19 INFO stats.py:314 | Epoch[27] Step[0] GlobalStep[3699] Training Speed: 400.95 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 11:33:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:19:29 INFO loss_tracker.py:84 | Epoch[27/NA] Step[24] GlobalStep[3723/99999]: loss_noise_mse[0.0006] loss_fk_mse[0.0060] loss_depth[0.0152] total_loss[0.0217] Rank[0/16] 06/24/2025 13:19:30 INFO stats.py:314 | Epoch[27] Step[25] GlobalStep[3724] Training Speed: 430.10 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:33:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:19:40 INFO loss_tracker.py:84 | Epoch[27/NA] Step[49] GlobalStep[3748/99999]: loss_noise_mse[0.0005] loss_fk_mse[0.0059] loss_depth[0.0154] total_loss[0.0218] Rank[0/16] 06/24/2025 13:19:40 INFO stats.py:314 | Epoch[27] Step[50] GlobalStep[3749] Training Speed: 397.69 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 11:33:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:19:50 INFO loss_tracker.py:84 | Epoch[27/NA] Step[74] GlobalStep[3773/99999]: loss_noise_mse[0.0005] loss_fk_mse[0.0056] loss_depth[0.0152] total_loss[0.0213] Rank[0/16] 06/24/2025 13:19:51 INFO stats.py:314 | Epoch[27] Step[75] GlobalStep[3774] Training Speed: 423.00 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:32:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:20:01 INFO loss_tracker.py:84 | Epoch[27/NA] Step[99] GlobalStep[3798/99999]: loss_noise_mse[0.0006] loss_fk_mse[0.0058] loss_depth[0.0151] total_loss[0.0216] Rank[0/16] 06/24/2025 13:20:01 INFO stats.py:314 | Epoch[27] Step[100] GlobalStep[3799] Training Speed: 408.87 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 11:32:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:20:11 INFO loss_tracker.py:84 | Epoch[27/NA] Step[124] GlobalStep[3823/99999]: loss_noise_mse[0.0005] loss_fk_mse[0.0057] loss_depth[0.0153] total_loss[0.0215] Rank[0/16] 06/24/2025 13:20:11 INFO stats.py:314 | Epoch[27] Step[125] GlobalStep[3824] Training Speed: 422.20 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:31:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:20:15 INFO stats.py:394 | Epoch[27] completed. Training Speed: 306.01 samples/sec across all devices. Epoch Time: 57.30 sec. Average Epoch Time: 57.30 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 11:31:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:20:22 INFO stats.py:314 | Epoch[28] Step[13] GlobalStep[3849] Training Speed: 436.70 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:31:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:20:27 INFO loss_tracker.py:84 | Epoch[28/NA] Step[24] GlobalStep[3860/99999]: loss_noise_mse[0.0005] loss_fk_mse[0.0059] loss_depth[0.0151] total_loss[0.0215] Rank[0/16] 06/24/2025 13:20:33 INFO stats.py:314 | Epoch[28] Step[38] GlobalStep[3874] Training Speed: 430.07 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:31:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:20:38 INFO loss_tracker.py:84 | Epoch[28/NA] Step[49] GlobalStep[3885/99999]: loss_noise_mse[0.0005] loss_fk_mse[0.0058] loss_depth[0.0151] total_loss[0.0214] Rank[0/16] 06/24/2025 13:20:44 INFO stats.py:314 | Epoch[28] Step[63] GlobalStep[3899] Training Speed: 421.87 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:31:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:20:49 INFO loss_tracker.py:84 | Epoch[28/NA] Step[74] GlobalStep[3910/99999]: loss_noise_mse[0.0005] loss_fk_mse[0.0059] loss_depth[0.0150] total_loss[0.0215] Rank[0/16] 06/24/2025 13:20:55 INFO stats.py:314 | Epoch[28] Step[88] GlobalStep[3924] Training Speed: 436.64 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:31:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:20:59 INFO loss_tracker.py:84 | Epoch[28/NA] Step[99] GlobalStep[3935/99999]: loss_noise_mse[0.0005] loss_fk_mse[0.0059] loss_depth[0.0151] total_loss[0.0215] Rank[0/16] 06/24/2025 13:21:05 INFO stats.py:314 | Epoch[28] Step[113] GlobalStep[3949] Training Speed: 434.91 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:30:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:21:10 INFO loss_tracker.py:84 | Epoch[28/NA] Step[124] GlobalStep[3960/99999]: loss_noise_mse[0.0005] loss_fk_mse[0.0055] loss_depth[0.0150] total_loss[0.0210] Rank[0/16] 06/24/2025 13:21:14 INFO stats.py:394 | Epoch[28] completed. Training Speed: 298.76 samples/sec across all devices. Epoch Time: 58.70 sec. Average Epoch Time: 58.70 sec. Average Step Time: 0.43 sec. Estimated Remaining Time: 11:30:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:21:16 INFO stats.py:314 | Epoch[29] Step[1] GlobalStep[3974] Training Speed: 433.42 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:30:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:21:26 INFO loss_tracker.py:84 | Epoch[29/NA] Step[24] GlobalStep[3997/99999]: loss_noise_mse[0.0005] loss_fk_mse[0.0059] loss_depth[0.0149] total_loss[0.0213] Rank[0/16] 06/24/2025 13:21:26 INFO stats.py:314 | Epoch[29] Step[26] GlobalStep[3999] Training Speed: 418.19 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 11:30:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:21:27 INFO accelerator.py:3111 | Saving current state to /job_data/checkpoints/checkpoint_0 Rank[14/16] 06/24/2025 13:21:27 INFO checkpoint.py:110 | Save checkpoint at the end of step 3999 to /job_data/checkpoints/checkpoint_0 Rank[9/16] 06/24/2025 13:21:27 INFO checkpoint.py:110 | Save checkpoint at the end of step 3999 to /job_data/checkpoints/checkpoint_0 Rank[11/16] 06/24/2025 13:21:27 INFO checkpoint.py:110 | Save checkpoint at the end of step 3999 to /job_data/checkpoints/checkpoint_0 Rank[12/16] 06/24/2025 13:21:28 INFO checkpoint.py:110 | Save checkpoint at the end of step 3999 to /job_data/checkpoints/checkpoint_0 Rank[2/16] 06/24/2025 13:21:28 INFO checkpoint.py:110 | Save checkpoint at the end of step 3999 to /job_data/checkpoints/checkpoint_0 Rank[13/16] 06/24/2025 13:21:28 INFO checkpoint.py:110 | Save checkpoint at the end of step 3999 to /job_data/checkpoints/checkpoint_0 Rank[6/16] 06/24/2025 13:21:28 INFO checkpoint.py:110 | Save checkpoint at the end of step 3999 to /job_data/checkpoints/checkpoint_0 Rank[15/16] 06/24/2025 13:21:28 INFO checkpoint.py:110 | Save checkpoint at the end of step 3999 to /job_data/checkpoints/checkpoint_0 Rank[10/16] 06/24/2025 13:21:28 INFO checkpoint.py:110 | Save checkpoint at the end of step 3999 to /job_data/checkpoints/checkpoint_0 Rank[8/16] 06/24/2025 13:21:28 INFO checkpoint.py:110 | Save checkpoint at the end of step 3999 to /job_data/checkpoints/checkpoint_0 Rank[4/16] 06/24/2025 13:21:28 INFO checkpoint.py:110 | Save checkpoint at the end of step 3999 to /job_data/checkpoints/checkpoint_0 Rank[1/16] 06/24/2025 13:21:28 INFO checkpoint.py:110 | Save checkpoint at the end of step 3999 to /job_data/checkpoints/checkpoint_0 Rank[7/16] 06/24/2025 13:21:28 INFO checkpoint.py:110 | Save checkpoint at the end of step 3999 to /job_data/checkpoints/checkpoint_0 Rank[3/16] 06/24/2025 13:21:28 INFO checkpoint.py:110 | Save checkpoint at the end of step 3999 to /job_data/checkpoints/checkpoint_0 Rank[5/16] 06/24/2025 13:21:28 INFO checkpoint.py:110 | Save checkpoint at the end of step 3999 to /job_data/checkpoints/checkpoint_0 Rank[0/16] 06/24/2025 13:21:28 INFO checkpointing.py:106 | Model weights saved in /job_data/checkpoints/checkpoint_0/model.safetensors Rank[0/16] 06/24/2025 13:21:29 INFO checkpointing.py:113 | Optimizer state saved in /job_data/checkpoints/checkpoint_0/optimizer.bin Rank[0/16] 06/24/2025 13:21:29 INFO checkpointing.py:120 | Scheduler state saved in /job_data/checkpoints/checkpoint_0/scheduler.bin Rank[0/16] 06/24/2025 13:21:29 INFO checkpointing.py:137 | Sampler state for dataloader 0 saved in /job_data/checkpoints/checkpoint_0/sampler.bin Rank[0/16] 06/24/2025 13:21:29 INFO checkpointing.py:164 | Random states saved in /job_data/checkpoints/checkpoint_0/random_states_0.pkl Rank[0/16] 06/24/2025 13:21:29 INFO checkpointing.py:300 | Saving the state of TrainerProgressState to /job_data/checkpoints/checkpoint_0/custom_checkpoint_0.pkl Rank[0/16] 06/24/2025 13:21:29 INFO checkpoint.py:110 | Save checkpoint at the end of step 3999 to /job_data/checkpoints/checkpoint_0 Rank[0/16] 06/24/2025 13:21:39 INFO loss_tracker.py:84 | Epoch[29/NA] Step[49] GlobalStep[4022/99999]: loss_noise_mse[0.0005] loss_fk_mse[0.0057] loss_depth[0.0150] total_loss[0.0213] Rank[0/16] 06/24/2025 13:21:40 INFO stats.py:314 | Epoch[29] Step[51] GlobalStep[4024] Training Speed: 411.03 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 11:31:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:21:49 INFO loss_tracker.py:84 | Epoch[29/NA] Step[74] GlobalStep[4047/99999]: loss_noise_mse[0.0005] loss_fk_mse[0.0062] loss_depth[0.0151] total_loss[0.0218] Rank[0/16] 06/24/2025 13:21:50 INFO stats.py:314 | Epoch[29] Step[76] GlobalStep[4049] Training Speed: 415.81 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 11:30:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:22:00 INFO loss_tracker.py:84 | Epoch[29/NA] Step[99] GlobalStep[4072/99999]: loss_noise_mse[0.0005] loss_fk_mse[0.0063] loss_depth[0.0148] total_loss[0.0216] Rank[0/16] 06/24/2025 13:22:01 INFO stats.py:314 | Epoch[29] Step[101] GlobalStep[4074] Training Speed: 430.76 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:30:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:22:10 INFO loss_tracker.py:84 | Epoch[29/NA] Step[124] GlobalStep[4097/99999]: loss_noise_mse[0.0005] loss_fk_mse[0.0061] loss_depth[0.0150] total_loss[0.0216] Rank[0/16] 06/24/2025 13:22:11 INFO stats.py:314 | Epoch[29] Step[126] GlobalStep[4099] Training Speed: 446.55 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:30:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:22:14 INFO stats.py:394 | Epoch[29] completed. Training Speed: 289.84 samples/sec across all devices. Epoch Time: 60.50 sec. Average Epoch Time: 60.50 sec. Average Step Time: 0.44 sec. Estimated Remaining Time: 11:29:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:22:22 INFO stats.py:314 | Epoch[30] Step[14] GlobalStep[4124] Training Speed: 411.39 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 11:30:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:22:26 INFO loss_tracker.py:84 | Epoch[30/NA] Step[24] GlobalStep[4134/99999]: loss_noise_mse[0.0005] loss_fk_mse[0.0058] loss_depth[0.0150] total_loss[0.0212] Rank[0/16] 06/24/2025 13:22:32 INFO stats.py:314 | Epoch[30] Step[39] GlobalStep[4149] Training Speed: 418.78 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 11:29:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:22:37 INFO loss_tracker.py:84 | Epoch[30/NA] Step[49] GlobalStep[4159/99999]: loss_noise_mse[0.0005] loss_fk_mse[0.0060] loss_depth[0.0150] total_loss[0.0215] Rank[0/16] 06/24/2025 13:22:44 INFO stats.py:314 | Epoch[30] Step[64] GlobalStep[4174] Training Speed: 432.50 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:29:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:22:48 INFO loss_tracker.py:84 | Epoch[30/NA] Step[74] GlobalStep[4184/99999]: loss_noise_mse[0.0004] loss_fk_mse[0.0061] loss_depth[0.0150] total_loss[0.0215] Rank[0/16] 06/24/2025 13:22:54 INFO stats.py:314 | Epoch[30] Step[89] GlobalStep[4199] Training Speed: 424.07 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:29:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:22:59 INFO loss_tracker.py:84 | Epoch[30/NA] Step[99] GlobalStep[4209/99999]: loss_noise_mse[0.0005] loss_fk_mse[0.0056] loss_depth[0.0148] total_loss[0.0209] Rank[0/16] 06/24/2025 13:23:05 INFO stats.py:314 | Epoch[30] Step[114] GlobalStep[4224] Training Speed: 411.36 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 11:29:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:23:09 INFO loss_tracker.py:84 | Epoch[30/NA] Step[124] GlobalStep[4234/99999]: loss_noise_mse[0.0005] loss_fk_mse[0.0057] loss_depth[0.0148] total_loss[0.0210] Rank[0/16] 06/24/2025 13:23:13 INFO stats.py:394 | Epoch[30] completed. Training Speed: 298.72 samples/sec across all devices. Epoch Time: 58.70 sec. Average Epoch Time: 58.70 sec. Average Step Time: 0.43 sec. Estimated Remaining Time: 11:28:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:23:15 INFO stats.py:314 | Epoch[31] Step[2] GlobalStep[4249] Training Speed: 428.30 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:29:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:23:25 INFO loss_tracker.py:84 | Epoch[31/NA] Step[24] GlobalStep[4271/99999]: loss_noise_mse[0.0005] loss_fk_mse[0.0057] loss_depth[0.0150] total_loss[0.0211] Rank[0/16] 06/24/2025 13:23:26 INFO stats.py:314 | Epoch[31] Step[27] GlobalStep[4274] Training Speed: 433.12 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:28:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:23:35 INFO loss_tracker.py:84 | Epoch[31/NA] Step[49] GlobalStep[4296/99999]: loss_noise_mse[0.0005] loss_fk_mse[0.0055] loss_depth[0.0148] total_loss[0.0208] Rank[0/16] 06/24/2025 13:23:36 INFO stats.py:314 | Epoch[31] Step[52] GlobalStep[4299] Training Speed: 422.18 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:28:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:23:46 INFO loss_tracker.py:84 | Epoch[31/NA] Step[74] GlobalStep[4321/99999]: loss_noise_mse[0.0004] loss_fk_mse[0.0059] loss_depth[0.0149] total_loss[0.0212] Rank[0/16] 06/24/2025 13:23:47 INFO stats.py:314 | Epoch[31] Step[77] GlobalStep[4324] Training Speed: 413.80 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 11:28:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:23:56 INFO loss_tracker.py:84 | Epoch[31/NA] Step[99] GlobalStep[4346/99999]: loss_noise_mse[0.0004] loss_fk_mse[0.0054] loss_depth[0.0148] total_loss[0.0207] Rank[0/16] 06/24/2025 13:23:57 INFO stats.py:314 | Epoch[31] Step[102] GlobalStep[4349] Training Speed: 433.75 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:27:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:24:07 INFO loss_tracker.py:84 | Epoch[31/NA] Step[124] GlobalStep[4371/99999]: loss_noise_mse[0.0005] loss_fk_mse[0.0058] loss_depth[0.0150] total_loss[0.0212] Rank[0/16] 06/24/2025 13:24:08 INFO stats.py:314 | Epoch[31] Step[127] GlobalStep[4374] Training Speed: 452.62 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 11:27:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:24:11 INFO stats.py:394 | Epoch[31] completed. Training Speed: 302.72 samples/sec across all devices. Epoch Time: 57.93 sec. Average Epoch Time: 57.93 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 11:27:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:24:18 INFO stats.py:314 | Epoch[32] Step[15] GlobalStep[4399] Training Speed: 426.77 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:27:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:24:23 INFO loss_tracker.py:84 | Epoch[32/NA] Step[24] GlobalStep[4408/99999]: loss_noise_mse[0.0004] loss_fk_mse[0.0059] loss_depth[0.0150] total_loss[0.0213] Rank[0/16] 06/24/2025 13:24:29 INFO stats.py:314 | Epoch[32] Step[40] GlobalStep[4424] Training Speed: 434.61 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:27:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:24:33 INFO loss_tracker.py:84 | Epoch[32/NA] Step[49] GlobalStep[4433/99999]: loss_noise_mse[0.0004] loss_fk_mse[0.0054] loss_depth[0.0148] total_loss[0.0206] Rank[0/16] 06/24/2025 13:24:40 INFO stats.py:314 | Epoch[32] Step[65] GlobalStep[4449] Training Speed: 406.20 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 11:26:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:24:44 INFO loss_tracker.py:84 | Epoch[32/NA] Step[74] GlobalStep[4458/99999]: loss_noise_mse[0.0005] loss_fk_mse[0.0062] loss_depth[0.0148] total_loss[0.0214] Rank[0/16] 06/24/2025 13:24:51 INFO stats.py:314 | Epoch[32] Step[90] GlobalStep[4474] Training Speed: 424.91 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:26:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:24:54 INFO loss_tracker.py:84 | Epoch[32/NA] Step[99] GlobalStep[4483/99999]: loss_noise_mse[0.0004] loss_fk_mse[0.0058] loss_depth[0.0149] total_loss[0.0210] Rank[0/16] 06/24/2025 13:25:00 INFO stats.py:314 | Epoch[32] Step[115] GlobalStep[4499] Training Speed: 436.63 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:26:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:25:05 INFO loss_tracker.py:84 | Epoch[32/NA] Step[124] GlobalStep[4508/99999]: loss_noise_mse[0.0004] loss_fk_mse[0.0057] loss_depth[0.0148] total_loss[0.0209] Rank[0/16] 06/24/2025 13:25:09 INFO stats.py:394 | Epoch[32] completed. Training Speed: 302.71 samples/sec across all devices. Epoch Time: 57.93 sec. Average Epoch Time: 57.93 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 11:25:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:25:12 INFO stats.py:314 | Epoch[33] Step[3] GlobalStep[4524] Training Speed: 432.81 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:26:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:25:20 INFO loss_tracker.py:84 | Epoch[33/NA] Step[24] GlobalStep[4545/99999]: loss_noise_mse[0.0004] loss_fk_mse[0.0055] loss_depth[0.0148] total_loss[0.0207] Rank[0/16] 06/24/2025 13:25:22 INFO stats.py:314 | Epoch[33] Step[28] GlobalStep[4549] Training Speed: 433.88 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:25:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:25:31 INFO loss_tracker.py:84 | Epoch[33/NA] Step[49] GlobalStep[4570/99999]: loss_noise_mse[0.0004] loss_fk_mse[0.0060] loss_depth[0.0148] total_loss[0.0212] Rank[0/16] 06/24/2025 13:25:33 INFO stats.py:314 | Epoch[33] Step[53] GlobalStep[4574] Training Speed: 434.47 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:25:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:25:41 INFO loss_tracker.py:84 | Epoch[33/NA] Step[74] GlobalStep[4595/99999]: loss_noise_mse[0.0004] loss_fk_mse[0.0055] loss_depth[0.0147] total_loss[0.0206] Rank[0/16] 06/24/2025 13:25:42 INFO stats.py:314 | Epoch[33] Step[78] GlobalStep[4599] Training Speed: 436.41 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:25:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:25:52 INFO loss_tracker.py:84 | Epoch[33/NA] Step[99] GlobalStep[4620/99999]: loss_noise_mse[0.0004] loss_fk_mse[0.0060] loss_depth[0.0148] total_loss[0.0212] Rank[0/16] 06/24/2025 13:25:53 INFO stats.py:314 | Epoch[33] Step[103] GlobalStep[4624] Training Speed: 434.37 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:24:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:26:01 INFO loss_tracker.py:84 | Epoch[33/NA] Step[124] GlobalStep[4645/99999]: loss_noise_mse[0.0004] loss_fk_mse[0.0055] loss_depth[0.0147] total_loss[0.0206] Rank[0/16] 06/24/2025 13:26:03 INFO stats.py:314 | Epoch[33] Step[128] GlobalStep[4649] Training Speed: 450.98 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 11:24:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:26:06 INFO stats.py:394 | Epoch[33] completed. Training Speed: 310.10 samples/sec across all devices. Epoch Time: 56.55 sec. Average Epoch Time: 56.55 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 11:24:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:26:14 INFO stats.py:314 | Epoch[34] Step[16] GlobalStep[4674] Training Speed: 436.02 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:24:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:26:17 INFO loss_tracker.py:84 | Epoch[34/NA] Step[24] GlobalStep[4682/99999]: loss_noise_mse[0.0005] loss_fk_mse[0.0054] loss_depth[0.0146] total_loss[0.0204] Rank[0/16] 06/24/2025 13:26:24 INFO stats.py:314 | Epoch[34] Step[41] GlobalStep[4699] Training Speed: 239.91 samples/sec across all devices. Average Step Time: 0.53 sec. Estimated Remaining Time: 11:24:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:26:28 INFO loss_tracker.py:84 | Epoch[34/NA] Step[49] GlobalStep[4707/99999]: loss_noise_mse[0.0004] loss_fk_mse[0.0059] loss_depth[0.0147] total_loss[0.0210] Rank[0/16] 06/24/2025 13:26:35 INFO stats.py:314 | Epoch[34] Step[66] GlobalStep[4724] Training Speed: 445.55 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:23:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:26:38 INFO loss_tracker.py:84 | Epoch[34/NA] Step[74] GlobalStep[4732/99999]: loss_noise_mse[0.0004] loss_fk_mse[0.0055] loss_depth[0.0147] total_loss[0.0206] Rank[0/16] 06/24/2025 13:26:45 INFO stats.py:314 | Epoch[34] Step[91] GlobalStep[4749] Training Speed: 236.41 samples/sec across all devices. Average Step Time: 0.54 sec. Estimated Remaining Time: 11:23:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:26:49 INFO loss_tracker.py:84 | Epoch[34/NA] Step[99] GlobalStep[4757/99999]: loss_noise_mse[0.0004] loss_fk_mse[0.0059] loss_depth[0.0146] total_loss[0.0209] Rank[0/16] 06/24/2025 13:26:56 INFO stats.py:314 | Epoch[34] Step[116] GlobalStep[4774] Training Speed: 433.68 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:23:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:26:59 INFO loss_tracker.py:84 | Epoch[34/NA] Step[124] GlobalStep[4782/99999]: loss_noise_mse[0.0004] loss_fk_mse[0.0057] loss_depth[0.0147] total_loss[0.0208] Rank[0/16] 06/24/2025 13:27:03 INFO stats.py:394 | Epoch[34] completed. Training Speed: 303.51 samples/sec across all devices. Epoch Time: 57.78 sec. Average Epoch Time: 57.78 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 11:22:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:27:06 INFO stats.py:314 | Epoch[35] Step[4] GlobalStep[4799] Training Speed: 413.36 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 11:22:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:27:15 INFO loss_tracker.py:84 | Epoch[35/NA] Step[24] GlobalStep[4819/99999]: loss_noise_mse[0.0004] loss_fk_mse[0.0055] loss_depth[0.0146] total_loss[0.0206] Rank[0/16] 06/24/2025 13:27:17 INFO stats.py:314 | Epoch[35] Step[29] GlobalStep[4824] Training Speed: 436.23 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:22:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:27:25 INFO loss_tracker.py:84 | Epoch[35/NA] Step[49] GlobalStep[4844/99999]: loss_noise_mse[0.0004] loss_fk_mse[0.0055] loss_depth[0.0147] total_loss[0.0206] Rank[0/16] 06/24/2025 13:27:28 INFO stats.py:314 | Epoch[35] Step[54] GlobalStep[4849] Training Speed: 229.49 samples/sec across all devices. Average Step Time: 0.56 sec. Estimated Remaining Time: 11:22:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:27:36 INFO loss_tracker.py:84 | Epoch[35/NA] Step[74] GlobalStep[4869/99999]: loss_noise_mse[0.0004] loss_fk_mse[0.0061] loss_depth[0.0147] total_loss[0.0211] Rank[0/16] 06/24/2025 13:27:38 INFO stats.py:314 | Epoch[35] Step[79] GlobalStep[4874] Training Speed: 422.89 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:22:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:27:46 INFO loss_tracker.py:84 | Epoch[35/NA] Step[99] GlobalStep[4894/99999]: loss_noise_mse[0.0004] loss_fk_mse[0.0057] loss_depth[0.0146] total_loss[0.0207] Rank[0/16] 06/24/2025 13:27:49 INFO stats.py:314 | Epoch[35] Step[104] GlobalStep[4899] Training Speed: 418.36 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 11:22:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:27:56 INFO loss_tracker.py:84 | Epoch[35/NA] Step[124] GlobalStep[4919/99999]: loss_noise_mse[0.0004] loss_fk_mse[0.0063] loss_depth[0.0145] total_loss[0.0212] Rank[0/16] 06/24/2025 13:27:58 INFO stats.py:314 | Epoch[35] Step[129] GlobalStep[4924] Training Speed: 453.39 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 11:21:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:28:01 INFO stats.py:394 | Epoch[35] completed. Training Speed: 305.53 samples/sec across all devices. Epoch Time: 57.40 sec. Average Epoch Time: 57.40 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 11:21:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:28:10 INFO stats.py:314 | Epoch[36] Step[17] GlobalStep[4949] Training Speed: 433.35 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:21:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:28:13 INFO loss_tracker.py:84 | Epoch[36/NA] Step[24] GlobalStep[4956/99999]: loss_noise_mse[0.0004] loss_fk_mse[0.0059] loss_depth[0.0145] total_loss[0.0208] Rank[0/16] 06/24/2025 13:28:20 INFO stats.py:314 | Epoch[36] Step[42] GlobalStep[4974] Training Speed: 420.34 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:21:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:28:23 INFO loss_tracker.py:84 | Epoch[36/NA] Step[49] GlobalStep[4981/99999]: loss_noise_mse[0.0004] loss_fk_mse[0.0056] loss_depth[0.0145] total_loss[0.0205] Rank[0/16] 06/24/2025 13:28:31 INFO stats.py:314 | Epoch[36] Step[67] GlobalStep[4999] Training Speed: 421.84 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:21:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:28:34 INFO loss_tracker.py:84 | Epoch[36/NA] Step[74] GlobalStep[5006/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0059] loss_depth[0.0147] total_loss[0.0209] Rank[0/16] 06/24/2025 13:28:41 INFO stats.py:314 | Epoch[36] Step[92] GlobalStep[5024] Training Speed: 434.13 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:20:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:28:44 INFO loss_tracker.py:84 | Epoch[36/NA] Step[99] GlobalStep[5031/99999]: loss_noise_mse[0.0004] loss_fk_mse[0.0051] loss_depth[0.0145] total_loss[0.0200] Rank[0/16] 06/24/2025 13:28:52 INFO stats.py:314 | Epoch[36] Step[117] GlobalStep[5049] Training Speed: 435.81 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:20:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:28:55 INFO loss_tracker.py:84 | Epoch[36/NA] Step[124] GlobalStep[5056/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0058] loss_depth[0.0145] total_loss[0.0207] Rank[0/16] 06/24/2025 13:28:59 INFO stats.py:394 | Epoch[36] completed. Training Speed: 302.21 samples/sec across all devices. Epoch Time: 58.03 sec. Average Epoch Time: 58.03 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 11:20:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:29:02 INFO stats.py:314 | Epoch[37] Step[5] GlobalStep[5074] Training Speed: 437.31 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:20:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:29:10 INFO loss_tracker.py:84 | Epoch[37/NA] Step[24] GlobalStep[5093/99999]: loss_noise_mse[0.0004] loss_fk_mse[0.0058] loss_depth[0.0145] total_loss[0.0206] Rank[0/16] 06/24/2025 13:29:13 INFO stats.py:314 | Epoch[37] Step[30] GlobalStep[5099] Training Speed: 422.26 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:19:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:29:20 INFO loss_tracker.py:84 | Epoch[37/NA] Step[49] GlobalStep[5118/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0057] loss_depth[0.0144] total_loss[0.0204] Rank[0/16] 06/24/2025 13:29:22 INFO stats.py:314 | Epoch[37] Step[55] GlobalStep[5124] Training Speed: 423.69 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:19:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:29:30 INFO loss_tracker.py:84 | Epoch[37/NA] Step[74] GlobalStep[5143/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0055] loss_depth[0.0146] total_loss[0.0205] Rank[0/16] 06/24/2025 13:29:33 INFO stats.py:314 | Epoch[37] Step[80] GlobalStep[5149] Training Speed: 434.60 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:19:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:29:40 INFO loss_tracker.py:84 | Epoch[37/NA] Step[99] GlobalStep[5168/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0057] loss_depth[0.0144] total_loss[0.0205] Rank[0/16] 06/24/2025 13:29:43 INFO stats.py:314 | Epoch[37] Step[105] GlobalStep[5174] Training Speed: 434.18 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:18:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:29:51 INFO loss_tracker.py:84 | Epoch[37/NA] Step[124] GlobalStep[5193/99999]: loss_noise_mse[0.0004] loss_fk_mse[0.0058] loss_depth[0.0144] total_loss[0.0206] Rank[0/16] 06/24/2025 13:29:53 INFO stats.py:314 | Epoch[37] Step[130] GlobalStep[5199] Training Speed: 452.12 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 11:18:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:29:55 INFO stats.py:394 | Epoch[37] completed. Training Speed: 310.98 samples/sec across all devices. Epoch Time: 56.39 sec. Average Epoch Time: 56.39 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 11:18:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:30:03 INFO stats.py:314 | Epoch[38] Step[18] GlobalStep[5224] Training Speed: 408.38 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 11:18:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:30:06 INFO loss_tracker.py:84 | Epoch[38/NA] Step[24] GlobalStep[5230/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0056] loss_depth[0.0145] total_loss[0.0204] Rank[0/16] 06/24/2025 13:30:15 INFO stats.py:314 | Epoch[38] Step[43] GlobalStep[5249] Training Speed: 381.91 samples/sec across all devices. Average Step Time: 0.34 sec. Estimated Remaining Time: 11:18:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:30:17 INFO loss_tracker.py:84 | Epoch[38/NA] Step[49] GlobalStep[5255/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0057] loss_depth[0.0144] total_loss[0.0205] Rank[0/16] 06/24/2025 13:30:24 INFO stats.py:314 | Epoch[38] Step[68] GlobalStep[5274] Training Speed: 433.45 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:17:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:30:27 INFO loss_tracker.py:84 | Epoch[38/NA] Step[74] GlobalStep[5280/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0051] loss_depth[0.0144] total_loss[0.0199] Rank[0/16] 06/24/2025 13:30:35 INFO stats.py:314 | Epoch[38] Step[93] GlobalStep[5299] Training Speed: 436.37 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:17:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:30:38 INFO loss_tracker.py:84 | Epoch[38/NA] Step[99] GlobalStep[5305/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0057] loss_depth[0.0144] total_loss[0.0205] Rank[0/16] 06/24/2025 13:30:45 INFO stats.py:314 | Epoch[38] Step[118] GlobalStep[5324] Training Speed: 406.13 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 11:17:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:30:47 INFO loss_tracker.py:84 | Epoch[38/NA] Step[124] GlobalStep[5330/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0061] loss_depth[0.0143] total_loss[0.0207] Rank[0/16] 06/24/2025 13:30:52 INFO stats.py:394 | Epoch[38] completed. Training Speed: 310.51 samples/sec across all devices. Epoch Time: 56.48 sec. Average Epoch Time: 56.48 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 11:16:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:30:56 INFO stats.py:314 | Epoch[39] Step[6] GlobalStep[5349] Training Speed: 402.11 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 11:16:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:31:02 INFO loss_tracker.py:84 | Epoch[39/NA] Step[24] GlobalStep[5367/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0054] loss_depth[0.0144] total_loss[0.0202] Rank[0/16] 06/24/2025 13:31:06 INFO stats.py:314 | Epoch[39] Step[31] GlobalStep[5374] Training Speed: 437.45 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:16:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:31:14 INFO loss_tracker.py:84 | Epoch[39/NA] Step[49] GlobalStep[5392/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0056] loss_depth[0.0144] total_loss[0.0203] Rank[0/16] 06/24/2025 13:31:17 INFO stats.py:314 | Epoch[39] Step[56] GlobalStep[5399] Training Speed: 433.88 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:16:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:31:24 INFO loss_tracker.py:84 | Epoch[39/NA] Step[74] GlobalStep[5417/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0051] loss_depth[0.0145] total_loss[0.0199] Rank[0/16] 06/24/2025 13:31:27 INFO stats.py:314 | Epoch[39] Step[81] GlobalStep[5424] Training Speed: 431.24 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:16:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:31:35 INFO loss_tracker.py:84 | Epoch[39/NA] Step[99] GlobalStep[5442/99999]: loss_noise_mse[0.0004] loss_fk_mse[0.0055] loss_depth[0.0144] total_loss[0.0202] Rank[0/16] 06/24/2025 13:31:38 INFO stats.py:314 | Epoch[39] Step[106] GlobalStep[5449] Training Speed: 423.19 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:15:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:31:44 INFO loss_tracker.py:84 | Epoch[39/NA] Step[124] GlobalStep[5467/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0056] loss_depth[0.0143] total_loss[0.0203] Rank[0/16] 06/24/2025 13:31:47 INFO stats.py:314 | Epoch[39] Step[131] GlobalStep[5474] Training Speed: 451.58 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 11:15:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:31:49 INFO stats.py:394 | Epoch[39] completed. Training Speed: 307.24 samples/sec across all devices. Epoch Time: 57.08 sec. Average Epoch Time: 57.08 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 11:15:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:31:58 INFO stats.py:314 | Epoch[40] Step[19] GlobalStep[5499] Training Speed: 428.91 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:15:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:32:00 INFO loss_tracker.py:84 | Epoch[40/NA] Step[24] GlobalStep[5504/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0053] loss_depth[0.0144] total_loss[0.0200] Rank[0/16] 06/24/2025 13:32:09 INFO stats.py:314 | Epoch[40] Step[44] GlobalStep[5524] Training Speed: 431.92 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:14:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:32:11 INFO loss_tracker.py:84 | Epoch[40/NA] Step[49] GlobalStep[5529/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0054] loss_depth[0.0144] total_loss[0.0201] Rank[0/16] 06/24/2025 13:32:19 INFO stats.py:314 | Epoch[40] Step[69] GlobalStep[5549] Training Speed: 433.47 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:14:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:32:21 INFO loss_tracker.py:84 | Epoch[40/NA] Step[74] GlobalStep[5554/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0048] loss_depth[0.0143] total_loss[0.0194] Rank[0/16] 06/24/2025 13:32:29 INFO stats.py:314 | Epoch[40] Step[94] GlobalStep[5574] Training Speed: 436.48 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:14:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:32:31 INFO loss_tracker.py:84 | Epoch[40/NA] Step[99] GlobalStep[5579/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0057] loss_depth[0.0144] total_loss[0.0204] Rank[0/16] 06/24/2025 13:32:40 INFO stats.py:314 | Epoch[40] Step[119] GlobalStep[5599] Training Speed: 434.30 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:14:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:32:41 INFO loss_tracker.py:84 | Epoch[40/NA] Step[124] GlobalStep[5604/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0054] loss_depth[0.0144] total_loss[0.0201] Rank[0/16] 06/24/2025 13:32:46 INFO stats.py:394 | Epoch[40] completed. Training Speed: 307.79 samples/sec across all devices. Epoch Time: 56.97 sec. Average Epoch Time: 56.97 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 11:13:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:32:50 INFO stats.py:314 | Epoch[41] Step[7] GlobalStep[5624] Training Speed: 437.70 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:13:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:32:58 INFO loss_tracker.py:84 | Epoch[41/NA] Step[24] GlobalStep[5641/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0059] loss_depth[0.0143] total_loss[0.0206] Rank[0/16] 06/24/2025 13:33:01 INFO stats.py:314 | Epoch[41] Step[32] GlobalStep[5649] Training Speed: 435.09 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:13:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:33:08 INFO loss_tracker.py:84 | Epoch[41/NA] Step[49] GlobalStep[5666/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0055] loss_depth[0.0142] total_loss[0.0200] Rank[0/16] 06/24/2025 13:33:11 INFO stats.py:314 | Epoch[41] Step[57] GlobalStep[5674] Training Speed: 431.21 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:13:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:33:18 INFO loss_tracker.py:84 | Epoch[41/NA] Step[74] GlobalStep[5691/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0058] loss_depth[0.0141] total_loss[0.0203] Rank[0/16] 06/24/2025 13:33:22 INFO stats.py:314 | Epoch[41] Step[82] GlobalStep[5699] Training Speed: 431.45 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:13:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:33:29 INFO loss_tracker.py:84 | Epoch[41/NA] Step[99] GlobalStep[5716/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0062] loss_depth[0.0144] total_loss[0.0208] Rank[0/16] 06/24/2025 13:33:32 INFO stats.py:314 | Epoch[41] Step[107] GlobalStep[5724] Training Speed: 432.83 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:12:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:33:39 INFO loss_tracker.py:84 | Epoch[41/NA] Step[124] GlobalStep[5741/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0057] loss_depth[0.0144] total_loss[0.0204] Rank[0/16] 06/24/2025 13:33:42 INFO stats.py:314 | Epoch[41] Step[132] GlobalStep[5749] Training Speed: 451.99 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 11:12:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:33:44 INFO stats.py:394 | Epoch[41] completed. Training Speed: 302.80 samples/sec across all devices. Epoch Time: 57.91 sec. Average Epoch Time: 57.91 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 11:12:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:33:53 INFO stats.py:314 | Epoch[42] Step[20] GlobalStep[5774] Training Speed: 434.34 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:12:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:33:55 INFO loss_tracker.py:84 | Epoch[42/NA] Step[24] GlobalStep[5778/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0058] loss_depth[0.0142] total_loss[0.0204] Rank[0/16] 06/24/2025 13:34:04 INFO stats.py:314 | Epoch[42] Step[45] GlobalStep[5799] Training Speed: 437.10 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:12:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:34:05 INFO loss_tracker.py:84 | Epoch[42/NA] Step[49] GlobalStep[5803/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0058] loss_depth[0.0144] total_loss[0.0204] Rank[0/16] 06/24/2025 13:34:14 INFO stats.py:314 | Epoch[42] Step[70] GlobalStep[5824] Training Speed: 437.12 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:11:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:34:16 INFO loss_tracker.py:84 | Epoch[42/NA] Step[74] GlobalStep[5828/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0055] loss_depth[0.0144] total_loss[0.0201] Rank[0/16] 06/24/2025 13:34:24 INFO stats.py:314 | Epoch[42] Step[95] GlobalStep[5849] Training Speed: 431.76 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:11:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:34:26 INFO loss_tracker.py:84 | Epoch[42/NA] Step[99] GlobalStep[5853/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0058] loss_depth[0.0142] total_loss[0.0202] Rank[0/16] 06/24/2025 13:34:35 INFO stats.py:314 | Epoch[42] Step[120] GlobalStep[5874] Training Speed: 447.99 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:11:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:34:36 INFO loss_tracker.py:84 | Epoch[42/NA] Step[124] GlobalStep[5878/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0053] loss_depth[0.0142] total_loss[0.0198] Rank[0/16] 06/24/2025 13:34:41 INFO stats.py:394 | Epoch[42] completed. Training Speed: 307.65 samples/sec across all devices. Epoch Time: 57.00 sec. Average Epoch Time: 57.00 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 11:11:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:34:46 INFO stats.py:314 | Epoch[43] Step[8] GlobalStep[5899] Training Speed: 434.95 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:11:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:34:52 INFO loss_tracker.py:84 | Epoch[43/NA] Step[24] GlobalStep[5915/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0056] loss_depth[0.0141] total_loss[0.0200] Rank[0/16] 06/24/2025 13:34:56 INFO stats.py:314 | Epoch[43] Step[33] GlobalStep[5924] Training Speed: 421.78 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:11:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:35:03 INFO loss_tracker.py:84 | Epoch[43/NA] Step[49] GlobalStep[5940/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0062] loss_depth[0.0142] total_loss[0.0207] Rank[0/16] 06/24/2025 13:35:07 INFO stats.py:314 | Epoch[43] Step[58] GlobalStep[5949] Training Speed: 427.69 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:10:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:35:13 INFO loss_tracker.py:84 | Epoch[43/NA] Step[74] GlobalStep[5965/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0053] loss_depth[0.0142] total_loss[0.0198] Rank[0/16] 06/24/2025 13:35:17 INFO stats.py:314 | Epoch[43] Step[83] GlobalStep[5974] Training Speed: 433.03 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:10:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:35:24 INFO loss_tracker.py:84 | Epoch[43/NA] Step[99] GlobalStep[5990/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0056] loss_depth[0.0143] total_loss[0.0202] Rank[0/16] 06/24/2025 13:35:27 INFO stats.py:314 | Epoch[43] Step[108] GlobalStep[5999] Training Speed: 431.01 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:10:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:35:33 INFO loss_tracker.py:84 | Epoch[43/NA] Step[124] GlobalStep[6015/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0058] loss_depth[0.0142] total_loss[0.0203] Rank[0/16] 06/24/2025 13:35:37 INFO stats.py:314 | Epoch[43] Step[133] GlobalStep[6024] Training Speed: 449.47 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 11:09:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:35:38 INFO stats.py:394 | Epoch[43] completed. Training Speed: 304.92 samples/sec across all devices. Epoch Time: 57.51 sec. Average Epoch Time: 57.51 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 11:09:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:35:48 INFO stats.py:314 | Epoch[44] Step[21] GlobalStep[6049] Training Speed: 435.99 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:09:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:35:49 INFO loss_tracker.py:84 | Epoch[44/NA] Step[24] GlobalStep[6052/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0055] loss_depth[0.0142] total_loss[0.0200] Rank[0/16] 06/24/2025 13:35:59 INFO stats.py:314 | Epoch[44] Step[46] GlobalStep[6074] Training Speed: 433.27 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:09:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:36:00 INFO loss_tracker.py:84 | Epoch[44/NA] Step[49] GlobalStep[6077/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0061] loss_depth[0.0142] total_loss[0.0206] Rank[0/16] 06/24/2025 13:36:09 INFO stats.py:314 | Epoch[44] Step[71] GlobalStep[6099] Training Speed: 432.90 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:09:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:36:11 INFO loss_tracker.py:84 | Epoch[44/NA] Step[74] GlobalStep[6102/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0060] loss_depth[0.0142] total_loss[0.0205] Rank[0/16] 06/24/2025 13:36:20 INFO stats.py:314 | Epoch[44] Step[96] GlobalStep[6124] Training Speed: 435.71 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:09:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:36:21 INFO loss_tracker.py:84 | Epoch[44/NA] Step[99] GlobalStep[6127/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0057] loss_depth[0.0142] total_loss[0.0202] Rank[0/16] 06/24/2025 13:36:31 INFO stats.py:314 | Epoch[44] Step[121] GlobalStep[6149] Training Speed: 448.15 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:09:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:36:32 INFO loss_tracker.py:84 | Epoch[44/NA] Step[124] GlobalStep[6152/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0055] loss_depth[0.0142] total_loss[0.0200] Rank[0/16] 06/24/2025 13:36:36 INFO stats.py:394 | Epoch[44] completed. Training Speed: 302.11 samples/sec across all devices. Epoch Time: 58.05 sec. Average Epoch Time: 58.05 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 11:08:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:36:41 INFO stats.py:314 | Epoch[45] Step[9] GlobalStep[6174] Training Speed: 434.72 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:08:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:36:48 INFO loss_tracker.py:84 | Epoch[45/NA] Step[24] GlobalStep[6189/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0060] loss_depth[0.0142] total_loss[0.0205] Rank[0/16] 06/24/2025 13:36:52 INFO stats.py:314 | Epoch[45] Step[34] GlobalStep[6199] Training Speed: 430.49 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:08:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:36:59 INFO loss_tracker.py:84 | Epoch[45/NA] Step[49] GlobalStep[6214/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0052] loss_depth[0.0141] total_loss[0.0197] Rank[0/16] 06/24/2025 13:37:02 INFO stats.py:314 | Epoch[45] Step[59] GlobalStep[6224] Training Speed: 438.39 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:08:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:37:09 INFO loss_tracker.py:84 | Epoch[45/NA] Step[74] GlobalStep[6239/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0057] loss_depth[0.0141] total_loss[0.0201] Rank[0/16] 06/24/2025 13:37:13 INFO stats.py:314 | Epoch[45] Step[84] GlobalStep[6249] Training Speed: 431.42 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:08:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:37:20 INFO loss_tracker.py:84 | Epoch[45/NA] Step[99] GlobalStep[6264/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0053] loss_depth[0.0141] total_loss[0.0197] Rank[0/16] 06/24/2025 13:37:24 INFO stats.py:314 | Epoch[45] Step[109] GlobalStep[6274] Training Speed: 405.19 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 11:08:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:37:30 INFO loss_tracker.py:84 | Epoch[45/NA] Step[124] GlobalStep[6289/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0056] loss_depth[0.0140] total_loss[0.0199] Rank[0/16] 06/24/2025 13:37:34 INFO stats.py:314 | Epoch[45] Step[134] GlobalStep[6299] Training Speed: 452.89 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 11:07:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:37:34 INFO stats.py:394 | Epoch[45] completed. Training Speed: 302.13 samples/sec across all devices. Epoch Time: 58.04 sec. Average Epoch Time: 58.04 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 11:07:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:37:45 INFO stats.py:314 | Epoch[46] Step[22] GlobalStep[6324] Training Speed: 423.57 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:07:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:37:46 INFO loss_tracker.py:84 | Epoch[46/NA] Step[24] GlobalStep[6326/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0053] loss_depth[0.0141] total_loss[0.0196] Rank[0/16] 06/24/2025 13:37:56 INFO stats.py:314 | Epoch[46] Step[47] GlobalStep[6349] Training Speed: 434.60 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:07:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:37:57 INFO loss_tracker.py:84 | Epoch[46/NA] Step[49] GlobalStep[6351/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0057] loss_depth[0.0141] total_loss[0.0201] Rank[0/16] 06/24/2025 13:38:06 INFO stats.py:314 | Epoch[46] Step[72] GlobalStep[6374] Training Speed: 432.47 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:07:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:38:07 INFO loss_tracker.py:84 | Epoch[46/NA] Step[74] GlobalStep[6376/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0048] loss_depth[0.0142] total_loss[0.0193] Rank[0/16] 06/24/2025 13:38:17 INFO stats.py:314 | Epoch[46] Step[97] GlobalStep[6399] Training Speed: 431.87 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:07:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:38:18 INFO loss_tracker.py:84 | Epoch[46/NA] Step[99] GlobalStep[6401/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0056] loss_depth[0.0141] total_loss[0.0200] Rank[0/16] 06/24/2025 13:38:27 INFO stats.py:314 | Epoch[46] Step[122] GlobalStep[6424] Training Speed: 452.45 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 11:06:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:38:28 INFO loss_tracker.py:84 | Epoch[46/NA] Step[124] GlobalStep[6426/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0054] loss_depth[0.0139] total_loss[0.0197] Rank[0/16] 06/24/2025 13:38:33 INFO stats.py:394 | Epoch[46] completed. Training Speed: 300.24 samples/sec across all devices. Epoch Time: 58.41 sec. Average Epoch Time: 58.41 sec. Average Step Time: 0.43 sec. Estimated Remaining Time: 11:06:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:38:38 INFO stats.py:314 | Epoch[47] Step[10] GlobalStep[6449] Training Speed: 426.94 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:06:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:38:44 INFO loss_tracker.py:84 | Epoch[47/NA] Step[24] GlobalStep[6463/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0057] loss_depth[0.0141] total_loss[0.0200] Rank[0/16] 06/24/2025 13:38:49 INFO stats.py:314 | Epoch[47] Step[35] GlobalStep[6474] Training Speed: 405.69 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 11:06:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:38:55 INFO loss_tracker.py:84 | Epoch[47/NA] Step[49] GlobalStep[6488/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0054] loss_depth[0.0141] total_loss[0.0198] Rank[0/16] 06/24/2025 13:39:00 INFO stats.py:314 | Epoch[47] Step[60] GlobalStep[6499] Training Speed: 428.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:06:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:39:05 INFO loss_tracker.py:84 | Epoch[47/NA] Step[74] GlobalStep[6513/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0052] loss_depth[0.0141] total_loss[0.0196] Rank[0/16] 06/24/2025 13:39:10 INFO stats.py:314 | Epoch[47] Step[85] GlobalStep[6524] Training Speed: 429.28 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:06:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:39:16 INFO loss_tracker.py:84 | Epoch[47/NA] Step[99] GlobalStep[6538/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0056] loss_depth[0.0140] total_loss[0.0198] Rank[0/16] 06/24/2025 13:39:20 INFO stats.py:314 | Epoch[47] Step[110] GlobalStep[6549] Training Speed: 434.82 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:05:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:39:26 INFO loss_tracker.py:84 | Epoch[47/NA] Step[124] GlobalStep[6563/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0052] loss_depth[0.0139] total_loss[0.0194] Rank[0/16] 06/24/2025 13:39:30 INFO stats.py:314 | Epoch[47] Step[135] GlobalStep[6574] Training Speed: 449.02 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:05:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:39:31 INFO stats.py:394 | Epoch[47] completed. Training Speed: 301.26 samples/sec across all devices. Epoch Time: 58.21 sec. Average Epoch Time: 58.21 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 11:05:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:39:42 INFO stats.py:314 | Epoch[48] Step[23] GlobalStep[6599] Training Speed: 434.15 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:05:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:39:42 INFO loss_tracker.py:84 | Epoch[48/NA] Step[24] GlobalStep[6600/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0056] loss_depth[0.0140] total_loss[0.0199] Rank[0/16] 06/24/2025 13:39:53 INFO stats.py:314 | Epoch[48] Step[48] GlobalStep[6624] Training Speed: 428.55 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:05:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:39:53 INFO loss_tracker.py:84 | Epoch[48/NA] Step[49] GlobalStep[6625/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0055] loss_depth[0.0140] total_loss[0.0197] Rank[0/16] 06/24/2025 13:40:03 INFO stats.py:314 | Epoch[48] Step[73] GlobalStep[6649] Training Speed: 430.34 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:05:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:40:04 INFO loss_tracker.py:84 | Epoch[48/NA] Step[74] GlobalStep[6650/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0052] loss_depth[0.0139] total_loss[0.0194] Rank[0/16] 06/24/2025 13:40:14 INFO stats.py:314 | Epoch[48] Step[98] GlobalStep[6674] Training Speed: 432.48 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:04:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:40:14 INFO loss_tracker.py:84 | Epoch[48/NA] Step[99] GlobalStep[6675/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0056] loss_depth[0.0141] total_loss[0.0200] Rank[0/16] 06/24/2025 13:40:24 INFO stats.py:314 | Epoch[48] Step[123] GlobalStep[6699] Training Speed: 449.92 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 11:04:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:40:25 INFO loss_tracker.py:84 | Epoch[48/NA] Step[124] GlobalStep[6700/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0059] loss_depth[0.0141] total_loss[0.0202] Rank[0/16] 06/24/2025 13:40:29 INFO stats.py:394 | Epoch[48] completed. Training Speed: 300.53 samples/sec across all devices. Epoch Time: 58.35 sec. Average Epoch Time: 58.35 sec. Average Step Time: 0.43 sec. Estimated Remaining Time: 11:04:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:40:35 INFO stats.py:314 | Epoch[49] Step[11] GlobalStep[6724] Training Speed: 429.88 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:04:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:40:41 INFO loss_tracker.py:84 | Epoch[49/NA] Step[24] GlobalStep[6737/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0058] loss_depth[0.0140] total_loss[0.0200] Rank[0/16] 06/24/2025 13:40:46 INFO stats.py:314 | Epoch[49] Step[36] GlobalStep[6749] Training Speed: 435.10 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:04:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:40:51 INFO loss_tracker.py:84 | Epoch[49/NA] Step[49] GlobalStep[6762/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0051] loss_depth[0.0140] total_loss[0.0193] Rank[0/16] 06/24/2025 13:40:56 INFO stats.py:314 | Epoch[49] Step[61] GlobalStep[6774] Training Speed: 435.61 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:04:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:41:02 INFO loss_tracker.py:84 | Epoch[49/NA] Step[74] GlobalStep[6787/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0056] loss_depth[0.0140] total_loss[0.0198] Rank[0/16] 06/24/2025 13:41:07 INFO stats.py:314 | Epoch[49] Step[86] GlobalStep[6799] Training Speed: 429.73 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:03:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:41:12 INFO loss_tracker.py:84 | Epoch[49/NA] Step[99] GlobalStep[6812/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0050] loss_depth[0.0139] total_loss[0.0191] Rank[0/16] 06/24/2025 13:41:17 INFO stats.py:314 | Epoch[49] Step[111] GlobalStep[6824] Training Speed: 433.42 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:03:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:41:22 INFO loss_tracker.py:84 | Epoch[49/NA] Step[124] GlobalStep[6837/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0051] loss_depth[0.0141] total_loss[0.0194] Rank[0/16] 06/24/2025 13:41:27 INFO stats.py:314 | Epoch[49] Step[136] GlobalStep[6849] Training Speed: 452.12 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 11:03:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:41:27 INFO stats.py:394 | Epoch[49] completed. Training Speed: 304.97 samples/sec across all devices. Epoch Time: 57.50 sec. Average Epoch Time: 57.50 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 11:03:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:41:38 INFO stats.py:314 | Epoch[50] Step[24] GlobalStep[6874] Training Speed: 434.07 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:03:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:41:38 INFO loss_tracker.py:84 | Epoch[50/NA] Step[24] GlobalStep[6874/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0059] loss_depth[0.0139] total_loss[0.0201] Rank[0/16] 06/24/2025 13:41:49 INFO stats.py:314 | Epoch[50] Step[49] GlobalStep[6899] Training Speed: 433.70 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:03:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:41:49 INFO loss_tracker.py:84 | Epoch[50/NA] Step[49] GlobalStep[6899/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0059] loss_depth[0.0139] total_loss[0.0201] Rank[0/16] 06/24/2025 13:41:58 INFO stats.py:314 | Epoch[50] Step[74] GlobalStep[6924] Training Speed: 433.89 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:02:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:41:59 INFO loss_tracker.py:84 | Epoch[50/NA] Step[74] GlobalStep[6924/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0059] loss_depth[0.0139] total_loss[0.0200] Rank[0/16] 06/24/2025 13:42:09 INFO stats.py:314 | Epoch[50] Step[99] GlobalStep[6949] Training Speed: 436.52 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:02:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:42:09 INFO loss_tracker.py:84 | Epoch[50/NA] Step[99] GlobalStep[6949/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0059] loss_depth[0.0140] total_loss[0.0201] Rank[0/16] 06/24/2025 13:42:19 INFO stats.py:314 | Epoch[50] Step[124] GlobalStep[6974] Training Speed: 453.09 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 11:02:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:42:19 INFO loss_tracker.py:84 | Epoch[50/NA] Step[124] GlobalStep[6974/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0053] loss_depth[0.0140] total_loss[0.0195] Rank[0/16] 06/24/2025 13:42:24 INFO stats.py:394 | Epoch[50] completed. Training Speed: 307.49 samples/sec across all devices. Epoch Time: 57.03 sec. Average Epoch Time: 57.03 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 11:01:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:42:30 INFO stats.py:314 | Epoch[51] Step[12] GlobalStep[6999] Training Speed: 445.44 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:02:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:42:35 INFO loss_tracker.py:84 | Epoch[51/NA] Step[24] GlobalStep[7011/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0055] loss_depth[0.0139] total_loss[0.0196] Rank[0/16] 06/24/2025 13:42:41 INFO stats.py:314 | Epoch[51] Step[37] GlobalStep[7024] Training Speed: 434.52 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 11:01:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:42:46 INFO loss_tracker.py:84 | Epoch[51/NA] Step[49] GlobalStep[7036/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0054] loss_depth[0.0139] total_loss[0.0195] Rank[0/16] 06/24/2025 13:42:51 INFO stats.py:314 | Epoch[51] Step[62] GlobalStep[7049] Training Speed: 426.33 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:01:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:42:56 INFO loss_tracker.py:84 | Epoch[51/NA] Step[74] GlobalStep[7061/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0057] loss_depth[0.0140] total_loss[0.0198] Rank[0/16] 06/24/2025 13:43:01 INFO stats.py:314 | Epoch[51] Step[87] GlobalStep[7074] Training Speed: 429.95 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:01:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:43:07 INFO loss_tracker.py:84 | Epoch[51/NA] Step[99] GlobalStep[7086/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0053] loss_depth[0.0138] total_loss[0.0194] Rank[0/16] 06/24/2025 13:43:12 INFO stats.py:314 | Epoch[51] Step[112] GlobalStep[7099] Training Speed: 421.64 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 11:01:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:43:17 INFO loss_tracker.py:84 | Epoch[51/NA] Step[124] GlobalStep[7111/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0053] loss_depth[0.0139] total_loss[0.0195] Rank[0/16] 06/24/2025 13:43:21 INFO stats.py:394 | Epoch[51] completed. Training Speed: 307.07 samples/sec across all devices. Epoch Time: 57.11 sec. Average Epoch Time: 57.11 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 11:00:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:43:22 INFO stats.py:314 | Epoch[52] Step[0] GlobalStep[7124] Training Speed: 365.84 samples/sec across all devices. Average Step Time: 0.35 sec. Estimated Remaining Time: 11:00:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:43:32 INFO loss_tracker.py:84 | Epoch[52/NA] Step[24] GlobalStep[7148/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0059] loss_depth[0.0139] total_loss[0.0201] Rank[0/16] 06/24/2025 13:43:33 INFO stats.py:314 | Epoch[52] Step[25] GlobalStep[7149] Training Speed: 400.00 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 11:00:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:43:43 INFO loss_tracker.py:84 | Epoch[52/NA] Step[49] GlobalStep[7173/99999]: loss_noise_mse[0.0003] loss_fk_mse[0.0058] loss_depth[0.0139] total_loss[0.0199] Rank[0/16] 06/24/2025 13:43:43 INFO stats.py:314 | Epoch[52] Step[50] GlobalStep[7174] Training Speed: 417.27 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 11:00:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:43:53 INFO loss_tracker.py:84 | Epoch[52/NA] Step[74] GlobalStep[7198/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0055] loss_depth[0.0139] total_loss[0.0197] Rank[0/16] 06/24/2025 13:43:53 INFO stats.py:314 | Epoch[52] Step[75] GlobalStep[7199] Training Speed: 395.50 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 11:00:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:44:03 INFO loss_tracker.py:84 | Epoch[52/NA] Step[99] GlobalStep[7223/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0060] loss_depth[0.0138] total_loss[0.0201] Rank[0/16] 06/24/2025 13:44:04 INFO stats.py:314 | Epoch[52] Step[100] GlobalStep[7224] Training Speed: 428.78 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:59:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:44:13 INFO loss_tracker.py:84 | Epoch[52/NA] Step[124] GlobalStep[7248/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0060] loss_depth[0.0140] total_loss[0.0202] Rank[0/16] 06/24/2025 13:44:14 INFO stats.py:314 | Epoch[52] Step[125] GlobalStep[7249] Training Speed: 435.19 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:59:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:44:17 INFO stats.py:394 | Epoch[52] completed. Training Speed: 309.54 samples/sec across all devices. Epoch Time: 56.65 sec. Average Epoch Time: 56.65 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 10:59:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:44:25 INFO stats.py:314 | Epoch[53] Step[13] GlobalStep[7274] Training Speed: 431.05 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:59:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:44:29 INFO loss_tracker.py:84 | Epoch[53/NA] Step[24] GlobalStep[7285/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0056] loss_depth[0.0139] total_loss[0.0197] Rank[0/16] 06/24/2025 13:44:35 INFO stats.py:314 | Epoch[53] Step[38] GlobalStep[7299] Training Speed: 432.41 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:59:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:44:40 INFO loss_tracker.py:84 | Epoch[53/NA] Step[49] GlobalStep[7310/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0059] loss_depth[0.0138] total_loss[0.0200] Rank[0/16] 06/24/2025 13:44:46 INFO stats.py:314 | Epoch[53] Step[63] GlobalStep[7324] Training Speed: 433.34 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:59:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:44:50 INFO loss_tracker.py:84 | Epoch[53/NA] Step[74] GlobalStep[7335/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0053] loss_depth[0.0138] total_loss[0.0193] Rank[0/16] 06/24/2025 13:44:56 INFO stats.py:314 | Epoch[53] Step[88] GlobalStep[7349] Training Speed: 435.57 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:58:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:45:01 INFO loss_tracker.py:84 | Epoch[53/NA] Step[99] GlobalStep[7360/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0052] loss_depth[0.0138] total_loss[0.0192] Rank[0/16] 06/24/2025 13:45:07 INFO stats.py:314 | Epoch[53] Step[113] GlobalStep[7374] Training Speed: 426.90 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:58:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:45:11 INFO loss_tracker.py:84 | Epoch[53/NA] Step[124] GlobalStep[7385/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0060] loss_depth[0.0138] total_loss[0.0201] Rank[0/16] 06/24/2025 13:45:15 INFO stats.py:394 | Epoch[53] completed. Training Speed: 303.40 samples/sec across all devices. Epoch Time: 57.80 sec. Average Epoch Time: 57.80 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 10:58:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:45:17 INFO stats.py:314 | Epoch[54] Step[1] GlobalStep[7399] Training Speed: 415.22 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 10:58:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:45:27 INFO loss_tracker.py:84 | Epoch[54/NA] Step[24] GlobalStep[7422/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0058] loss_depth[0.0140] total_loss[0.0200] Rank[0/16] 06/24/2025 13:45:28 INFO stats.py:314 | Epoch[54] Step[26] GlobalStep[7424] Training Speed: 426.75 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:58:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:45:37 INFO loss_tracker.py:84 | Epoch[54/NA] Step[49] GlobalStep[7447/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0053] loss_depth[0.0138] total_loss[0.0193] Rank[0/16] 06/24/2025 13:45:38 INFO stats.py:314 | Epoch[54] Step[51] GlobalStep[7449] Training Speed: 417.49 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 10:57:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:45:48 INFO loss_tracker.py:84 | Epoch[54/NA] Step[74] GlobalStep[7472/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0054] loss_depth[0.0139] total_loss[0.0194] Rank[0/16] 06/24/2025 13:45:48 INFO stats.py:314 | Epoch[54] Step[76] GlobalStep[7474] Training Speed: 433.24 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:57:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:45:58 INFO loss_tracker.py:84 | Epoch[54/NA] Step[99] GlobalStep[7497/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0056] loss_depth[0.0139] total_loss[0.0197] Rank[0/16] 06/24/2025 13:45:58 INFO stats.py:314 | Epoch[54] Step[101] GlobalStep[7499] Training Speed: 430.20 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:57:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:46:08 INFO loss_tracker.py:84 | Epoch[54/NA] Step[124] GlobalStep[7522/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0059] loss_depth[0.0139] total_loss[0.0200] Rank[0/16] 06/24/2025 13:46:09 INFO stats.py:314 | Epoch[54] Step[126] GlobalStep[7524] Training Speed: 449.73 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 10:57:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:46:12 INFO stats.py:394 | Epoch[54] completed. Training Speed: 307.68 samples/sec across all devices. Epoch Time: 56.99 sec. Average Epoch Time: 56.99 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 10:56:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:46:19 INFO stats.py:314 | Epoch[55] Step[14] GlobalStep[7549] Training Speed: 434.43 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:56:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:46:24 INFO loss_tracker.py:84 | Epoch[55/NA] Step[24] GlobalStep[7559/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0059] loss_depth[0.0139] total_loss[0.0200] Rank[0/16] 06/24/2025 13:46:30 INFO stats.py:314 | Epoch[55] Step[39] GlobalStep[7574] Training Speed: 432.87 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:56:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:46:34 INFO loss_tracker.py:84 | Epoch[55/NA] Step[49] GlobalStep[7584/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0053] loss_depth[0.0139] total_loss[0.0194] Rank[0/16] 06/24/2025 13:46:40 INFO stats.py:314 | Epoch[55] Step[64] GlobalStep[7599] Training Speed: 432.32 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:56:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:46:45 INFO loss_tracker.py:84 | Epoch[55/NA] Step[74] GlobalStep[7609/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0059] loss_depth[0.0138] total_loss[0.0199] Rank[0/16] 06/24/2025 13:46:51 INFO stats.py:314 | Epoch[55] Step[89] GlobalStep[7624] Training Speed: 431.22 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:56:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:46:54 INFO loss_tracker.py:84 | Epoch[55/NA] Step[99] GlobalStep[7634/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0060] loss_depth[0.0138] total_loss[0.0200] Rank[0/16] 06/24/2025 13:47:01 INFO stats.py:314 | Epoch[55] Step[114] GlobalStep[7649] Training Speed: 428.46 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:56:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:47:05 INFO loss_tracker.py:84 | Epoch[55/NA] Step[124] GlobalStep[7659/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0057] loss_depth[0.0138] total_loss[0.0197] Rank[0/16] 06/24/2025 13:47:09 INFO stats.py:394 | Epoch[55] completed. Training Speed: 306.49 samples/sec across all devices. Epoch Time: 57.21 sec. Average Epoch Time: 57.21 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 10:55:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:47:12 INFO stats.py:314 | Epoch[56] Step[2] GlobalStep[7674] Training Speed: 425.08 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:55:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:47:20 INFO loss_tracker.py:84 | Epoch[56/NA] Step[24] GlobalStep[7696/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0056] loss_depth[0.0138] total_loss[0.0196] Rank[0/16] 06/24/2025 13:47:22 INFO stats.py:314 | Epoch[56] Step[27] GlobalStep[7699] Training Speed: 432.52 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:55:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:47:31 INFO loss_tracker.py:84 | Epoch[56/NA] Step[49] GlobalStep[7721/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0057] loss_depth[0.0139] total_loss[0.0198] Rank[0/16] 06/24/2025 13:47:33 INFO stats.py:314 | Epoch[56] Step[52] GlobalStep[7724] Training Speed: 431.58 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:55:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:47:42 INFO loss_tracker.py:84 | Epoch[56/NA] Step[74] GlobalStep[7746/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0059] loss_depth[0.0138] total_loss[0.0199] Rank[0/16] 06/24/2025 13:47:43 INFO stats.py:314 | Epoch[56] Step[77] GlobalStep[7749] Training Speed: 396.35 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 10:55:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:47:53 INFO loss_tracker.py:84 | Epoch[56/NA] Step[99] GlobalStep[7771/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0058] loss_depth[0.0137] total_loss[0.0198] Rank[0/16] 06/24/2025 13:47:54 INFO stats.py:314 | Epoch[56] Step[102] GlobalStep[7774] Training Speed: 422.94 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:55:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:48:02 INFO loss_tracker.py:84 | Epoch[56/NA] Step[124] GlobalStep[7796/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0055] loss_depth[0.0138] total_loss[0.0195] Rank[0/16] 06/24/2025 13:48:04 INFO stats.py:314 | Epoch[56] Step[127] GlobalStep[7799] Training Speed: 448.91 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:54:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:48:07 INFO stats.py:394 | Epoch[56] completed. Training Speed: 304.56 samples/sec across all devices. Epoch Time: 57.58 sec. Average Epoch Time: 57.58 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 10:54:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:48:15 INFO stats.py:314 | Epoch[57] Step[15] GlobalStep[7824] Training Speed: 430.99 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:54:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:48:18 INFO loss_tracker.py:84 | Epoch[57/NA] Step[24] GlobalStep[7833/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0060] loss_depth[0.0139] total_loss[0.0200] Rank[0/16] 06/24/2025 13:48:25 INFO stats.py:314 | Epoch[57] Step[40] GlobalStep[7849] Training Speed: 438.83 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:54:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:48:29 INFO loss_tracker.py:84 | Epoch[57/NA] Step[49] GlobalStep[7858/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0057] loss_depth[0.0138] total_loss[0.0197] Rank[0/16] 06/24/2025 13:48:35 INFO stats.py:314 | Epoch[57] Step[65] GlobalStep[7874] Training Speed: 431.74 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:54:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:48:39 INFO loss_tracker.py:84 | Epoch[57/NA] Step[74] GlobalStep[7883/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0066] loss_depth[0.0138] total_loss[0.0206] Rank[0/16] 06/24/2025 13:48:46 INFO stats.py:314 | Epoch[57] Step[90] GlobalStep[7899] Training Speed: 436.04 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:53:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:48:50 INFO loss_tracker.py:84 | Epoch[57/NA] Step[99] GlobalStep[7908/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0062] loss_depth[0.0137] total_loss[0.0201] Rank[0/16] 06/24/2025 13:48:56 INFO stats.py:314 | Epoch[57] Step[115] GlobalStep[7924] Training Speed: 435.46 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:53:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:49:00 INFO loss_tracker.py:84 | Epoch[57/NA] Step[124] GlobalStep[7933/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0058] loss_depth[0.0138] total_loss[0.0198] Rank[0/16] 06/24/2025 13:49:04 INFO stats.py:394 | Epoch[57] completed. Training Speed: 307.17 samples/sec across all devices. Epoch Time: 57.09 sec. Average Epoch Time: 57.09 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 10:53:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:49:07 INFO stats.py:314 | Epoch[58] Step[3] GlobalStep[7949] Training Speed: 435.09 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:53:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:49:16 INFO loss_tracker.py:84 | Epoch[58/NA] Step[24] GlobalStep[7970/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0055] loss_depth[0.0138] total_loss[0.0194] Rank[0/16] 06/24/2025 13:49:17 INFO stats.py:314 | Epoch[58] Step[28] GlobalStep[7974] Training Speed: 430.54 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:53:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:49:26 INFO loss_tracker.py:84 | Epoch[58/NA] Step[49] GlobalStep[7995/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0063] loss_depth[0.0139] total_loss[0.0203] Rank[0/16] 06/24/2025 13:49:28 INFO stats.py:314 | Epoch[58] Step[53] GlobalStep[7999] Training Speed: 416.03 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 10:53:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:49:29 INFO accelerator.py:3111 | Saving current state to /job_data/checkpoints/checkpoint_1 Rank[11/16] 06/24/2025 13:49:29 INFO checkpoint.py:110 | Save checkpoint at the end of step 7999 to /job_data/checkpoints/checkpoint_1 Rank[7/16] 06/24/2025 13:49:29 INFO checkpoint.py:110 | Save checkpoint at the end of step 7999 to /job_data/checkpoints/checkpoint_1 Rank[2/16] 06/24/2025 13:49:29 INFO checkpoint.py:110 | Save checkpoint at the end of step 7999 to /job_data/checkpoints/checkpoint_1 Rank[14/16] 06/24/2025 13:49:29 INFO checkpoint.py:110 | Save checkpoint at the end of step 7999 to /job_data/checkpoints/checkpoint_1 Rank[10/16] 06/24/2025 13:49:29 INFO checkpoint.py:110 | Save checkpoint at the end of step 7999 to /job_data/checkpoints/checkpoint_1 Rank[15/16] 06/24/2025 13:49:29 INFO checkpoint.py:110 | Save checkpoint at the end of step 7999 to /job_data/checkpoints/checkpoint_1 Rank[6/16] 06/24/2025 13:49:29 INFO checkpoint.py:110 | Save checkpoint at the end of step 7999 to /job_data/checkpoints/checkpoint_1 Rank[4/16] 06/24/2025 13:49:29 INFO checkpoint.py:110 | Save checkpoint at the end of step 7999 to /job_data/checkpoints/checkpoint_1 Rank[13/16] 06/24/2025 13:49:29 INFO checkpoint.py:110 | Save checkpoint at the end of step 7999 to /job_data/checkpoints/checkpoint_1 Rank[8/16] 06/24/2025 13:49:29 INFO checkpoint.py:110 | Save checkpoint at the end of step 7999 to /job_data/checkpoints/checkpoint_1 Rank[3/16] 06/24/2025 13:49:29 INFO checkpoint.py:110 | Save checkpoint at the end of step 7999 to /job_data/checkpoints/checkpoint_1 Rank[9/16] 06/24/2025 13:49:29 INFO checkpoint.py:110 | Save checkpoint at the end of step 7999 to /job_data/checkpoints/checkpoint_1 Rank[5/16] 06/24/2025 13:49:29 INFO checkpoint.py:110 | Save checkpoint at the end of step 7999 to /job_data/checkpoints/checkpoint_1 Rank[12/16] 06/24/2025 13:49:29 INFO checkpoint.py:110 | Save checkpoint at the end of step 7999 to /job_data/checkpoints/checkpoint_1 Rank[1/16] 06/24/2025 13:49:29 INFO checkpoint.py:110 | Save checkpoint at the end of step 7999 to /job_data/checkpoints/checkpoint_1 Rank[0/16] 06/24/2025 13:49:30 INFO checkpointing.py:106 | Model weights saved in /job_data/checkpoints/checkpoint_1/model.safetensors Rank[0/16] 06/24/2025 13:49:31 INFO checkpointing.py:113 | Optimizer state saved in /job_data/checkpoints/checkpoint_1/optimizer.bin Rank[0/16] 06/24/2025 13:49:31 INFO checkpointing.py:120 | Scheduler state saved in /job_data/checkpoints/checkpoint_1/scheduler.bin Rank[0/16] 06/24/2025 13:49:31 INFO checkpointing.py:137 | Sampler state for dataloader 0 saved in /job_data/checkpoints/checkpoint_1/sampler.bin Rank[0/16] 06/24/2025 13:49:31 INFO checkpointing.py:164 | Random states saved in /job_data/checkpoints/checkpoint_1/random_states_0.pkl Rank[0/16] 06/24/2025 13:49:31 INFO checkpointing.py:300 | Saving the state of TrainerProgressState to /job_data/checkpoints/checkpoint_1/custom_checkpoint_0.pkl Rank[0/16] 06/24/2025 13:49:31 INFO checkpoint.py:110 | Save checkpoint at the end of step 7999 to /job_data/checkpoints/checkpoint_1 Rank[0/16] 06/24/2025 13:49:39 INFO loss_tracker.py:84 | Epoch[58/NA] Step[74] GlobalStep[8020/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0058] loss_depth[0.0137] total_loss[0.0198] Rank[0/16] 06/24/2025 13:49:41 INFO stats.py:314 | Epoch[58] Step[78] GlobalStep[8024] Training Speed: 419.92 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:53:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:49:50 INFO loss_tracker.py:84 | Epoch[58/NA] Step[99] GlobalStep[8045/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0055] loss_depth[0.0137] total_loss[0.0194] Rank[0/16] 06/24/2025 13:49:51 INFO stats.py:314 | Epoch[58] Step[103] GlobalStep[8049] Training Speed: 432.12 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:53:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:49:59 INFO loss_tracker.py:84 | Epoch[58/NA] Step[124] GlobalStep[8070/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0054] loss_depth[0.0138] total_loss[0.0194] Rank[0/16] 06/24/2025 13:50:01 INFO stats.py:314 | Epoch[58] Step[128] GlobalStep[8074] Training Speed: 451.56 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 10:52:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:50:04 INFO stats.py:394 | Epoch[58] completed. Training Speed: 294.95 samples/sec across all devices. Epoch Time: 59.45 sec. Average Epoch Time: 59.45 sec. Average Step Time: 0.43 sec. Estimated Remaining Time: 10:52:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:50:12 INFO stats.py:314 | Epoch[59] Step[16] GlobalStep[8099] Training Speed: 436.90 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:52:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:50:15 INFO loss_tracker.py:84 | Epoch[59/NA] Step[24] GlobalStep[8107/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0052] loss_depth[0.0137] total_loss[0.0191] Rank[0/16] 06/24/2025 13:50:22 INFO stats.py:314 | Epoch[59] Step[41] GlobalStep[8124] Training Speed: 427.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:52:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:50:26 INFO loss_tracker.py:84 | Epoch[59/NA] Step[49] GlobalStep[8132/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0050] loss_depth[0.0136] total_loss[0.0189] Rank[0/16] 06/24/2025 13:50:33 INFO stats.py:314 | Epoch[59] Step[66] GlobalStep[8149] Training Speed: 434.36 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:52:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:50:37 INFO loss_tracker.py:84 | Epoch[59/NA] Step[74] GlobalStep[8157/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0058] loss_depth[0.0137] total_loss[0.0197] Rank[0/16] 06/24/2025 13:50:43 INFO stats.py:314 | Epoch[59] Step[91] GlobalStep[8174] Training Speed: 433.71 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:52:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:50:47 INFO loss_tracker.py:84 | Epoch[59/NA] Step[99] GlobalStep[8182/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0052] loss_depth[0.0138] total_loss[0.0192] Rank[0/16] 06/24/2025 13:50:54 INFO stats.py:314 | Epoch[59] Step[116] GlobalStep[8199] Training Speed: 436.02 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:51:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:50:57 INFO loss_tracker.py:84 | Epoch[59/NA] Step[124] GlobalStep[8207/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0052] loss_depth[0.0137] total_loss[0.0191] Rank[0/16] 06/24/2025 13:51:01 INFO stats.py:394 | Epoch[59] completed. Training Speed: 305.95 samples/sec across all devices. Epoch Time: 57.32 sec. Average Epoch Time: 57.32 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 10:51:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:51:04 INFO stats.py:314 | Epoch[60] Step[4] GlobalStep[8224] Training Speed: 430.95 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:51:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:51:13 INFO loss_tracker.py:84 | Epoch[60/NA] Step[24] GlobalStep[8244/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0056] loss_depth[0.0137] total_loss[0.0194] Rank[0/16] 06/24/2025 13:51:15 INFO stats.py:314 | Epoch[60] Step[29] GlobalStep[8249] Training Speed: 428.41 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:51:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:51:23 INFO loss_tracker.py:84 | Epoch[60/NA] Step[49] GlobalStep[8269/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0060] loss_depth[0.0138] total_loss[0.0199] Rank[0/16] 06/24/2025 13:51:25 INFO stats.py:314 | Epoch[60] Step[54] GlobalStep[8274] Training Speed: 417.47 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 10:51:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:51:33 INFO loss_tracker.py:84 | Epoch[60/NA] Step[74] GlobalStep[8294/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0052] loss_depth[0.0138] total_loss[0.0192] Rank[0/16] 06/24/2025 13:51:35 INFO stats.py:314 | Epoch[60] Step[79] GlobalStep[8299] Training Speed: 430.95 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:50:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:51:43 INFO loss_tracker.py:84 | Epoch[60/NA] Step[99] GlobalStep[8319/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0053] loss_depth[0.0137] total_loss[0.0191] Rank[0/16] 06/24/2025 13:51:45 INFO stats.py:314 | Epoch[60] Step[104] GlobalStep[8324] Training Speed: 434.20 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:50:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:51:53 INFO loss_tracker.py:84 | Epoch[60/NA] Step[124] GlobalStep[8344/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0051] loss_depth[0.0138] total_loss[0.0191] Rank[0/16] 06/24/2025 13:51:55 INFO stats.py:314 | Epoch[60] Step[129] GlobalStep[8349] Training Speed: 447.83 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:50:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:51:58 INFO stats.py:394 | Epoch[60] completed. Training Speed: 308.91 samples/sec across all devices. Epoch Time: 56.77 sec. Average Epoch Time: 56.77 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 10:50:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:52:06 INFO stats.py:314 | Epoch[61] Step[17] GlobalStep[8374] Training Speed: 433.84 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:50:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:52:09 INFO loss_tracker.py:84 | Epoch[61/NA] Step[24] GlobalStep[8381/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0058] loss_depth[0.0137] total_loss[0.0197] Rank[0/16] 06/24/2025 13:52:16 INFO stats.py:314 | Epoch[61] Step[42] GlobalStep[8399] Training Speed: 434.86 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:49:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:52:19 INFO loss_tracker.py:84 | Epoch[61/NA] Step[49] GlobalStep[8406/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0055] loss_depth[0.0138] total_loss[0.0194] Rank[0/16] 06/24/2025 13:52:27 INFO stats.py:314 | Epoch[61] Step[67] GlobalStep[8424] Training Speed: 436.53 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:49:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:52:29 INFO loss_tracker.py:84 | Epoch[61/NA] Step[74] GlobalStep[8431/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0053] loss_depth[0.0137] total_loss[0.0192] Rank[0/16] 06/24/2025 13:52:37 INFO stats.py:314 | Epoch[61] Step[92] GlobalStep[8449] Training Speed: 433.39 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:49:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:52:40 INFO loss_tracker.py:84 | Epoch[61/NA] Step[99] GlobalStep[8456/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0062] loss_depth[0.0137] total_loss[0.0200] Rank[0/16] 06/24/2025 13:52:47 INFO stats.py:314 | Epoch[61] Step[117] GlobalStep[8474] Training Speed: 432.49 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:49:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:52:50 INFO loss_tracker.py:84 | Epoch[61/NA] Step[124] GlobalStep[8481/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0055] loss_depth[0.0136] total_loss[0.0193] Rank[0/16] 06/24/2025 13:52:54 INFO stats.py:394 | Epoch[61] completed. Training Speed: 310.41 samples/sec across all devices. Epoch Time: 56.49 sec. Average Epoch Time: 56.49 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 10:48:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:52:58 INFO stats.py:314 | Epoch[62] Step[5] GlobalStep[8499] Training Speed: 442.73 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:49:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:53:05 INFO loss_tracker.py:84 | Epoch[62/NA] Step[24] GlobalStep[8518/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0050] loss_depth[0.0137] total_loss[0.0189] Rank[0/16] 06/24/2025 13:53:08 INFO stats.py:314 | Epoch[62] Step[30] GlobalStep[8524] Training Speed: 432.35 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:48:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:53:16 INFO loss_tracker.py:84 | Epoch[62/NA] Step[49] GlobalStep[8543/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0053] loss_depth[0.0137] total_loss[0.0192] Rank[0/16] 06/24/2025 13:53:18 INFO stats.py:314 | Epoch[62] Step[55] GlobalStep[8549] Training Speed: 434.03 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:48:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:53:26 INFO loss_tracker.py:84 | Epoch[62/NA] Step[74] GlobalStep[8568/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0053] loss_depth[0.0138] total_loss[0.0192] Rank[0/16] 06/24/2025 13:53:28 INFO stats.py:314 | Epoch[62] Step[80] GlobalStep[8574] Training Speed: 432.39 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:48:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:53:36 INFO loss_tracker.py:84 | Epoch[62/NA] Step[99] GlobalStep[8593/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0053] loss_depth[0.0136] total_loss[0.0191] Rank[0/16] 06/24/2025 13:53:39 INFO stats.py:314 | Epoch[62] Step[105] GlobalStep[8599] Training Speed: 429.58 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:48:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:53:46 INFO loss_tracker.py:84 | Epoch[62/NA] Step[124] GlobalStep[8618/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0054] loss_depth[0.0137] total_loss[0.0192] Rank[0/16] 06/24/2025 13:53:48 INFO stats.py:314 | Epoch[62] Step[130] GlobalStep[8624] Training Speed: 454.07 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 10:47:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:53:51 INFO stats.py:394 | Epoch[62] completed. Training Speed: 309.76 samples/sec across all devices. Epoch Time: 56.61 sec. Average Epoch Time: 56.61 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 10:47:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:54:00 INFO stats.py:314 | Epoch[63] Step[18] GlobalStep[8649] Training Speed: 431.12 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:47:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:54:02 INFO loss_tracker.py:84 | Epoch[63/NA] Step[24] GlobalStep[8655/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0057] loss_depth[0.0137] total_loss[0.0196] Rank[0/16] 06/24/2025 13:54:10 INFO stats.py:314 | Epoch[63] Step[43] GlobalStep[8674] Training Speed: 435.35 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:47:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:54:12 INFO loss_tracker.py:84 | Epoch[63/NA] Step[49] GlobalStep[8680/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0057] loss_depth[0.0137] total_loss[0.0195] Rank[0/16] 06/24/2025 13:54:20 INFO stats.py:314 | Epoch[63] Step[68] GlobalStep[8699] Training Speed: 435.01 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:47:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:54:22 INFO loss_tracker.py:84 | Epoch[63/NA] Step[74] GlobalStep[8705/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0057] loss_depth[0.0137] total_loss[0.0195] Rank[0/16] 06/24/2025 13:54:30 INFO stats.py:314 | Epoch[63] Step[93] GlobalStep[8724] Training Speed: 433.16 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:46:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:54:33 INFO loss_tracker.py:84 | Epoch[63/NA] Step[99] GlobalStep[8730/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0061] loss_depth[0.0137] total_loss[0.0200] Rank[0/16] 06/24/2025 13:54:40 INFO stats.py:314 | Epoch[63] Step[118] GlobalStep[8749] Training Speed: 436.67 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:46:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:54:42 INFO loss_tracker.py:84 | Epoch[63/NA] Step[124] GlobalStep[8755/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0057] loss_depth[0.0136] total_loss[0.0195] Rank[0/16] 06/24/2025 13:54:46 INFO stats.py:394 | Epoch[63] completed. Training Speed: 314.90 samples/sec across all devices. Epoch Time: 55.69 sec. Average Epoch Time: 55.69 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 10:46:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:54:50 INFO stats.py:314 | Epoch[64] Step[6] GlobalStep[8774] Training Speed: 436.88 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:46:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:54:57 INFO loss_tracker.py:84 | Epoch[64/NA] Step[24] GlobalStep[8792/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0051] loss_depth[0.0137] total_loss[0.0190] Rank[0/16] 06/24/2025 13:55:00 INFO stats.py:314 | Epoch[64] Step[31] GlobalStep[8799] Training Speed: 436.38 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:45:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:55:08 INFO loss_tracker.py:84 | Epoch[64/NA] Step[49] GlobalStep[8817/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0056] loss_depth[0.0136] total_loss[0.0194] Rank[0/16] 06/24/2025 13:55:11 INFO stats.py:314 | Epoch[64] Step[56] GlobalStep[8824] Training Speed: 437.93 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:45:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:55:18 INFO loss_tracker.py:84 | Epoch[64/NA] Step[74] GlobalStep[8842/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0054] loss_depth[0.0137] total_loss[0.0192] Rank[0/16] 06/24/2025 13:55:20 INFO stats.py:314 | Epoch[64] Step[81] GlobalStep[8849] Training Speed: 435.29 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:45:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:55:28 INFO loss_tracker.py:84 | Epoch[64/NA] Step[99] GlobalStep[8867/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0055] loss_depth[0.0136] total_loss[0.0192] Rank[0/16] 06/24/2025 13:55:30 INFO stats.py:314 | Epoch[64] Step[106] GlobalStep[8874] Training Speed: 433.55 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:45:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:55:38 INFO loss_tracker.py:84 | Epoch[64/NA] Step[124] GlobalStep[8892/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0059] loss_depth[0.0136] total_loss[0.0197] Rank[0/16] 06/24/2025 13:55:40 INFO stats.py:314 | Epoch[64] Step[131] GlobalStep[8899] Training Speed: 452.57 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 10:44:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:55:42 INFO stats.py:394 | Epoch[64] completed. Training Speed: 316.02 samples/sec across all devices. Epoch Time: 55.49 sec. Average Epoch Time: 55.49 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 10:44:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:55:51 INFO stats.py:314 | Epoch[65] Step[19] GlobalStep[8924] Training Speed: 426.49 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:44:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:55:54 INFO loss_tracker.py:84 | Epoch[65/NA] Step[24] GlobalStep[8929/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0054] loss_depth[0.0136] total_loss[0.0192] Rank[0/16] 06/24/2025 13:56:02 INFO stats.py:314 | Epoch[65] Step[44] GlobalStep[8949] Training Speed: 434.49 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:44:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:56:04 INFO loss_tracker.py:84 | Epoch[65/NA] Step[49] GlobalStep[8954/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0058] loss_depth[0.0136] total_loss[0.0196] Rank[0/16] 06/24/2025 13:56:12 INFO stats.py:314 | Epoch[65] Step[69] GlobalStep[8974] Training Speed: 428.75 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:44:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:56:14 INFO loss_tracker.py:84 | Epoch[65/NA] Step[74] GlobalStep[8979/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0047] loss_depth[0.0136] total_loss[0.0185] Rank[0/16] 06/24/2025 13:56:23 INFO stats.py:314 | Epoch[65] Step[94] GlobalStep[8999] Training Speed: 433.75 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:44:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:56:25 INFO loss_tracker.py:84 | Epoch[65/NA] Step[99] GlobalStep[9004/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0056] loss_depth[0.0137] total_loss[0.0195] Rank[0/16] 06/24/2025 13:56:33 INFO stats.py:314 | Epoch[65] Step[119] GlobalStep[9024] Training Speed: 424.96 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:43:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:56:35 INFO loss_tracker.py:84 | Epoch[65/NA] Step[124] GlobalStep[9029/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0057] loss_depth[0.0136] total_loss[0.0195] Rank[0/16] 06/24/2025 13:56:39 INFO stats.py:394 | Epoch[65] completed. Training Speed: 305.53 samples/sec across all devices. Epoch Time: 57.40 sec. Average Epoch Time: 57.40 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 10:43:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:56:44 INFO stats.py:314 | Epoch[66] Step[7] GlobalStep[9049] Training Speed: 422.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:43:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:56:51 INFO loss_tracker.py:84 | Epoch[66/NA] Step[24] GlobalStep[9066/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0053] loss_depth[0.0136] total_loss[0.0191] Rank[0/16] 06/24/2025 13:56:55 INFO stats.py:314 | Epoch[66] Step[32] GlobalStep[9074] Training Speed: 424.97 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:43:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:57:01 INFO loss_tracker.py:84 | Epoch[66/NA] Step[49] GlobalStep[9091/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0060] loss_depth[0.0138] total_loss[0.0199] Rank[0/16] 06/24/2025 13:57:04 INFO stats.py:314 | Epoch[66] Step[57] GlobalStep[9099] Training Speed: 438.14 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:43:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:57:12 INFO loss_tracker.py:84 | Epoch[66/NA] Step[74] GlobalStep[9116/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0060] loss_depth[0.0135] total_loss[0.0197] Rank[0/16] 06/24/2025 13:57:15 INFO stats.py:314 | Epoch[66] Step[82] GlobalStep[9124] Training Speed: 395.53 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 10:43:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:57:22 INFO loss_tracker.py:84 | Epoch[66/NA] Step[99] GlobalStep[9141/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0057] loss_depth[0.0135] total_loss[0.0194] Rank[0/16] 06/24/2025 13:57:25 INFO stats.py:314 | Epoch[66] Step[107] GlobalStep[9149] Training Speed: 438.38 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:42:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:57:32 INFO loss_tracker.py:84 | Epoch[66/NA] Step[124] GlobalStep[9166/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0056] loss_depth[0.0135] total_loss[0.0193] Rank[0/16] 06/24/2025 13:57:35 INFO stats.py:314 | Epoch[66] Step[132] GlobalStep[9174] Training Speed: 453.61 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 10:42:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:57:36 INFO stats.py:394 | Epoch[66] completed. Training Speed: 307.61 samples/sec across all devices. Epoch Time: 57.01 sec. Average Epoch Time: 57.01 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 10:42:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:57:46 INFO stats.py:314 | Epoch[67] Step[20] GlobalStep[9199] Training Speed: 430.77 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:42:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:57:47 INFO loss_tracker.py:84 | Epoch[67/NA] Step[24] GlobalStep[9203/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0060] loss_depth[0.0136] total_loss[0.0198] Rank[0/16] 06/24/2025 13:57:56 INFO stats.py:314 | Epoch[67] Step[45] GlobalStep[9224] Training Speed: 428.88 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:42:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:57:58 INFO loss_tracker.py:84 | Epoch[67/NA] Step[49] GlobalStep[9228/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0056] loss_depth[0.0136] total_loss[0.0193] Rank[0/16] 06/24/2025 13:58:06 INFO stats.py:314 | Epoch[67] Step[70] GlobalStep[9249] Training Speed: 427.00 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:41:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:58:08 INFO loss_tracker.py:84 | Epoch[67/NA] Step[74] GlobalStep[9253/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0136] total_loss[0.0186] Rank[0/16] 06/24/2025 13:58:17 INFO stats.py:314 | Epoch[67] Step[95] GlobalStep[9274] Training Speed: 437.89 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:41:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:58:19 INFO loss_tracker.py:84 | Epoch[67/NA] Step[99] GlobalStep[9278/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0061] loss_depth[0.0136] total_loss[0.0199] Rank[0/16] 06/24/2025 13:58:27 INFO stats.py:314 | Epoch[67] Step[120] GlobalStep[9299] Training Speed: 448.85 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:41:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:58:29 INFO loss_tracker.py:84 | Epoch[67/NA] Step[124] GlobalStep[9303/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0061] loss_depth[0.0136] total_loss[0.0198] Rank[0/16] 06/24/2025 13:58:33 INFO stats.py:394 | Epoch[67] completed. Training Speed: 308.32 samples/sec across all devices. Epoch Time: 56.88 sec. Average Epoch Time: 56.88 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 10:41:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:58:38 INFO stats.py:314 | Epoch[68] Step[8] GlobalStep[9324] Training Speed: 429.46 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:41:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:58:45 INFO loss_tracker.py:84 | Epoch[68/NA] Step[24] GlobalStep[9340/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0054] loss_depth[0.0135] total_loss[0.0190] Rank[0/16] 06/24/2025 13:58:48 INFO stats.py:314 | Epoch[68] Step[33] GlobalStep[9349] Training Speed: 433.75 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:41:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:58:55 INFO loss_tracker.py:84 | Epoch[68/NA] Step[49] GlobalStep[9365/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0058] loss_depth[0.0135] total_loss[0.0196] Rank[0/16] 06/24/2025 13:58:59 INFO stats.py:314 | Epoch[68] Step[58] GlobalStep[9374] Training Speed: 435.37 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:40:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:59:05 INFO loss_tracker.py:84 | Epoch[68/NA] Step[74] GlobalStep[9390/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0062] loss_depth[0.0137] total_loss[0.0201] Rank[0/16] 06/24/2025 13:59:09 INFO stats.py:314 | Epoch[68] Step[83] GlobalStep[9399] Training Speed: 435.82 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:40:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:59:16 INFO loss_tracker.py:84 | Epoch[68/NA] Step[99] GlobalStep[9415/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0136] total_loss[0.0190] Rank[0/16] 06/24/2025 13:59:20 INFO stats.py:314 | Epoch[68] Step[108] GlobalStep[9424] Training Speed: 435.08 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:40:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:59:26 INFO loss_tracker.py:84 | Epoch[68/NA] Step[124] GlobalStep[9440/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0055] loss_depth[0.0136] total_loss[0.0193] Rank[0/16] 06/24/2025 13:59:29 INFO stats.py:314 | Epoch[68] Step[133] GlobalStep[9449] Training Speed: 451.41 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 10:40:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:59:30 INFO stats.py:394 | Epoch[68] completed. Training Speed: 307.02 samples/sec across all devices. Epoch Time: 57.12 sec. Average Epoch Time: 57.12 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 10:40:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:59:41 INFO stats.py:314 | Epoch[69] Step[21] GlobalStep[9474] Training Speed: 417.17 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 10:40:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:59:42 INFO loss_tracker.py:84 | Epoch[69/NA] Step[24] GlobalStep[9477/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0054] loss_depth[0.0136] total_loss[0.0192] Rank[0/16] 06/24/2025 13:59:51 INFO stats.py:314 | Epoch[69] Step[46] GlobalStep[9499] Training Speed: 429.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:40:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 13:59:53 INFO loss_tracker.py:84 | Epoch[69/NA] Step[49] GlobalStep[9502/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0057] loss_depth[0.0136] total_loss[0.0194] Rank[0/16] 06/24/2025 14:00:02 INFO stats.py:314 | Epoch[69] Step[71] GlobalStep[9524] Training Speed: 433.91 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:39:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:00:03 INFO loss_tracker.py:84 | Epoch[69/NA] Step[74] GlobalStep[9527/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0060] loss_depth[0.0136] total_loss[0.0197] Rank[0/16] 06/24/2025 14:00:13 INFO stats.py:314 | Epoch[69] Step[96] GlobalStep[9549] Training Speed: 425.53 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:39:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:00:14 INFO loss_tracker.py:84 | Epoch[69/NA] Step[99] GlobalStep[9552/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0051] loss_depth[0.0135] total_loss[0.0188] Rank[0/16] 06/24/2025 14:00:23 INFO stats.py:314 | Epoch[69] Step[121] GlobalStep[9574] Training Speed: 449.16 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 10:39:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:00:24 INFO loss_tracker.py:84 | Epoch[69/NA] Step[124] GlobalStep[9577/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0136] total_loss[0.0194] Rank[0/16] 06/24/2025 14:00:28 INFO stats.py:394 | Epoch[69] completed. Training Speed: 303.43 samples/sec across all devices. Epoch Time: 57.79 sec. Average Epoch Time: 57.79 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 10:39:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:00:33 INFO stats.py:314 | Epoch[70] Step[9] GlobalStep[9599] Training Speed: 431.12 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:39:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:00:40 INFO loss_tracker.py:84 | Epoch[70/NA] Step[24] GlobalStep[9614/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0136] total_loss[0.0193] Rank[0/16] 06/24/2025 14:00:44 INFO stats.py:314 | Epoch[70] Step[34] GlobalStep[9624] Training Speed: 428.71 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:39:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:00:50 INFO loss_tracker.py:84 | Epoch[70/NA] Step[49] GlobalStep[9639/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0052] loss_depth[0.0135] total_loss[0.0189] Rank[0/16] 06/24/2025 14:00:54 INFO stats.py:314 | Epoch[70] Step[59] GlobalStep[9649] Training Speed: 436.94 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:38:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:01:01 INFO loss_tracker.py:84 | Epoch[70/NA] Step[74] GlobalStep[9664/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0135] total_loss[0.0189] Rank[0/16] 06/24/2025 14:01:05 INFO stats.py:314 | Epoch[70] Step[84] GlobalStep[9674] Training Speed: 435.51 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:38:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:01:11 INFO loss_tracker.py:84 | Epoch[70/NA] Step[99] GlobalStep[9689/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0050] loss_depth[0.0135] total_loss[0.0187] Rank[0/16] 06/24/2025 14:01:15 INFO stats.py:314 | Epoch[70] Step[109] GlobalStep[9699] Training Speed: 423.02 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:38:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:01:21 INFO loss_tracker.py:84 | Epoch[70/NA] Step[124] GlobalStep[9714/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0056] loss_depth[0.0134] total_loss[0.0192] Rank[0/16] 06/24/2025 14:01:25 INFO stats.py:314 | Epoch[70] Step[134] GlobalStep[9724] Training Speed: 451.11 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 10:38:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:01:25 INFO stats.py:394 | Epoch[70] completed. Training Speed: 306.21 samples/sec across all devices. Epoch Time: 57.27 sec. Average Epoch Time: 57.27 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 10:38:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:01:37 INFO stats.py:314 | Epoch[71] Step[22] GlobalStep[9749] Training Speed: 430.93 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:38:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:01:37 INFO loss_tracker.py:84 | Epoch[71/NA] Step[24] GlobalStep[9751/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0136] total_loss[0.0190] Rank[0/16] 06/24/2025 14:01:47 INFO stats.py:314 | Epoch[71] Step[47] GlobalStep[9774] Training Speed: 408.87 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 10:37:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:01:48 INFO loss_tracker.py:84 | Epoch[71/NA] Step[49] GlobalStep[9776/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0054] loss_depth[0.0135] total_loss[0.0191] Rank[0/16] 06/24/2025 14:01:57 INFO stats.py:314 | Epoch[71] Step[72] GlobalStep[9799] Training Speed: 429.63 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:37:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:01:58 INFO loss_tracker.py:84 | Epoch[71/NA] Step[74] GlobalStep[9801/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0055] loss_depth[0.0136] total_loss[0.0192] Rank[0/16] 06/24/2025 14:02:08 INFO stats.py:314 | Epoch[71] Step[97] GlobalStep[9824] Training Speed: 431.05 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:37:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:02:08 INFO loss_tracker.py:84 | Epoch[71/NA] Step[99] GlobalStep[9826/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0135] total_loss[0.0190] Rank[0/16] 06/24/2025 14:02:18 INFO stats.py:314 | Epoch[71] Step[122] GlobalStep[9849] Training Speed: 451.93 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 10:37:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:02:19 INFO loss_tracker.py:84 | Epoch[71/NA] Step[124] GlobalStep[9851/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0058] loss_depth[0.0135] total_loss[0.0195] Rank[0/16] 06/24/2025 14:02:23 INFO stats.py:394 | Epoch[71] completed. Training Speed: 303.39 samples/sec across all devices. Epoch Time: 57.80 sec. Average Epoch Time: 57.80 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 10:37:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:02:29 INFO stats.py:314 | Epoch[72] Step[10] GlobalStep[9874] Training Speed: 434.30 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:37:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:02:35 INFO loss_tracker.py:84 | Epoch[72/NA] Step[24] GlobalStep[9888/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0135] total_loss[0.0190] Rank[0/16] 06/24/2025 14:02:40 INFO stats.py:314 | Epoch[72] Step[35] GlobalStep[9899] Training Speed: 420.96 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:37:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:02:46 INFO loss_tracker.py:84 | Epoch[72/NA] Step[49] GlobalStep[9913/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0135] total_loss[0.0187] Rank[0/16] 06/24/2025 14:02:50 INFO stats.py:314 | Epoch[72] Step[60] GlobalStep[9924] Training Speed: 416.84 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 10:36:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:02:56 INFO loss_tracker.py:84 | Epoch[72/NA] Step[74] GlobalStep[9938/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0060] loss_depth[0.0135] total_loss[0.0196] Rank[0/16] 06/24/2025 14:03:01 INFO stats.py:314 | Epoch[72] Step[85] GlobalStep[9949] Training Speed: 406.77 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 10:36:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:03:07 INFO loss_tracker.py:84 | Epoch[72/NA] Step[99] GlobalStep[9963/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0135] total_loss[0.0188] Rank[0/16] 06/24/2025 14:03:11 INFO stats.py:314 | Epoch[72] Step[110] GlobalStep[9974] Training Speed: 423.70 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:36:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:03:17 INFO loss_tracker.py:84 | Epoch[72/NA] Step[124] GlobalStep[9988/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0135] total_loss[0.0193] Rank[0/16] 06/24/2025 14:03:21 INFO stats.py:314 | Epoch[72] Step[135] GlobalStep[9999] Training Speed: 446.79 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:36:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:03:22 INFO stats.py:394 | Epoch[72] completed. Training Speed: 300.60 samples/sec across all devices. Epoch Time: 58.34 sec. Average Epoch Time: 58.34 sec. Average Step Time: 0.43 sec. Estimated Remaining Time: 10:36:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:03:33 INFO stats.py:314 | Epoch[73] Step[23] GlobalStep[10024] Training Speed: 428.30 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:36:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:03:33 INFO loss_tracker.py:84 | Epoch[73/NA] Step[24] GlobalStep[10025/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0058] loss_depth[0.0136] total_loss[0.0195] Rank[0/16] 06/24/2025 14:03:43 INFO stats.py:314 | Epoch[73] Step[48] GlobalStep[10049] Training Speed: 431.51 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:35:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:03:43 INFO loss_tracker.py:84 | Epoch[73/NA] Step[49] GlobalStep[10050/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0135] total_loss[0.0189] Rank[0/16] 06/24/2025 14:03:53 INFO stats.py:314 | Epoch[73] Step[73] GlobalStep[10074] Training Speed: 425.40 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:35:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:03:54 INFO loss_tracker.py:84 | Epoch[73/NA] Step[74] GlobalStep[10075/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0058] loss_depth[0.0135] total_loss[0.0194] Rank[0/16] 06/24/2025 14:04:03 INFO stats.py:314 | Epoch[73] Step[98] GlobalStep[10099] Training Speed: 435.42 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:35:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:04:04 INFO loss_tracker.py:84 | Epoch[73/NA] Step[99] GlobalStep[10100/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0062] loss_depth[0.0134] total_loss[0.0198] Rank[0/16] 06/24/2025 14:04:14 INFO stats.py:314 | Epoch[73] Step[123] GlobalStep[10124] Training Speed: 451.59 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 10:35:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:04:14 INFO loss_tracker.py:84 | Epoch[73/NA] Step[124] GlobalStep[10125/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0135] total_loss[0.0185] Rank[0/16] 06/24/2025 14:04:18 INFO stats.py:394 | Epoch[73] completed. Training Speed: 309.19 samples/sec across all devices. Epoch Time: 56.72 sec. Average Epoch Time: 56.72 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 10:34:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:04:24 INFO stats.py:314 | Epoch[74] Step[11] GlobalStep[10149] Training Speed: 437.22 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:34:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:04:30 INFO loss_tracker.py:84 | Epoch[74/NA] Step[24] GlobalStep[10162/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0135] total_loss[0.0192] Rank[0/16] 06/24/2025 14:04:35 INFO stats.py:314 | Epoch[74] Step[36] GlobalStep[10174] Training Speed: 432.06 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:34:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:04:40 INFO loss_tracker.py:84 | Epoch[74/NA] Step[49] GlobalStep[10187/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0054] loss_depth[0.0135] total_loss[0.0191] Rank[0/16] 06/24/2025 14:04:45 INFO stats.py:314 | Epoch[74] Step[61] GlobalStep[10199] Training Speed: 437.46 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:34:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:04:51 INFO loss_tracker.py:84 | Epoch[74/NA] Step[74] GlobalStep[10212/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0134] total_loss[0.0186] Rank[0/16] 06/24/2025 14:04:55 INFO stats.py:314 | Epoch[74] Step[86] GlobalStep[10224] Training Speed: 433.21 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:34:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:05:01 INFO loss_tracker.py:84 | Epoch[74/NA] Step[99] GlobalStep[10237/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0061] loss_depth[0.0135] total_loss[0.0198] Rank[0/16] 06/24/2025 14:05:05 INFO stats.py:314 | Epoch[74] Step[111] GlobalStep[10249] Training Speed: 433.89 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:34:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:05:10 INFO loss_tracker.py:84 | Epoch[74/NA] Step[124] GlobalStep[10262/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0135] total_loss[0.0186] Rank[0/16] 06/24/2025 14:05:15 INFO stats.py:314 | Epoch[74] Step[136] GlobalStep[10274] Training Speed: 451.36 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 10:33:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:05:15 INFO stats.py:394 | Epoch[74] completed. Training Speed: 309.86 samples/sec across all devices. Epoch Time: 56.59 sec. Average Epoch Time: 56.59 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 10:33:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:05:26 INFO stats.py:314 | Epoch[75] Step[24] GlobalStep[10299] Training Speed: 440.94 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:33:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:05:26 INFO loss_tracker.py:84 | Epoch[75/NA] Step[24] GlobalStep[10299/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0134] total_loss[0.0192] Rank[0/16] 06/24/2025 14:05:36 INFO stats.py:314 | Epoch[75] Step[49] GlobalStep[10324] Training Speed: 429.03 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:33:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:05:37 INFO loss_tracker.py:84 | Epoch[75/NA] Step[49] GlobalStep[10324/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0055] loss_depth[0.0135] total_loss[0.0191] Rank[0/16] 06/24/2025 14:05:47 INFO stats.py:314 | Epoch[75] Step[74] GlobalStep[10349] Training Speed: 438.47 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:33:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:05:47 INFO loss_tracker.py:84 | Epoch[75/NA] Step[74] GlobalStep[10349/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0057] loss_depth[0.0135] total_loss[0.0193] Rank[0/16] 06/24/2025 14:05:58 INFO stats.py:314 | Epoch[75] Step[99] GlobalStep[10374] Training Speed: 408.83 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 10:33:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:05:58 INFO loss_tracker.py:84 | Epoch[75/NA] Step[99] GlobalStep[10374/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0135] total_loss[0.0186] Rank[0/16] 06/24/2025 14:06:08 INFO stats.py:314 | Epoch[75] Step[124] GlobalStep[10399] Training Speed: 454.21 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 10:32:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:06:08 INFO loss_tracker.py:84 | Epoch[75/NA] Step[124] GlobalStep[10399/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0133] total_loss[0.0191] Rank[0/16] 06/24/2025 14:06:12 INFO stats.py:394 | Epoch[75] completed. Training Speed: 306.19 samples/sec across all devices. Epoch Time: 57.27 sec. Average Epoch Time: 57.27 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 10:32:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:06:19 INFO stats.py:314 | Epoch[76] Step[12] GlobalStep[10424] Training Speed: 424.28 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:32:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:06:24 INFO loss_tracker.py:84 | Epoch[76/NA] Step[24] GlobalStep[10436/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0059] loss_depth[0.0135] total_loss[0.0195] Rank[0/16] 06/24/2025 14:06:29 INFO stats.py:314 | Epoch[76] Step[37] GlobalStep[10449] Training Speed: 415.17 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 10:32:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:06:34 INFO loss_tracker.py:84 | Epoch[76/NA] Step[49] GlobalStep[10461/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0133] total_loss[0.0184] Rank[0/16] 06/24/2025 14:06:39 INFO stats.py:314 | Epoch[76] Step[62] GlobalStep[10474] Training Speed: 440.13 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:32:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:06:44 INFO loss_tracker.py:84 | Epoch[76/NA] Step[74] GlobalStep[10486/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0057] loss_depth[0.0135] total_loss[0.0193] Rank[0/16] 06/24/2025 14:06:50 INFO stats.py:314 | Epoch[76] Step[87] GlobalStep[10499] Training Speed: 443.37 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:32:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:06:55 INFO loss_tracker.py:84 | Epoch[76/NA] Step[99] GlobalStep[10511/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0134] total_loss[0.0188] Rank[0/16] 06/24/2025 14:07:00 INFO stats.py:314 | Epoch[76] Step[112] GlobalStep[10524] Training Speed: 431.25 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:31:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:07:04 INFO loss_tracker.py:84 | Epoch[76/NA] Step[124] GlobalStep[10536/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0055] loss_depth[0.0134] total_loss[0.0190] Rank[0/16] 06/24/2025 14:07:09 INFO stats.py:394 | Epoch[76] completed. Training Speed: 310.60 samples/sec across all devices. Epoch Time: 56.46 sec. Average Epoch Time: 56.46 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 10:31:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:07:10 INFO stats.py:314 | Epoch[77] Step[0] GlobalStep[10549] Training Speed: 345.39 samples/sec across all devices. Average Step Time: 0.37 sec. Estimated Remaining Time: 10:31:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:07:20 INFO loss_tracker.py:84 | Epoch[77/NA] Step[24] GlobalStep[10573/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0057] loss_depth[0.0134] total_loss[0.0192] Rank[0/16] 06/24/2025 14:07:20 INFO stats.py:314 | Epoch[77] Step[25] GlobalStep[10574] Training Speed: 400.02 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 10:31:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:07:30 INFO loss_tracker.py:84 | Epoch[77/NA] Step[49] GlobalStep[10598/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0135] total_loss[0.0191] Rank[0/16] 06/24/2025 14:07:31 INFO stats.py:314 | Epoch[77] Step[50] GlobalStep[10599] Training Speed: 429.89 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:31:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:07:40 INFO loss_tracker.py:84 | Epoch[77/NA] Step[74] GlobalStep[10623/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0055] loss_depth[0.0135] total_loss[0.0191] Rank[0/16] 06/24/2025 14:07:41 INFO stats.py:314 | Epoch[77] Step[75] GlobalStep[10624] Training Speed: 427.94 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:30:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:07:51 INFO loss_tracker.py:84 | Epoch[77/NA] Step[99] GlobalStep[10648/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0050] loss_depth[0.0134] total_loss[0.0186] Rank[0/16] 06/24/2025 14:07:51 INFO stats.py:314 | Epoch[77] Step[100] GlobalStep[10649] Training Speed: 407.43 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 10:30:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:08:01 INFO loss_tracker.py:84 | Epoch[77/NA] Step[124] GlobalStep[10673/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0055] loss_depth[0.0135] total_loss[0.0191] Rank[0/16] 06/24/2025 14:08:01 INFO stats.py:314 | Epoch[77] Step[125] GlobalStep[10674] Training Speed: 419.02 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 10:30:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:08:05 INFO stats.py:394 | Epoch[77] completed. Training Speed: 312.49 samples/sec across all devices. Epoch Time: 56.12 sec. Average Epoch Time: 56.12 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 10:30:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:08:12 INFO stats.py:314 | Epoch[78] Step[13] GlobalStep[10699] Training Speed: 291.05 samples/sec across all devices. Average Step Time: 0.44 sec. Estimated Remaining Time: 10:30:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:08:16 INFO loss_tracker.py:84 | Epoch[78/NA] Step[24] GlobalStep[10710/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0134] total_loss[0.0182] Rank[0/16] 06/24/2025 14:08:22 INFO stats.py:314 | Epoch[78] Step[38] GlobalStep[10724] Training Speed: 431.56 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:30:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:08:27 INFO loss_tracker.py:84 | Epoch[78/NA] Step[49] GlobalStep[10735/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0060] loss_depth[0.0133] total_loss[0.0195] Rank[0/16] 06/24/2025 14:08:33 INFO stats.py:314 | Epoch[78] Step[63] GlobalStep[10749] Training Speed: 429.96 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:29:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:08:37 INFO loss_tracker.py:84 | Epoch[78/NA] Step[74] GlobalStep[10760/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0055] loss_depth[0.0134] total_loss[0.0190] Rank[0/16] 06/24/2025 14:08:43 INFO stats.py:314 | Epoch[78] Step[88] GlobalStep[10774] Training Speed: 400.24 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 10:29:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:08:48 INFO loss_tracker.py:84 | Epoch[78/NA] Step[99] GlobalStep[10785/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0135] total_loss[0.0188] Rank[0/16] 06/24/2025 14:08:53 INFO stats.py:314 | Epoch[78] Step[113] GlobalStep[10799] Training Speed: 431.45 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:29:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:08:58 INFO loss_tracker.py:84 | Epoch[78/NA] Step[124] GlobalStep[10810/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0134] total_loss[0.0187] Rank[0/16] 06/24/2025 14:09:02 INFO stats.py:394 | Epoch[78] completed. Training Speed: 305.72 samples/sec across all devices. Epoch Time: 57.36 sec. Average Epoch Time: 57.36 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 10:29:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:09:04 INFO stats.py:314 | Epoch[79] Step[1] GlobalStep[10824] Training Speed: 392.10 samples/sec across all devices. Average Step Time: 0.33 sec. Estimated Remaining Time: 10:29:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:09:14 INFO loss_tracker.py:84 | Epoch[79/NA] Step[24] GlobalStep[10847/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0057] loss_depth[0.0134] total_loss[0.0192] Rank[0/16] 06/24/2025 14:09:14 INFO stats.py:314 | Epoch[79] Step[26] GlobalStep[10849] Training Speed: 428.61 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:29:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:09:24 INFO loss_tracker.py:84 | Epoch[79/NA] Step[49] GlobalStep[10872/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0134] total_loss[0.0185] Rank[0/16] 06/24/2025 14:09:25 INFO stats.py:314 | Epoch[79] Step[51] GlobalStep[10874] Training Speed: 389.10 samples/sec across all devices. Average Step Time: 0.33 sec. Estimated Remaining Time: 10:28:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:09:34 INFO loss_tracker.py:84 | Epoch[79/NA] Step[74] GlobalStep[10897/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0055] loss_depth[0.0134] total_loss[0.0190] Rank[0/16] 06/24/2025 14:09:35 INFO stats.py:314 | Epoch[79] Step[76] GlobalStep[10899] Training Speed: 423.72 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:28:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:09:45 INFO loss_tracker.py:84 | Epoch[79/NA] Step[99] GlobalStep[10922/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0133] total_loss[0.0188] Rank[0/16] 06/24/2025 14:09:46 INFO stats.py:314 | Epoch[79] Step[101] GlobalStep[10924] Training Speed: 405.06 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 10:28:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:09:55 INFO loss_tracker.py:84 | Epoch[79/NA] Step[124] GlobalStep[10947/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0135] total_loss[0.0183] Rank[0/16] 06/24/2025 14:09:56 INFO stats.py:314 | Epoch[79] Step[126] GlobalStep[10949] Training Speed: 448.27 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:28:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:09:59 INFO stats.py:394 | Epoch[79] completed. Training Speed: 307.55 samples/sec across all devices. Epoch Time: 57.02 sec. Average Epoch Time: 57.02 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 10:28:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:10:06 INFO stats.py:314 | Epoch[80] Step[14] GlobalStep[10974] Training Speed: 420.35 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:28:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:10:10 INFO loss_tracker.py:84 | Epoch[80/NA] Step[24] GlobalStep[10984/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0134] total_loss[0.0187] Rank[0/16] 06/24/2025 14:10:17 INFO stats.py:314 | Epoch[80] Step[39] GlobalStep[10999] Training Speed: 424.28 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:27:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:10:21 INFO loss_tracker.py:84 | Epoch[80/NA] Step[49] GlobalStep[11009/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0134] total_loss[0.0186] Rank[0/16] 06/24/2025 14:10:27 INFO stats.py:314 | Epoch[80] Step[64] GlobalStep[11024] Training Speed: 437.32 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:27:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:10:31 INFO loss_tracker.py:84 | Epoch[80/NA] Step[74] GlobalStep[11034/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0058] loss_depth[0.0134] total_loss[0.0193] Rank[0/16] 06/24/2025 14:10:38 INFO stats.py:314 | Epoch[80] Step[89] GlobalStep[11049] Training Speed: 435.30 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:27:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:10:42 INFO loss_tracker.py:84 | Epoch[80/NA] Step[99] GlobalStep[11059/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0055] loss_depth[0.0134] total_loss[0.0190] Rank[0/16] 06/24/2025 14:10:48 INFO stats.py:314 | Epoch[80] Step[114] GlobalStep[11074] Training Speed: 432.89 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:27:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:10:52 INFO loss_tracker.py:84 | Epoch[80/NA] Step[124] GlobalStep[11084/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0134] total_loss[0.0187] Rank[0/16] 06/24/2025 14:10:56 INFO stats.py:394 | Epoch[80] completed. Training Speed: 307.87 samples/sec across all devices. Epoch Time: 56.96 sec. Average Epoch Time: 56.96 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 10:27:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:10:58 INFO stats.py:314 | Epoch[81] Step[2] GlobalStep[11099] Training Speed: 433.84 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:27:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:11:08 INFO loss_tracker.py:84 | Epoch[81/NA] Step[24] GlobalStep[11121/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0057] loss_depth[0.0134] total_loss[0.0193] Rank[0/16] 06/24/2025 14:11:09 INFO stats.py:314 | Epoch[81] Step[27] GlobalStep[11124] Training Speed: 435.14 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:26:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:11:18 INFO loss_tracker.py:84 | Epoch[81/NA] Step[49] GlobalStep[11146/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0135] total_loss[0.0191] Rank[0/16] 06/24/2025 14:11:19 INFO stats.py:314 | Epoch[81] Step[52] GlobalStep[11149] Training Speed: 429.88 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:26:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:11:28 INFO loss_tracker.py:84 | Epoch[81/NA] Step[74] GlobalStep[11171/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0046] loss_depth[0.0134] total_loss[0.0182] Rank[0/16] 06/24/2025 14:11:30 INFO stats.py:314 | Epoch[81] Step[77] GlobalStep[11174] Training Speed: 421.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:26:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:11:38 INFO loss_tracker.py:84 | Epoch[81/NA] Step[99] GlobalStep[11196/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0059] loss_depth[0.0134] total_loss[0.0194] Rank[0/16] 06/24/2025 14:11:39 INFO stats.py:314 | Epoch[81] Step[102] GlobalStep[11199] Training Speed: 430.75 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:26:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:11:49 INFO loss_tracker.py:84 | Epoch[81/NA] Step[124] GlobalStep[11221/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0133] total_loss[0.0185] Rank[0/16] 06/24/2025 14:11:50 INFO stats.py:314 | Epoch[81] Step[127] GlobalStep[11224] Training Speed: 446.82 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:26:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:11:53 INFO stats.py:394 | Epoch[81] completed. Training Speed: 305.68 samples/sec across all devices. Epoch Time: 57.37 sec. Average Epoch Time: 57.37 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 10:25:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:12:01 INFO stats.py:314 | Epoch[82] Step[15] GlobalStep[11249] Training Speed: 436.71 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:25:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:12:05 INFO loss_tracker.py:84 | Epoch[82/NA] Step[24] GlobalStep[11258/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0134] total_loss[0.0186] Rank[0/16] 06/24/2025 14:12:12 INFO stats.py:314 | Epoch[82] Step[40] GlobalStep[11274] Training Speed: 410.07 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 10:25:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:12:15 INFO loss_tracker.py:84 | Epoch[82/NA] Step[49] GlobalStep[11283/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0061] loss_depth[0.0134] total_loss[0.0196] Rank[0/16] 06/24/2025 14:12:22 INFO stats.py:314 | Epoch[82] Step[65] GlobalStep[11299] Training Speed: 411.87 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 10:25:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:12:26 INFO loss_tracker.py:84 | Epoch[82/NA] Step[74] GlobalStep[11308/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0134] total_loss[0.0188] Rank[0/16] 06/24/2025 14:12:33 INFO stats.py:314 | Epoch[82] Step[90] GlobalStep[11324] Training Speed: 431.85 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:25:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:12:36 INFO loss_tracker.py:84 | Epoch[82/NA] Step[99] GlobalStep[11333/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0133] total_loss[0.0186] Rank[0/16] 06/24/2025 14:12:42 INFO stats.py:314 | Epoch[82] Step[115] GlobalStep[11349] Training Speed: 431.23 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:25:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:12:46 INFO loss_tracker.py:84 | Epoch[82/NA] Step[124] GlobalStep[11358/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0134] total_loss[0.0191] Rank[0/16] 06/24/2025 14:12:51 INFO stats.py:394 | Epoch[82] completed. Training Speed: 304.82 samples/sec across all devices. Epoch Time: 57.53 sec. Average Epoch Time: 57.53 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 10:24:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:12:54 INFO stats.py:314 | Epoch[83] Step[3] GlobalStep[11374] Training Speed: 434.86 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:25:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:13:02 INFO loss_tracker.py:84 | Epoch[83/NA] Step[24] GlobalStep[11395/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0134] total_loss[0.0187] Rank[0/16] 06/24/2025 14:13:04 INFO stats.py:314 | Epoch[83] Step[28] GlobalStep[11399] Training Speed: 424.20 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:24:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:13:13 INFO loss_tracker.py:84 | Epoch[83/NA] Step[49] GlobalStep[11420/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0134] total_loss[0.0191] Rank[0/16] 06/24/2025 14:13:14 INFO stats.py:314 | Epoch[83] Step[53] GlobalStep[11424] Training Speed: 414.92 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 10:24:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:13:23 INFO loss_tracker.py:84 | Epoch[83/NA] Step[74] GlobalStep[11445/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0133] total_loss[0.0190] Rank[0/16] 06/24/2025 14:13:24 INFO stats.py:314 | Epoch[83] Step[78] GlobalStep[11449] Training Speed: 436.71 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:24:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:13:33 INFO loss_tracker.py:84 | Epoch[83/NA] Step[99] GlobalStep[11470/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0134] total_loss[0.0186] Rank[0/16] 06/24/2025 14:13:35 INFO stats.py:314 | Epoch[83] Step[103] GlobalStep[11474] Training Speed: 434.72 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:24:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:13:43 INFO loss_tracker.py:84 | Epoch[83/NA] Step[124] GlobalStep[11495/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0055] loss_depth[0.0134] total_loss[0.0191] Rank[0/16] 06/24/2025 14:13:45 INFO stats.py:314 | Epoch[83] Step[128] GlobalStep[11499] Training Speed: 451.41 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 10:23:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:13:47 INFO stats.py:394 | Epoch[83] completed. Training Speed: 310.47 samples/sec across all devices. Epoch Time: 56.48 sec. Average Epoch Time: 56.48 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 10:23:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:13:55 INFO stats.py:314 | Epoch[84] Step[16] GlobalStep[11524] Training Speed: 402.36 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 10:23:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:13:59 INFO loss_tracker.py:84 | Epoch[84/NA] Step[24] GlobalStep[11532/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0133] total_loss[0.0188] Rank[0/16] 06/24/2025 14:14:06 INFO stats.py:314 | Epoch[84] Step[41] GlobalStep[11549] Training Speed: 424.39 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:23:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:14:09 INFO loss_tracker.py:84 | Epoch[84/NA] Step[49] GlobalStep[11557/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0133] total_loss[0.0191] Rank[0/16] 06/24/2025 14:14:16 INFO stats.py:314 | Epoch[84] Step[66] GlobalStep[11574] Training Speed: 428.02 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:23:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:14:19 INFO loss_tracker.py:84 | Epoch[84/NA] Step[74] GlobalStep[11582/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0133] total_loss[0.0186] Rank[0/16] 06/24/2025 14:14:26 INFO stats.py:314 | Epoch[84] Step[91] GlobalStep[11599] Training Speed: 422.84 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:23:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:14:30 INFO loss_tracker.py:84 | Epoch[84/NA] Step[99] GlobalStep[11607/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0133] total_loss[0.0184] Rank[0/16] 06/24/2025 14:14:37 INFO stats.py:314 | Epoch[84] Step[116] GlobalStep[11624] Training Speed: 425.58 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:22:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:14:40 INFO loss_tracker.py:84 | Epoch[84/NA] Step[124] GlobalStep[11632/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0133] total_loss[0.0190] Rank[0/16] 06/24/2025 14:14:44 INFO stats.py:394 | Epoch[84] completed. Training Speed: 308.51 samples/sec across all devices. Epoch Time: 56.84 sec. Average Epoch Time: 56.84 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 10:22:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:14:47 INFO stats.py:314 | Epoch[85] Step[4] GlobalStep[11649] Training Speed: 433.73 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:22:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:14:56 INFO loss_tracker.py:84 | Epoch[85/NA] Step[24] GlobalStep[11669/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0063] loss_depth[0.0133] total_loss[0.0197] Rank[0/16] 06/24/2025 14:14:58 INFO stats.py:314 | Epoch[85] Step[29] GlobalStep[11674] Training Speed: 404.76 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 10:22:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:15:06 INFO loss_tracker.py:84 | Epoch[85/NA] Step[49] GlobalStep[11694/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0133] total_loss[0.0185] Rank[0/16] 06/24/2025 14:15:08 INFO stats.py:314 | Epoch[85] Step[54] GlobalStep[11699] Training Speed: 418.56 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 10:22:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:15:17 INFO loss_tracker.py:84 | Epoch[85/NA] Step[74] GlobalStep[11719/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0133] total_loss[0.0185] Rank[0/16] 06/24/2025 14:15:18 INFO stats.py:314 | Epoch[85] Step[79] GlobalStep[11724] Training Speed: 432.54 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:22:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:15:27 INFO loss_tracker.py:84 | Epoch[85/NA] Step[99] GlobalStep[11744/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0134] total_loss[0.0191] Rank[0/16] 06/24/2025 14:15:29 INFO stats.py:314 | Epoch[85] Step[104] GlobalStep[11749] Training Speed: 425.49 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:22:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:15:37 INFO loss_tracker.py:84 | Epoch[85/NA] Step[124] GlobalStep[11769/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0134] total_loss[0.0188] Rank[0/16] 06/24/2025 14:15:39 INFO stats.py:314 | Epoch[85] Step[129] GlobalStep[11774] Training Speed: 448.93 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:21:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:15:41 INFO stats.py:394 | Epoch[85] completed. Training Speed: 308.04 samples/sec across all devices. Epoch Time: 56.93 sec. Average Epoch Time: 56.93 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 10:21:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:15:50 INFO stats.py:314 | Epoch[86] Step[17] GlobalStep[11799] Training Speed: 434.51 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:21:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:15:53 INFO loss_tracker.py:84 | Epoch[86/NA] Step[24] GlobalStep[11806/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0133] total_loss[0.0191] Rank[0/16] 06/24/2025 14:16:00 INFO stats.py:314 | Epoch[86] Step[42] GlobalStep[11824] Training Speed: 431.87 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:21:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:16:03 INFO loss_tracker.py:84 | Epoch[86/NA] Step[49] GlobalStep[11831/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0133] total_loss[0.0186] Rank[0/16] 06/24/2025 14:16:10 INFO stats.py:314 | Epoch[86] Step[67] GlobalStep[11849] Training Speed: 430.06 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:21:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:16:13 INFO loss_tracker.py:84 | Epoch[86/NA] Step[74] GlobalStep[11856/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0134] total_loss[0.0191] Rank[0/16] 06/24/2025 14:16:20 INFO stats.py:314 | Epoch[86] Step[92] GlobalStep[11874] Training Speed: 428.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:20:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:16:23 INFO loss_tracker.py:84 | Epoch[86/NA] Step[99] GlobalStep[11881/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0133] total_loss[0.0184] Rank[0/16] 06/24/2025 14:16:31 INFO stats.py:314 | Epoch[86] Step[117] GlobalStep[11899] Training Speed: 413.94 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 10:20:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:16:34 INFO loss_tracker.py:84 | Epoch[86/NA] Step[124] GlobalStep[11906/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0133] total_loss[0.0189] Rank[0/16] 06/24/2025 14:16:38 INFO stats.py:394 | Epoch[86] completed. Training Speed: 308.31 samples/sec across all devices. Epoch Time: 56.88 sec. Average Epoch Time: 56.88 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 10:20:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:16:41 INFO stats.py:314 | Epoch[87] Step[5] GlobalStep[11924] Training Speed: 411.55 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 10:20:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:16:49 INFO loss_tracker.py:84 | Epoch[87/NA] Step[24] GlobalStep[11943/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0133] total_loss[0.0189] Rank[0/16] 06/24/2025 14:16:52 INFO stats.py:314 | Epoch[87] Step[30] GlobalStep[11949] Training Speed: 431.54 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:20:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:17:00 INFO loss_tracker.py:84 | Epoch[87/NA] Step[49] GlobalStep[11968/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0133] total_loss[0.0187] Rank[0/16] 06/24/2025 14:17:02 INFO stats.py:314 | Epoch[87] Step[55] GlobalStep[11974] Training Speed: 423.23 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:20:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:17:10 INFO loss_tracker.py:84 | Epoch[87/NA] Step[74] GlobalStep[11993/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0133] total_loss[0.0187] Rank[0/16] 06/24/2025 14:17:12 INFO stats.py:314 | Epoch[87] Step[80] GlobalStep[11999] Training Speed: 429.07 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:19:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:17:13 INFO accelerator.py:3111 | Saving current state to /job_data/checkpoints/checkpoint_2 Rank[10/16] 06/24/2025 14:17:13 INFO checkpoint.py:110 | Save checkpoint at the end of step 11999 to /job_data/checkpoints/checkpoint_2 Rank[8/16] 06/24/2025 14:17:13 INFO checkpoint.py:110 | Save checkpoint at the end of step 11999 to /job_data/checkpoints/checkpoint_2 Rank[11/16] 06/24/2025 14:17:13 INFO checkpoint.py:110 | Save checkpoint at the end of step 11999 to /job_data/checkpoints/checkpoint_2 Rank[6/16] 06/24/2025 14:17:13 INFO checkpoint.py:110 | Save checkpoint at the end of step 11999 to /job_data/checkpoints/checkpoint_2 Rank[13/16] 06/24/2025 14:17:13 INFO checkpoint.py:110 | Save checkpoint at the end of step 11999 to /job_data/checkpoints/checkpoint_2 Rank[1/16] 06/24/2025 14:17:13 INFO checkpoint.py:110 | Save checkpoint at the end of step 11999 to /job_data/checkpoints/checkpoint_2 Rank[14/16] 06/24/2025 14:17:13 INFO checkpoint.py:110 | Save checkpoint at the end of step 11999 to /job_data/checkpoints/checkpoint_2 Rank[3/16] 06/24/2025 14:17:13 INFO checkpoint.py:110 | Save checkpoint at the end of step 11999 to /job_data/checkpoints/checkpoint_2 Rank[12/16] 06/24/2025 14:17:13 INFO checkpoint.py:110 | Save checkpoint at the end of step 11999 to /job_data/checkpoints/checkpoint_2 Rank[15/16] 06/24/2025 14:17:13 INFO checkpoint.py:110 | Save checkpoint at the end of step 11999 to /job_data/checkpoints/checkpoint_2 Rank[5/16] 06/24/2025 14:17:13 INFO checkpoint.py:110 | Save checkpoint at the end of step 11999 to /job_data/checkpoints/checkpoint_2 Rank[9/16] 06/24/2025 14:17:13 INFO checkpoint.py:110 | Save checkpoint at the end of step 11999 to /job_data/checkpoints/checkpoint_2 Rank[7/16] 06/24/2025 14:17:13 INFO checkpoint.py:110 | Save checkpoint at the end of step 11999 to /job_data/checkpoints/checkpoint_2 Rank[4/16] 06/24/2025 14:17:13 INFO checkpoint.py:110 | Save checkpoint at the end of step 11999 to /job_data/checkpoints/checkpoint_2 Rank[2/16] 06/24/2025 14:17:13 INFO checkpoint.py:110 | Save checkpoint at the end of step 11999 to /job_data/checkpoints/checkpoint_2 Rank[0/16] 06/24/2025 14:17:14 INFO checkpointing.py:106 | Model weights saved in /job_data/checkpoints/checkpoint_2/model.safetensors Rank[0/16] 06/24/2025 14:17:15 INFO checkpointing.py:113 | Optimizer state saved in /job_data/checkpoints/checkpoint_2/optimizer.bin Rank[0/16] 06/24/2025 14:17:15 INFO checkpointing.py:120 | Scheduler state saved in /job_data/checkpoints/checkpoint_2/scheduler.bin Rank[0/16] 06/24/2025 14:17:15 INFO checkpointing.py:137 | Sampler state for dataloader 0 saved in /job_data/checkpoints/checkpoint_2/sampler.bin Rank[0/16] 06/24/2025 14:17:15 INFO checkpointing.py:164 | Random states saved in /job_data/checkpoints/checkpoint_2/random_states_0.pkl Rank[0/16] 06/24/2025 14:17:15 INFO checkpointing.py:300 | Saving the state of TrainerProgressState to /job_data/checkpoints/checkpoint_2/custom_checkpoint_0.pkl Rank[0/16] 06/24/2025 14:17:15 INFO checkpoint.py:110 | Save checkpoint at the end of step 11999 to /job_data/checkpoints/checkpoint_2 Rank[0/16] 06/24/2025 14:17:23 INFO loss_tracker.py:84 | Epoch[87/NA] Step[99] GlobalStep[12018/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0057] loss_depth[0.0133] total_loss[0.0192] Rank[0/16] 06/24/2025 14:17:25 INFO stats.py:314 | Epoch[87] Step[105] GlobalStep[12024] Training Speed: 431.56 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:19:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:17:33 INFO loss_tracker.py:84 | Epoch[87/NA] Step[124] GlobalStep[12043/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0134] total_loss[0.0187] Rank[0/16] 06/24/2025 14:17:35 INFO stats.py:314 | Epoch[87] Step[130] GlobalStep[12049] Training Speed: 441.29 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:19:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:17:37 INFO stats.py:394 | Epoch[87] completed. Training Speed: 298.06 samples/sec across all devices. Epoch Time: 58.83 sec. Average Epoch Time: 58.83 sec. Average Step Time: 0.43 sec. Estimated Remaining Time: 10:19:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:17:45 INFO stats.py:314 | Epoch[88] Step[18] GlobalStep[12074] Training Speed: 439.32 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:19:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:17:48 INFO loss_tracker.py:84 | Epoch[88/NA] Step[24] GlobalStep[12080/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0133] total_loss[0.0186] Rank[0/16] 06/24/2025 14:17:56 INFO stats.py:314 | Epoch[88] Step[43] GlobalStep[12099] Training Speed: 417.91 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 10:19:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:17:58 INFO loss_tracker.py:84 | Epoch[88/NA] Step[49] GlobalStep[12105/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0133] total_loss[0.0181] Rank[0/16] 06/24/2025 14:18:06 INFO stats.py:314 | Epoch[88] Step[68] GlobalStep[12124] Training Speed: 426.16 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:19:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:18:08 INFO loss_tracker.py:84 | Epoch[88/NA] Step[74] GlobalStep[12130/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0057] loss_depth[0.0134] total_loss[0.0192] Rank[0/16] 06/24/2025 14:18:16 INFO stats.py:314 | Epoch[88] Step[93] GlobalStep[12149] Training Speed: 415.51 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 10:18:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:18:19 INFO loss_tracker.py:84 | Epoch[88/NA] Step[99] GlobalStep[12155/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0133] total_loss[0.0186] Rank[0/16] 06/24/2025 14:18:26 INFO stats.py:314 | Epoch[88] Step[118] GlobalStep[12174] Training Speed: 237.70 samples/sec across all devices. Average Step Time: 0.54 sec. Estimated Remaining Time: 10:18:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:18:29 INFO loss_tracker.py:84 | Epoch[88/NA] Step[124] GlobalStep[12180/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0133] total_loss[0.0184] Rank[0/16] 06/24/2025 14:18:33 INFO stats.py:394 | Epoch[88] completed. Training Speed: 311.48 samples/sec across all devices. Epoch Time: 56.30 sec. Average Epoch Time: 56.30 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 10:18:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:18:37 INFO stats.py:314 | Epoch[89] Step[6] GlobalStep[12199] Training Speed: 439.88 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:18:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:18:44 INFO loss_tracker.py:84 | Epoch[89/NA] Step[24] GlobalStep[12217/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0055] loss_depth[0.0132] total_loss[0.0189] Rank[0/16] 06/24/2025 14:18:47 INFO stats.py:314 | Epoch[89] Step[31] GlobalStep[12224] Training Speed: 429.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:18:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:18:55 INFO loss_tracker.py:84 | Epoch[89/NA] Step[49] GlobalStep[12242/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0059] loss_depth[0.0133] total_loss[0.0193] Rank[0/16] 06/24/2025 14:18:58 INFO stats.py:314 | Epoch[89] Step[56] GlobalStep[12249] Training Speed: 418.57 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 10:18:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:19:05 INFO loss_tracker.py:84 | Epoch[89/NA] Step[74] GlobalStep[12267/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0134] total_loss[0.0189] Rank[0/16] 06/24/2025 14:19:08 INFO stats.py:314 | Epoch[89] Step[81] GlobalStep[12274] Training Speed: 427.92 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:17:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:19:16 INFO loss_tracker.py:84 | Epoch[89/NA] Step[99] GlobalStep[12292/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0133] total_loss[0.0189] Rank[0/16] 06/24/2025 14:19:19 INFO stats.py:314 | Epoch[89] Step[106] GlobalStep[12299] Training Speed: 423.86 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:17:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:19:26 INFO loss_tracker.py:84 | Epoch[89/NA] Step[124] GlobalStep[12317/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0058] loss_depth[0.0134] total_loss[0.0193] Rank[0/16] 06/24/2025 14:19:28 INFO stats.py:314 | Epoch[89] Step[131] GlobalStep[12324] Training Speed: 450.04 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 10:17:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:19:30 INFO stats.py:394 | Epoch[89] completed. Training Speed: 309.19 samples/sec across all devices. Epoch Time: 56.72 sec. Average Epoch Time: 56.72 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 10:17:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:19:39 INFO stats.py:314 | Epoch[90] Step[19] GlobalStep[12349] Training Speed: 426.08 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:17:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:19:41 INFO loss_tracker.py:84 | Epoch[90/NA] Step[24] GlobalStep[12354/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0133] total_loss[0.0184] Rank[0/16] 06/24/2025 14:19:50 INFO stats.py:314 | Epoch[90] Step[44] GlobalStep[12374] Training Speed: 227.23 samples/sec across all devices. Average Step Time: 0.56 sec. Estimated Remaining Time: 10:17:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:19:52 INFO loss_tracker.py:84 | Epoch[90/NA] Step[49] GlobalStep[12379/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0061] loss_depth[0.0132] total_loss[0.0195] Rank[0/16] 06/24/2025 14:20:00 INFO stats.py:314 | Epoch[90] Step[69] GlobalStep[12399] Training Speed: 410.23 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 10:16:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:20:02 INFO loss_tracker.py:84 | Epoch[90/NA] Step[74] GlobalStep[12404/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0133] total_loss[0.0190] Rank[0/16] 06/24/2025 14:20:10 INFO stats.py:314 | Epoch[90] Step[94] GlobalStep[12424] Training Speed: 229.45 samples/sec across all devices. Average Step Time: 0.56 sec. Estimated Remaining Time: 10:16:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:20:12 INFO loss_tracker.py:84 | Epoch[90/NA] Step[99] GlobalStep[12429/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0133] total_loss[0.0187] Rank[0/16] 06/24/2025 14:20:20 INFO stats.py:314 | Epoch[90] Step[119] GlobalStep[12449] Training Speed: 436.45 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:16:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:20:22 INFO loss_tracker.py:84 | Epoch[90/NA] Step[124] GlobalStep[12454/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0048] loss_depth[0.0133] total_loss[0.0182] Rank[0/16] 06/24/2025 14:20:27 INFO stats.py:394 | Epoch[90] completed. Training Speed: 308.83 samples/sec across all devices. Epoch Time: 56.78 sec. Average Epoch Time: 56.78 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 10:16:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:20:31 INFO stats.py:314 | Epoch[91] Step[7] GlobalStep[12474] Training Speed: 226.40 samples/sec across all devices. Average Step Time: 0.57 sec. Estimated Remaining Time: 10:16:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:20:39 INFO loss_tracker.py:84 | Epoch[91/NA] Step[24] GlobalStep[12491/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0055] loss_depth[0.0133] total_loss[0.0189] Rank[0/16] 06/24/2025 14:20:42 INFO stats.py:314 | Epoch[91] Step[32] GlobalStep[12499] Training Speed: 422.16 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:16:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:20:49 INFO loss_tracker.py:84 | Epoch[91/NA] Step[49] GlobalStep[12516/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0060] loss_depth[0.0133] total_loss[0.0194] Rank[0/16] 06/24/2025 14:20:52 INFO stats.py:314 | Epoch[91] Step[57] GlobalStep[12524] Training Speed: 422.42 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:15:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:20:59 INFO loss_tracker.py:84 | Epoch[91/NA] Step[74] GlobalStep[12541/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0132] total_loss[0.0185] Rank[0/16] 06/24/2025 14:21:02 INFO stats.py:314 | Epoch[91] Step[82] GlobalStep[12549] Training Speed: 433.97 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:15:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:21:09 INFO loss_tracker.py:84 | Epoch[91/NA] Step[99] GlobalStep[12566/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0132] total_loss[0.0188] Rank[0/16] 06/24/2025 14:21:13 INFO stats.py:314 | Epoch[91] Step[107] GlobalStep[12574] Training Speed: 430.26 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:15:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:21:20 INFO loss_tracker.py:84 | Epoch[91/NA] Step[124] GlobalStep[12591/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0133] total_loss[0.0184] Rank[0/16] 06/24/2025 14:21:22 INFO stats.py:314 | Epoch[91] Step[132] GlobalStep[12599] Training Speed: 448.88 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:15:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:21:24 INFO stats.py:394 | Epoch[91] completed. Training Speed: 307.94 samples/sec across all devices. Epoch Time: 56.95 sec. Average Epoch Time: 56.95 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 10:15:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:21:34 INFO stats.py:314 | Epoch[92] Step[20] GlobalStep[12624] Training Speed: 429.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:15:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:21:35 INFO loss_tracker.py:84 | Epoch[92/NA] Step[24] GlobalStep[12628/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0054] loss_depth[0.0132] total_loss[0.0188] Rank[0/16] 06/24/2025 14:21:44 INFO stats.py:314 | Epoch[92] Step[45] GlobalStep[12649] Training Speed: 414.69 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 10:14:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:21:45 INFO loss_tracker.py:84 | Epoch[92/NA] Step[49] GlobalStep[12653/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0133] total_loss[0.0188] Rank[0/16] 06/24/2025 14:21:54 INFO stats.py:314 | Epoch[92] Step[70] GlobalStep[12674] Training Speed: 425.16 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:14:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:21:56 INFO loss_tracker.py:84 | Epoch[92/NA] Step[74] GlobalStep[12678/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0059] loss_depth[0.0132] total_loss[0.0193] Rank[0/16] 06/24/2025 14:22:04 INFO stats.py:314 | Epoch[92] Step[95] GlobalStep[12699] Training Speed: 433.57 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:14:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:22:06 INFO loss_tracker.py:84 | Epoch[92/NA] Step[99] GlobalStep[12703/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0053] loss_depth[0.0133] total_loss[0.0188] Rank[0/16] 06/24/2025 14:22:15 INFO stats.py:314 | Epoch[92] Step[120] GlobalStep[12724] Training Speed: 451.46 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 10:14:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:22:16 INFO loss_tracker.py:84 | Epoch[92/NA] Step[124] GlobalStep[12728/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0058] loss_depth[0.0133] total_loss[0.0192] Rank[0/16] 06/24/2025 14:22:21 INFO stats.py:394 | Epoch[92] completed. Training Speed: 308.16 samples/sec across all devices. Epoch Time: 56.91 sec. Average Epoch Time: 56.91 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 10:14:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:22:25 INFO stats.py:314 | Epoch[93] Step[8] GlobalStep[12749] Training Speed: 425.10 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:14:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:22:32 INFO loss_tracker.py:84 | Epoch[93/NA] Step[24] GlobalStep[12765/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0133] total_loss[0.0187] Rank[0/16] 06/24/2025 14:22:36 INFO stats.py:314 | Epoch[93] Step[33] GlobalStep[12774] Training Speed: 425.38 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:13:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:22:42 INFO loss_tracker.py:84 | Epoch[93/NA] Step[49] GlobalStep[12790/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0057] loss_depth[0.0133] total_loss[0.0191] Rank[0/16] 06/24/2025 14:22:46 INFO stats.py:314 | Epoch[93] Step[58] GlobalStep[12799] Training Speed: 435.39 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:13:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:22:53 INFO loss_tracker.py:84 | Epoch[93/NA] Step[74] GlobalStep[12815/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0133] total_loss[0.0190] Rank[0/16] 06/24/2025 14:22:56 INFO stats.py:314 | Epoch[93] Step[83] GlobalStep[12824] Training Speed: 436.52 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:13:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:23:03 INFO loss_tracker.py:84 | Epoch[93/NA] Step[99] GlobalStep[12840/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0132] total_loss[0.0185] Rank[0/16] 06/24/2025 14:23:06 INFO stats.py:314 | Epoch[93] Step[108] GlobalStep[12849] Training Speed: 423.57 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:13:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:23:12 INFO loss_tracker.py:84 | Epoch[93/NA] Step[124] GlobalStep[12865/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0133] total_loss[0.0184] Rank[0/16] 06/24/2025 14:23:16 INFO stats.py:314 | Epoch[93] Step[133] GlobalStep[12874] Training Speed: 451.36 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 10:13:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:23:17 INFO stats.py:394 | Epoch[93] completed. Training Speed: 310.21 samples/sec across all devices. Epoch Time: 56.53 sec. Average Epoch Time: 56.53 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 10:13:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:23:27 INFO stats.py:314 | Epoch[94] Step[21] GlobalStep[12899] Training Speed: 412.18 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 10:12:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:23:28 INFO loss_tracker.py:84 | Epoch[94/NA] Step[24] GlobalStep[12902/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0055] loss_depth[0.0133] total_loss[0.0189] Rank[0/16] 06/24/2025 14:23:37 INFO stats.py:314 | Epoch[94] Step[46] GlobalStep[12924] Training Speed: 429.36 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:12:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:23:38 INFO loss_tracker.py:84 | Epoch[94/NA] Step[49] GlobalStep[12927/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0133] total_loss[0.0187] Rank[0/16] 06/24/2025 14:23:47 INFO stats.py:314 | Epoch[94] Step[71] GlobalStep[12949] Training Speed: 434.01 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:12:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:23:48 INFO loss_tracker.py:84 | Epoch[94/NA] Step[74] GlobalStep[12952/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0055] loss_depth[0.0132] total_loss[0.0188] Rank[0/16] 06/24/2025 14:23:57 INFO stats.py:314 | Epoch[94] Step[96] GlobalStep[12974] Training Speed: 438.99 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:12:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:23:59 INFO loss_tracker.py:84 | Epoch[94/NA] Step[99] GlobalStep[12977/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0131] total_loss[0.0185] Rank[0/16] 06/24/2025 14:24:07 INFO stats.py:314 | Epoch[94] Step[121] GlobalStep[12999] Training Speed: 450.87 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 10:12:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:24:08 INFO loss_tracker.py:84 | Epoch[94/NA] Step[124] GlobalStep[13002/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0133] total_loss[0.0185] Rank[0/16] 06/24/2025 14:24:13 INFO stats.py:394 | Epoch[94] completed. Training Speed: 316.43 samples/sec across all devices. Epoch Time: 55.42 sec. Average Epoch Time: 55.42 sec. Average Step Time: 0.40 sec. Estimated Remaining Time: 10:11:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:24:18 INFO stats.py:314 | Epoch[95] Step[9] GlobalStep[13024] Training Speed: 426.85 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:11:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:24:24 INFO loss_tracker.py:84 | Epoch[95/NA] Step[24] GlobalStep[13039/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0133] total_loss[0.0187] Rank[0/16] 06/24/2025 14:24:28 INFO stats.py:314 | Epoch[95] Step[34] GlobalStep[13049] Training Speed: 428.92 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:11:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:24:35 INFO loss_tracker.py:84 | Epoch[95/NA] Step[49] GlobalStep[13064/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0132] total_loss[0.0186] Rank[0/16] 06/24/2025 14:24:38 INFO stats.py:314 | Epoch[95] Step[59] GlobalStep[13074] Training Speed: 407.65 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 10:11:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:24:45 INFO loss_tracker.py:84 | Epoch[95/NA] Step[74] GlobalStep[13089/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0133] total_loss[0.0185] Rank[0/16] 06/24/2025 14:24:48 INFO stats.py:314 | Epoch[95] Step[84] GlobalStep[13099] Training Speed: 428.73 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:11:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:24:55 INFO loss_tracker.py:84 | Epoch[95/NA] Step[99] GlobalStep[13114/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0132] total_loss[0.0189] Rank[0/16] 06/24/2025 14:24:59 INFO stats.py:314 | Epoch[95] Step[109] GlobalStep[13124] Training Speed: 450.54 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 10:11:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:25:05 INFO loss_tracker.py:84 | Epoch[95/NA] Step[124] GlobalStep[13139/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0055] loss_depth[0.0133] total_loss[0.0189] Rank[0/16] 06/24/2025 14:25:08 INFO stats.py:314 | Epoch[95] Step[134] GlobalStep[13149] Training Speed: 453.30 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 10:10:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:25:09 INFO stats.py:394 | Epoch[95] completed. Training Speed: 309.67 samples/sec across all devices. Epoch Time: 56.63 sec. Average Epoch Time: 56.63 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 10:10:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:25:19 INFO stats.py:314 | Epoch[96] Step[22] GlobalStep[13174] Training Speed: 435.26 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:10:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:25:20 INFO loss_tracker.py:84 | Epoch[96/NA] Step[24] GlobalStep[13176/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0132] total_loss[0.0186] Rank[0/16] 06/24/2025 14:25:30 INFO stats.py:314 | Epoch[96] Step[47] GlobalStep[13199] Training Speed: 260.21 samples/sec across all devices. Average Step Time: 0.49 sec. Estimated Remaining Time: 10:10:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:25:31 INFO loss_tracker.py:84 | Epoch[96/NA] Step[49] GlobalStep[13201/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0132] total_loss[0.0189] Rank[0/16] 06/24/2025 14:25:40 INFO stats.py:314 | Epoch[96] Step[72] GlobalStep[13224] Training Speed: 438.27 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:10:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:25:41 INFO loss_tracker.py:84 | Epoch[96/NA] Step[74] GlobalStep[13226/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0046] loss_depth[0.0132] total_loss[0.0179] Rank[0/16] 06/24/2025 14:25:50 INFO stats.py:314 | Epoch[96] Step[97] GlobalStep[13249] Training Speed: 429.87 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:09:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:25:51 INFO loss_tracker.py:84 | Epoch[96/NA] Step[99] GlobalStep[13251/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0133] total_loss[0.0186] Rank[0/16] 06/24/2025 14:26:00 INFO stats.py:314 | Epoch[96] Step[122] GlobalStep[13274] Training Speed: 456.12 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 10:09:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:26:01 INFO loss_tracker.py:84 | Epoch[96/NA] Step[124] GlobalStep[13276/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0132] total_loss[0.0184] Rank[0/16] 06/24/2025 14:26:05 INFO stats.py:394 | Epoch[96] completed. Training Speed: 312.35 samples/sec across all devices. Epoch Time: 56.14 sec. Average Epoch Time: 56.14 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 10:09:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:26:11 INFO stats.py:314 | Epoch[97] Step[10] GlobalStep[13299] Training Speed: 432.89 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:09:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:26:17 INFO loss_tracker.py:84 | Epoch[97/NA] Step[24] GlobalStep[13313/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0058] loss_depth[0.0133] total_loss[0.0192] Rank[0/16] 06/24/2025 14:26:21 INFO stats.py:314 | Epoch[97] Step[35] GlobalStep[13324] Training Speed: 427.82 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:09:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:26:27 INFO loss_tracker.py:84 | Epoch[97/NA] Step[49] GlobalStep[13338/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0133] total_loss[0.0187] Rank[0/16] 06/24/2025 14:26:32 INFO stats.py:314 | Epoch[97] Step[60] GlobalStep[13349] Training Speed: 420.22 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:09:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:26:37 INFO loss_tracker.py:84 | Epoch[97/NA] Step[74] GlobalStep[13363/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0133] total_loss[0.0190] Rank[0/16] 06/24/2025 14:26:42 INFO stats.py:314 | Epoch[97] Step[85] GlobalStep[13374] Training Speed: 414.20 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 10:08:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:26:47 INFO loss_tracker.py:84 | Epoch[97/NA] Step[99] GlobalStep[13388/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0055] loss_depth[0.0132] total_loss[0.0189] Rank[0/16] 06/24/2025 14:26:52 INFO stats.py:314 | Epoch[97] Step[110] GlobalStep[13399] Training Speed: 430.21 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:08:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:26:57 INFO loss_tracker.py:84 | Epoch[97/NA] Step[124] GlobalStep[13413/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0132] total_loss[0.0189] Rank[0/16] 06/24/2025 14:27:02 INFO stats.py:314 | Epoch[97] Step[135] GlobalStep[13424] Training Speed: 449.19 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 10:08:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:27:02 INFO stats.py:394 | Epoch[97] completed. Training Speed: 309.77 samples/sec across all devices. Epoch Time: 56.61 sec. Average Epoch Time: 56.61 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 10:08:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:27:13 INFO stats.py:314 | Epoch[98] Step[23] GlobalStep[13449] Training Speed: 411.16 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 10:08:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:27:13 INFO loss_tracker.py:84 | Epoch[98/NA] Step[24] GlobalStep[13450/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0132] total_loss[0.0183] Rank[0/16] 06/24/2025 14:27:23 INFO stats.py:314 | Epoch[98] Step[48] GlobalStep[13474] Training Speed: 435.08 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:08:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:27:24 INFO loss_tracker.py:84 | Epoch[98/NA] Step[49] GlobalStep[13475/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0060] loss_depth[0.0132] total_loss[0.0193] Rank[0/16] 06/24/2025 14:27:33 INFO stats.py:314 | Epoch[98] Step[73] GlobalStep[13499] Training Speed: 431.30 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:07:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:27:34 INFO loss_tracker.py:84 | Epoch[98/NA] Step[74] GlobalStep[13500/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0132] total_loss[0.0182] Rank[0/16] 06/24/2025 14:27:44 INFO stats.py:314 | Epoch[98] Step[98] GlobalStep[13524] Training Speed: 426.92 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:07:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:27:44 INFO loss_tracker.py:84 | Epoch[98/NA] Step[99] GlobalStep[13525/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0132] total_loss[0.0186] Rank[0/16] 06/24/2025 14:27:54 INFO stats.py:314 | Epoch[98] Step[123] GlobalStep[13549] Training Speed: 449.62 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 10:07:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:27:54 INFO loss_tracker.py:84 | Epoch[98/NA] Step[124] GlobalStep[13550/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0132] total_loss[0.0182] Rank[0/16] 06/24/2025 14:27:59 INFO stats.py:394 | Epoch[98] completed. Training Speed: 309.04 samples/sec across all devices. Epoch Time: 56.74 sec. Average Epoch Time: 56.74 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 10:07:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:28:04 INFO stats.py:314 | Epoch[99] Step[11] GlobalStep[13574] Training Speed: 433.16 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:07:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:28:10 INFO loss_tracker.py:84 | Epoch[99/NA] Step[24] GlobalStep[13587/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0133] total_loss[0.0187] Rank[0/16] 06/24/2025 14:28:15 INFO stats.py:314 | Epoch[99] Step[36] GlobalStep[13599] Training Speed: 422.88 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:07:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:28:20 INFO loss_tracker.py:84 | Epoch[99/NA] Step[49] GlobalStep[13612/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0055] loss_depth[0.0132] total_loss[0.0188] Rank[0/16] 06/24/2025 14:28:25 INFO stats.py:314 | Epoch[99] Step[61] GlobalStep[13624] Training Speed: 421.53 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:06:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:28:30 INFO loss_tracker.py:84 | Epoch[99/NA] Step[74] GlobalStep[13637/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0132] total_loss[0.0186] Rank[0/16] 06/24/2025 14:28:35 INFO stats.py:314 | Epoch[99] Step[86] GlobalStep[13649] Training Speed: 404.87 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 10:06:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:28:41 INFO loss_tracker.py:84 | Epoch[99/NA] Step[99] GlobalStep[13662/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0131] total_loss[0.0184] Rank[0/16] 06/24/2025 14:28:46 INFO stats.py:314 | Epoch[99] Step[111] GlobalStep[13674] Training Speed: 432.65 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:06:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:28:51 INFO loss_tracker.py:84 | Epoch[99/NA] Step[124] GlobalStep[13687/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0059] loss_depth[0.0133] total_loss[0.0193] Rank[0/16] 06/24/2025 14:28:55 INFO stats.py:314 | Epoch[99] Step[136] GlobalStep[13699] Training Speed: 450.89 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 10:06:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:28:55 INFO stats.py:394 | Epoch[99] completed. Training Speed: 308.67 samples/sec across all devices. Epoch Time: 56.81 sec. Average Epoch Time: 56.81 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 10:06:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:29:07 INFO stats.py:314 | Epoch[100] Step[24] GlobalStep[13724] Training Speed: 431.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:06:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:29:07 INFO loss_tracker.py:84 | Epoch[100/NA] Step[24] GlobalStep[13724/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0132] total_loss[0.0185] Rank[0/16] 06/24/2025 14:29:17 INFO stats.py:314 | Epoch[100] Step[49] GlobalStep[13749] Training Speed: 427.80 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:06:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:29:17 INFO loss_tracker.py:84 | Epoch[100/NA] Step[49] GlobalStep[13749/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0131] total_loss[0.0179] Rank[0/16] 06/24/2025 14:29:27 INFO stats.py:314 | Epoch[100] Step[74] GlobalStep[13774] Training Speed: 432.07 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:05:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:29:27 INFO loss_tracker.py:84 | Epoch[100/NA] Step[74] GlobalStep[13774/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0055] loss_depth[0.0132] total_loss[0.0188] Rank[0/16] 06/24/2025 14:29:38 INFO stats.py:314 | Epoch[100] Step[99] GlobalStep[13799] Training Speed: 435.90 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:05:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:29:38 INFO loss_tracker.py:84 | Epoch[100/NA] Step[99] GlobalStep[13799/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0132] total_loss[0.0182] Rank[0/16] 06/24/2025 14:29:47 INFO stats.py:314 | Epoch[100] Step[124] GlobalStep[13824] Training Speed: 452.43 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 10:05:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:29:47 INFO loss_tracker.py:84 | Epoch[100/NA] Step[124] GlobalStep[13824/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0133] total_loss[0.0186] Rank[0/16] 06/24/2025 14:29:52 INFO stats.py:394 | Epoch[100] completed. Training Speed: 312.25 samples/sec across all devices. Epoch Time: 56.16 sec. Average Epoch Time: 56.16 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 10:05:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:29:59 INFO stats.py:314 | Epoch[101] Step[12] GlobalStep[13849] Training Speed: 423.18 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:05:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:30:03 INFO loss_tracker.py:84 | Epoch[101/NA] Step[24] GlobalStep[13861/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0132] total_loss[0.0180] Rank[0/16] 06/24/2025 14:30:09 INFO stats.py:314 | Epoch[101] Step[37] GlobalStep[13874] Training Speed: 435.21 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:05:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:30:14 INFO loss_tracker.py:84 | Epoch[101/NA] Step[49] GlobalStep[13886/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0132] total_loss[0.0187] Rank[0/16] 06/24/2025 14:30:20 INFO stats.py:314 | Epoch[101] Step[62] GlobalStep[13899] Training Speed: 421.45 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:04:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:30:24 INFO loss_tracker.py:84 | Epoch[101/NA] Step[74] GlobalStep[13911/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0132] total_loss[0.0181] Rank[0/16] 06/24/2025 14:30:29 INFO stats.py:314 | Epoch[101] Step[87] GlobalStep[13924] Training Speed: 433.17 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:04:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:30:34 INFO loss_tracker.py:84 | Epoch[101/NA] Step[99] GlobalStep[13936/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0132] total_loss[0.0184] Rank[0/16] 06/24/2025 14:30:40 INFO stats.py:314 | Epoch[101] Step[112] GlobalStep[13949] Training Speed: 436.54 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:04:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:30:44 INFO loss_tracker.py:84 | Epoch[101/NA] Step[124] GlobalStep[13961/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0048] loss_depth[0.0132] total_loss[0.0181] Rank[0/16] 06/24/2025 14:30:48 INFO stats.py:394 | Epoch[101] completed. Training Speed: 308.80 samples/sec across all devices. Epoch Time: 56.79 sec. Average Epoch Time: 56.79 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 10:04:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:30:50 INFO stats.py:314 | Epoch[102] Step[0] GlobalStep[13974] Training Speed: 360.98 samples/sec across all devices. Average Step Time: 0.35 sec. Estimated Remaining Time: 10:04:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:31:00 INFO loss_tracker.py:84 | Epoch[102/NA] Step[24] GlobalStep[13998/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0132] total_loss[0.0182] Rank[0/16] 06/24/2025 14:31:01 INFO stats.py:314 | Epoch[102] Step[25] GlobalStep[13999] Training Speed: 424.14 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:04:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:31:10 INFO loss_tracker.py:84 | Epoch[102/NA] Step[49] GlobalStep[14023/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0055] loss_depth[0.0133] total_loss[0.0189] Rank[0/16] 06/24/2025 14:31:10 INFO stats.py:314 | Epoch[102] Step[50] GlobalStep[14024] Training Speed: 434.48 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:03:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:31:21 INFO loss_tracker.py:84 | Epoch[102/NA] Step[74] GlobalStep[14048/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0132] total_loss[0.0183] Rank[0/16] 06/24/2025 14:31:21 INFO stats.py:314 | Epoch[102] Step[75] GlobalStep[14049] Training Speed: 397.64 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 10:03:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:31:31 INFO loss_tracker.py:84 | Epoch[102/NA] Step[99] GlobalStep[14073/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0132] total_loss[0.0183] Rank[0/16] 06/24/2025 14:31:31 INFO stats.py:314 | Epoch[102] Step[100] GlobalStep[14074] Training Speed: 409.94 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 10:03:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:31:41 INFO loss_tracker.py:84 | Epoch[102/NA] Step[124] GlobalStep[14098/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0132] total_loss[0.0187] Rank[0/16] 06/24/2025 14:31:41 INFO stats.py:314 | Epoch[102] Step[125] GlobalStep[14099] Training Speed: 439.76 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 10:03:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:31:45 INFO stats.py:394 | Epoch[102] completed. Training Speed: 307.96 samples/sec across all devices. Epoch Time: 56.94 sec. Average Epoch Time: 56.94 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 10:03:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:31:52 INFO stats.py:314 | Epoch[103] Step[13] GlobalStep[14124] Training Speed: 410.65 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 10:03:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:31:57 INFO loss_tracker.py:84 | Epoch[103/NA] Step[24] GlobalStep[14135/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0057] loss_depth[0.0132] total_loss[0.0190] Rank[0/16] 06/24/2025 14:32:03 INFO stats.py:314 | Epoch[103] Step[38] GlobalStep[14149] Training Speed: 431.01 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:02:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:32:07 INFO loss_tracker.py:84 | Epoch[103/NA] Step[49] GlobalStep[14160/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0055] loss_depth[0.0133] total_loss[0.0188] Rank[0/16] 06/24/2025 14:32:13 INFO stats.py:314 | Epoch[103] Step[63] GlobalStep[14174] Training Speed: 412.73 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 10:02:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:32:17 INFO loss_tracker.py:84 | Epoch[103/NA] Step[74] GlobalStep[14185/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0055] loss_depth[0.0132] total_loss[0.0188] Rank[0/16] 06/24/2025 14:32:23 INFO stats.py:314 | Epoch[103] Step[88] GlobalStep[14199] Training Speed: 433.55 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:02:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:32:27 INFO loss_tracker.py:84 | Epoch[103/NA] Step[99] GlobalStep[14210/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0131] total_loss[0.0179] Rank[0/16] 06/24/2025 14:32:33 INFO stats.py:314 | Epoch[103] Step[113] GlobalStep[14224] Training Speed: 430.97 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:02:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:32:38 INFO loss_tracker.py:84 | Epoch[103/NA] Step[124] GlobalStep[14235/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0132] total_loss[0.0184] Rank[0/16] 06/24/2025 14:32:42 INFO stats.py:394 | Epoch[103] completed. Training Speed: 310.82 samples/sec across all devices. Epoch Time: 56.42 sec. Average Epoch Time: 56.42 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 10:02:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:32:43 INFO stats.py:314 | Epoch[104] Step[1] GlobalStep[14249] Training Speed: 428.91 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:02:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:32:53 INFO loss_tracker.py:84 | Epoch[104/NA] Step[24] GlobalStep[14272/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0132] total_loss[0.0185] Rank[0/16] 06/24/2025 14:32:54 INFO stats.py:314 | Epoch[104] Step[26] GlobalStep[14274] Training Speed: 432.47 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:01:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:33:03 INFO loss_tracker.py:84 | Epoch[104/NA] Step[49] GlobalStep[14297/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0132] total_loss[0.0182] Rank[0/16] 06/24/2025 14:33:04 INFO stats.py:314 | Epoch[104] Step[51] GlobalStep[14299] Training Speed: 404.86 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 10:01:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:33:14 INFO loss_tracker.py:84 | Epoch[104/NA] Step[74] GlobalStep[14322/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0132] total_loss[0.0186] Rank[0/16] 06/24/2025 14:33:14 INFO stats.py:314 | Epoch[104] Step[76] GlobalStep[14324] Training Speed: 428.34 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:01:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:33:24 INFO loss_tracker.py:84 | Epoch[104/NA] Step[99] GlobalStep[14347/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0132] total_loss[0.0189] Rank[0/16] 06/24/2025 14:33:25 INFO stats.py:314 | Epoch[104] Step[101] GlobalStep[14349] Training Speed: 429.26 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:01:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:33:33 INFO loss_tracker.py:84 | Epoch[104/NA] Step[124] GlobalStep[14372/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0132] total_loss[0.0182] Rank[0/16] 06/24/2025 14:33:34 INFO stats.py:314 | Epoch[104] Step[126] GlobalStep[14374] Training Speed: 449.13 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 10:01:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:33:38 INFO stats.py:394 | Epoch[104] completed. Training Speed: 312.22 samples/sec across all devices. Epoch Time: 56.17 sec. Average Epoch Time: 56.17 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 10:00:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:33:45 INFO stats.py:314 | Epoch[105] Step[14] GlobalStep[14399] Training Speed: 425.18 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:00:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:33:49 INFO loss_tracker.py:84 | Epoch[105/NA] Step[24] GlobalStep[14409/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0132] total_loss[0.0187] Rank[0/16] 06/24/2025 14:33:56 INFO stats.py:314 | Epoch[105] Step[39] GlobalStep[14424] Training Speed: 430.63 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:00:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:34:00 INFO loss_tracker.py:84 | Epoch[105/NA] Step[49] GlobalStep[14434/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0132] total_loss[0.0186] Rank[0/16] 06/24/2025 14:34:06 INFO stats.py:314 | Epoch[105] Step[64] GlobalStep[14449] Training Speed: 430.46 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:00:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:34:10 INFO loss_tracker.py:84 | Epoch[105/NA] Step[74] GlobalStep[14459/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0132] total_loss[0.0183] Rank[0/16] 06/24/2025 14:34:16 INFO stats.py:314 | Epoch[105] Step[89] GlobalStep[14474] Training Speed: 428.09 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 10:00:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:34:20 INFO loss_tracker.py:84 | Epoch[105/NA] Step[99] GlobalStep[14484/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0132] total_loss[0.0183] Rank[0/16] 06/24/2025 14:34:26 INFO stats.py:314 | Epoch[105] Step[114] GlobalStep[14499] Training Speed: 261.38 samples/sec across all devices. Average Step Time: 0.49 sec. Estimated Remaining Time: 10:00:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:34:30 INFO loss_tracker.py:84 | Epoch[105/NA] Step[124] GlobalStep[14509/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0057] loss_depth[0.0131] total_loss[0.0190] Rank[0/16] 06/24/2025 14:34:34 INFO stats.py:394 | Epoch[105] completed. Training Speed: 313.04 samples/sec across all devices. Epoch Time: 56.02 sec. Average Epoch Time: 56.02 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:59:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:34:36 INFO stats.py:314 | Epoch[106] Step[2] GlobalStep[14524] Training Speed: 428.90 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:59:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:34:46 INFO loss_tracker.py:84 | Epoch[106/NA] Step[24] GlobalStep[14546/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0132] total_loss[0.0185] Rank[0/16] 06/24/2025 14:34:47 INFO stats.py:314 | Epoch[106] Step[27] GlobalStep[14549] Training Speed: 433.94 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:59:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:34:56 INFO loss_tracker.py:84 | Epoch[106/NA] Step[49] GlobalStep[14571/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0055] loss_depth[0.0132] total_loss[0.0187] Rank[0/16] 06/24/2025 14:34:57 INFO stats.py:314 | Epoch[106] Step[52] GlobalStep[14574] Training Speed: 432.00 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:59:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:35:06 INFO loss_tracker.py:84 | Epoch[106/NA] Step[74] GlobalStep[14596/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0132] total_loss[0.0184] Rank[0/16] 06/24/2025 14:35:07 INFO stats.py:314 | Epoch[106] Step[77] GlobalStep[14599] Training Speed: 433.82 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:59:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:35:17 INFO loss_tracker.py:84 | Epoch[106/NA] Step[99] GlobalStep[14621/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0057] loss_depth[0.0132] total_loss[0.0190] Rank[0/16] 06/24/2025 14:35:18 INFO stats.py:314 | Epoch[106] Step[102] GlobalStep[14624] Training Speed: 430.06 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:59:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:35:27 INFO loss_tracker.py:84 | Epoch[106/NA] Step[124] GlobalStep[14646/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0132] total_loss[0.0186] Rank[0/16] 06/24/2025 14:35:28 INFO stats.py:314 | Epoch[106] Step[127] GlobalStep[14649] Training Speed: 449.18 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 9:58:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:35:31 INFO stats.py:394 | Epoch[106] completed. Training Speed: 306.38 samples/sec across all devices. Epoch Time: 57.24 sec. Average Epoch Time: 57.24 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 9:58:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:35:39 INFO stats.py:314 | Epoch[107] Step[15] GlobalStep[14674] Training Speed: 425.76 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:58:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:35:43 INFO loss_tracker.py:84 | Epoch[107/NA] Step[24] GlobalStep[14683/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0132] total_loss[0.0187] Rank[0/16] 06/24/2025 14:35:49 INFO stats.py:314 | Epoch[107] Step[40] GlobalStep[14699] Training Speed: 432.75 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:58:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:35:53 INFO loss_tracker.py:84 | Epoch[107/NA] Step[49] GlobalStep[14708/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0048] loss_depth[0.0131] total_loss[0.0180] Rank[0/16] 06/24/2025 14:36:00 INFO stats.py:314 | Epoch[107] Step[65] GlobalStep[14724] Training Speed: 427.23 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:58:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:36:03 INFO loss_tracker.py:84 | Epoch[107/NA] Step[74] GlobalStep[14733/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0131] total_loss[0.0186] Rank[0/16] 06/24/2025 14:36:10 INFO stats.py:314 | Epoch[107] Step[90] GlobalStep[14749] Training Speed: 421.59 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:58:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:36:13 INFO loss_tracker.py:84 | Epoch[107/NA] Step[99] GlobalStep[14758/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0058] loss_depth[0.0131] total_loss[0.0191] Rank[0/16] 06/24/2025 14:36:20 INFO stats.py:314 | Epoch[107] Step[115] GlobalStep[14774] Training Speed: 433.43 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:57:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:36:24 INFO loss_tracker.py:84 | Epoch[107/NA] Step[124] GlobalStep[14783/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0132] total_loss[0.0185] Rank[0/16] 06/24/2025 14:36:28 INFO stats.py:394 | Epoch[107] completed. Training Speed: 309.45 samples/sec across all devices. Epoch Time: 56.67 sec. Average Epoch Time: 56.67 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:57:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:36:30 INFO stats.py:314 | Epoch[108] Step[3] GlobalStep[14799] Training Speed: 434.74 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:57:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:36:40 INFO loss_tracker.py:84 | Epoch[108/NA] Step[24] GlobalStep[14820/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0058] loss_depth[0.0131] total_loss[0.0190] Rank[0/16] 06/24/2025 14:36:41 INFO stats.py:314 | Epoch[108] Step[28] GlobalStep[14824] Training Speed: 429.57 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:57:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:36:49 INFO loss_tracker.py:84 | Epoch[108/NA] Step[49] GlobalStep[14845/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0045] loss_depth[0.0132] total_loss[0.0177] Rank[0/16] 06/24/2025 14:36:51 INFO stats.py:314 | Epoch[108] Step[53] GlobalStep[14849] Training Speed: 430.10 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:57:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:37:00 INFO loss_tracker.py:84 | Epoch[108/NA] Step[74] GlobalStep[14870/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0055] loss_depth[0.0131] total_loss[0.0187] Rank[0/16] 06/24/2025 14:37:01 INFO stats.py:314 | Epoch[108] Step[78] GlobalStep[14874] Training Speed: 438.06 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:57:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:37:10 INFO loss_tracker.py:84 | Epoch[108/NA] Step[99] GlobalStep[14895/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0055] loss_depth[0.0131] total_loss[0.0187] Rank[0/16] 06/24/2025 14:37:11 INFO stats.py:314 | Epoch[108] Step[103] GlobalStep[14899] Training Speed: 432.58 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:56:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:37:20 INFO loss_tracker.py:84 | Epoch[108/NA] Step[124] GlobalStep[14920/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0132] total_loss[0.0188] Rank[0/16] 06/24/2025 14:37:21 INFO stats.py:314 | Epoch[108] Step[128] GlobalStep[14924] Training Speed: 447.54 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:56:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:37:24 INFO stats.py:394 | Epoch[108] completed. Training Speed: 311.37 samples/sec across all devices. Epoch Time: 56.32 sec. Average Epoch Time: 56.32 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:56:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:37:32 INFO stats.py:314 | Epoch[109] Step[16] GlobalStep[14949] Training Speed: 402.61 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 9:56:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:37:35 INFO loss_tracker.py:84 | Epoch[109/NA] Step[24] GlobalStep[14957/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0132] total_loss[0.0183] Rank[0/16] 06/24/2025 14:37:42 INFO stats.py:314 | Epoch[109] Step[41] GlobalStep[14974] Training Speed: 431.03 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:56:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:37:46 INFO loss_tracker.py:84 | Epoch[109/NA] Step[49] GlobalStep[14982/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0062] loss_depth[0.0132] total_loss[0.0194] Rank[0/16] 06/24/2025 14:37:52 INFO stats.py:314 | Epoch[109] Step[66] GlobalStep[14999] Training Speed: 433.04 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:56:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:37:55 INFO loss_tracker.py:84 | Epoch[109/NA] Step[74] GlobalStep[15007/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0131] total_loss[0.0183] Rank[0/16] 06/24/2025 14:38:03 INFO stats.py:314 | Epoch[109] Step[91] GlobalStep[15024] Training Speed: 435.45 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:55:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:38:05 INFO loss_tracker.py:84 | Epoch[109/NA] Step[99] GlobalStep[15032/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0059] loss_depth[0.0132] total_loss[0.0192] Rank[0/16] 06/24/2025 14:38:12 INFO stats.py:314 | Epoch[109] Step[116] GlobalStep[15049] Training Speed: 428.97 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:55:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:38:15 INFO loss_tracker.py:84 | Epoch[109/NA] Step[124] GlobalStep[15057/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0131] total_loss[0.0183] Rank[0/16] 06/24/2025 14:38:20 INFO stats.py:394 | Epoch[109] completed. Training Speed: 315.37 samples/sec across all devices. Epoch Time: 55.61 sec. Average Epoch Time: 55.61 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:55:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:38:23 INFO stats.py:314 | Epoch[110] Step[4] GlobalStep[15074] Training Speed: 431.41 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:55:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:38:31 INFO loss_tracker.py:84 | Epoch[110/NA] Step[24] GlobalStep[15094/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0131] total_loss[0.0183] Rank[0/16] 06/24/2025 14:38:33 INFO stats.py:314 | Epoch[110] Step[29] GlobalStep[15099] Training Speed: 424.47 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:55:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:38:41 INFO loss_tracker.py:84 | Epoch[110/NA] Step[49] GlobalStep[15119/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0131] total_loss[0.0184] Rank[0/16] 06/24/2025 14:38:43 INFO stats.py:314 | Epoch[110] Step[54] GlobalStep[15124] Training Speed: 431.42 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:55:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:38:51 INFO loss_tracker.py:84 | Epoch[110/NA] Step[74] GlobalStep[15144/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0131] total_loss[0.0186] Rank[0/16] 06/24/2025 14:38:53 INFO stats.py:314 | Epoch[110] Step[79] GlobalStep[15149] Training Speed: 430.23 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:54:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:39:02 INFO loss_tracker.py:84 | Epoch[110/NA] Step[99] GlobalStep[15169/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0131] total_loss[0.0183] Rank[0/16] 06/24/2025 14:39:04 INFO stats.py:314 | Epoch[110] Step[104] GlobalStep[15174] Training Speed: 433.19 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:54:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:39:12 INFO loss_tracker.py:84 | Epoch[110/NA] Step[124] GlobalStep[15194/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0132] total_loss[0.0186] Rank[0/16] 06/24/2025 14:39:13 INFO stats.py:314 | Epoch[110] Step[129] GlobalStep[15199] Training Speed: 448.48 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:54:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:39:16 INFO stats.py:394 | Epoch[110] completed. Training Speed: 313.74 samples/sec across all devices. Epoch Time: 55.89 sec. Average Epoch Time: 55.89 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:54:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:39:24 INFO stats.py:314 | Epoch[111] Step[17] GlobalStep[15224] Training Speed: 428.83 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:54:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:39:27 INFO loss_tracker.py:84 | Epoch[111/NA] Step[24] GlobalStep[15231/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0048] loss_depth[0.0132] total_loss[0.0181] Rank[0/16] 06/24/2025 14:39:34 INFO stats.py:314 | Epoch[111] Step[42] GlobalStep[15249] Training Speed: 435.08 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:54:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:39:37 INFO loss_tracker.py:84 | Epoch[111/NA] Step[49] GlobalStep[15256/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0058] loss_depth[0.0131] total_loss[0.0190] Rank[0/16] 06/24/2025 14:39:45 INFO stats.py:314 | Epoch[111] Step[67] GlobalStep[15274] Training Speed: 407.18 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 9:53:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:39:47 INFO loss_tracker.py:84 | Epoch[111/NA] Step[74] GlobalStep[15281/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0131] total_loss[0.0182] Rank[0/16] 06/24/2025 14:39:55 INFO stats.py:314 | Epoch[111] Step[92] GlobalStep[15299] Training Speed: 433.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:53:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:39:58 INFO loss_tracker.py:84 | Epoch[111/NA] Step[99] GlobalStep[15306/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0131] total_loss[0.0183] Rank[0/16] 06/24/2025 14:40:05 INFO stats.py:314 | Epoch[111] Step[117] GlobalStep[15324] Training Speed: 435.81 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:53:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:40:08 INFO loss_tracker.py:84 | Epoch[111/NA] Step[124] GlobalStep[15331/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0131] total_loss[0.0184] Rank[0/16] 06/24/2025 14:40:12 INFO stats.py:394 | Epoch[111] completed. Training Speed: 311.07 samples/sec across all devices. Epoch Time: 56.37 sec. Average Epoch Time: 56.37 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:53:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:40:16 INFO stats.py:314 | Epoch[112] Step[5] GlobalStep[15349] Training Speed: 433.13 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:53:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:40:23 INFO loss_tracker.py:84 | Epoch[112/NA] Step[24] GlobalStep[15368/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0131] total_loss[0.0182] Rank[0/16] 06/24/2025 14:40:26 INFO stats.py:314 | Epoch[112] Step[30] GlobalStep[15374] Training Speed: 430.63 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:53:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:40:34 INFO loss_tracker.py:84 | Epoch[112/NA] Step[49] GlobalStep[15393/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0131] total_loss[0.0179] Rank[0/16] 06/24/2025 14:40:36 INFO stats.py:314 | Epoch[112] Step[55] GlobalStep[15399] Training Speed: 428.89 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:52:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:40:44 INFO loss_tracker.py:84 | Epoch[112/NA] Step[74] GlobalStep[15418/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0131] total_loss[0.0184] Rank[0/16] 06/24/2025 14:40:46 INFO stats.py:314 | Epoch[112] Step[80] GlobalStep[15424] Training Speed: 430.71 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:52:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:40:54 INFO loss_tracker.py:84 | Epoch[112/NA] Step[99] GlobalStep[15443/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0132] total_loss[0.0183] Rank[0/16] 06/24/2025 14:40:56 INFO stats.py:314 | Epoch[112] Step[105] GlobalStep[15449] Training Speed: 433.34 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:52:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:41:04 INFO loss_tracker.py:84 | Epoch[112/NA] Step[124] GlobalStep[15468/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0130] total_loss[0.0183] Rank[0/16] 06/24/2025 14:41:06 INFO stats.py:314 | Epoch[112] Step[130] GlobalStep[15474] Training Speed: 444.57 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:52:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:41:08 INFO stats.py:394 | Epoch[112] completed. Training Speed: 313.29 samples/sec across all devices. Epoch Time: 55.97 sec. Average Epoch Time: 55.97 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:52:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:41:17 INFO stats.py:314 | Epoch[113] Step[18] GlobalStep[15499] Training Speed: 429.81 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:52:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:41:20 INFO loss_tracker.py:84 | Epoch[113/NA] Step[24] GlobalStep[15505/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0057] loss_depth[0.0131] total_loss[0.0189] Rank[0/16] 06/24/2025 14:41:27 INFO stats.py:314 | Epoch[113] Step[43] GlobalStep[15524] Training Speed: 407.78 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 9:51:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:41:30 INFO loss_tracker.py:84 | Epoch[113/NA] Step[49] GlobalStep[15530/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0131] total_loss[0.0181] Rank[0/16] 06/24/2025 14:41:38 INFO stats.py:314 | Epoch[113] Step[68] GlobalStep[15549] Training Speed: 435.51 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:51:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:41:40 INFO loss_tracker.py:84 | Epoch[113/NA] Step[74] GlobalStep[15555/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0055] loss_depth[0.0132] total_loss[0.0188] Rank[0/16] 06/24/2025 14:41:48 INFO stats.py:314 | Epoch[113] Step[93] GlobalStep[15574] Training Speed: 432.46 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:51:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:41:50 INFO loss_tracker.py:84 | Epoch[113/NA] Step[99] GlobalStep[15580/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0131] total_loss[0.0183] Rank[0/16] 06/24/2025 14:41:58 INFO stats.py:314 | Epoch[113] Step[118] GlobalStep[15599] Training Speed: 428.90 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:51:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:42:00 INFO loss_tracker.py:84 | Epoch[113/NA] Step[124] GlobalStep[15605/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0131] total_loss[0.0189] Rank[0/16] 06/24/2025 14:42:05 INFO stats.py:394 | Epoch[113] completed. Training Speed: 310.16 samples/sec across all devices. Epoch Time: 56.54 sec. Average Epoch Time: 56.54 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:51:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:42:09 INFO stats.py:314 | Epoch[114] Step[6] GlobalStep[15624] Training Speed: 246.56 samples/sec across all devices. Average Step Time: 0.52 sec. Estimated Remaining Time: 9:51:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:42:16 INFO loss_tracker.py:84 | Epoch[114/NA] Step[24] GlobalStep[15642/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0131] total_loss[0.0185] Rank[0/16] 06/24/2025 14:42:19 INFO stats.py:314 | Epoch[114] Step[31] GlobalStep[15649] Training Speed: 433.61 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:50:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:42:26 INFO loss_tracker.py:84 | Epoch[114/NA] Step[49] GlobalStep[15667/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0058] loss_depth[0.0131] total_loss[0.0190] Rank[0/16] 06/24/2025 14:42:29 INFO stats.py:314 | Epoch[114] Step[56] GlobalStep[15674] Training Speed: 248.22 samples/sec across all devices. Average Step Time: 0.52 sec. Estimated Remaining Time: 9:50:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:42:37 INFO loss_tracker.py:84 | Epoch[114/NA] Step[74] GlobalStep[15692/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0132] total_loss[0.0187] Rank[0/16] 06/24/2025 14:42:39 INFO stats.py:314 | Epoch[114] Step[81] GlobalStep[15699] Training Speed: 423.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:50:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:42:47 INFO loss_tracker.py:84 | Epoch[114/NA] Step[99] GlobalStep[15717/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0132] total_loss[0.0183] Rank[0/16] 06/24/2025 14:42:50 INFO stats.py:314 | Epoch[114] Step[106] GlobalStep[15724] Training Speed: 433.26 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:50:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:42:58 INFO loss_tracker.py:84 | Epoch[114/NA] Step[124] GlobalStep[15742/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0131] total_loss[0.0181] Rank[0/16] 06/24/2025 14:43:00 INFO stats.py:314 | Epoch[114] Step[131] GlobalStep[15749] Training Speed: 447.82 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:50:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:43:02 INFO stats.py:394 | Epoch[114] completed. Training Speed: 306.75 samples/sec across all devices. Epoch Time: 57.17 sec. Average Epoch Time: 57.17 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 9:50:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:43:11 INFO stats.py:314 | Epoch[115] Step[19] GlobalStep[15774] Training Speed: 264.97 samples/sec across all devices. Average Step Time: 0.48 sec. Estimated Remaining Time: 9:50:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:43:13 INFO loss_tracker.py:84 | Epoch[115/NA] Step[24] GlobalStep[15779/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0055] loss_depth[0.0131] total_loss[0.0187] Rank[0/16] 06/24/2025 14:43:21 INFO stats.py:314 | Epoch[115] Step[44] GlobalStep[15799] Training Speed: 425.70 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:49:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:43:23 INFO loss_tracker.py:84 | Epoch[115/NA] Step[49] GlobalStep[15804/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0131] total_loss[0.0186] Rank[0/16] 06/24/2025 14:43:32 INFO stats.py:314 | Epoch[115] Step[69] GlobalStep[15824] Training Speed: 434.02 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:49:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:43:34 INFO loss_tracker.py:84 | Epoch[115/NA] Step[74] GlobalStep[15829/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0131] total_loss[0.0187] Rank[0/16] 06/24/2025 14:43:42 INFO stats.py:314 | Epoch[115] Step[94] GlobalStep[15849] Training Speed: 437.68 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:49:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:43:44 INFO loss_tracker.py:84 | Epoch[115/NA] Step[99] GlobalStep[15854/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0044] loss_depth[0.0132] total_loss[0.0177] Rank[0/16] 06/24/2025 14:43:53 INFO stats.py:314 | Epoch[115] Step[119] GlobalStep[15874] Training Speed: 433.21 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:49:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:43:54 INFO loss_tracker.py:84 | Epoch[115/NA] Step[124] GlobalStep[15879/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0131] total_loss[0.0186] Rank[0/16] 06/24/2025 14:43:59 INFO stats.py:394 | Epoch[115] completed. Training Speed: 307.55 samples/sec across all devices. Epoch Time: 57.02 sec. Average Epoch Time: 57.02 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 9:49:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:44:03 INFO stats.py:314 | Epoch[116] Step[7] GlobalStep[15899] Training Speed: 430.32 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:49:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:44:10 INFO loss_tracker.py:84 | Epoch[116/NA] Step[24] GlobalStep[15916/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0131] total_loss[0.0185] Rank[0/16] 06/24/2025 14:44:13 INFO stats.py:314 | Epoch[116] Step[32] GlobalStep[15924] Training Speed: 431.56 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:48:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:44:20 INFO loss_tracker.py:84 | Epoch[116/NA] Step[49] GlobalStep[15941/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0132] total_loss[0.0185] Rank[0/16] 06/24/2025 14:44:23 INFO stats.py:314 | Epoch[116] Step[57] GlobalStep[15949] Training Speed: 423.01 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:48:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:44:30 INFO loss_tracker.py:84 | Epoch[116/NA] Step[74] GlobalStep[15966/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0055] loss_depth[0.0131] total_loss[0.0187] Rank[0/16] 06/24/2025 14:44:34 INFO stats.py:314 | Epoch[116] Step[82] GlobalStep[15974] Training Speed: 403.62 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 9:48:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:44:41 INFO loss_tracker.py:84 | Epoch[116/NA] Step[99] GlobalStep[15991/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0131] total_loss[0.0188] Rank[0/16] 06/24/2025 14:44:44 INFO stats.py:314 | Epoch[116] Step[107] GlobalStep[15999] Training Speed: 435.95 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:48:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:44:44 WARNING accelerator.py:3099 | Deleting 1 checkpoints to make room for new checkpoint. Rank[0/16] 06/24/2025 14:44:45 INFO accelerator.py:3111 | Saving current state to /job_data/checkpoints/checkpoint_3 Rank[7/16] 06/24/2025 14:44:45 INFO checkpoint.py:110 | Save checkpoint at the end of step 15999 to /job_data/checkpoints/checkpoint_3 Rank[1/16] 06/24/2025 14:44:45 INFO checkpoint.py:110 | Save checkpoint at the end of step 15999 to /job_data/checkpoints/checkpoint_3 Rank[12/16] 06/24/2025 14:44:45 INFO checkpoint.py:110 | Save checkpoint at the end of step 15999 to /job_data/checkpoints/checkpoint_3 Rank[5/16] 06/24/2025 14:44:45 INFO checkpoint.py:110 | Save checkpoint at the end of step 15999 to /job_data/checkpoints/checkpoint_3 Rank[3/16] 06/24/2025 14:44:45 INFO checkpoint.py:110 | Save checkpoint at the end of step 15999 to /job_data/checkpoints/checkpoint_3 Rank[15/16] 06/24/2025 14:44:45 INFO checkpoint.py:110 | Save checkpoint at the end of step 15999 to /job_data/checkpoints/checkpoint_3 Rank[6/16] 06/24/2025 14:44:45 INFO checkpoint.py:110 | Save checkpoint at the end of step 15999 to /job_data/checkpoints/checkpoint_3 Rank[10/16] 06/24/2025 14:44:45 INFO checkpoint.py:110 | Save checkpoint at the end of step 15999 to /job_data/checkpoints/checkpoint_3 Rank[2/16] 06/24/2025 14:44:45 INFO checkpoint.py:110 | Save checkpoint at the end of step 15999 to /job_data/checkpoints/checkpoint_3 Rank[8/16] 06/24/2025 14:44:45 INFO checkpoint.py:110 | Save checkpoint at the end of step 15999 to /job_data/checkpoints/checkpoint_3 Rank[4/16] 06/24/2025 14:44:45 INFO checkpoint.py:110 | Save checkpoint at the end of step 15999 to /job_data/checkpoints/checkpoint_3 Rank[13/16] 06/24/2025 14:44:45 INFO checkpoint.py:110 | Save checkpoint at the end of step 15999 to /job_data/checkpoints/checkpoint_3 Rank[11/16] 06/24/2025 14:44:45 INFO checkpoint.py:110 | Save checkpoint at the end of step 15999 to /job_data/checkpoints/checkpoint_3 Rank[14/16] 06/24/2025 14:44:45 INFO checkpoint.py:110 | Save checkpoint at the end of step 15999 to /job_data/checkpoints/checkpoint_3 Rank[9/16] 06/24/2025 14:44:45 INFO checkpoint.py:110 | Save checkpoint at the end of step 15999 to /job_data/checkpoints/checkpoint_3 Rank[0/16] 06/24/2025 14:44:46 INFO checkpointing.py:106 | Model weights saved in /job_data/checkpoints/checkpoint_3/model.safetensors Rank[0/16] 06/24/2025 14:44:47 INFO checkpointing.py:113 | Optimizer state saved in /job_data/checkpoints/checkpoint_3/optimizer.bin Rank[0/16] 06/24/2025 14:44:47 INFO checkpointing.py:120 | Scheduler state saved in /job_data/checkpoints/checkpoint_3/scheduler.bin Rank[0/16] 06/24/2025 14:44:47 INFO checkpointing.py:137 | Sampler state for dataloader 0 saved in /job_data/checkpoints/checkpoint_3/sampler.bin Rank[0/16] 06/24/2025 14:44:47 INFO checkpointing.py:164 | Random states saved in /job_data/checkpoints/checkpoint_3/random_states_0.pkl Rank[0/16] 06/24/2025 14:44:47 INFO checkpointing.py:300 | Saving the state of TrainerProgressState to /job_data/checkpoints/checkpoint_3/custom_checkpoint_0.pkl Rank[0/16] 06/24/2025 14:44:47 INFO checkpoint.py:110 | Save checkpoint at the end of step 15999 to /job_data/checkpoints/checkpoint_3 Rank[0/16] 06/24/2025 14:44:54 INFO loss_tracker.py:84 | Epoch[116/NA] Step[124] GlobalStep[16016/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0131] total_loss[0.0182] Rank[0/16] 06/24/2025 14:44:57 INFO stats.py:314 | Epoch[116] Step[132] GlobalStep[16024] Training Speed: 449.82 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 9:48:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:44:58 INFO stats.py:394 | Epoch[116] completed. Training Speed: 294.90 samples/sec across all devices. Epoch Time: 59.47 sec. Average Epoch Time: 59.47 sec. Average Step Time: 0.43 sec. Estimated Remaining Time: 9:48:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:45:08 INFO stats.py:314 | Epoch[117] Step[20] GlobalStep[16049] Training Speed: 434.96 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:48:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:45:10 INFO loss_tracker.py:84 | Epoch[117/NA] Step[24] GlobalStep[16053/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0131] total_loss[0.0183] Rank[0/16] 06/24/2025 14:45:18 INFO stats.py:314 | Epoch[117] Step[45] GlobalStep[16074] Training Speed: 428.39 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:48:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:45:20 INFO loss_tracker.py:84 | Epoch[117/NA] Step[49] GlobalStep[16078/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0059] loss_depth[0.0131] total_loss[0.0191] Rank[0/16] 06/24/2025 14:45:28 INFO stats.py:314 | Epoch[117] Step[70] GlobalStep[16099] Training Speed: 434.70 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:47:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:45:30 INFO loss_tracker.py:84 | Epoch[117/NA] Step[74] GlobalStep[16103/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0048] loss_depth[0.0131] total_loss[0.0179] Rank[0/16] 06/24/2025 14:45:39 INFO stats.py:314 | Epoch[117] Step[95] GlobalStep[16124] Training Speed: 431.40 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:47:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:45:40 INFO loss_tracker.py:84 | Epoch[117/NA] Step[99] GlobalStep[16128/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0131] total_loss[0.0185] Rank[0/16] 06/24/2025 14:45:49 INFO stats.py:314 | Epoch[117] Step[120] GlobalStep[16149] Training Speed: 445.39 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:47:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:45:50 INFO loss_tracker.py:84 | Epoch[117/NA] Step[124] GlobalStep[16153/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0048] loss_depth[0.0130] total_loss[0.0179] Rank[0/16] 06/24/2025 14:45:54 INFO stats.py:394 | Epoch[117] completed. Training Speed: 312.08 samples/sec across all devices. Epoch Time: 56.19 sec. Average Epoch Time: 56.19 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:47:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:45:59 INFO stats.py:314 | Epoch[118] Step[8] GlobalStep[16174] Training Speed: 433.78 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:47:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:46:06 INFO loss_tracker.py:84 | Epoch[118/NA] Step[24] GlobalStep[16190/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0131] total_loss[0.0184] Rank[0/16] 06/24/2025 14:46:10 INFO stats.py:314 | Epoch[118] Step[33] GlobalStep[16199] Training Speed: 435.52 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:47:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:46:16 INFO loss_tracker.py:84 | Epoch[118/NA] Step[49] GlobalStep[16215/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0131] total_loss[0.0179] Rank[0/16] 06/24/2025 14:46:20 INFO stats.py:314 | Epoch[118] Step[58] GlobalStep[16224] Training Speed: 431.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:46:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:46:26 INFO loss_tracker.py:84 | Epoch[118/NA] Step[74] GlobalStep[16240/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0131] total_loss[0.0186] Rank[0/16] 06/24/2025 14:46:30 INFO stats.py:314 | Epoch[118] Step[83] GlobalStep[16249] Training Speed: 432.66 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:46:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:46:37 INFO loss_tracker.py:84 | Epoch[118/NA] Step[99] GlobalStep[16265/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0054] loss_depth[0.0131] total_loss[0.0187] Rank[0/16] 06/24/2025 14:46:40 INFO stats.py:314 | Epoch[118] Step[108] GlobalStep[16274] Training Speed: 432.39 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:46:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:46:47 INFO loss_tracker.py:84 | Epoch[118/NA] Step[124] GlobalStep[16290/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0058] loss_depth[0.0131] total_loss[0.0190] Rank[0/16] 06/24/2025 14:46:50 INFO stats.py:314 | Epoch[118] Step[133] GlobalStep[16299] Training Speed: 450.01 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 9:46:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:46:51 INFO stats.py:394 | Epoch[118] completed. Training Speed: 309.03 samples/sec across all devices. Epoch Time: 56.75 sec. Average Epoch Time: 56.75 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:46:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:47:01 INFO stats.py:314 | Epoch[119] Step[21] GlobalStep[16324] Training Speed: 434.21 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:46:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:47:03 INFO loss_tracker.py:84 | Epoch[119/NA] Step[24] GlobalStep[16327/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0132] total_loss[0.0187] Rank[0/16] 06/24/2025 14:47:12 INFO stats.py:314 | Epoch[119] Step[46] GlobalStep[16349] Training Speed: 434.19 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:45:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:47:13 INFO loss_tracker.py:84 | Epoch[119/NA] Step[49] GlobalStep[16352/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0131] total_loss[0.0178] Rank[0/16] 06/24/2025 14:47:22 INFO stats.py:314 | Epoch[119] Step[71] GlobalStep[16374] Training Speed: 434.78 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:45:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:47:23 INFO loss_tracker.py:84 | Epoch[119/NA] Step[74] GlobalStep[16377/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0130] total_loss[0.0187] Rank[0/16] 06/24/2025 14:47:32 INFO stats.py:314 | Epoch[119] Step[96] GlobalStep[16399] Training Speed: 427.69 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:45:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:47:34 INFO loss_tracker.py:84 | Epoch[119/NA] Step[99] GlobalStep[16402/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0131] total_loss[0.0183] Rank[0/16] 06/24/2025 14:47:43 INFO stats.py:314 | Epoch[119] Step[121] GlobalStep[16424] Training Speed: 449.37 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 9:45:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:47:44 INFO loss_tracker.py:84 | Epoch[119/NA] Step[124] GlobalStep[16427/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0131] total_loss[0.0183] Rank[0/16] 06/24/2025 14:47:48 INFO stats.py:394 | Epoch[119] completed. Training Speed: 307.49 samples/sec across all devices. Epoch Time: 57.03 sec. Average Epoch Time: 57.03 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 9:45:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:47:53 INFO stats.py:314 | Epoch[120] Step[9] GlobalStep[16449] Training Speed: 435.15 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:45:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:48:00 INFO loss_tracker.py:84 | Epoch[120/NA] Step[24] GlobalStep[16464/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0131] total_loss[0.0178] Rank[0/16] 06/24/2025 14:48:04 INFO stats.py:314 | Epoch[120] Step[34] GlobalStep[16474] Training Speed: 421.15 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:45:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:48:10 INFO loss_tracker.py:84 | Epoch[120/NA] Step[49] GlobalStep[16489/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0130] total_loss[0.0183] Rank[0/16] 06/24/2025 14:48:14 INFO stats.py:314 | Epoch[120] Step[59] GlobalStep[16499] Training Speed: 246.07 samples/sec across all devices. Average Step Time: 0.52 sec. Estimated Remaining Time: 9:44:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:48:20 INFO loss_tracker.py:84 | Epoch[120/NA] Step[74] GlobalStep[16514/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0131] total_loss[0.0183] Rank[0/16] 06/24/2025 14:48:24 INFO stats.py:314 | Epoch[120] Step[84] GlobalStep[16524] Training Speed: 432.69 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:44:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:48:30 INFO loss_tracker.py:84 | Epoch[120/NA] Step[99] GlobalStep[16539/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0131] total_loss[0.0182] Rank[0/16] 06/24/2025 14:48:34 INFO stats.py:314 | Epoch[120] Step[109] GlobalStep[16549] Training Speed: 423.70 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:44:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:48:40 INFO loss_tracker.py:84 | Epoch[120/NA] Step[124] GlobalStep[16564/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0131] total_loss[0.0180] Rank[0/16] 06/24/2025 14:48:44 INFO stats.py:314 | Epoch[120] Step[134] GlobalStep[16574] Training Speed: 451.22 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 9:44:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:48:45 INFO stats.py:394 | Epoch[120] completed. Training Speed: 311.26 samples/sec across all devices. Epoch Time: 56.34 sec. Average Epoch Time: 56.34 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:44:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:48:55 INFO stats.py:314 | Epoch[121] Step[22] GlobalStep[16599] Training Speed: 431.24 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:44:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:48:56 INFO loss_tracker.py:84 | Epoch[121/NA] Step[24] GlobalStep[16601/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0131] total_loss[0.0182] Rank[0/16] 06/24/2025 14:49:05 INFO stats.py:314 | Epoch[121] Step[47] GlobalStep[16624] Training Speed: 434.39 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:43:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:49:06 INFO loss_tracker.py:84 | Epoch[121/NA] Step[49] GlobalStep[16626/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0058] loss_depth[0.0131] total_loss[0.0189] Rank[0/16] 06/24/2025 14:49:16 INFO stats.py:314 | Epoch[121] Step[72] GlobalStep[16649] Training Speed: 432.09 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:43:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:49:16 INFO loss_tracker.py:84 | Epoch[121/NA] Step[74] GlobalStep[16651/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0131] total_loss[0.0179] Rank[0/16] 06/24/2025 14:49:26 INFO stats.py:314 | Epoch[121] Step[97] GlobalStep[16674] Training Speed: 435.00 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:43:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:49:27 INFO loss_tracker.py:84 | Epoch[121/NA] Step[99] GlobalStep[16676/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0131] total_loss[0.0179] Rank[0/16] 06/24/2025 14:49:36 INFO stats.py:314 | Epoch[121] Step[122] GlobalStep[16699] Training Speed: 451.34 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 9:43:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:49:37 INFO loss_tracker.py:84 | Epoch[121/NA] Step[124] GlobalStep[16701/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0058] loss_depth[0.0131] total_loss[0.0190] Rank[0/16] 06/24/2025 14:49:41 INFO stats.py:394 | Epoch[121] completed. Training Speed: 310.04 samples/sec across all devices. Epoch Time: 56.56 sec. Average Epoch Time: 56.56 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:43:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:49:47 INFO stats.py:314 | Epoch[122] Step[10] GlobalStep[16724] Training Speed: 411.77 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 9:43:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:49:52 INFO loss_tracker.py:84 | Epoch[122/NA] Step[24] GlobalStep[16738/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0130] total_loss[0.0180] Rank[0/16] 06/24/2025 14:49:57 INFO stats.py:314 | Epoch[122] Step[35] GlobalStep[16749] Training Speed: 435.48 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:42:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:50:03 INFO loss_tracker.py:84 | Epoch[122/NA] Step[49] GlobalStep[16763/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0130] total_loss[0.0188] Rank[0/16] 06/24/2025 14:50:08 INFO stats.py:314 | Epoch[122] Step[60] GlobalStep[16774] Training Speed: 435.16 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:42:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:50:13 INFO loss_tracker.py:84 | Epoch[122/NA] Step[74] GlobalStep[16788/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0131] total_loss[0.0188] Rank[0/16] 06/24/2025 14:50:17 INFO stats.py:314 | Epoch[122] Step[85] GlobalStep[16799] Training Speed: 438.56 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:42:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:50:24 INFO loss_tracker.py:84 | Epoch[122/NA] Step[99] GlobalStep[16813/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0131] total_loss[0.0186] Rank[0/16] 06/24/2025 14:50:28 INFO stats.py:314 | Epoch[122] Step[110] GlobalStep[16824] Training Speed: 434.48 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:42:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:50:33 INFO loss_tracker.py:84 | Epoch[122/NA] Step[124] GlobalStep[16838/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0057] loss_depth[0.0131] total_loss[0.0189] Rank[0/16] 06/24/2025 14:50:37 INFO stats.py:314 | Epoch[122] Step[135] GlobalStep[16849] Training Speed: 452.38 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 9:42:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:50:38 INFO stats.py:394 | Epoch[122] completed. Training Speed: 309.93 samples/sec across all devices. Epoch Time: 56.58 sec. Average Epoch Time: 56.58 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:42:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:50:48 INFO stats.py:314 | Epoch[123] Step[23] GlobalStep[16874] Training Speed: 438.27 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:41:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:50:49 INFO loss_tracker.py:84 | Epoch[123/NA] Step[24] GlobalStep[16875/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0130] total_loss[0.0183] Rank[0/16] 06/24/2025 14:50:58 INFO stats.py:314 | Epoch[123] Step[48] GlobalStep[16899] Training Speed: 442.78 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:41:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:50:59 INFO loss_tracker.py:84 | Epoch[123/NA] Step[49] GlobalStep[16900/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0131] total_loss[0.0179] Rank[0/16] 06/24/2025 14:51:09 INFO stats.py:314 | Epoch[123] Step[73] GlobalStep[16924] Training Speed: 425.58 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:41:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:51:09 INFO loss_tracker.py:84 | Epoch[123/NA] Step[74] GlobalStep[16925/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0048] loss_depth[0.0131] total_loss[0.0180] Rank[0/16] 06/24/2025 14:51:19 INFO stats.py:314 | Epoch[123] Step[98] GlobalStep[16949] Training Speed: 436.31 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:41:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:51:20 INFO loss_tracker.py:84 | Epoch[123/NA] Step[99] GlobalStep[16950/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0130] total_loss[0.0185] Rank[0/16] 06/24/2025 14:51:29 INFO stats.py:314 | Epoch[123] Step[123] GlobalStep[16974] Training Speed: 454.80 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 9:41:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:51:30 INFO loss_tracker.py:84 | Epoch[123/NA] Step[124] GlobalStep[16975/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0130] total_loss[0.0181] Rank[0/16] 06/24/2025 14:51:34 INFO stats.py:394 | Epoch[123] completed. Training Speed: 310.33 samples/sec across all devices. Epoch Time: 56.51 sec. Average Epoch Time: 56.51 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:41:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:51:40 INFO stats.py:314 | Epoch[124] Step[11] GlobalStep[16999] Training Speed: 433.17 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:41:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:51:46 INFO loss_tracker.py:84 | Epoch[124/NA] Step[24] GlobalStep[17012/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0131] total_loss[0.0185] Rank[0/16] 06/24/2025 14:51:51 INFO stats.py:314 | Epoch[124] Step[36] GlobalStep[17024] Training Speed: 430.13 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:40:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:51:56 INFO loss_tracker.py:84 | Epoch[124/NA] Step[49] GlobalStep[17037/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0131] total_loss[0.0183] Rank[0/16] 06/24/2025 14:52:01 INFO stats.py:314 | Epoch[124] Step[61] GlobalStep[17049] Training Speed: 434.22 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:40:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:52:06 INFO loss_tracker.py:84 | Epoch[124/NA] Step[74] GlobalStep[17062/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0131] total_loss[0.0181] Rank[0/16] 06/24/2025 14:52:11 INFO stats.py:314 | Epoch[124] Step[86] GlobalStep[17074] Training Speed: 410.51 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 9:40:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:52:16 INFO loss_tracker.py:84 | Epoch[124/NA] Step[99] GlobalStep[17087/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0048] loss_depth[0.0131] total_loss[0.0180] Rank[0/16] 06/24/2025 14:52:21 INFO stats.py:314 | Epoch[124] Step[111] GlobalStep[17099] Training Speed: 415.71 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 9:40:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:52:26 INFO loss_tracker.py:84 | Epoch[124/NA] Step[124] GlobalStep[17112/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0130] total_loss[0.0178] Rank[0/16] 06/24/2025 14:52:30 INFO stats.py:314 | Epoch[124] Step[136] GlobalStep[17124] Training Speed: 451.32 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 9:39:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:52:30 INFO stats.py:394 | Epoch[124] completed. Training Speed: 311.56 samples/sec across all devices. Epoch Time: 56.28 sec. Average Epoch Time: 56.28 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:39:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:52:42 INFO stats.py:314 | Epoch[125] Step[24] GlobalStep[17149] Training Speed: 429.17 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:39:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:52:42 INFO loss_tracker.py:84 | Epoch[125/NA] Step[24] GlobalStep[17149/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0131] total_loss[0.0183] Rank[0/16] 06/24/2025 14:52:52 INFO stats.py:314 | Epoch[125] Step[49] GlobalStep[17174] Training Speed: 426.32 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:39:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:52:52 INFO loss_tracker.py:84 | Epoch[125/NA] Step[49] GlobalStep[17174/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0131] total_loss[0.0185] Rank[0/16] 06/24/2025 14:53:02 INFO stats.py:314 | Epoch[125] Step[74] GlobalStep[17199] Training Speed: 430.32 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:39:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:53:02 INFO loss_tracker.py:84 | Epoch[125/NA] Step[74] GlobalStep[17199/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0048] loss_depth[0.0131] total_loss[0.0180] Rank[0/16] 06/24/2025 14:53:13 INFO stats.py:314 | Epoch[125] Step[99] GlobalStep[17224] Training Speed: 429.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:39:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:53:13 INFO loss_tracker.py:84 | Epoch[125/NA] Step[99] GlobalStep[17224/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0131] total_loss[0.0183] Rank[0/16] 06/24/2025 14:53:23 INFO stats.py:314 | Epoch[125] Step[124] GlobalStep[17249] Training Speed: 453.75 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 9:39:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:53:23 INFO loss_tracker.py:84 | Epoch[125/NA] Step[124] GlobalStep[17249/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0130] total_loss[0.0180] Rank[0/16] 06/24/2025 14:53:27 INFO stats.py:394 | Epoch[125] completed. Training Speed: 307.48 samples/sec across all devices. Epoch Time: 57.03 sec. Average Epoch Time: 57.03 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 9:38:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:53:34 INFO stats.py:314 | Epoch[126] Step[12] GlobalStep[17274] Training Speed: 403.64 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 9:38:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:53:39 INFO loss_tracker.py:84 | Epoch[126/NA] Step[24] GlobalStep[17286/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0131] total_loss[0.0184] Rank[0/16] 06/24/2025 14:53:44 INFO stats.py:314 | Epoch[126] Step[37] GlobalStep[17299] Training Speed: 433.02 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:38:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:53:49 INFO loss_tracker.py:84 | Epoch[126/NA] Step[49] GlobalStep[17311/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0131] total_loss[0.0185] Rank[0/16] 06/24/2025 14:53:54 INFO stats.py:314 | Epoch[126] Step[62] GlobalStep[17324] Training Speed: 435.41 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:38:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:53:59 INFO loss_tracker.py:84 | Epoch[126/NA] Step[74] GlobalStep[17336/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0131] total_loss[0.0183] Rank[0/16] 06/24/2025 14:54:05 INFO stats.py:314 | Epoch[126] Step[87] GlobalStep[17349] Training Speed: 432.28 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:38:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:54:10 INFO loss_tracker.py:84 | Epoch[126/NA] Step[99] GlobalStep[17361/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0044] loss_depth[0.0130] total_loss[0.0175] Rank[0/16] 06/24/2025 14:54:15 INFO stats.py:314 | Epoch[126] Step[112] GlobalStep[17374] Training Speed: 430.00 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:38:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:54:20 INFO loss_tracker.py:84 | Epoch[126/NA] Step[124] GlobalStep[17386/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0044] loss_depth[0.0130] total_loss[0.0175] Rank[0/16] 06/24/2025 14:54:24 INFO stats.py:394 | Epoch[126] completed. Training Speed: 307.72 samples/sec across all devices. Epoch Time: 56.99 sec. Average Epoch Time: 56.99 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 9:37:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:54:25 INFO stats.py:314 | Epoch[127] Step[0] GlobalStep[17399] Training Speed: 371.35 samples/sec across all devices. Average Step Time: 0.34 sec. Estimated Remaining Time: 9:37:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:54:36 INFO loss_tracker.py:84 | Epoch[127/NA] Step[24] GlobalStep[17423/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0055] loss_depth[0.0131] total_loss[0.0186] Rank[0/16] 06/24/2025 14:54:36 INFO stats.py:314 | Epoch[127] Step[25] GlobalStep[17424] Training Speed: 425.72 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:37:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:54:46 INFO loss_tracker.py:84 | Epoch[127/NA] Step[49] GlobalStep[17448/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0131] total_loss[0.0183] Rank[0/16] 06/24/2025 14:54:46 INFO stats.py:314 | Epoch[127] Step[50] GlobalStep[17449] Training Speed: 434.31 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:37:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:54:56 INFO loss_tracker.py:84 | Epoch[127/NA] Step[74] GlobalStep[17473/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0048] loss_depth[0.0131] total_loss[0.0179] Rank[0/16] 06/24/2025 14:54:56 INFO stats.py:314 | Epoch[127] Step[75] GlobalStep[17474] Training Speed: 415.69 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 9:37:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:55:06 INFO loss_tracker.py:84 | Epoch[127/NA] Step[99] GlobalStep[17498/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0046] loss_depth[0.0131] total_loss[0.0177] Rank[0/16] 06/24/2025 14:55:07 INFO stats.py:314 | Epoch[127] Step[100] GlobalStep[17499] Training Speed: 420.07 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:37:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:55:16 INFO loss_tracker.py:84 | Epoch[127/NA] Step[124] GlobalStep[17523/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0131] total_loss[0.0182] Rank[0/16] 06/24/2025 14:55:16 INFO stats.py:314 | Epoch[127] Step[125] GlobalStep[17524] Training Speed: 438.65 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:36:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:55:21 INFO stats.py:394 | Epoch[127] completed. Training Speed: 313.00 samples/sec across all devices. Epoch Time: 56.02 sec. Average Epoch Time: 56.02 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:36:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:55:27 INFO stats.py:314 | Epoch[128] Step[13] GlobalStep[17549] Training Speed: 435.36 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:36:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:55:32 INFO loss_tracker.py:84 | Epoch[128/NA] Step[24] GlobalStep[17560/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0130] total_loss[0.0185] Rank[0/16] 06/24/2025 14:55:38 INFO stats.py:314 | Epoch[128] Step[38] GlobalStep[17574] Training Speed: 434.39 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:36:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:55:42 INFO loss_tracker.py:84 | Epoch[128/NA] Step[49] GlobalStep[17585/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0130] total_loss[0.0184] Rank[0/16] 06/24/2025 14:55:48 INFO stats.py:314 | Epoch[128] Step[63] GlobalStep[17599] Training Speed: 429.64 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:36:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:55:52 INFO loss_tracker.py:84 | Epoch[128/NA] Step[74] GlobalStep[17610/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0131] total_loss[0.0182] Rank[0/16] 06/24/2025 14:55:58 INFO stats.py:314 | Epoch[128] Step[88] GlobalStep[17624] Training Speed: 425.27 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:36:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:56:03 INFO loss_tracker.py:84 | Epoch[128/NA] Step[99] GlobalStep[17635/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0130] total_loss[0.0183] Rank[0/16] 06/24/2025 14:56:08 INFO stats.py:314 | Epoch[128] Step[113] GlobalStep[17649] Training Speed: 421.09 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:36:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:56:13 INFO loss_tracker.py:84 | Epoch[128/NA] Step[124] GlobalStep[17660/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0130] total_loss[0.0183] Rank[0/16] 06/24/2025 14:56:17 INFO stats.py:394 | Epoch[128] completed. Training Speed: 310.75 samples/sec across all devices. Epoch Time: 56.43 sec. Average Epoch Time: 56.43 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:35:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:56:19 INFO stats.py:314 | Epoch[129] Step[1] GlobalStep[17674] Training Speed: 435.34 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:35:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:56:28 INFO loss_tracker.py:84 | Epoch[129/NA] Step[24] GlobalStep[17697/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0130] total_loss[0.0180] Rank[0/16] 06/24/2025 14:56:29 INFO stats.py:314 | Epoch[129] Step[26] GlobalStep[17699] Training Speed: 393.00 samples/sec across all devices. Average Step Time: 0.33 sec. Estimated Remaining Time: 9:35:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:56:39 INFO loss_tracker.py:84 | Epoch[129/NA] Step[49] GlobalStep[17722/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0058] loss_depth[0.0131] total_loss[0.0189] Rank[0/16] 06/24/2025 14:56:39 INFO stats.py:314 | Epoch[129] Step[51] GlobalStep[17724] Training Speed: 259.19 samples/sec across all devices. Average Step Time: 0.49 sec. Estimated Remaining Time: 9:35:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:56:49 INFO loss_tracker.py:84 | Epoch[129/NA] Step[74] GlobalStep[17747/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0131] total_loss[0.0182] Rank[0/16] 06/24/2025 14:56:49 INFO stats.py:314 | Epoch[129] Step[76] GlobalStep[17749] Training Speed: 429.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:35:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:56:59 INFO loss_tracker.py:84 | Epoch[129/NA] Step[99] GlobalStep[17772/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0048] loss_depth[0.0131] total_loss[0.0180] Rank[0/16] 06/24/2025 14:57:00 INFO stats.py:314 | Epoch[129] Step[101] GlobalStep[17774] Training Speed: 429.72 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:35:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:57:09 INFO loss_tracker.py:84 | Epoch[129/NA] Step[124] GlobalStep[17797/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0130] total_loss[0.0180] Rank[0/16] 06/24/2025 14:57:10 INFO stats.py:314 | Epoch[129] Step[126] GlobalStep[17799] Training Speed: 449.84 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 9:34:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:57:13 INFO stats.py:394 | Epoch[129] completed. Training Speed: 312.47 samples/sec across all devices. Epoch Time: 56.12 sec. Average Epoch Time: 56.12 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:34:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:57:20 INFO stats.py:314 | Epoch[130] Step[14] GlobalStep[17824] Training Speed: 430.09 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:34:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:57:25 INFO loss_tracker.py:84 | Epoch[130/NA] Step[24] GlobalStep[17834/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0130] total_loss[0.0184] Rank[0/16] 06/24/2025 14:57:31 INFO stats.py:314 | Epoch[130] Step[39] GlobalStep[17849] Training Speed: 433.97 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:34:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:57:35 INFO loss_tracker.py:84 | Epoch[130/NA] Step[49] GlobalStep[17859/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0131] total_loss[0.0184] Rank[0/16] 06/24/2025 14:57:41 INFO stats.py:314 | Epoch[130] Step[64] GlobalStep[17874] Training Speed: 431.81 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:34:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:57:45 INFO loss_tracker.py:84 | Epoch[130/NA] Step[74] GlobalStep[17884/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0057] loss_depth[0.0131] total_loss[0.0188] Rank[0/16] 06/24/2025 14:57:51 INFO stats.py:314 | Epoch[130] Step[89] GlobalStep[17899] Training Speed: 404.73 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 9:34:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:57:55 INFO loss_tracker.py:84 | Epoch[130/NA] Step[99] GlobalStep[17909/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0046] loss_depth[0.0130] total_loss[0.0177] Rank[0/16] 06/24/2025 14:58:01 INFO stats.py:314 | Epoch[130] Step[114] GlobalStep[17924] Training Speed: 425.53 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:33:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:58:05 INFO loss_tracker.py:84 | Epoch[130/NA] Step[124] GlobalStep[17934/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0131] total_loss[0.0184] Rank[0/16] 06/24/2025 14:58:10 INFO stats.py:394 | Epoch[130] completed. Training Speed: 310.17 samples/sec across all devices. Epoch Time: 56.54 sec. Average Epoch Time: 56.54 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:33:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:58:12 INFO stats.py:314 | Epoch[131] Step[2] GlobalStep[17949] Training Speed: 435.54 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:33:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:58:21 INFO loss_tracker.py:84 | Epoch[131/NA] Step[24] GlobalStep[17971/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0130] total_loss[0.0183] Rank[0/16] 06/24/2025 14:58:22 INFO stats.py:314 | Epoch[131] Step[27] GlobalStep[17974] Training Speed: 435.56 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:33:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:58:31 INFO loss_tracker.py:84 | Epoch[131/NA] Step[49] GlobalStep[17996/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0131] total_loss[0.0186] Rank[0/16] 06/24/2025 14:58:32 INFO stats.py:314 | Epoch[131] Step[52] GlobalStep[17999] Training Speed: 414.69 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 9:33:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:58:41 INFO loss_tracker.py:84 | Epoch[131/NA] Step[74] GlobalStep[18021/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0131] total_loss[0.0184] Rank[0/16] 06/24/2025 14:58:42 INFO stats.py:314 | Epoch[131] Step[77] GlobalStep[18024] Training Speed: 409.86 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 9:33:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:58:52 INFO loss_tracker.py:84 | Epoch[131/NA] Step[99] GlobalStep[18046/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0131] total_loss[0.0180] Rank[0/16] 06/24/2025 14:58:53 INFO stats.py:314 | Epoch[131] Step[102] GlobalStep[18049] Training Speed: 434.34 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:33:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:59:01 INFO loss_tracker.py:84 | Epoch[131/NA] Step[124] GlobalStep[18071/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0130] total_loss[0.0180] Rank[0/16] 06/24/2025 14:59:03 INFO stats.py:314 | Epoch[131] Step[127] GlobalStep[18074] Training Speed: 448.33 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:32:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:59:06 INFO stats.py:394 | Epoch[131] completed. Training Speed: 311.00 samples/sec across all devices. Epoch Time: 56.39 sec. Average Epoch Time: 56.39 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:32:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:59:13 INFO stats.py:314 | Epoch[132] Step[15] GlobalStep[18099] Training Speed: 437.37 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:32:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:59:17 INFO loss_tracker.py:84 | Epoch[132/NA] Step[24] GlobalStep[18108/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0131] total_loss[0.0182] Rank[0/16] 06/24/2025 14:59:24 INFO stats.py:314 | Epoch[132] Step[40] GlobalStep[18124] Training Speed: 422.28 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:32:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:59:27 INFO loss_tracker.py:84 | Epoch[132/NA] Step[49] GlobalStep[18133/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0048] loss_depth[0.0130] total_loss[0.0178] Rank[0/16] 06/24/2025 14:59:34 INFO stats.py:314 | Epoch[132] Step[65] GlobalStep[18149] Training Speed: 405.68 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 9:32:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:59:37 INFO loss_tracker.py:84 | Epoch[132/NA] Step[74] GlobalStep[18158/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0130] total_loss[0.0181] Rank[0/16] 06/24/2025 14:59:44 INFO stats.py:314 | Epoch[132] Step[90] GlobalStep[18174] Training Speed: 430.21 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:32:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:59:47 INFO loss_tracker.py:84 | Epoch[132/NA] Step[99] GlobalStep[18183/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0130] total_loss[0.0180] Rank[0/16] 06/24/2025 14:59:54 INFO stats.py:314 | Epoch[132] Step[115] GlobalStep[18199] Training Speed: 429.98 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:31:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 14:59:57 INFO loss_tracker.py:84 | Epoch[132/NA] Step[124] GlobalStep[18208/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0130] total_loss[0.0182] Rank[0/16] 06/24/2025 15:00:01 INFO stats.py:394 | Epoch[132] completed. Training Speed: 318.10 samples/sec across all devices. Epoch Time: 55.13 sec. Average Epoch Time: 55.13 sec. Average Step Time: 0.40 sec. Estimated Remaining Time: 9:31:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:00:04 INFO stats.py:314 | Epoch[133] Step[3] GlobalStep[18224] Training Speed: 428.15 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:31:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:00:12 INFO loss_tracker.py:84 | Epoch[133/NA] Step[24] GlobalStep[18245/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0131] total_loss[0.0183] Rank[0/16] 06/24/2025 15:00:14 INFO stats.py:314 | Epoch[133] Step[28] GlobalStep[18249] Training Speed: 405.65 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 9:31:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:00:23 INFO loss_tracker.py:84 | Epoch[133/NA] Step[49] GlobalStep[18270/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0130] total_loss[0.0183] Rank[0/16] 06/24/2025 15:00:24 INFO stats.py:314 | Epoch[133] Step[53] GlobalStep[18274] Training Speed: 432.63 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:31:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:00:33 INFO loss_tracker.py:84 | Epoch[133/NA] Step[74] GlobalStep[18295/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0060] loss_depth[0.0130] total_loss[0.0190] Rank[0/16] 06/24/2025 15:00:34 INFO stats.py:314 | Epoch[133] Step[78] GlobalStep[18299] Training Speed: 434.89 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:30:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:00:43 INFO loss_tracker.py:84 | Epoch[133/NA] Step[99] GlobalStep[18320/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0130] total_loss[0.0178] Rank[0/16] 06/24/2025 15:00:44 INFO stats.py:314 | Epoch[133] Step[103] GlobalStep[18324] Training Speed: 428.98 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:30:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:00:53 INFO loss_tracker.py:84 | Epoch[133/NA] Step[124] GlobalStep[18345/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0131] total_loss[0.0185] Rank[0/16] 06/24/2025 15:00:54 INFO stats.py:314 | Epoch[133] Step[128] GlobalStep[18349] Training Speed: 448.18 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:30:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:00:57 INFO stats.py:394 | Epoch[133] completed. Training Speed: 314.63 samples/sec across all devices. Epoch Time: 55.74 sec. Average Epoch Time: 55.74 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:30:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:01:05 INFO stats.py:314 | Epoch[134] Step[16] GlobalStep[18374] Training Speed: 426.59 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:30:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:01:08 INFO loss_tracker.py:84 | Epoch[134/NA] Step[24] GlobalStep[18382/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0130] total_loss[0.0181] Rank[0/16] 06/24/2025 15:01:15 INFO stats.py:314 | Epoch[134] Step[41] GlobalStep[18399] Training Speed: 428.29 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:30:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:01:19 INFO loss_tracker.py:84 | Epoch[134/NA] Step[49] GlobalStep[18407/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0131] total_loss[0.0182] Rank[0/16] 06/24/2025 15:01:26 INFO stats.py:314 | Epoch[134] Step[66] GlobalStep[18424] Training Speed: 432.10 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:30:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:01:29 INFO loss_tracker.py:84 | Epoch[134/NA] Step[74] GlobalStep[18432/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0057] loss_depth[0.0130] total_loss[0.0188] Rank[0/16] 06/24/2025 15:01:36 INFO stats.py:314 | Epoch[134] Step[91] GlobalStep[18449] Training Speed: 419.11 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 9:29:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:01:40 INFO loss_tracker.py:84 | Epoch[134/NA] Step[99] GlobalStep[18457/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0131] total_loss[0.0180] Rank[0/16] 06/24/2025 15:01:46 INFO stats.py:314 | Epoch[134] Step[116] GlobalStep[18474] Training Speed: 431.17 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:29:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:01:50 INFO loss_tracker.py:84 | Epoch[134/NA] Step[124] GlobalStep[18482/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0131] total_loss[0.0184] Rank[0/16] 06/24/2025 15:01:54 INFO stats.py:394 | Epoch[134] completed. Training Speed: 305.72 samples/sec across all devices. Epoch Time: 57.36 sec. Average Epoch Time: 57.36 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 9:29:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:01:57 INFO stats.py:314 | Epoch[135] Step[4] GlobalStep[18499] Training Speed: 436.73 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:29:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:02:05 INFO loss_tracker.py:84 | Epoch[135/NA] Step[24] GlobalStep[18519/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0131] total_loss[0.0182] Rank[0/16] 06/24/2025 15:02:07 INFO stats.py:314 | Epoch[135] Step[29] GlobalStep[18524] Training Speed: 431.92 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:29:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:02:16 INFO loss_tracker.py:84 | Epoch[135/NA] Step[49] GlobalStep[18544/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0131] total_loss[0.0183] Rank[0/16] 06/24/2025 15:02:18 INFO stats.py:314 | Epoch[135] Step[54] GlobalStep[18549] Training Speed: 428.61 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:29:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:02:27 INFO loss_tracker.py:84 | Epoch[135/NA] Step[74] GlobalStep[18569/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0131] total_loss[0.0184] Rank[0/16] 06/24/2025 15:02:28 INFO stats.py:314 | Epoch[135] Step[79] GlobalStep[18574] Training Speed: 430.53 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:29:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:02:37 INFO loss_tracker.py:84 | Epoch[135/NA] Step[99] GlobalStep[18594/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0130] total_loss[0.0177] Rank[0/16] 06/24/2025 15:02:39 INFO stats.py:314 | Epoch[135] Step[104] GlobalStep[18599] Training Speed: 420.18 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:28:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:02:47 INFO loss_tracker.py:84 | Epoch[135/NA] Step[124] GlobalStep[18619/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0130] total_loss[0.0185] Rank[0/16] 06/24/2025 15:02:49 INFO stats.py:314 | Epoch[135] Step[129] GlobalStep[18624] Training Speed: 450.24 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 9:28:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:02:51 INFO stats.py:394 | Epoch[135] completed. Training Speed: 306.34 samples/sec across all devices. Epoch Time: 57.24 sec. Average Epoch Time: 57.24 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 9:28:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:03:00 INFO stats.py:314 | Epoch[136] Step[17] GlobalStep[18649] Training Speed: 430.05 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:28:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:03:03 INFO loss_tracker.py:84 | Epoch[136/NA] Step[24] GlobalStep[18656/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0130] total_loss[0.0184] Rank[0/16] 06/24/2025 15:03:10 INFO stats.py:314 | Epoch[136] Step[42] GlobalStep[18674] Training Speed: 433.50 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:28:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:03:13 INFO loss_tracker.py:84 | Epoch[136/NA] Step[49] GlobalStep[18681/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0130] total_loss[0.0179] Rank[0/16] 06/24/2025 15:03:20 INFO stats.py:314 | Epoch[136] Step[67] GlobalStep[18699] Training Speed: 408.81 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 9:28:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:03:23 INFO loss_tracker.py:84 | Epoch[136/NA] Step[74] GlobalStep[18706/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0130] total_loss[0.0182] Rank[0/16] 06/24/2025 15:03:31 INFO stats.py:314 | Epoch[136] Step[92] GlobalStep[18724] Training Speed: 434.14 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:27:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:03:34 INFO loss_tracker.py:84 | Epoch[136/NA] Step[99] GlobalStep[18731/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0130] total_loss[0.0181] Rank[0/16] 06/24/2025 15:03:41 INFO stats.py:314 | Epoch[136] Step[117] GlobalStep[18749] Training Speed: 437.39 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:27:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:03:43 INFO loss_tracker.py:84 | Epoch[136/NA] Step[124] GlobalStep[18756/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0057] loss_depth[0.0130] total_loss[0.0187] Rank[0/16] 06/24/2025 15:03:47 INFO stats.py:394 | Epoch[136] completed. Training Speed: 313.31 samples/sec across all devices. Epoch Time: 55.97 sec. Average Epoch Time: 55.97 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:27:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:03:51 INFO stats.py:314 | Epoch[137] Step[5] GlobalStep[18774] Training Speed: 422.61 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:27:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:03:58 INFO loss_tracker.py:84 | Epoch[137/NA] Step[24] GlobalStep[18793/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0130] total_loss[0.0181] Rank[0/16] 06/24/2025 15:04:01 INFO stats.py:314 | Epoch[137] Step[30] GlobalStep[18799] Training Speed: 431.26 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:27:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:04:09 INFO loss_tracker.py:84 | Epoch[137/NA] Step[49] GlobalStep[18818/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0131] total_loss[0.0180] Rank[0/16] 06/24/2025 15:04:11 INFO stats.py:314 | Epoch[137] Step[55] GlobalStep[18824] Training Speed: 432.32 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:27:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:04:19 INFO loss_tracker.py:84 | Epoch[137/NA] Step[74] GlobalStep[18843/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0058] loss_depth[0.0130] total_loss[0.0189] Rank[0/16] 06/24/2025 15:04:21 INFO stats.py:314 | Epoch[137] Step[80] GlobalStep[18849] Training Speed: 427.98 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:26:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:04:29 INFO loss_tracker.py:84 | Epoch[137/NA] Step[99] GlobalStep[18868/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0130] total_loss[0.0184] Rank[0/16] 06/24/2025 15:04:31 INFO stats.py:314 | Epoch[137] Step[105] GlobalStep[18874] Training Speed: 427.73 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:26:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:04:39 INFO loss_tracker.py:84 | Epoch[137/NA] Step[124] GlobalStep[18893/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0057] loss_depth[0.0131] total_loss[0.0188] Rank[0/16] 06/24/2025 15:04:41 INFO stats.py:314 | Epoch[137] Step[130] GlobalStep[18899] Training Speed: 446.75 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:26:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:04:43 INFO stats.py:394 | Epoch[137] completed. Training Speed: 314.62 samples/sec across all devices. Epoch Time: 55.74 sec. Average Epoch Time: 55.74 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:26:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:04:52 INFO stats.py:314 | Epoch[138] Step[18] GlobalStep[18924] Training Speed: 432.87 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:26:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:04:55 INFO loss_tracker.py:84 | Epoch[138/NA] Step[24] GlobalStep[18930/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0130] total_loss[0.0184] Rank[0/16] 06/24/2025 15:05:02 INFO stats.py:314 | Epoch[138] Step[43] GlobalStep[18949] Training Speed: 406.92 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 9:26:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:05:05 INFO loss_tracker.py:84 | Epoch[138/NA] Step[49] GlobalStep[18955/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0046] loss_depth[0.0130] total_loss[0.0176] Rank[0/16] 06/24/2025 15:05:13 INFO stats.py:314 | Epoch[138] Step[68] GlobalStep[18974] Training Speed: 432.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:25:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:05:15 INFO loss_tracker.py:84 | Epoch[138/NA] Step[74] GlobalStep[18980/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0130] total_loss[0.0186] Rank[0/16] 06/24/2025 15:05:22 INFO stats.py:314 | Epoch[138] Step[93] GlobalStep[18999] Training Speed: 437.91 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:25:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:05:25 INFO loss_tracker.py:84 | Epoch[138/NA] Step[99] GlobalStep[19005/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0130] total_loss[0.0180] Rank[0/16] 06/24/2025 15:05:33 INFO stats.py:314 | Epoch[138] Step[118] GlobalStep[19024] Training Speed: 431.66 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:25:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:05:35 INFO loss_tracker.py:84 | Epoch[138/NA] Step[124] GlobalStep[19030/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0130] total_loss[0.0178] Rank[0/16] 06/24/2025 15:05:40 INFO stats.py:394 | Epoch[138] completed. Training Speed: 310.81 samples/sec across all devices. Epoch Time: 56.42 sec. Average Epoch Time: 56.42 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:25:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:05:43 INFO stats.py:314 | Epoch[139] Step[6] GlobalStep[19049] Training Speed: 430.31 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:25:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:05:52 INFO loss_tracker.py:84 | Epoch[139/NA] Step[24] GlobalStep[19067/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0130] total_loss[0.0181] Rank[0/16] 06/24/2025 15:05:54 INFO stats.py:314 | Epoch[139] Step[31] GlobalStep[19074] Training Speed: 431.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:25:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:06:02 INFO loss_tracker.py:84 | Epoch[139/NA] Step[49] GlobalStep[19092/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0130] total_loss[0.0180] Rank[0/16] 06/24/2025 15:06:04 INFO stats.py:314 | Epoch[139] Step[56] GlobalStep[19099] Training Speed: 433.79 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:25:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:06:12 INFO loss_tracker.py:84 | Epoch[139/NA] Step[74] GlobalStep[19117/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0130] total_loss[0.0182] Rank[0/16] 06/24/2025 15:06:15 INFO stats.py:314 | Epoch[139] Step[81] GlobalStep[19124] Training Speed: 431.77 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:24:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:06:22 INFO loss_tracker.py:84 | Epoch[139/NA] Step[99] GlobalStep[19142/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0130] total_loss[0.0181] Rank[0/16] 06/24/2025 15:06:25 INFO stats.py:314 | Epoch[139] Step[106] GlobalStep[19149] Training Speed: 425.96 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:24:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:06:33 INFO loss_tracker.py:84 | Epoch[139/NA] Step[124] GlobalStep[19167/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0130] total_loss[0.0182] Rank[0/16] 06/24/2025 15:06:35 INFO stats.py:314 | Epoch[139] Step[131] GlobalStep[19174] Training Speed: 448.61 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:24:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:06:37 INFO stats.py:394 | Epoch[139] completed. Training Speed: 307.16 samples/sec across all devices. Epoch Time: 57.09 sec. Average Epoch Time: 57.09 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 9:24:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:06:46 INFO stats.py:314 | Epoch[140] Step[19] GlobalStep[19199] Training Speed: 430.42 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:24:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:06:48 INFO loss_tracker.py:84 | Epoch[140/NA] Step[24] GlobalStep[19204/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0131] total_loss[0.0184] Rank[0/16] 06/24/2025 15:06:57 INFO stats.py:314 | Epoch[140] Step[44] GlobalStep[19224] Training Speed: 422.17 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:24:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:06:58 INFO loss_tracker.py:84 | Epoch[140/NA] Step[49] GlobalStep[19229/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0130] total_loss[0.0181] Rank[0/16] 06/24/2025 15:07:07 INFO stats.py:314 | Epoch[140] Step[69] GlobalStep[19249] Training Speed: 431.97 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:23:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:07:09 INFO loss_tracker.py:84 | Epoch[140/NA] Step[74] GlobalStep[19254/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0130] total_loss[0.0183] Rank[0/16] 06/24/2025 15:07:17 INFO stats.py:314 | Epoch[140] Step[94] GlobalStep[19274] Training Speed: 436.07 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:23:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:07:19 INFO loss_tracker.py:84 | Epoch[140/NA] Step[99] GlobalStep[19279/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0129] total_loss[0.0184] Rank[0/16] 06/24/2025 15:07:27 INFO stats.py:314 | Epoch[140] Step[119] GlobalStep[19299] Training Speed: 430.31 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:23:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:07:29 INFO loss_tracker.py:84 | Epoch[140/NA] Step[124] GlobalStep[19304/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0130] total_loss[0.0185] Rank[0/16] 06/24/2025 15:07:34 INFO stats.py:394 | Epoch[140] completed. Training Speed: 308.28 samples/sec across all devices. Epoch Time: 56.88 sec. Average Epoch Time: 56.88 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 9:23:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:07:38 INFO stats.py:314 | Epoch[141] Step[7] GlobalStep[19324] Training Speed: 435.16 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:23:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:07:45 INFO loss_tracker.py:84 | Epoch[141/NA] Step[24] GlobalStep[19341/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0130] total_loss[0.0183] Rank[0/16] 06/24/2025 15:07:48 INFO stats.py:314 | Epoch[141] Step[32] GlobalStep[19349] Training Speed: 432.22 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:23:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:07:55 INFO loss_tracker.py:84 | Epoch[141/NA] Step[49] GlobalStep[19366/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0131] total_loss[0.0178] Rank[0/16] 06/24/2025 15:07:58 INFO stats.py:314 | Epoch[141] Step[57] GlobalStep[19374] Training Speed: 419.77 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:23:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:08:05 INFO loss_tracker.py:84 | Epoch[141/NA] Step[74] GlobalStep[19391/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0130] total_loss[0.0180] Rank[0/16] 06/24/2025 15:08:08 INFO stats.py:314 | Epoch[141] Step[82] GlobalStep[19399] Training Speed: 435.81 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:22:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:08:15 INFO loss_tracker.py:84 | Epoch[141/NA] Step[99] GlobalStep[19416/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0130] total_loss[0.0181] Rank[0/16] 06/24/2025 15:08:18 INFO stats.py:314 | Epoch[141] Step[107] GlobalStep[19424] Training Speed: 424.88 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:22:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:08:25 INFO loss_tracker.py:84 | Epoch[141/NA] Step[124] GlobalStep[19441/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0055] loss_depth[0.0130] total_loss[0.0185] Rank[0/16] 06/24/2025 15:08:28 INFO stats.py:314 | Epoch[141] Step[132] GlobalStep[19449] Training Speed: 449.64 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 9:22:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:08:29 INFO stats.py:394 | Epoch[141] completed. Training Speed: 313.70 samples/sec across all devices. Epoch Time: 55.90 sec. Average Epoch Time: 55.90 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:22:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:08:39 INFO stats.py:314 | Epoch[142] Step[20] GlobalStep[19474] Training Speed: 434.65 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:22:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:08:40 INFO loss_tracker.py:84 | Epoch[142/NA] Step[24] GlobalStep[19478/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0130] total_loss[0.0181] Rank[0/16] 06/24/2025 15:08:49 INFO stats.py:314 | Epoch[142] Step[45] GlobalStep[19499] Training Speed: 434.30 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:22:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:08:51 INFO loss_tracker.py:84 | Epoch[142/NA] Step[49] GlobalStep[19503/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0130] total_loss[0.0185] Rank[0/16] 06/24/2025 15:08:59 INFO stats.py:314 | Epoch[142] Step[70] GlobalStep[19524] Training Speed: 432.85 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:21:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:09:01 INFO loss_tracker.py:84 | Epoch[142/NA] Step[74] GlobalStep[19528/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0057] loss_depth[0.0130] total_loss[0.0188] Rank[0/16] 06/24/2025 15:09:10 INFO stats.py:314 | Epoch[142] Step[95] GlobalStep[19549] Training Speed: 429.12 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:21:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:09:12 INFO loss_tracker.py:84 | Epoch[142/NA] Step[99] GlobalStep[19553/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0130] total_loss[0.0180] Rank[0/16] 06/24/2025 15:09:20 INFO stats.py:314 | Epoch[142] Step[120] GlobalStep[19574] Training Speed: 447.51 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:21:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:09:21 INFO loss_tracker.py:84 | Epoch[142/NA] Step[124] GlobalStep[19578/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0130] total_loss[0.0184] Rank[0/16] 06/24/2025 15:09:26 INFO stats.py:394 | Epoch[142] completed. Training Speed: 310.49 samples/sec across all devices. Epoch Time: 56.48 sec. Average Epoch Time: 56.48 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:21:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:09:31 INFO stats.py:314 | Epoch[143] Step[8] GlobalStep[19599] Training Speed: 432.91 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:21:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:09:37 INFO loss_tracker.py:84 | Epoch[143/NA] Step[24] GlobalStep[19615/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0048] loss_depth[0.0130] total_loss[0.0178] Rank[0/16] 06/24/2025 15:09:41 INFO stats.py:314 | Epoch[143] Step[33] GlobalStep[19624] Training Speed: 433.33 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:21:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:09:48 INFO loss_tracker.py:84 | Epoch[143/NA] Step[49] GlobalStep[19640/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0130] total_loss[0.0182] Rank[0/16] 06/24/2025 15:09:51 INFO stats.py:314 | Epoch[143] Step[58] GlobalStep[19649] Training Speed: 430.80 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:20:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:09:58 INFO loss_tracker.py:84 | Epoch[143/NA] Step[74] GlobalStep[19665/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0055] loss_depth[0.0130] total_loss[0.0185] Rank[0/16] 06/24/2025 15:10:02 INFO stats.py:314 | Epoch[143] Step[83] GlobalStep[19674] Training Speed: 423.42 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:20:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:10:08 INFO loss_tracker.py:84 | Epoch[143/NA] Step[99] GlobalStep[19690/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0130] total_loss[0.0178] Rank[0/16] 06/24/2025 15:10:12 INFO stats.py:314 | Epoch[143] Step[108] GlobalStep[19699] Training Speed: 432.36 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:20:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:10:18 INFO loss_tracker.py:84 | Epoch[143/NA] Step[124] GlobalStep[19715/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0129] total_loss[0.0179] Rank[0/16] 06/24/2025 15:10:22 INFO stats.py:314 | Epoch[143] Step[133] GlobalStep[19724] Training Speed: 449.30 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 9:20:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:10:23 INFO stats.py:394 | Epoch[143] completed. Training Speed: 308.08 samples/sec across all devices. Epoch Time: 56.92 sec. Average Epoch Time: 56.92 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 9:20:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:10:33 INFO stats.py:314 | Epoch[144] Step[21] GlobalStep[19749] Training Speed: 433.98 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:20:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:10:35 INFO loss_tracker.py:84 | Epoch[144/NA] Step[24] GlobalStep[19752/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0046] loss_depth[0.0130] total_loss[0.0177] Rank[0/16] 06/24/2025 15:10:43 INFO stats.py:314 | Epoch[144] Step[46] GlobalStep[19774] Training Speed: 433.68 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:20:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:10:45 INFO loss_tracker.py:84 | Epoch[144/NA] Step[49] GlobalStep[19777/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0042] loss_depth[0.0130] total_loss[0.0173] Rank[0/16] 06/24/2025 15:10:54 INFO stats.py:314 | Epoch[144] Step[71] GlobalStep[19799] Training Speed: 435.60 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:19:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:10:56 INFO loss_tracker.py:84 | Epoch[144/NA] Step[74] GlobalStep[19802/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0130] total_loss[0.0180] Rank[0/16] 06/24/2025 15:11:04 INFO stats.py:314 | Epoch[144] Step[96] GlobalStep[19824] Training Speed: 432.60 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:19:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:11:06 INFO loss_tracker.py:84 | Epoch[144/NA] Step[99] GlobalStep[19827/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0130] total_loss[0.0182] Rank[0/16] 06/24/2025 15:11:15 INFO stats.py:314 | Epoch[144] Step[121] GlobalStep[19849] Training Speed: 450.08 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 9:19:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:11:16 INFO loss_tracker.py:84 | Epoch[144/NA] Step[124] GlobalStep[19852/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0130] total_loss[0.0178] Rank[0/16] 06/24/2025 15:11:20 INFO stats.py:394 | Epoch[144] completed. Training Speed: 307.08 samples/sec across all devices. Epoch Time: 57.11 sec. Average Epoch Time: 57.11 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 9:19:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:11:25 INFO stats.py:314 | Epoch[145] Step[9] GlobalStep[19874] Training Speed: 433.97 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:19:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:11:32 INFO loss_tracker.py:84 | Epoch[145/NA] Step[24] GlobalStep[19889/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0130] total_loss[0.0180] Rank[0/16] 06/24/2025 15:11:36 INFO stats.py:314 | Epoch[145] Step[34] GlobalStep[19899] Training Speed: 431.15 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:19:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:11:42 INFO loss_tracker.py:84 | Epoch[145/NA] Step[49] GlobalStep[19914/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0058] loss_depth[0.0130] total_loss[0.0189] Rank[0/16] 06/24/2025 15:11:46 INFO stats.py:314 | Epoch[145] Step[59] GlobalStep[19924] Training Speed: 433.71 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:18:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:11:52 INFO loss_tracker.py:84 | Epoch[145/NA] Step[74] GlobalStep[19939/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0057] loss_depth[0.0130] total_loss[0.0188] Rank[0/16] 06/24/2025 15:11:56 INFO stats.py:314 | Epoch[145] Step[84] GlobalStep[19949] Training Speed: 432.60 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:18:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:12:02 INFO loss_tracker.py:84 | Epoch[145/NA] Step[99] GlobalStep[19964/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0130] total_loss[0.0180] Rank[0/16] 06/24/2025 15:12:06 INFO stats.py:314 | Epoch[145] Step[109] GlobalStep[19974] Training Speed: 425.06 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:18:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:12:13 INFO loss_tracker.py:84 | Epoch[145/NA] Step[124] GlobalStep[19989/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0129] total_loss[0.0180] Rank[0/16] 06/24/2025 15:12:16 INFO stats.py:314 | Epoch[145] Step[134] GlobalStep[19999] Training Speed: 450.71 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 9:18:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:12:16 WARNING accelerator.py:3099 | Deleting 1 checkpoints to make room for new checkpoint. Rank[0/16] 06/24/2025 15:12:17 INFO accelerator.py:3111 | Saving current state to /job_data/checkpoints/checkpoint_4 Rank[10/16] 06/24/2025 15:12:17 INFO checkpoint.py:110 | Save checkpoint at the end of step 19999 to /job_data/checkpoints/checkpoint_4 Rank[9/16] 06/24/2025 15:12:17 INFO checkpoint.py:110 | Save checkpoint at the end of step 19999 to /job_data/checkpoints/checkpoint_4 Rank[12/16] 06/24/2025 15:12:17 INFO checkpoint.py:110 | Save checkpoint at the end of step 19999 to /job_data/checkpoints/checkpoint_4 Rank[8/16] 06/24/2025 15:12:17 INFO checkpoint.py:110 | Save checkpoint at the end of step 19999 to /job_data/checkpoints/checkpoint_4 Rank[14/16] 06/24/2025 15:12:17 INFO checkpoint.py:110 | Save checkpoint at the end of step 19999 to /job_data/checkpoints/checkpoint_4 Rank[5/16] 06/24/2025 15:12:17 INFO checkpoint.py:110 | Save checkpoint at the end of step 19999 to /job_data/checkpoints/checkpoint_4 Rank[11/16] 06/24/2025 15:12:17 INFO checkpoint.py:110 | Save checkpoint at the end of step 19999 to /job_data/checkpoints/checkpoint_4 Rank[6/16] 06/24/2025 15:12:17 INFO checkpoint.py:110 | Save checkpoint at the end of step 19999 to /job_data/checkpoints/checkpoint_4 Rank[13/16] 06/24/2025 15:12:17 INFO checkpoint.py:110 | Save checkpoint at the end of step 19999 to /job_data/checkpoints/checkpoint_4 Rank[2/16] 06/24/2025 15:12:17 INFO checkpoint.py:110 | Save checkpoint at the end of step 19999 to /job_data/checkpoints/checkpoint_4 Rank[15/16] 06/24/2025 15:12:17 INFO checkpoint.py:110 | Save checkpoint at the end of step 19999 to /job_data/checkpoints/checkpoint_4 Rank[3/16] 06/24/2025 15:12:17 INFO checkpoint.py:110 | Save checkpoint at the end of step 19999 to /job_data/checkpoints/checkpoint_4 Rank[4/16] 06/24/2025 15:12:17 INFO checkpoint.py:110 | Save checkpoint at the end of step 19999 to /job_data/checkpoints/checkpoint_4 Rank[7/16] 06/24/2025 15:12:17 INFO checkpoint.py:110 | Save checkpoint at the end of step 19999 to /job_data/checkpoints/checkpoint_4 Rank[1/16] 06/24/2025 15:12:18 INFO checkpoint.py:110 | Save checkpoint at the end of step 19999 to /job_data/checkpoints/checkpoint_4 Rank[0/16] 06/24/2025 15:12:18 INFO checkpointing.py:106 | Model weights saved in /job_data/checkpoints/checkpoint_4/model.safetensors Rank[0/16] 06/24/2025 15:12:19 INFO checkpointing.py:113 | Optimizer state saved in /job_data/checkpoints/checkpoint_4/optimizer.bin Rank[0/16] 06/24/2025 15:12:19 INFO checkpointing.py:120 | Scheduler state saved in /job_data/checkpoints/checkpoint_4/scheduler.bin Rank[0/16] 06/24/2025 15:12:19 INFO checkpointing.py:137 | Sampler state for dataloader 0 saved in /job_data/checkpoints/checkpoint_4/sampler.bin Rank[0/16] 06/24/2025 15:12:19 INFO checkpointing.py:164 | Random states saved in /job_data/checkpoints/checkpoint_4/random_states_0.pkl Rank[0/16] 06/24/2025 15:12:19 INFO checkpointing.py:300 | Saving the state of TrainerProgressState to /job_data/checkpoints/checkpoint_4/custom_checkpoint_0.pkl Rank[0/16] 06/24/2025 15:12:19 INFO checkpoint.py:110 | Save checkpoint at the end of step 19999 to /job_data/checkpoints/checkpoint_4 Rank[0/16] 06/24/2025 15:12:20 INFO stats.py:394 | Epoch[145] completed. Training Speed: 292.81 samples/sec across all devices. Epoch Time: 59.89 sec. Average Epoch Time: 59.89 sec. Average Step Time: 0.44 sec. Estimated Remaining Time: 9:18:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:12:30 INFO stats.py:314 | Epoch[146] Step[22] GlobalStep[20024] Training Speed: 420.32 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:18:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:12:31 INFO loss_tracker.py:84 | Epoch[146/NA] Step[24] GlobalStep[20026/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0130] total_loss[0.0184] Rank[0/16] 06/24/2025 15:12:41 INFO stats.py:314 | Epoch[146] Step[47] GlobalStep[20049] Training Speed: 434.07 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:18:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:12:42 INFO loss_tracker.py:84 | Epoch[146/NA] Step[49] GlobalStep[20051/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0130] total_loss[0.0179] Rank[0/16] 06/24/2025 15:12:51 INFO stats.py:314 | Epoch[146] Step[72] GlobalStep[20074] Training Speed: 424.51 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:18:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:12:52 INFO loss_tracker.py:84 | Epoch[146/NA] Step[74] GlobalStep[20076/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0044] loss_depth[0.0130] total_loss[0.0175] Rank[0/16] 06/24/2025 15:13:01 INFO stats.py:314 | Epoch[146] Step[97] GlobalStep[20099] Training Speed: 427.49 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:17:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:13:02 INFO loss_tracker.py:84 | Epoch[146/NA] Step[99] GlobalStep[20101/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0130] total_loss[0.0181] Rank[0/16] 06/24/2025 15:13:11 INFO stats.py:314 | Epoch[146] Step[122] GlobalStep[20124] Training Speed: 454.12 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 9:17:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:13:12 INFO loss_tracker.py:84 | Epoch[146/NA] Step[124] GlobalStep[20126/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0129] total_loss[0.0186] Rank[0/16] 06/24/2025 15:13:16 INFO stats.py:394 | Epoch[146] completed. Training Speed: 312.23 samples/sec across all devices. Epoch Time: 56.16 sec. Average Epoch Time: 56.16 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:17:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:13:21 INFO stats.py:314 | Epoch[147] Step[10] GlobalStep[20149] Training Speed: 430.47 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:17:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:13:27 INFO loss_tracker.py:84 | Epoch[147/NA] Step[24] GlobalStep[20163/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0130] total_loss[0.0180] Rank[0/16] 06/24/2025 15:13:32 INFO stats.py:314 | Epoch[147] Step[35] GlobalStep[20174] Training Speed: 433.91 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:17:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:13:38 INFO loss_tracker.py:84 | Epoch[147/NA] Step[49] GlobalStep[20188/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0130] total_loss[0.0181] Rank[0/16] 06/24/2025 15:13:42 INFO stats.py:314 | Epoch[147] Step[60] GlobalStep[20199] Training Speed: 430.88 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:17:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:13:48 INFO loss_tracker.py:84 | Epoch[147/NA] Step[74] GlobalStep[20213/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0130] total_loss[0.0181] Rank[0/16] 06/24/2025 15:13:52 INFO stats.py:314 | Epoch[147] Step[85] GlobalStep[20224] Training Speed: 428.11 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:16:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:13:58 INFO loss_tracker.py:84 | Epoch[147/NA] Step[99] GlobalStep[20238/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0130] total_loss[0.0183] Rank[0/16] 06/24/2025 15:14:02 INFO stats.py:314 | Epoch[147] Step[110] GlobalStep[20249] Training Speed: 434.88 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:16:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:14:08 INFO loss_tracker.py:84 | Epoch[147/NA] Step[124] GlobalStep[20263/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0130] total_loss[0.0179] Rank[0/16] 06/24/2025 15:14:12 INFO stats.py:314 | Epoch[147] Step[135] GlobalStep[20274] Training Speed: 450.23 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 9:16:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:14:12 INFO stats.py:394 | Epoch[147] completed. Training Speed: 311.70 samples/sec across all devices. Epoch Time: 56.26 sec. Average Epoch Time: 56.26 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:16:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:14:23 INFO stats.py:314 | Epoch[148] Step[23] GlobalStep[20299] Training Speed: 439.12 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:16:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:14:24 INFO loss_tracker.py:84 | Epoch[148/NA] Step[24] GlobalStep[20300/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0130] total_loss[0.0184] Rank[0/16] 06/24/2025 15:14:33 INFO stats.py:314 | Epoch[148] Step[48] GlobalStep[20324] Training Speed: 433.39 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:16:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:14:34 INFO loss_tracker.py:84 | Epoch[148/NA] Step[49] GlobalStep[20325/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0054] loss_depth[0.0130] total_loss[0.0184] Rank[0/16] 06/24/2025 15:14:44 INFO stats.py:314 | Epoch[148] Step[73] GlobalStep[20349] Training Speed: 248.31 samples/sec across all devices. Average Step Time: 0.52 sec. Estimated Remaining Time: 9:16:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:14:44 INFO loss_tracker.py:84 | Epoch[148/NA] Step[74] GlobalStep[20350/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0130] total_loss[0.0180] Rank[0/16] 06/24/2025 15:14:54 INFO stats.py:314 | Epoch[148] Step[98] GlobalStep[20374] Training Speed: 431.78 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:15:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:14:54 INFO loss_tracker.py:84 | Epoch[148/NA] Step[99] GlobalStep[20375/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0043] loss_depth[0.0129] total_loss[0.0173] Rank[0/16] 06/24/2025 15:15:04 INFO stats.py:314 | Epoch[148] Step[123] GlobalStep[20399] Training Speed: 453.74 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 9:15:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:15:04 INFO loss_tracker.py:84 | Epoch[148/NA] Step[124] GlobalStep[20400/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0130] total_loss[0.0183] Rank[0/16] 06/24/2025 15:15:08 INFO stats.py:394 | Epoch[148] completed. Training Speed: 312.04 samples/sec across all devices. Epoch Time: 56.20 sec. Average Epoch Time: 56.20 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:15:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:15:14 INFO stats.py:314 | Epoch[149] Step[11] GlobalStep[20424] Training Speed: 435.45 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:15:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:15:20 INFO loss_tracker.py:84 | Epoch[149/NA] Step[24] GlobalStep[20437/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0129] total_loss[0.0182] Rank[0/16] 06/24/2025 15:15:25 INFO stats.py:314 | Epoch[149] Step[36] GlobalStep[20449] Training Speed: 260.94 samples/sec across all devices. Average Step Time: 0.49 sec. Estimated Remaining Time: 9:15:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:15:30 INFO loss_tracker.py:84 | Epoch[149/NA] Step[49] GlobalStep[20462/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0130] total_loss[0.0181] Rank[0/16] 06/24/2025 15:15:35 INFO stats.py:314 | Epoch[149] Step[61] GlobalStep[20474] Training Speed: 435.98 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:15:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:15:41 INFO loss_tracker.py:84 | Epoch[149/NA] Step[74] GlobalStep[20487/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0130] total_loss[0.0182] Rank[0/16] 06/24/2025 15:15:46 INFO stats.py:314 | Epoch[149] Step[86] GlobalStep[20499] Training Speed: 428.85 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:14:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:15:51 INFO loss_tracker.py:84 | Epoch[149/NA] Step[99] GlobalStep[20512/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0130] total_loss[0.0178] Rank[0/16] 06/24/2025 15:15:56 INFO stats.py:314 | Epoch[149] Step[111] GlobalStep[20524] Training Speed: 433.19 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:14:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:16:01 INFO loss_tracker.py:84 | Epoch[149/NA] Step[124] GlobalStep[20537/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0057] loss_depth[0.0130] total_loss[0.0187] Rank[0/16] 06/24/2025 15:16:05 INFO stats.py:314 | Epoch[149] Step[136] GlobalStep[20549] Training Speed: 448.13 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:14:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:16:05 INFO stats.py:394 | Epoch[149] completed. Training Speed: 307.98 samples/sec across all devices. Epoch Time: 56.94 sec. Average Epoch Time: 56.94 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 9:14:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:16:17 INFO stats.py:314 | Epoch[150] Step[24] GlobalStep[20574] Training Speed: 433.29 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:14:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:16:17 INFO loss_tracker.py:84 | Epoch[150/NA] Step[24] GlobalStep[20574/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0131] total_loss[0.0182] Rank[0/16] 06/24/2025 15:16:27 INFO stats.py:314 | Epoch[150] Step[49] GlobalStep[20599] Training Speed: 402.75 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 9:14:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:16:27 INFO loss_tracker.py:84 | Epoch[150/NA] Step[49] GlobalStep[20599/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0131] total_loss[0.0180] Rank[0/16] 06/24/2025 15:16:38 INFO stats.py:314 | Epoch[150] Step[74] GlobalStep[20624] Training Speed: 430.14 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:14:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:16:38 INFO loss_tracker.py:84 | Epoch[150/NA] Step[74] GlobalStep[20624/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0130] total_loss[0.0183] Rank[0/16] 06/24/2025 15:16:48 INFO stats.py:314 | Epoch[150] Step[99] GlobalStep[20649] Training Speed: 421.76 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:13:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:16:48 INFO loss_tracker.py:84 | Epoch[150/NA] Step[99] GlobalStep[20649/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0129] total_loss[0.0182] Rank[0/16] 06/24/2025 15:16:58 INFO stats.py:314 | Epoch[150] Step[124] GlobalStep[20674] Training Speed: 451.54 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 9:13:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:16:58 INFO loss_tracker.py:84 | Epoch[150/NA] Step[124] GlobalStep[20674/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0130] total_loss[0.0180] Rank[0/16] 06/24/2025 15:17:03 INFO stats.py:394 | Epoch[150] completed. Training Speed: 305.70 samples/sec across all devices. Epoch Time: 57.36 sec. Average Epoch Time: 57.36 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 9:13:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:17:09 INFO stats.py:314 | Epoch[151] Step[12] GlobalStep[20699] Training Speed: 426.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:13:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:17:14 INFO loss_tracker.py:84 | Epoch[151/NA] Step[24] GlobalStep[20711/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0130] total_loss[0.0179] Rank[0/16] 06/24/2025 15:17:20 INFO stats.py:314 | Epoch[151] Step[37] GlobalStep[20724] Training Speed: 432.46 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:13:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:17:24 INFO loss_tracker.py:84 | Epoch[151/NA] Step[49] GlobalStep[20736/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0129] total_loss[0.0180] Rank[0/16] 06/24/2025 15:17:29 INFO stats.py:314 | Epoch[151] Step[62] GlobalStep[20749] Training Speed: 427.12 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:13:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:17:34 INFO loss_tracker.py:84 | Epoch[151/NA] Step[74] GlobalStep[20761/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0130] total_loss[0.0186] Rank[0/16] 06/24/2025 15:17:40 INFO stats.py:314 | Epoch[151] Step[87] GlobalStep[20774] Training Speed: 428.24 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:12:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:17:44 INFO loss_tracker.py:84 | Epoch[151/NA] Step[99] GlobalStep[20786/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0129] total_loss[0.0181] Rank[0/16] 06/24/2025 15:17:49 INFO stats.py:314 | Epoch[151] Step[112] GlobalStep[20799] Training Speed: 430.09 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:12:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:17:55 INFO loss_tracker.py:84 | Epoch[151/NA] Step[124] GlobalStep[20811/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0129] total_loss[0.0181] Rank[0/16] 06/24/2025 15:17:59 INFO stats.py:394 | Epoch[151] completed. Training Speed: 312.05 samples/sec across all devices. Epoch Time: 56.20 sec. Average Epoch Time: 56.20 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:12:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:18:00 INFO stats.py:314 | Epoch[152] Step[0] GlobalStep[20824] Training Speed: 369.18 samples/sec across all devices. Average Step Time: 0.35 sec. Estimated Remaining Time: 9:12:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:18:10 INFO loss_tracker.py:84 | Epoch[152/NA] Step[24] GlobalStep[20848/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0129] total_loss[0.0180] Rank[0/16] 06/24/2025 15:18:10 INFO stats.py:314 | Epoch[152] Step[25] GlobalStep[20849] Training Speed: 420.89 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:12:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:18:21 INFO loss_tracker.py:84 | Epoch[152/NA] Step[49] GlobalStep[20873/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0129] total_loss[0.0177] Rank[0/16] 06/24/2025 15:18:21 INFO stats.py:314 | Epoch[152] Step[50] GlobalStep[20874] Training Speed: 409.71 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 9:12:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:18:30 INFO loss_tracker.py:84 | Epoch[152/NA] Step[74] GlobalStep[20898/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0129] total_loss[0.0186] Rank[0/16] 06/24/2025 15:18:31 INFO stats.py:314 | Epoch[152] Step[75] GlobalStep[20899] Training Speed: 412.45 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 9:11:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:18:40 INFO loss_tracker.py:84 | Epoch[152/NA] Step[99] GlobalStep[20923/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0130] total_loss[0.0180] Rank[0/16] 06/24/2025 15:18:41 INFO stats.py:314 | Epoch[152] Step[100] GlobalStep[20924] Training Speed: 432.51 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:11:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:18:50 INFO loss_tracker.py:84 | Epoch[152/NA] Step[124] GlobalStep[20948/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0130] total_loss[0.0187] Rank[0/16] 06/24/2025 15:18:50 INFO stats.py:314 | Epoch[152] Step[125] GlobalStep[20949] Training Speed: 437.93 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:11:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:18:54 INFO stats.py:394 | Epoch[152] completed. Training Speed: 316.16 samples/sec across all devices. Epoch Time: 55.47 sec. Average Epoch Time: 55.47 sec. Average Step Time: 0.40 sec. Estimated Remaining Time: 9:11:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:19:01 INFO stats.py:314 | Epoch[153] Step[13] GlobalStep[20974] Training Speed: 433.51 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:11:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:19:05 INFO loss_tracker.py:84 | Epoch[153/NA] Step[24] GlobalStep[20985/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0129] total_loss[0.0181] Rank[0/16] 06/24/2025 15:19:11 INFO stats.py:314 | Epoch[153] Step[38] GlobalStep[20999] Training Speed: 441.18 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:11:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:19:16 INFO loss_tracker.py:84 | Epoch[153/NA] Step[49] GlobalStep[21010/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0130] total_loss[0.0180] Rank[0/16] 06/24/2025 15:19:21 INFO stats.py:314 | Epoch[153] Step[63] GlobalStep[21024] Training Speed: 431.72 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:10:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:19:25 INFO loss_tracker.py:84 | Epoch[153/NA] Step[74] GlobalStep[21035/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0130] total_loss[0.0186] Rank[0/16] 06/24/2025 15:19:31 INFO stats.py:314 | Epoch[153] Step[88] GlobalStep[21049] Training Speed: 428.15 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:10:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:19:36 INFO loss_tracker.py:84 | Epoch[153/NA] Step[99] GlobalStep[21060/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0130] total_loss[0.0177] Rank[0/16] 06/24/2025 15:19:41 INFO stats.py:314 | Epoch[153] Step[113] GlobalStep[21074] Training Speed: 435.80 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:10:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:19:45 INFO loss_tracker.py:84 | Epoch[153/NA] Step[124] GlobalStep[21085/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0130] total_loss[0.0181] Rank[0/16] 06/24/2025 15:19:49 INFO stats.py:394 | Epoch[153] completed. Training Speed: 318.74 samples/sec across all devices. Epoch Time: 55.02 sec. Average Epoch Time: 55.02 sec. Average Step Time: 0.40 sec. Estimated Remaining Time: 9:10:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:19:51 INFO stats.py:314 | Epoch[154] Step[1] GlobalStep[21099] Training Speed: 427.20 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:10:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:20:00 INFO loss_tracker.py:84 | Epoch[154/NA] Step[24] GlobalStep[21122/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0129] total_loss[0.0182] Rank[0/16] 06/24/2025 15:20:01 INFO stats.py:314 | Epoch[154] Step[26] GlobalStep[21124] Training Speed: 239.56 samples/sec across all devices. Average Step Time: 0.53 sec. Estimated Remaining Time: 9:10:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:20:11 INFO loss_tracker.py:84 | Epoch[154/NA] Step[49] GlobalStep[21147/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0130] total_loss[0.0179] Rank[0/16] 06/24/2025 15:20:11 INFO stats.py:314 | Epoch[154] Step[51] GlobalStep[21149] Training Speed: 429.21 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:09:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:20:21 INFO loss_tracker.py:84 | Epoch[154/NA] Step[74] GlobalStep[21172/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0130] total_loss[0.0186] Rank[0/16] 06/24/2025 15:20:21 INFO stats.py:314 | Epoch[154] Step[76] GlobalStep[21174] Training Speed: 431.51 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:09:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:20:31 INFO loss_tracker.py:84 | Epoch[154/NA] Step[99] GlobalStep[21197/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0129] total_loss[0.0176] Rank[0/16] 06/24/2025 15:20:32 INFO stats.py:314 | Epoch[154] Step[101] GlobalStep[21199] Training Speed: 439.47 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:09:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:20:41 INFO loss_tracker.py:84 | Epoch[154/NA] Step[124] GlobalStep[21222/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0130] total_loss[0.0182] Rank[0/16] 06/24/2025 15:20:42 INFO stats.py:314 | Epoch[154] Step[126] GlobalStep[21224] Training Speed: 451.42 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 9:09:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:20:45 INFO stats.py:394 | Epoch[154] completed. Training Speed: 314.80 samples/sec across all devices. Epoch Time: 55.70 sec. Average Epoch Time: 55.70 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:09:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:20:52 INFO stats.py:314 | Epoch[155] Step[14] GlobalStep[21249] Training Speed: 438.48 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:09:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:20:57 INFO loss_tracker.py:84 | Epoch[155/NA] Step[24] GlobalStep[21259/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0130] total_loss[0.0181] Rank[0/16] 06/24/2025 15:21:03 INFO stats.py:314 | Epoch[155] Step[39] GlobalStep[21274] Training Speed: 438.96 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:09:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:21:07 INFO loss_tracker.py:84 | Epoch[155/NA] Step[49] GlobalStep[21284/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0130] total_loss[0.0179] Rank[0/16] 06/24/2025 15:21:13 INFO stats.py:314 | Epoch[155] Step[64] GlobalStep[21299] Training Speed: 422.84 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:08:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:21:17 INFO loss_tracker.py:84 | Epoch[155/NA] Step[74] GlobalStep[21309/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0130] total_loss[0.0182] Rank[0/16] 06/24/2025 15:21:23 INFO stats.py:314 | Epoch[155] Step[89] GlobalStep[21324] Training Speed: 431.33 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:08:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:21:27 INFO loss_tracker.py:84 | Epoch[155/NA] Step[99] GlobalStep[21334/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0129] total_loss[0.0180] Rank[0/16] 06/24/2025 15:21:34 INFO stats.py:314 | Epoch[155] Step[114] GlobalStep[21349] Training Speed: 434.75 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:08:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:21:37 INFO loss_tracker.py:84 | Epoch[155/NA] Step[124] GlobalStep[21359/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0129] total_loss[0.0179] Rank[0/16] 06/24/2025 15:21:41 INFO stats.py:394 | Epoch[155] completed. Training Speed: 311.87 samples/sec across all devices. Epoch Time: 56.23 sec. Average Epoch Time: 56.23 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:08:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:21:44 INFO stats.py:314 | Epoch[156] Step[2] GlobalStep[21374] Training Speed: 436.03 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:08:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:21:53 INFO loss_tracker.py:84 | Epoch[156/NA] Step[24] GlobalStep[21396/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0130] total_loss[0.0179] Rank[0/16] 06/24/2025 15:21:54 INFO stats.py:314 | Epoch[156] Step[27] GlobalStep[21399] Training Speed: 425.61 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:08:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:22:03 INFO loss_tracker.py:84 | Epoch[156/NA] Step[49] GlobalStep[21421/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0048] loss_depth[0.0130] total_loss[0.0178] Rank[0/16] 06/24/2025 15:22:04 INFO stats.py:314 | Epoch[156] Step[52] GlobalStep[21424] Training Speed: 434.70 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:07:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:22:14 INFO loss_tracker.py:84 | Epoch[156/NA] Step[74] GlobalStep[21446/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0048] loss_depth[0.0129] total_loss[0.0178] Rank[0/16] 06/24/2025 15:22:15 INFO stats.py:314 | Epoch[156] Step[77] GlobalStep[21449] Training Speed: 441.51 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:07:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:22:24 INFO loss_tracker.py:84 | Epoch[156/NA] Step[99] GlobalStep[21471/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0129] total_loss[0.0176] Rank[0/16] 06/24/2025 15:22:25 INFO stats.py:314 | Epoch[156] Step[102] GlobalStep[21474] Training Speed: 439.27 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:07:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:22:34 INFO loss_tracker.py:84 | Epoch[156/NA] Step[124] GlobalStep[21496/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0130] total_loss[0.0182] Rank[0/16] 06/24/2025 15:22:35 INFO stats.py:314 | Epoch[156] Step[127] GlobalStep[21499] Training Speed: 451.25 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 9:07:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:22:38 INFO stats.py:394 | Epoch[156] completed. Training Speed: 310.66 samples/sec across all devices. Epoch Time: 56.45 sec. Average Epoch Time: 56.45 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:07:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:22:45 INFO stats.py:314 | Epoch[157] Step[15] GlobalStep[21524] Training Speed: 433.85 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:07:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:22:49 INFO loss_tracker.py:84 | Epoch[157/NA] Step[24] GlobalStep[21533/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0048] loss_depth[0.0129] total_loss[0.0178] Rank[0/16] 06/24/2025 15:22:56 INFO stats.py:314 | Epoch[157] Step[40] GlobalStep[21549] Training Speed: 435.33 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:06:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:22:59 INFO loss_tracker.py:84 | Epoch[157/NA] Step[49] GlobalStep[21558/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0130] total_loss[0.0183] Rank[0/16] 06/24/2025 15:23:06 INFO stats.py:314 | Epoch[157] Step[65] GlobalStep[21574] Training Speed: 427.78 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:06:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:23:09 INFO loss_tracker.py:84 | Epoch[157/NA] Step[74] GlobalStep[21583/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0129] total_loss[0.0179] Rank[0/16] 06/24/2025 15:23:16 INFO stats.py:314 | Epoch[157] Step[90] GlobalStep[21599] Training Speed: 436.95 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:06:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:23:19 INFO loss_tracker.py:84 | Epoch[157/NA] Step[99] GlobalStep[21608/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0129] total_loss[0.0184] Rank[0/16] 06/24/2025 15:23:26 INFO stats.py:314 | Epoch[157] Step[115] GlobalStep[21624] Training Speed: 434.15 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:06:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:23:29 INFO loss_tracker.py:84 | Epoch[157/NA] Step[124] GlobalStep[21633/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0129] total_loss[0.0184] Rank[0/16] 06/24/2025 15:23:34 INFO stats.py:394 | Epoch[157] completed. Training Speed: 314.91 samples/sec across all devices. Epoch Time: 55.69 sec. Average Epoch Time: 55.69 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:06:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:23:36 INFO stats.py:314 | Epoch[158] Step[3] GlobalStep[21649] Training Speed: 428.57 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:06:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:23:45 INFO loss_tracker.py:84 | Epoch[158/NA] Step[24] GlobalStep[21670/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0130] total_loss[0.0177] Rank[0/16] 06/24/2025 15:23:46 INFO stats.py:314 | Epoch[158] Step[28] GlobalStep[21674] Training Speed: 432.34 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:06:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:23:55 INFO loss_tracker.py:84 | Epoch[158/NA] Step[49] GlobalStep[21695/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0129] total_loss[0.0186] Rank[0/16] 06/24/2025 15:23:57 INFO stats.py:314 | Epoch[158] Step[53] GlobalStep[21699] Training Speed: 407.00 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 9:05:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:24:05 INFO loss_tracker.py:84 | Epoch[158/NA] Step[74] GlobalStep[21720/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0044] loss_depth[0.0130] total_loss[0.0174] Rank[0/16] 06/24/2025 15:24:07 INFO stats.py:314 | Epoch[158] Step[78] GlobalStep[21724] Training Speed: 431.87 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:05:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:24:15 INFO loss_tracker.py:84 | Epoch[158/NA] Step[99] GlobalStep[21745/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0130] total_loss[0.0177] Rank[0/16] 06/24/2025 15:24:17 INFO stats.py:314 | Epoch[158] Step[103] GlobalStep[21749] Training Speed: 435.67 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:05:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:24:25 INFO loss_tracker.py:84 | Epoch[158/NA] Step[124] GlobalStep[21770/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0130] total_loss[0.0180] Rank[0/16] 06/24/2025 15:24:27 INFO stats.py:314 | Epoch[158] Step[128] GlobalStep[21774] Training Speed: 448.50 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:05:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:24:30 INFO stats.py:394 | Epoch[158] completed. Training Speed: 311.85 samples/sec across all devices. Epoch Time: 56.23 sec. Average Epoch Time: 56.23 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:05:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:24:38 INFO stats.py:314 | Epoch[159] Step[16] GlobalStep[21799] Training Speed: 434.07 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:05:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:24:41 INFO loss_tracker.py:84 | Epoch[159/NA] Step[24] GlobalStep[21807/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0059] loss_depth[0.0130] total_loss[0.0189] Rank[0/16] 06/24/2025 15:24:48 INFO stats.py:314 | Epoch[159] Step[41] GlobalStep[21824] Training Speed: 437.67 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:04:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:24:52 INFO loss_tracker.py:84 | Epoch[159/NA] Step[49] GlobalStep[21832/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0045] loss_depth[0.0130] total_loss[0.0175] Rank[0/16] 06/24/2025 15:24:58 INFO stats.py:314 | Epoch[159] Step[66] GlobalStep[21849] Training Speed: 437.54 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:04:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:25:02 INFO loss_tracker.py:84 | Epoch[159/NA] Step[74] GlobalStep[21857/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0130] total_loss[0.0181] Rank[0/16] 06/24/2025 15:25:09 INFO stats.py:314 | Epoch[159] Step[91] GlobalStep[21874] Training Speed: 435.70 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:04:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:25:12 INFO loss_tracker.py:84 | Epoch[159/NA] Step[99] GlobalStep[21882/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0130] total_loss[0.0180] Rank[0/16] 06/24/2025 15:25:19 INFO stats.py:314 | Epoch[159] Step[116] GlobalStep[21899] Training Speed: 426.61 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:04:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:25:22 INFO loss_tracker.py:84 | Epoch[159/NA] Step[124] GlobalStep[21907/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0129] total_loss[0.0180] Rank[0/16] 06/24/2025 15:25:26 INFO stats.py:394 | Epoch[159] completed. Training Speed: 310.50 samples/sec across all devices. Epoch Time: 56.48 sec. Average Epoch Time: 56.48 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:04:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:25:29 INFO stats.py:314 | Epoch[160] Step[4] GlobalStep[21924] Training Speed: 434.46 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:04:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:25:37 INFO loss_tracker.py:84 | Epoch[160/NA] Step[24] GlobalStep[21944/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0129] total_loss[0.0179] Rank[0/16] 06/24/2025 15:25:40 INFO stats.py:314 | Epoch[160] Step[29] GlobalStep[21949] Training Speed: 247.22 samples/sec across all devices. Average Step Time: 0.52 sec. Estimated Remaining Time: 9:04:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:25:48 INFO loss_tracker.py:84 | Epoch[160/NA] Step[49] GlobalStep[21969/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0048] loss_depth[0.0130] total_loss[0.0178] Rank[0/16] 06/24/2025 15:25:50 INFO stats.py:314 | Epoch[160] Step[54] GlobalStep[21974] Training Speed: 427.92 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:03:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:25:58 INFO loss_tracker.py:84 | Epoch[160/NA] Step[74] GlobalStep[21994/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0045] loss_depth[0.0130] total_loss[0.0175] Rank[0/16] 06/24/2025 15:26:00 INFO stats.py:314 | Epoch[160] Step[79] GlobalStep[21999] Training Speed: 435.35 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:03:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:26:09 INFO loss_tracker.py:84 | Epoch[160/NA] Step[99] GlobalStep[22019/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0129] total_loss[0.0181] Rank[0/16] 06/24/2025 15:26:11 INFO stats.py:314 | Epoch[160] Step[104] GlobalStep[22024] Training Speed: 432.69 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:03:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:26:19 INFO loss_tracker.py:84 | Epoch[160/NA] Step[124] GlobalStep[22044/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0056] loss_depth[0.0130] total_loss[0.0186] Rank[0/16] 06/24/2025 15:26:20 INFO stats.py:314 | Epoch[160] Step[129] GlobalStep[22049] Training Speed: 449.46 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 9:03:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:26:23 INFO stats.py:394 | Epoch[160] completed. Training Speed: 309.90 samples/sec across all devices. Epoch Time: 56.59 sec. Average Epoch Time: 56.59 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:03:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:26:31 INFO stats.py:314 | Epoch[161] Step[17] GlobalStep[22074] Training Speed: 425.78 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:03:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:26:34 INFO loss_tracker.py:84 | Epoch[161/NA] Step[24] GlobalStep[22081/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0130] total_loss[0.0180] Rank[0/16] 06/24/2025 15:26:41 INFO stats.py:314 | Epoch[161] Step[42] GlobalStep[22099] Training Speed: 422.25 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:02:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:26:45 INFO loss_tracker.py:84 | Epoch[161/NA] Step[49] GlobalStep[22106/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0129] total_loss[0.0180] Rank[0/16] 06/24/2025 15:26:52 INFO stats.py:314 | Epoch[161] Step[67] GlobalStep[22124] Training Speed: 433.47 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:02:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:26:54 INFO loss_tracker.py:84 | Epoch[161/NA] Step[74] GlobalStep[22131/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0130] total_loss[0.0176] Rank[0/16] 06/24/2025 15:27:02 INFO stats.py:314 | Epoch[161] Step[92] GlobalStep[22149] Training Speed: 423.54 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:02:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:27:05 INFO loss_tracker.py:84 | Epoch[161/NA] Step[99] GlobalStep[22156/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0129] total_loss[0.0179] Rank[0/16] 06/24/2025 15:27:12 INFO stats.py:314 | Epoch[161] Step[117] GlobalStep[22174] Training Speed: 433.77 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:02:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:27:15 INFO loss_tracker.py:84 | Epoch[161/NA] Step[124] GlobalStep[22181/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0129] total_loss[0.0181] Rank[0/16] 06/24/2025 15:27:19 INFO stats.py:394 | Epoch[161] completed. Training Speed: 312.23 samples/sec across all devices. Epoch Time: 56.16 sec. Average Epoch Time: 56.16 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:02:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:27:23 INFO stats.py:314 | Epoch[162] Step[5] GlobalStep[22199] Training Speed: 408.20 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 9:02:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:27:30 INFO loss_tracker.py:84 | Epoch[162/NA] Step[24] GlobalStep[22218/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0129] total_loss[0.0182] Rank[0/16] 06/24/2025 15:27:33 INFO stats.py:314 | Epoch[162] Step[30] GlobalStep[22224] Training Speed: 433.50 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:01:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:27:41 INFO loss_tracker.py:84 | Epoch[162/NA] Step[49] GlobalStep[22243/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0129] total_loss[0.0179] Rank[0/16] 06/24/2025 15:27:44 INFO stats.py:314 | Epoch[162] Step[55] GlobalStep[22249] Training Speed: 418.93 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 9:01:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:27:51 INFO loss_tracker.py:84 | Epoch[162/NA] Step[74] GlobalStep[22268/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0130] total_loss[0.0180] Rank[0/16] 06/24/2025 15:27:53 INFO stats.py:314 | Epoch[162] Step[80] GlobalStep[22274] Training Speed: 430.12 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:01:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:28:01 INFO loss_tracker.py:84 | Epoch[162/NA] Step[99] GlobalStep[22293/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0129] total_loss[0.0176] Rank[0/16] 06/24/2025 15:28:04 INFO stats.py:314 | Epoch[162] Step[105] GlobalStep[22299] Training Speed: 407.24 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 9:01:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:28:11 INFO loss_tracker.py:84 | Epoch[162/NA] Step[124] GlobalStep[22318/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0129] total_loss[0.0183] Rank[0/16] 06/24/2025 15:28:13 INFO stats.py:314 | Epoch[162] Step[130] GlobalStep[22324] Training Speed: 451.72 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 9:01:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:28:15 INFO stats.py:394 | Epoch[162] completed. Training Speed: 311.00 samples/sec across all devices. Epoch Time: 56.39 sec. Average Epoch Time: 56.39 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:01:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:28:24 INFO stats.py:314 | Epoch[163] Step[18] GlobalStep[22349] Training Speed: 437.10 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:01:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:28:27 INFO loss_tracker.py:84 | Epoch[163/NA] Step[24] GlobalStep[22355/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0130] total_loss[0.0179] Rank[0/16] 06/24/2025 15:28:35 INFO stats.py:314 | Epoch[163] Step[43] GlobalStep[22374] Training Speed: 434.04 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:00:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:28:37 INFO loss_tracker.py:84 | Epoch[163/NA] Step[49] GlobalStep[22380/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0129] total_loss[0.0177] Rank[0/16] 06/24/2025 15:28:45 INFO stats.py:314 | Epoch[163] Step[68] GlobalStep[22399] Training Speed: 436.99 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:00:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:28:47 INFO loss_tracker.py:84 | Epoch[163/NA] Step[74] GlobalStep[22405/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0129] total_loss[0.0181] Rank[0/16] 06/24/2025 15:28:55 INFO stats.py:314 | Epoch[163] Step[93] GlobalStep[22424] Training Speed: 408.90 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 9:00:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:28:57 INFO loss_tracker.py:84 | Epoch[163/NA] Step[99] GlobalStep[22430/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0048] loss_depth[0.0129] total_loss[0.0177] Rank[0/16] 06/24/2025 15:29:05 INFO stats.py:314 | Epoch[163] Step[118] GlobalStep[22449] Training Speed: 433.05 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 9:00:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:29:08 INFO loss_tracker.py:84 | Epoch[163/NA] Step[124] GlobalStep[22455/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0058] loss_depth[0.0130] total_loss[0.0189] Rank[0/16] 06/24/2025 15:29:12 INFO stats.py:394 | Epoch[163] completed. Training Speed: 308.73 samples/sec across all devices. Epoch Time: 56.80 sec. Average Epoch Time: 56.80 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 9:00:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:29:16 INFO stats.py:314 | Epoch[164] Step[6] GlobalStep[22474] Training Speed: 434.91 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 9:00:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:29:23 INFO loss_tracker.py:84 | Epoch[164/NA] Step[24] GlobalStep[22492/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0129] total_loss[0.0180] Rank[0/16] 06/24/2025 15:29:26 INFO stats.py:314 | Epoch[164] Step[31] GlobalStep[22499] Training Speed: 425.68 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:59:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:29:34 INFO loss_tracker.py:84 | Epoch[164/NA] Step[49] GlobalStep[22517/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0130] total_loss[0.0177] Rank[0/16] 06/24/2025 15:29:36 INFO stats.py:314 | Epoch[164] Step[56] GlobalStep[22524] Training Speed: 418.82 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 8:59:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:29:44 INFO loss_tracker.py:84 | Epoch[164/NA] Step[74] GlobalStep[22542/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0130] total_loss[0.0176] Rank[0/16] 06/24/2025 15:29:47 INFO stats.py:314 | Epoch[164] Step[81] GlobalStep[22549] Training Speed: 404.65 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 8:59:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:29:54 INFO loss_tracker.py:84 | Epoch[164/NA] Step[99] GlobalStep[22567/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0048] loss_depth[0.0129] total_loss[0.0178] Rank[0/16] 06/24/2025 15:29:57 INFO stats.py:314 | Epoch[164] Step[106] GlobalStep[22574] Training Speed: 432.64 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:59:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:30:05 INFO loss_tracker.py:84 | Epoch[164/NA] Step[124] GlobalStep[22592/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0130] total_loss[0.0181] Rank[0/16] 06/24/2025 15:30:07 INFO stats.py:314 | Epoch[164] Step[131] GlobalStep[22599] Training Speed: 449.01 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:59:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:30:09 INFO stats.py:394 | Epoch[164] completed. Training Speed: 308.98 samples/sec across all devices. Epoch Time: 56.76 sec. Average Epoch Time: 56.76 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:59:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:30:18 INFO stats.py:314 | Epoch[165] Step[19] GlobalStep[22624] Training Speed: 424.97 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:59:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:30:21 INFO loss_tracker.py:84 | Epoch[165/NA] Step[24] GlobalStep[22629/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0130] total_loss[0.0181] Rank[0/16] 06/24/2025 15:30:29 INFO stats.py:314 | Epoch[165] Step[44] GlobalStep[22649] Training Speed: 434.96 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:58:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:30:31 INFO loss_tracker.py:84 | Epoch[165/NA] Step[49] GlobalStep[22654/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0043] loss_depth[0.0129] total_loss[0.0173] Rank[0/16] 06/24/2025 15:30:40 INFO stats.py:314 | Epoch[165] Step[69] GlobalStep[22674] Training Speed: 403.93 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 8:58:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:30:42 INFO loss_tracker.py:84 | Epoch[165/NA] Step[74] GlobalStep[22679/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0129] total_loss[0.0177] Rank[0/16] 06/24/2025 15:30:50 INFO stats.py:314 | Epoch[165] Step[94] GlobalStep[22699] Training Speed: 432.46 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:58:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:30:52 INFO loss_tracker.py:84 | Epoch[165/NA] Step[99] GlobalStep[22704/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0129] total_loss[0.0180] Rank[0/16] 06/24/2025 15:31:00 INFO stats.py:314 | Epoch[165] Step[119] GlobalStep[22724] Training Speed: 431.54 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:58:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:31:02 INFO loss_tracker.py:84 | Epoch[165/NA] Step[124] GlobalStep[22729/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0054] loss_depth[0.0129] total_loss[0.0183] Rank[0/16] 06/24/2025 15:31:06 INFO stats.py:394 | Epoch[165] completed. Training Speed: 305.93 samples/sec across all devices. Epoch Time: 57.32 sec. Average Epoch Time: 57.32 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 8:58:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:31:11 INFO stats.py:314 | Epoch[166] Step[7] GlobalStep[22749] Training Speed: 425.47 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:58:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:31:18 INFO loss_tracker.py:84 | Epoch[166/NA] Step[24] GlobalStep[22766/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0046] loss_depth[0.0129] total_loss[0.0176] Rank[0/16] 06/24/2025 15:31:21 INFO stats.py:314 | Epoch[166] Step[32] GlobalStep[22774] Training Speed: 425.52 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:58:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:31:28 INFO loss_tracker.py:84 | Epoch[166/NA] Step[49] GlobalStep[22791/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0129] total_loss[0.0181] Rank[0/16] 06/24/2025 15:31:31 INFO stats.py:314 | Epoch[166] Step[57] GlobalStep[22799] Training Speed: 425.20 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:57:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:31:38 INFO loss_tracker.py:84 | Epoch[166/NA] Step[74] GlobalStep[22816/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0130] total_loss[0.0183] Rank[0/16] 06/24/2025 15:31:41 INFO stats.py:314 | Epoch[166] Step[82] GlobalStep[22824] Training Speed: 253.71 samples/sec across all devices. Average Step Time: 0.50 sec. Estimated Remaining Time: 8:57:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:31:48 INFO loss_tracker.py:84 | Epoch[166/NA] Step[99] GlobalStep[22841/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0129] total_loss[0.0176] Rank[0/16] 06/24/2025 15:31:52 INFO stats.py:314 | Epoch[166] Step[107] GlobalStep[22849] Training Speed: 425.57 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:57:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:31:58 INFO loss_tracker.py:84 | Epoch[166/NA] Step[124] GlobalStep[22866/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0130] total_loss[0.0178] Rank[0/16] 06/24/2025 15:32:01 INFO stats.py:314 | Epoch[166] Step[132] GlobalStep[22874] Training Speed: 449.11 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:57:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:32:02 INFO stats.py:394 | Epoch[166] completed. Training Speed: 312.03 samples/sec across all devices. Epoch Time: 56.20 sec. Average Epoch Time: 56.20 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:57:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:32:12 INFO stats.py:314 | Epoch[167] Step[20] GlobalStep[22899] Training Speed: 428.40 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:57:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:32:14 INFO loss_tracker.py:84 | Epoch[167/NA] Step[24] GlobalStep[22903/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0129] total_loss[0.0181] Rank[0/16] 06/24/2025 15:32:22 INFO stats.py:314 | Epoch[167] Step[45] GlobalStep[22924] Training Speed: 404.72 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 8:56:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:32:24 INFO loss_tracker.py:84 | Epoch[167/NA] Step[49] GlobalStep[22928/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0129] total_loss[0.0176] Rank[0/16] 06/24/2025 15:32:33 INFO stats.py:314 | Epoch[167] Step[70] GlobalStep[22949] Training Speed: 426.41 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:56:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:32:34 INFO loss_tracker.py:84 | Epoch[167/NA] Step[74] GlobalStep[22953/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0048] loss_depth[0.0129] total_loss[0.0177] Rank[0/16] 06/24/2025 15:32:43 INFO stats.py:314 | Epoch[167] Step[95] GlobalStep[22974] Training Speed: 432.66 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:56:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:32:45 INFO loss_tracker.py:84 | Epoch[167/NA] Step[99] GlobalStep[22978/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0129] total_loss[0.0183] Rank[0/16] 06/24/2025 15:32:53 INFO stats.py:314 | Epoch[167] Step[120] GlobalStep[22999] Training Speed: 449.96 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 8:56:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:32:55 INFO loss_tracker.py:84 | Epoch[167/NA] Step[124] GlobalStep[23003/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0129] total_loss[0.0179] Rank[0/16] 06/24/2025 15:32:59 INFO stats.py:394 | Epoch[167] completed. Training Speed: 309.94 samples/sec across all devices. Epoch Time: 56.58 sec. Average Epoch Time: 56.58 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:56:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:33:04 INFO stats.py:314 | Epoch[168] Step[8] GlobalStep[23024] Training Speed: 433.09 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:56:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:33:11 INFO loss_tracker.py:84 | Epoch[168/NA] Step[24] GlobalStep[23040/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0129] total_loss[0.0181] Rank[0/16] 06/24/2025 15:33:14 INFO stats.py:314 | Epoch[168] Step[33] GlobalStep[23049] Training Speed: 428.07 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:56:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:33:21 INFO loss_tracker.py:84 | Epoch[168/NA] Step[49] GlobalStep[23065/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0128] total_loss[0.0185] Rank[0/16] 06/24/2025 15:33:24 INFO stats.py:314 | Epoch[168] Step[58] GlobalStep[23074] Training Speed: 434.75 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:55:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:33:31 INFO loss_tracker.py:84 | Epoch[168/NA] Step[74] GlobalStep[23090/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0129] total_loss[0.0180] Rank[0/16] 06/24/2025 15:33:34 INFO stats.py:314 | Epoch[168] Step[83] GlobalStep[23099] Training Speed: 422.95 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:55:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:33:41 INFO loss_tracker.py:84 | Epoch[168/NA] Step[99] GlobalStep[23115/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0129] total_loss[0.0179] Rank[0/16] 06/24/2025 15:33:45 INFO stats.py:314 | Epoch[168] Step[108] GlobalStep[23124] Training Speed: 431.49 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:55:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:33:51 INFO loss_tracker.py:84 | Epoch[168/NA] Step[124] GlobalStep[23140/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0130] total_loss[0.0180] Rank[0/16] 06/24/2025 15:33:54 INFO stats.py:314 | Epoch[168] Step[133] GlobalStep[23149] Training Speed: 450.00 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 8:55:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:33:55 INFO stats.py:394 | Epoch[168] completed. Training Speed: 312.14 samples/sec across all devices. Epoch Time: 56.18 sec. Average Epoch Time: 56.18 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:55:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:34:05 INFO stats.py:314 | Epoch[169] Step[21] GlobalStep[23174] Training Speed: 438.24 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:55:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:34:07 INFO loss_tracker.py:84 | Epoch[169/NA] Step[24] GlobalStep[23177/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0129] total_loss[0.0182] Rank[0/16] 06/24/2025 15:34:16 INFO stats.py:314 | Epoch[169] Step[46] GlobalStep[23199] Training Speed: 434.90 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:54:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:34:17 INFO loss_tracker.py:84 | Epoch[169/NA] Step[49] GlobalStep[23202/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0129] total_loss[0.0182] Rank[0/16] 06/24/2025 15:34:26 INFO stats.py:314 | Epoch[169] Step[71] GlobalStep[23224] Training Speed: 435.51 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:54:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:34:27 INFO loss_tracker.py:84 | Epoch[169/NA] Step[74] GlobalStep[23227/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 15:34:36 INFO stats.py:314 | Epoch[169] Step[96] GlobalStep[23249] Training Speed: 430.10 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:54:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:34:37 INFO loss_tracker.py:84 | Epoch[169/NA] Step[99] GlobalStep[23252/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0129] total_loss[0.0175] Rank[0/16] 06/24/2025 15:34:46 INFO stats.py:314 | Epoch[169] Step[121] GlobalStep[23274] Training Speed: 452.21 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 8:54:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:34:47 INFO loss_tracker.py:84 | Epoch[169/NA] Step[124] GlobalStep[23277/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0056] loss_depth[0.0128] total_loss[0.0184] Rank[0/16] 06/24/2025 15:34:51 INFO stats.py:394 | Epoch[169] completed. Training Speed: 312.10 samples/sec across all devices. Epoch Time: 56.19 sec. Average Epoch Time: 56.19 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:54:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:34:56 INFO stats.py:314 | Epoch[170] Step[9] GlobalStep[23299] Training Speed: 434.15 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:54:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:35:03 INFO loss_tracker.py:84 | Epoch[170/NA] Step[24] GlobalStep[23314/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0129] total_loss[0.0182] Rank[0/16] 06/24/2025 15:35:07 INFO stats.py:314 | Epoch[170] Step[34] GlobalStep[23324] Training Speed: 427.40 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:53:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:35:13 INFO loss_tracker.py:84 | Epoch[170/NA] Step[49] GlobalStep[23339/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0054] loss_depth[0.0129] total_loss[0.0183] Rank[0/16] 06/24/2025 15:35:17 INFO stats.py:314 | Epoch[170] Step[59] GlobalStep[23349] Training Speed: 432.94 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:53:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:35:23 INFO loss_tracker.py:84 | Epoch[170/NA] Step[74] GlobalStep[23364/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0130] total_loss[0.0179] Rank[0/16] 06/24/2025 15:35:27 INFO stats.py:314 | Epoch[170] Step[84] GlobalStep[23374] Training Speed: 435.68 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:53:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:35:33 INFO loss_tracker.py:84 | Epoch[170/NA] Step[99] GlobalStep[23389/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0129] total_loss[0.0175] Rank[0/16] 06/24/2025 15:35:37 INFO stats.py:314 | Epoch[170] Step[109] GlobalStep[23399] Training Speed: 424.82 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:53:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:35:43 INFO loss_tracker.py:84 | Epoch[170/NA] Step[124] GlobalStep[23414/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0130] total_loss[0.0173] Rank[0/16] 06/24/2025 15:35:47 INFO stats.py:314 | Epoch[170] Step[134] GlobalStep[23424] Training Speed: 450.10 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 8:53:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:35:48 INFO stats.py:394 | Epoch[170] completed. Training Speed: 312.28 samples/sec across all devices. Epoch Time: 56.16 sec. Average Epoch Time: 56.16 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:53:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:35:58 INFO stats.py:314 | Epoch[171] Step[22] GlobalStep[23449] Training Speed: 426.15 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:53:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:35:59 INFO loss_tracker.py:84 | Epoch[171/NA] Step[24] GlobalStep[23451/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0129] total_loss[0.0179] Rank[0/16] 06/24/2025 15:36:08 INFO stats.py:314 | Epoch[171] Step[47] GlobalStep[23474] Training Speed: 434.80 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:52:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:36:09 INFO loss_tracker.py:84 | Epoch[171/NA] Step[49] GlobalStep[23476/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0129] total_loss[0.0182] Rank[0/16] 06/24/2025 15:36:18 INFO stats.py:314 | Epoch[171] Step[72] GlobalStep[23499] Training Speed: 431.22 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:52:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:36:19 INFO loss_tracker.py:84 | Epoch[171/NA] Step[74] GlobalStep[23501/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0129] total_loss[0.0178] Rank[0/16] 06/24/2025 15:36:29 INFO stats.py:314 | Epoch[171] Step[97] GlobalStep[23524] Training Speed: 421.68 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:52:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:36:29 INFO loss_tracker.py:84 | Epoch[171/NA] Step[99] GlobalStep[23526/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0129] total_loss[0.0179] Rank[0/16] 06/24/2025 15:36:39 INFO stats.py:314 | Epoch[171] Step[122] GlobalStep[23549] Training Speed: 447.54 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:52:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:36:39 INFO loss_tracker.py:84 | Epoch[171/NA] Step[124] GlobalStep[23551/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0129] total_loss[0.0180] Rank[0/16] 06/24/2025 15:36:44 INFO stats.py:394 | Epoch[171] completed. Training Speed: 312.47 samples/sec across all devices. Epoch Time: 56.12 sec. Average Epoch Time: 56.12 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:52:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:36:49 INFO stats.py:314 | Epoch[172] Step[10] GlobalStep[23574] Training Speed: 414.21 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 8:52:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:36:55 INFO loss_tracker.py:84 | Epoch[172/NA] Step[24] GlobalStep[23588/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0129] total_loss[0.0182] Rank[0/16] 06/24/2025 15:36:59 INFO stats.py:314 | Epoch[172] Step[35] GlobalStep[23599] Training Speed: 432.95 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:51:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:37:06 INFO loss_tracker.py:84 | Epoch[172/NA] Step[49] GlobalStep[23613/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0129] total_loss[0.0176] Rank[0/16] 06/24/2025 15:37:10 INFO stats.py:314 | Epoch[172] Step[60] GlobalStep[23624] Training Speed: 430.81 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:51:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:37:16 INFO loss_tracker.py:84 | Epoch[172/NA] Step[74] GlobalStep[23638/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0130] total_loss[0.0181] Rank[0/16] 06/24/2025 15:37:20 INFO stats.py:314 | Epoch[172] Step[85] GlobalStep[23649] Training Speed: 420.43 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:51:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:37:26 INFO loss_tracker.py:84 | Epoch[172/NA] Step[99] GlobalStep[23663/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0129] total_loss[0.0177] Rank[0/16] 06/24/2025 15:37:30 INFO stats.py:314 | Epoch[172] Step[110] GlobalStep[23674] Training Speed: 437.14 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:51:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:37:36 INFO loss_tracker.py:84 | Epoch[172/NA] Step[124] GlobalStep[23688/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0129] total_loss[0.0181] Rank[0/16] 06/24/2025 15:37:40 INFO stats.py:314 | Epoch[172] Step[135] GlobalStep[23699] Training Speed: 449.55 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 8:51:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:37:40 INFO stats.py:394 | Epoch[172] completed. Training Speed: 309.92 samples/sec across all devices. Epoch Time: 56.58 sec. Average Epoch Time: 56.58 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:51:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:37:51 INFO stats.py:314 | Epoch[173] Step[23] GlobalStep[23724] Training Speed: 424.72 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:51:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:37:51 INFO loss_tracker.py:84 | Epoch[173/NA] Step[24] GlobalStep[23725/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0055] loss_depth[0.0128] total_loss[0.0184] Rank[0/16] 06/24/2025 15:38:01 INFO stats.py:314 | Epoch[173] Step[48] GlobalStep[23749] Training Speed: 430.09 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:50:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:38:01 INFO loss_tracker.py:84 | Epoch[173/NA] Step[49] GlobalStep[23750/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0129] total_loss[0.0180] Rank[0/16] 06/24/2025 15:38:11 INFO stats.py:314 | Epoch[173] Step[73] GlobalStep[23774] Training Speed: 434.06 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:50:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:38:12 INFO loss_tracker.py:84 | Epoch[173/NA] Step[74] GlobalStep[23775/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0048] loss_depth[0.0129] total_loss[0.0178] Rank[0/16] 06/24/2025 15:38:22 INFO stats.py:314 | Epoch[173] Step[98] GlobalStep[23799] Training Speed: 235.88 samples/sec across all devices. Average Step Time: 0.54 sec. Estimated Remaining Time: 8:50:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:38:22 INFO loss_tracker.py:84 | Epoch[173/NA] Step[99] GlobalStep[23800/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0055] loss_depth[0.0129] total_loss[0.0184] Rank[0/16] 06/24/2025 15:38:31 INFO stats.py:314 | Epoch[173] Step[123] GlobalStep[23824] Training Speed: 448.28 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:50:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:38:32 INFO loss_tracker.py:84 | Epoch[173/NA] Step[124] GlobalStep[23825/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0129] total_loss[0.0173] Rank[0/16] 06/24/2025 15:38:36 INFO stats.py:394 | Epoch[173] completed. Training Speed: 313.71 samples/sec across all devices. Epoch Time: 55.90 sec. Average Epoch Time: 55.90 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:50:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:38:42 INFO stats.py:314 | Epoch[174] Step[11] GlobalStep[23849] Training Speed: 430.31 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:50:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:38:48 INFO loss_tracker.py:84 | Epoch[174/NA] Step[24] GlobalStep[23862/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0129] total_loss[0.0178] Rank[0/16] 06/24/2025 15:38:53 INFO stats.py:314 | Epoch[174] Step[36] GlobalStep[23874] Training Speed: 426.20 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:49:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:38:58 INFO loss_tracker.py:84 | Epoch[174/NA] Step[49] GlobalStep[23887/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0048] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 15:39:03 INFO stats.py:314 | Epoch[174] Step[61] GlobalStep[23899] Training Speed: 419.85 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:49:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:39:08 INFO loss_tracker.py:84 | Epoch[174/NA] Step[74] GlobalStep[23912/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0129] total_loss[0.0177] Rank[0/16] 06/24/2025 15:39:13 INFO stats.py:314 | Epoch[174] Step[86] GlobalStep[23924] Training Speed: 439.54 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:49:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:39:19 INFO loss_tracker.py:84 | Epoch[174/NA] Step[99] GlobalStep[23937/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0129] total_loss[0.0177] Rank[0/16] 06/24/2025 15:39:23 INFO stats.py:314 | Epoch[174] Step[111] GlobalStep[23949] Training Speed: 421.84 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:49:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:39:29 INFO loss_tracker.py:84 | Epoch[174/NA] Step[124] GlobalStep[23962/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0129] total_loss[0.0177] Rank[0/16] 06/24/2025 15:39:33 INFO stats.py:314 | Epoch[174] Step[136] GlobalStep[23974] Training Speed: 445.19 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:49:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:39:33 INFO stats.py:394 | Epoch[174] completed. Training Speed: 307.98 samples/sec across all devices. Epoch Time: 56.94 sec. Average Epoch Time: 56.94 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 8:49:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:39:44 INFO stats.py:314 | Epoch[175] Step[24] GlobalStep[23999] Training Speed: 423.11 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:49:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:39:44 INFO loss_tracker.py:84 | Epoch[175/NA] Step[24] GlobalStep[23999/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0129] total_loss[0.0180] Rank[0/16] 06/24/2025 15:39:45 WARNING accelerator.py:3099 | Deleting 1 checkpoints to make room for new checkpoint. Rank[0/16] 06/24/2025 15:39:46 INFO accelerator.py:3111 | Saving current state to /job_data/checkpoints/checkpoint_5 Rank[8/16] 06/24/2025 15:39:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 23999 to /job_data/checkpoints/checkpoint_5 Rank[10/16] 06/24/2025 15:39:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 23999 to /job_data/checkpoints/checkpoint_5 Rank[4/16] 06/24/2025 15:39:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 23999 to /job_data/checkpoints/checkpoint_5 Rank[12/16] 06/24/2025 15:39:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 23999 to /job_data/checkpoints/checkpoint_5 Rank[11/16] 06/24/2025 15:39:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 23999 to /job_data/checkpoints/checkpoint_5 Rank[14/16] 06/24/2025 15:39:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 23999 to /job_data/checkpoints/checkpoint_5 Rank[6/16] 06/24/2025 15:39:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 23999 to /job_data/checkpoints/checkpoint_5 Rank[15/16] 06/24/2025 15:39:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 23999 to /job_data/checkpoints/checkpoint_5 Rank[3/16] 06/24/2025 15:39:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 23999 to /job_data/checkpoints/checkpoint_5 Rank[13/16] 06/24/2025 15:39:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 23999 to /job_data/checkpoints/checkpoint_5 Rank[1/16] 06/24/2025 15:39:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 23999 to /job_data/checkpoints/checkpoint_5 Rank[9/16] 06/24/2025 15:39:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 23999 to /job_data/checkpoints/checkpoint_5 Rank[5/16] 06/24/2025 15:39:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 23999 to /job_data/checkpoints/checkpoint_5 Rank[7/16] 06/24/2025 15:39:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 23999 to /job_data/checkpoints/checkpoint_5 Rank[2/16] 06/24/2025 15:39:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 23999 to /job_data/checkpoints/checkpoint_5 Rank[0/16] 06/24/2025 15:39:47 INFO checkpointing.py:106 | Model weights saved in /job_data/checkpoints/checkpoint_5/model.safetensors Rank[0/16] 06/24/2025 15:39:48 INFO checkpointing.py:113 | Optimizer state saved in /job_data/checkpoints/checkpoint_5/optimizer.bin Rank[0/16] 06/24/2025 15:39:48 INFO checkpointing.py:120 | Scheduler state saved in /job_data/checkpoints/checkpoint_5/scheduler.bin Rank[0/16] 06/24/2025 15:39:48 INFO checkpointing.py:137 | Sampler state for dataloader 0 saved in /job_data/checkpoints/checkpoint_5/sampler.bin Rank[0/16] 06/24/2025 15:39:48 INFO checkpointing.py:164 | Random states saved in /job_data/checkpoints/checkpoint_5/random_states_0.pkl Rank[0/16] 06/24/2025 15:39:48 INFO checkpointing.py:300 | Saving the state of TrainerProgressState to /job_data/checkpoints/checkpoint_5/custom_checkpoint_0.pkl Rank[0/16] 06/24/2025 15:39:48 INFO checkpoint.py:110 | Save checkpoint at the end of step 23999 to /job_data/checkpoints/checkpoint_5 Rank[0/16] 06/24/2025 15:39:58 INFO stats.py:314 | Epoch[175] Step[49] GlobalStep[24024] Training Speed: 430.49 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:49:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:39:58 INFO loss_tracker.py:84 | Epoch[175/NA] Step[49] GlobalStep[24024/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0129] total_loss[0.0176] Rank[0/16] 06/24/2025 15:40:09 INFO stats.py:314 | Epoch[175] Step[74] GlobalStep[24049] Training Speed: 424.57 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:48:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:40:09 INFO loss_tracker.py:84 | Epoch[175/NA] Step[74] GlobalStep[24049/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0129] total_loss[0.0181] Rank[0/16] 06/24/2025 15:40:19 INFO stats.py:314 | Epoch[175] Step[99] GlobalStep[24074] Training Speed: 433.50 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:48:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:40:19 INFO loss_tracker.py:84 | Epoch[175/NA] Step[99] GlobalStep[24074/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0129] total_loss[0.0178] Rank[0/16] 06/24/2025 15:40:29 INFO stats.py:314 | Epoch[175] Step[124] GlobalStep[24099] Training Speed: 450.73 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 8:48:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:40:30 INFO loss_tracker.py:84 | Epoch[175/NA] Step[124] GlobalStep[24099/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0129] total_loss[0.0184] Rank[0/16] 06/24/2025 15:40:34 INFO stats.py:394 | Epoch[175] completed. Training Speed: 287.48 samples/sec across all devices. Epoch Time: 61.00 sec. Average Epoch Time: 61.00 sec. Average Step Time: 0.45 sec. Estimated Remaining Time: 8:48:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:40:40 INFO stats.py:314 | Epoch[176] Step[12] GlobalStep[24124] Training Speed: 434.98 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:48:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:40:45 INFO loss_tracker.py:84 | Epoch[176/NA] Step[24] GlobalStep[24136/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0129] total_loss[0.0182] Rank[0/16] 06/24/2025 15:40:51 INFO stats.py:314 | Epoch[176] Step[37] GlobalStep[24149] Training Speed: 417.45 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 8:48:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:40:56 INFO loss_tracker.py:84 | Epoch[176/NA] Step[49] GlobalStep[24161/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0129] total_loss[0.0179] Rank[0/16] 06/24/2025 15:41:01 INFO stats.py:314 | Epoch[176] Step[62] GlobalStep[24174] Training Speed: 433.85 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:48:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:41:06 INFO loss_tracker.py:84 | Epoch[176/NA] Step[74] GlobalStep[24186/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0129] total_loss[0.0179] Rank[0/16] 06/24/2025 15:41:11 INFO stats.py:314 | Epoch[176] Step[87] GlobalStep[24199] Training Speed: 429.32 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:47:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:41:16 INFO loss_tracker.py:84 | Epoch[176/NA] Step[99] GlobalStep[24211/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0129] total_loss[0.0172] Rank[0/16] 06/24/2025 15:41:22 INFO stats.py:314 | Epoch[176] Step[112] GlobalStep[24224] Training Speed: 422.73 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:47:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:41:26 INFO loss_tracker.py:84 | Epoch[176/NA] Step[124] GlobalStep[24236/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0129] total_loss[0.0179] Rank[0/16] 06/24/2025 15:41:30 INFO stats.py:394 | Epoch[176] completed. Training Speed: 310.96 samples/sec across all devices. Epoch Time: 56.39 sec. Average Epoch Time: 56.39 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:47:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:41:32 INFO stats.py:314 | Epoch[177] Step[0] GlobalStep[24249] Training Speed: 360.29 samples/sec across all devices. Average Step Time: 0.36 sec. Estimated Remaining Time: 8:47:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:41:42 INFO loss_tracker.py:84 | Epoch[177/NA] Step[24] GlobalStep[24273/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0054] loss_depth[0.0129] total_loss[0.0184] Rank[0/16] 06/24/2025 15:41:42 INFO stats.py:314 | Epoch[177] Step[25] GlobalStep[24274] Training Speed: 407.03 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 8:47:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:41:52 INFO loss_tracker.py:84 | Epoch[177/NA] Step[49] GlobalStep[24298/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0129] total_loss[0.0180] Rank[0/16] 06/24/2025 15:41:52 INFO stats.py:314 | Epoch[177] Step[50] GlobalStep[24299] Training Speed: 409.47 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 8:47:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:42:02 INFO loss_tracker.py:84 | Epoch[177/NA] Step[74] GlobalStep[24323/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0129] total_loss[0.0181] Rank[0/16] 06/24/2025 15:42:02 INFO stats.py:314 | Epoch[177] Step[75] GlobalStep[24324] Training Speed: 412.90 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 8:46:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:42:12 INFO loss_tracker.py:84 | Epoch[177/NA] Step[99] GlobalStep[24348/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0048] loss_depth[0.0129] total_loss[0.0178] Rank[0/16] 06/24/2025 15:42:13 INFO stats.py:314 | Epoch[177] Step[100] GlobalStep[24349] Training Speed: 378.67 samples/sec across all devices. Average Step Time: 0.34 sec. Estimated Remaining Time: 8:46:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:42:22 INFO loss_tracker.py:84 | Epoch[177/NA] Step[124] GlobalStep[24373/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0056] loss_depth[0.0129] total_loss[0.0186] Rank[0/16] 06/24/2025 15:42:23 INFO stats.py:314 | Epoch[177] Step[125] GlobalStep[24374] Training Speed: 430.46 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:46:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:42:27 INFO stats.py:394 | Epoch[177] completed. Training Speed: 312.54 samples/sec across all devices. Epoch Time: 56.11 sec. Average Epoch Time: 56.11 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:46:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:42:34 INFO stats.py:314 | Epoch[178] Step[13] GlobalStep[24399] Training Speed: 424.49 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:46:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:42:38 INFO loss_tracker.py:84 | Epoch[178/NA] Step[24] GlobalStep[24410/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0129] total_loss[0.0181] Rank[0/16] 06/24/2025 15:42:44 INFO stats.py:314 | Epoch[178] Step[38] GlobalStep[24424] Training Speed: 434.17 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:46:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:42:48 INFO loss_tracker.py:84 | Epoch[178/NA] Step[49] GlobalStep[24435/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0043] loss_depth[0.0129] total_loss[0.0173] Rank[0/16] 06/24/2025 15:42:54 INFO stats.py:314 | Epoch[178] Step[63] GlobalStep[24449] Training Speed: 408.37 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 8:46:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:42:59 INFO loss_tracker.py:84 | Epoch[178/NA] Step[74] GlobalStep[24460/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 15:43:04 INFO stats.py:314 | Epoch[178] Step[88] GlobalStep[24474] Training Speed: 426.83 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:45:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:43:09 INFO loss_tracker.py:84 | Epoch[178/NA] Step[99] GlobalStep[24485/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0129] total_loss[0.0174] Rank[0/16] 06/24/2025 15:43:15 INFO stats.py:314 | Epoch[178] Step[113] GlobalStep[24499] Training Speed: 429.60 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:45:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:43:19 INFO loss_tracker.py:84 | Epoch[178/NA] Step[124] GlobalStep[24510/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0129] total_loss[0.0181] Rank[0/16] 06/24/2025 15:43:23 INFO stats.py:394 | Epoch[178] completed. Training Speed: 309.75 samples/sec across all devices. Epoch Time: 56.61 sec. Average Epoch Time: 56.61 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:45:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:43:25 INFO stats.py:314 | Epoch[179] Step[1] GlobalStep[24524] Training Speed: 433.86 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:45:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:43:35 INFO loss_tracker.py:84 | Epoch[179/NA] Step[24] GlobalStep[24547/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0129] total_loss[0.0182] Rank[0/16] 06/24/2025 15:43:36 INFO stats.py:314 | Epoch[179] Step[26] GlobalStep[24549] Training Speed: 432.36 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:45:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:43:44 INFO loss_tracker.py:84 | Epoch[179/NA] Step[49] GlobalStep[24572/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0176] Rank[0/16] 06/24/2025 15:43:45 INFO stats.py:314 | Epoch[179] Step[51] GlobalStep[24574] Training Speed: 391.89 samples/sec across all devices. Average Step Time: 0.33 sec. Estimated Remaining Time: 8:45:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:43:55 INFO loss_tracker.py:84 | Epoch[179/NA] Step[74] GlobalStep[24597/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0129] total_loss[0.0180] Rank[0/16] 06/24/2025 15:43:56 INFO stats.py:314 | Epoch[179] Step[76] GlobalStep[24599] Training Speed: 430.53 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:44:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:44:05 INFO loss_tracker.py:84 | Epoch[179/NA] Step[99] GlobalStep[24622/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0055] loss_depth[0.0129] total_loss[0.0184] Rank[0/16] 06/24/2025 15:44:06 INFO stats.py:314 | Epoch[179] Step[101] GlobalStep[24624] Training Speed: 426.88 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:44:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:44:15 INFO loss_tracker.py:84 | Epoch[179/NA] Step[124] GlobalStep[24647/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 15:44:16 INFO stats.py:314 | Epoch[179] Step[126] GlobalStep[24649] Training Speed: 445.86 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:44:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:44:20 INFO stats.py:394 | Epoch[179] completed. Training Speed: 311.00 samples/sec across all devices. Epoch Time: 56.39 sec. Average Epoch Time: 56.39 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:44:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:44:26 INFO stats.py:314 | Epoch[180] Step[14] GlobalStep[24674] Training Speed: 437.71 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:44:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:44:31 INFO loss_tracker.py:84 | Epoch[180/NA] Step[24] GlobalStep[24684/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0129] total_loss[0.0175] Rank[0/16] 06/24/2025 15:44:37 INFO stats.py:314 | Epoch[180] Step[39] GlobalStep[24699] Training Speed: 426.33 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:44:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:44:41 INFO loss_tracker.py:84 | Epoch[180/NA] Step[49] GlobalStep[24709/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0054] loss_depth[0.0130] total_loss[0.0184] Rank[0/16] 06/24/2025 15:44:47 INFO stats.py:314 | Epoch[180] Step[64] GlobalStep[24724] Training Speed: 434.27 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:43:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:44:51 INFO loss_tracker.py:84 | Epoch[180/NA] Step[74] GlobalStep[24734/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0129] total_loss[0.0178] Rank[0/16] 06/24/2025 15:44:57 INFO stats.py:314 | Epoch[180] Step[89] GlobalStep[24749] Training Speed: 433.62 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:43:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:45:02 INFO loss_tracker.py:84 | Epoch[180/NA] Step[99] GlobalStep[24759/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0129] total_loss[0.0174] Rank[0/16] 06/24/2025 15:45:08 INFO stats.py:314 | Epoch[180] Step[114] GlobalStep[24774] Training Speed: 421.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:43:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:45:12 INFO loss_tracker.py:84 | Epoch[180/NA] Step[124] GlobalStep[24784/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0129] total_loss[0.0179] Rank[0/16] 06/24/2025 15:45:16 INFO stats.py:394 | Epoch[180] completed. Training Speed: 311.39 samples/sec across all devices. Epoch Time: 56.32 sec. Average Epoch Time: 56.32 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:43:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:45:18 INFO stats.py:314 | Epoch[181] Step[2] GlobalStep[24799] Training Speed: 423.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:43:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:45:27 INFO loss_tracker.py:84 | Epoch[181/NA] Step[24] GlobalStep[24821/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0055] loss_depth[0.0129] total_loss[0.0184] Rank[0/16] 06/24/2025 15:45:28 INFO stats.py:314 | Epoch[181] Step[27] GlobalStep[24824] Training Speed: 429.13 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:43:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:45:37 INFO loss_tracker.py:84 | Epoch[181/NA] Step[49] GlobalStep[24846/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0129] total_loss[0.0174] Rank[0/16] 06/24/2025 15:45:39 INFO stats.py:314 | Epoch[181] Step[52] GlobalStep[24849] Training Speed: 415.87 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 8:43:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:45:47 INFO loss_tracker.py:84 | Epoch[181/NA] Step[74] GlobalStep[24871/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0129] total_loss[0.0177] Rank[0/16] 06/24/2025 15:45:49 INFO stats.py:314 | Epoch[181] Step[77] GlobalStep[24874] Training Speed: 415.16 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 8:42:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:45:58 INFO loss_tracker.py:84 | Epoch[181/NA] Step[99] GlobalStep[24896/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0128] total_loss[0.0181] Rank[0/16] 06/24/2025 15:45:59 INFO stats.py:314 | Epoch[181] Step[102] GlobalStep[24899] Training Speed: 430.23 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:42:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:46:08 INFO loss_tracker.py:84 | Epoch[181/NA] Step[124] GlobalStep[24921/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0129] total_loss[0.0181] Rank[0/16] 06/24/2025 15:46:09 INFO stats.py:314 | Epoch[181] Step[127] GlobalStep[24924] Training Speed: 449.16 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 8:42:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:46:12 INFO stats.py:394 | Epoch[181] completed. Training Speed: 312.14 samples/sec across all devices. Epoch Time: 56.18 sec. Average Epoch Time: 56.18 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:42:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:46:20 INFO stats.py:314 | Epoch[182] Step[15] GlobalStep[24949] Training Speed: 408.56 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 8:42:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:46:23 INFO loss_tracker.py:84 | Epoch[182/NA] Step[24] GlobalStep[24958/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0129] total_loss[0.0179] Rank[0/16] 06/24/2025 15:46:30 INFO stats.py:314 | Epoch[182] Step[40] GlobalStep[24974] Training Speed: 429.20 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:42:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:46:34 INFO loss_tracker.py:84 | Epoch[182/NA] Step[49] GlobalStep[24983/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0129] total_loss[0.0182] Rank[0/16] 06/24/2025 15:46:41 INFO stats.py:314 | Epoch[182] Step[65] GlobalStep[24999] Training Speed: 421.52 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:42:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:46:44 INFO loss_tracker.py:84 | Epoch[182/NA] Step[74] GlobalStep[25008/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0129] total_loss[0.0172] Rank[0/16] 06/24/2025 15:46:51 INFO stats.py:314 | Epoch[182] Step[90] GlobalStep[25024] Training Speed: 431.28 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:41:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:46:54 INFO loss_tracker.py:84 | Epoch[182/NA] Step[99] GlobalStep[25033/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0129] total_loss[0.0183] Rank[0/16] 06/24/2025 15:47:01 INFO stats.py:314 | Epoch[182] Step[115] GlobalStep[25049] Training Speed: 426.93 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:41:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:47:04 INFO loss_tracker.py:84 | Epoch[182/NA] Step[124] GlobalStep[25058/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0129] total_loss[0.0177] Rank[0/16] 06/24/2025 15:47:08 INFO stats.py:394 | Epoch[182] completed. Training Speed: 311.39 samples/sec across all devices. Epoch Time: 56.31 sec. Average Epoch Time: 56.31 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:41:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:47:11 INFO stats.py:314 | Epoch[183] Step[3] GlobalStep[25074] Training Speed: 423.89 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:41:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:47:20 INFO loss_tracker.py:84 | Epoch[183/NA] Step[24] GlobalStep[25095/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0128] total_loss[0.0174] Rank[0/16] 06/24/2025 15:47:21 INFO stats.py:314 | Epoch[183] Step[28] GlobalStep[25099] Training Speed: 420.33 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:41:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:47:30 INFO loss_tracker.py:84 | Epoch[183/NA] Step[49] GlobalStep[25120/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0055] loss_depth[0.0129] total_loss[0.0185] Rank[0/16] 06/24/2025 15:47:31 INFO stats.py:314 | Epoch[183] Step[53] GlobalStep[25124] Training Speed: 430.17 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:41:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:47:40 INFO loss_tracker.py:84 | Epoch[183/NA] Step[74] GlobalStep[25145/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0048] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 15:47:42 INFO stats.py:314 | Epoch[183] Step[78] GlobalStep[25149] Training Speed: 433.36 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:40:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:47:50 INFO loss_tracker.py:84 | Epoch[183/NA] Step[99] GlobalStep[25170/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0055] loss_depth[0.0129] total_loss[0.0185] Rank[0/16] 06/24/2025 15:47:52 INFO stats.py:314 | Epoch[183] Step[103] GlobalStep[25174] Training Speed: 429.05 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:40:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:48:00 INFO loss_tracker.py:84 | Epoch[183/NA] Step[124] GlobalStep[25195/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 15:48:02 INFO stats.py:314 | Epoch[183] Step[128] GlobalStep[25199] Training Speed: 448.18 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:40:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:48:05 INFO stats.py:394 | Epoch[183] completed. Training Speed: 312.42 samples/sec across all devices. Epoch Time: 56.13 sec. Average Epoch Time: 56.13 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:40:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:48:12 INFO stats.py:314 | Epoch[184] Step[16] GlobalStep[25224] Training Speed: 429.99 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:40:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:48:16 INFO loss_tracker.py:84 | Epoch[184/NA] Step[24] GlobalStep[25232/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0129] total_loss[0.0182] Rank[0/16] 06/24/2025 15:48:23 INFO stats.py:314 | Epoch[184] Step[41] GlobalStep[25249] Training Speed: 425.77 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:40:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:48:26 INFO loss_tracker.py:84 | Epoch[184/NA] Step[49] GlobalStep[25257/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0129] total_loss[0.0181] Rank[0/16] 06/24/2025 15:48:33 INFO stats.py:314 | Epoch[184] Step[66] GlobalStep[25274] Training Speed: 435.03 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:39:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:48:36 INFO loss_tracker.py:84 | Epoch[184/NA] Step[74] GlobalStep[25282/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0055] loss_depth[0.0129] total_loss[0.0184] Rank[0/16] 06/24/2025 15:48:43 INFO stats.py:314 | Epoch[184] Step[91] GlobalStep[25299] Training Speed: 431.20 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:39:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:48:47 INFO loss_tracker.py:84 | Epoch[184/NA] Step[99] GlobalStep[25307/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 15:48:54 INFO stats.py:314 | Epoch[184] Step[116] GlobalStep[25324] Training Speed: 426.23 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:39:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:48:57 INFO loss_tracker.py:84 | Epoch[184/NA] Step[124] GlobalStep[25332/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0056] loss_depth[0.0129] total_loss[0.0185] Rank[0/16] 06/24/2025 15:49:02 INFO stats.py:394 | Epoch[184] completed. Training Speed: 306.87 samples/sec across all devices. Epoch Time: 57.14 sec. Average Epoch Time: 57.14 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 8:39:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:49:05 INFO stats.py:314 | Epoch[185] Step[4] GlobalStep[25349] Training Speed: 433.55 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:39:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:49:13 INFO loss_tracker.py:84 | Epoch[185/NA] Step[24] GlobalStep[25369/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0129] total_loss[0.0178] Rank[0/16] 06/24/2025 15:49:15 INFO stats.py:314 | Epoch[185] Step[29] GlobalStep[25374] Training Speed: 431.32 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:39:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:49:23 INFO loss_tracker.py:84 | Epoch[185/NA] Step[49] GlobalStep[25394/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0129] total_loss[0.0177] Rank[0/16] 06/24/2025 15:49:26 INFO stats.py:314 | Epoch[185] Step[54] GlobalStep[25399] Training Speed: 430.19 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:39:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:49:33 INFO loss_tracker.py:84 | Epoch[185/NA] Step[74] GlobalStep[25419/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0128] total_loss[0.0182] Rank[0/16] 06/24/2025 15:49:36 INFO stats.py:314 | Epoch[185] Step[79] GlobalStep[25424] Training Speed: 433.55 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:38:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:49:44 INFO loss_tracker.py:84 | Epoch[185/NA] Step[99] GlobalStep[25444/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0129] total_loss[0.0178] Rank[0/16] 06/24/2025 15:49:45 INFO stats.py:314 | Epoch[185] Step[104] GlobalStep[25449] Training Speed: 430.25 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:38:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:49:53 INFO loss_tracker.py:84 | Epoch[185/NA] Step[124] GlobalStep[25469/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0129] total_loss[0.0178] Rank[0/16] 06/24/2025 15:49:55 INFO stats.py:314 | Epoch[185] Step[129] GlobalStep[25474] Training Speed: 449.15 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 8:38:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:49:58 INFO stats.py:394 | Epoch[185] completed. Training Speed: 313.73 samples/sec across all devices. Epoch Time: 55.89 sec. Average Epoch Time: 55.89 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:38:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:50:06 INFO stats.py:314 | Epoch[186] Step[17] GlobalStep[25499] Training Speed: 424.36 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:38:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:50:09 INFO loss_tracker.py:84 | Epoch[186/NA] Step[24] GlobalStep[25506/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0129] total_loss[0.0175] Rank[0/16] 06/24/2025 15:50:16 INFO stats.py:314 | Epoch[186] Step[42] GlobalStep[25524] Training Speed: 432.16 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:38:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:50:19 INFO loss_tracker.py:84 | Epoch[186/NA] Step[49] GlobalStep[25531/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0054] loss_depth[0.0128] total_loss[0.0183] Rank[0/16] 06/24/2025 15:50:26 INFO stats.py:314 | Epoch[186] Step[67] GlobalStep[25549] Training Speed: 428.55 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:38:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:50:29 INFO loss_tracker.py:84 | Epoch[186/NA] Step[74] GlobalStep[25556/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0129] total_loss[0.0178] Rank[0/16] 06/24/2025 15:50:37 INFO stats.py:314 | Epoch[186] Step[92] GlobalStep[25574] Training Speed: 432.12 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:37:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:50:40 INFO loss_tracker.py:84 | Epoch[186/NA] Step[99] GlobalStep[25581/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0055] loss_depth[0.0128] total_loss[0.0184] Rank[0/16] 06/24/2025 15:50:47 INFO stats.py:314 | Epoch[186] Step[117] GlobalStep[25599] Training Speed: 417.57 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 8:37:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:50:50 INFO loss_tracker.py:84 | Epoch[186/NA] Step[124] GlobalStep[25606/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0129] total_loss[0.0179] Rank[0/16] 06/24/2025 15:50:54 INFO stats.py:394 | Epoch[186] completed. Training Speed: 311.14 samples/sec across all devices. Epoch Time: 56.36 sec. Average Epoch Time: 56.36 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:37:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:50:58 INFO stats.py:314 | Epoch[187] Step[5] GlobalStep[25624] Training Speed: 428.17 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:37:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:51:05 INFO loss_tracker.py:84 | Epoch[187/NA] Step[24] GlobalStep[25643/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0129] total_loss[0.0177] Rank[0/16] 06/24/2025 15:51:08 INFO stats.py:314 | Epoch[187] Step[30] GlobalStep[25649] Training Speed: 406.58 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 8:37:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:51:16 INFO loss_tracker.py:84 | Epoch[187/NA] Step[49] GlobalStep[25668/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0129] total_loss[0.0180] Rank[0/16] 06/24/2025 15:51:18 INFO stats.py:314 | Epoch[187] Step[55] GlobalStep[25674] Training Speed: 431.64 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:37:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:51:26 INFO loss_tracker.py:84 | Epoch[187/NA] Step[74] GlobalStep[25693/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0050] loss_depth[0.0129] total_loss[0.0179] Rank[0/16] 06/24/2025 15:51:28 INFO stats.py:314 | Epoch[187] Step[80] GlobalStep[25699] Training Speed: 433.51 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:36:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:51:36 INFO loss_tracker.py:84 | Epoch[187/NA] Step[99] GlobalStep[25718/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0129] total_loss[0.0176] Rank[0/16] 06/24/2025 15:51:39 INFO stats.py:314 | Epoch[187] Step[105] GlobalStep[25724] Training Speed: 426.26 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:36:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:51:46 INFO loss_tracker.py:84 | Epoch[187/NA] Step[124] GlobalStep[25743/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0057] loss_depth[0.0129] total_loss[0.0186] Rank[0/16] 06/24/2025 15:51:48 INFO stats.py:314 | Epoch[187] Step[130] GlobalStep[25749] Training Speed: 447.74 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:36:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:51:50 INFO stats.py:394 | Epoch[187] completed. Training Speed: 312.64 samples/sec across all devices. Epoch Time: 56.09 sec. Average Epoch Time: 56.09 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:36:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:51:59 INFO stats.py:314 | Epoch[188] Step[18] GlobalStep[25774] Training Speed: 423.35 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:36:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:52:02 INFO loss_tracker.py:84 | Epoch[188/NA] Step[24] GlobalStep[25780/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0129] total_loss[0.0177] Rank[0/16] 06/24/2025 15:52:09 INFO stats.py:314 | Epoch[188] Step[43] GlobalStep[25799] Training Speed: 433.72 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:36:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:52:12 INFO loss_tracker.py:84 | Epoch[188/NA] Step[49] GlobalStep[25805/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0129] total_loss[0.0178] Rank[0/16] 06/24/2025 15:52:20 INFO stats.py:314 | Epoch[188] Step[68] GlobalStep[25824] Training Speed: 428.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:36:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:52:22 INFO loss_tracker.py:84 | Epoch[188/NA] Step[74] GlobalStep[25830/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0128] total_loss[0.0181] Rank[0/16] 06/24/2025 15:52:29 INFO stats.py:314 | Epoch[188] Step[93] GlobalStep[25849] Training Speed: 436.34 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:35:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:52:32 INFO loss_tracker.py:84 | Epoch[188/NA] Step[99] GlobalStep[25855/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0054] loss_depth[0.0129] total_loss[0.0184] Rank[0/16] 06/24/2025 15:52:40 INFO stats.py:314 | Epoch[188] Step[118] GlobalStep[25874] Training Speed: 408.30 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 8:35:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:52:42 INFO loss_tracker.py:84 | Epoch[188/NA] Step[124] GlobalStep[25880/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0129] total_loss[0.0176] Rank[0/16] 06/24/2025 15:52:46 INFO stats.py:394 | Epoch[188] completed. Training Speed: 311.16 samples/sec across all devices. Epoch Time: 56.36 sec. Average Epoch Time: 56.36 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:35:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:52:50 INFO stats.py:314 | Epoch[189] Step[6] GlobalStep[25899] Training Speed: 408.10 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 8:35:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:52:58 INFO loss_tracker.py:84 | Epoch[189/NA] Step[24] GlobalStep[25917/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0129] total_loss[0.0182] Rank[0/16] 06/24/2025 15:53:01 INFO stats.py:314 | Epoch[189] Step[31] GlobalStep[25924] Training Speed: 431.53 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:35:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:53:08 INFO loss_tracker.py:84 | Epoch[189/NA] Step[49] GlobalStep[25942/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0129] total_loss[0.0180] Rank[0/16] 06/24/2025 15:53:11 INFO stats.py:314 | Epoch[189] Step[56] GlobalStep[25949] Training Speed: 429.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:35:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:53:18 INFO loss_tracker.py:84 | Epoch[189/NA] Step[74] GlobalStep[25967/99999]: loss_noise_mse[0.0002] loss_fk_mse[0.0053] loss_depth[0.0129] total_loss[0.0183] Rank[0/16] 06/24/2025 15:53:21 INFO stats.py:314 | Epoch[189] Step[81] GlobalStep[25974] Training Speed: 429.39 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:34:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:53:28 INFO loss_tracker.py:84 | Epoch[189/NA] Step[99] GlobalStep[25992/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0129] total_loss[0.0182] Rank[0/16] 06/24/2025 15:53:31 INFO stats.py:314 | Epoch[189] Step[106] GlobalStep[25999] Training Speed: 430.49 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:34:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:53:39 INFO loss_tracker.py:84 | Epoch[189/NA] Step[124] GlobalStep[26017/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0129] total_loss[0.0182] Rank[0/16] 06/24/2025 15:53:41 INFO stats.py:314 | Epoch[189] Step[131] GlobalStep[26024] Training Speed: 448.37 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:34:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:53:43 INFO stats.py:394 | Epoch[189] completed. Training Speed: 310.56 samples/sec across all devices. Epoch Time: 56.47 sec. Average Epoch Time: 56.47 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:34:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:53:52 INFO stats.py:314 | Epoch[190] Step[19] GlobalStep[26049] Training Speed: 433.77 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:34:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:53:54 INFO loss_tracker.py:84 | Epoch[190/NA] Step[24] GlobalStep[26054/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0129] total_loss[0.0179] Rank[0/16] 06/24/2025 15:54:03 INFO stats.py:314 | Epoch[190] Step[44] GlobalStep[26074] Training Speed: 428.55 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:34:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:54:04 INFO loss_tracker.py:84 | Epoch[190/NA] Step[49] GlobalStep[26079/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0129] total_loss[0.0173] Rank[0/16] 06/24/2025 15:54:13 INFO stats.py:314 | Epoch[190] Step[69] GlobalStep[26099] Training Speed: 443.24 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:34:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:54:15 INFO loss_tracker.py:84 | Epoch[190/NA] Step[74] GlobalStep[26104/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0128] total_loss[0.0176] Rank[0/16] 06/24/2025 15:54:23 INFO stats.py:314 | Epoch[190] Step[94] GlobalStep[26124] Training Speed: 433.58 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:33:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:54:25 INFO loss_tracker.py:84 | Epoch[190/NA] Step[99] GlobalStep[26129/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0128] total_loss[0.0181] Rank[0/16] 06/24/2025 15:54:33 INFO stats.py:314 | Epoch[190] Step[119] GlobalStep[26149] Training Speed: 419.51 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 8:33:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:54:35 INFO loss_tracker.py:84 | Epoch[190/NA] Step[124] GlobalStep[26154/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0128] total_loss[0.0179] Rank[0/16] 06/24/2025 15:54:39 INFO stats.py:394 | Epoch[190] completed. Training Speed: 311.02 samples/sec across all devices. Epoch Time: 56.38 sec. Average Epoch Time: 56.38 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:33:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:54:43 INFO stats.py:314 | Epoch[191] Step[7] GlobalStep[26174] Training Speed: 432.36 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:33:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:54:50 INFO loss_tracker.py:84 | Epoch[191/NA] Step[24] GlobalStep[26191/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0129] total_loss[0.0176] Rank[0/16] 06/24/2025 15:54:54 INFO stats.py:314 | Epoch[191] Step[32] GlobalStep[26199] Training Speed: 433.73 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:33:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:55:01 INFO loss_tracker.py:84 | Epoch[191/NA] Step[49] GlobalStep[26216/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0129] total_loss[0.0177] Rank[0/16] 06/24/2025 15:55:04 INFO stats.py:314 | Epoch[191] Step[57] GlobalStep[26224] Training Speed: 433.16 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:33:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:55:11 INFO loss_tracker.py:84 | Epoch[191/NA] Step[74] GlobalStep[26241/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0128] total_loss[0.0180] Rank[0/16] 06/24/2025 15:55:14 INFO stats.py:314 | Epoch[191] Step[82] GlobalStep[26249] Training Speed: 433.62 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:32:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:55:21 INFO loss_tracker.py:84 | Epoch[191/NA] Step[99] GlobalStep[26266/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 15:55:24 INFO stats.py:314 | Epoch[191] Step[107] GlobalStep[26274] Training Speed: 409.05 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 8:32:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:55:31 INFO loss_tracker.py:84 | Epoch[191/NA] Step[124] GlobalStep[26291/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0128] total_loss[0.0172] Rank[0/16] 06/24/2025 15:55:34 INFO stats.py:314 | Epoch[191] Step[132] GlobalStep[26299] Training Speed: 446.30 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:32:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:55:36 INFO stats.py:394 | Epoch[191] completed. Training Speed: 311.08 samples/sec across all devices. Epoch Time: 56.37 sec. Average Epoch Time: 56.37 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:32:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:55:45 INFO stats.py:314 | Epoch[192] Step[20] GlobalStep[26324] Training Speed: 432.02 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:32:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:55:47 INFO loss_tracker.py:84 | Epoch[192/NA] Step[24] GlobalStep[26328/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 15:55:56 INFO stats.py:314 | Epoch[192] Step[45] GlobalStep[26349] Training Speed: 426.18 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:32:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:55:57 INFO loss_tracker.py:84 | Epoch[192/NA] Step[49] GlobalStep[26353/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0128] total_loss[0.0179] Rank[0/16] 06/24/2025 15:56:06 INFO stats.py:314 | Epoch[192] Step[70] GlobalStep[26374] Training Speed: 431.80 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:32:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:56:07 INFO loss_tracker.py:84 | Epoch[192/NA] Step[74] GlobalStep[26378/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0128] total_loss[0.0180] Rank[0/16] 06/24/2025 15:56:16 INFO stats.py:314 | Epoch[192] Step[95] GlobalStep[26399] Training Speed: 431.84 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:31:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:56:17 INFO loss_tracker.py:84 | Epoch[192/NA] Step[99] GlobalStep[26403/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0057] loss_depth[0.0129] total_loss[0.0186] Rank[0/16] 06/24/2025 15:56:25 INFO stats.py:314 | Epoch[192] Step[120] GlobalStep[26424] Training Speed: 447.18 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:31:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:56:27 INFO loss_tracker.py:84 | Epoch[192/NA] Step[124] GlobalStep[26428/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0129] total_loss[0.0178] Rank[0/16] 06/24/2025 15:56:31 INFO stats.py:394 | Epoch[192] completed. Training Speed: 314.99 samples/sec across all devices. Epoch Time: 55.67 sec. Average Epoch Time: 55.67 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:31:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:56:36 INFO stats.py:314 | Epoch[193] Step[8] GlobalStep[26449] Training Speed: 431.04 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:31:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:56:43 INFO loss_tracker.py:84 | Epoch[193/NA] Step[24] GlobalStep[26465/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0129] total_loss[0.0180] Rank[0/16] 06/24/2025 15:56:46 INFO stats.py:314 | Epoch[193] Step[33] GlobalStep[26474] Training Speed: 415.84 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 8:31:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:56:53 INFO loss_tracker.py:84 | Epoch[193/NA] Step[49] GlobalStep[26490/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 15:56:57 INFO stats.py:314 | Epoch[193] Step[58] GlobalStep[26499] Training Speed: 429.45 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:31:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:57:03 INFO loss_tracker.py:84 | Epoch[193/NA] Step[74] GlobalStep[26515/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0129] total_loss[0.0177] Rank[0/16] 06/24/2025 15:57:07 INFO stats.py:314 | Epoch[193] Step[83] GlobalStep[26524] Training Speed: 425.85 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:30:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:57:13 INFO loss_tracker.py:84 | Epoch[193/NA] Step[99] GlobalStep[26540/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 15:57:17 INFO stats.py:314 | Epoch[193] Step[108] GlobalStep[26549] Training Speed: 431.98 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:30:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:57:23 INFO loss_tracker.py:84 | Epoch[193/NA] Step[124] GlobalStep[26565/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0129] total_loss[0.0177] Rank[0/16] 06/24/2025 15:57:26 INFO stats.py:314 | Epoch[193] Step[133] GlobalStep[26574] Training Speed: 449.42 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 8:30:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:57:27 INFO stats.py:394 | Epoch[193] completed. Training Speed: 313.53 samples/sec across all devices. Epoch Time: 55.93 sec. Average Epoch Time: 55.93 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:30:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:57:37 INFO stats.py:314 | Epoch[194] Step[21] GlobalStep[26599] Training Speed: 432.49 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:30:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:57:39 INFO loss_tracker.py:84 | Epoch[194/NA] Step[24] GlobalStep[26602/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0129] total_loss[0.0179] Rank[0/16] 06/24/2025 15:57:48 INFO stats.py:314 | Epoch[194] Step[46] GlobalStep[26624] Training Speed: 434.72 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:30:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:57:49 INFO loss_tracker.py:84 | Epoch[194/NA] Step[49] GlobalStep[26627/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0128] total_loss[0.0172] Rank[0/16] 06/24/2025 15:57:58 INFO stats.py:314 | Epoch[194] Step[71] GlobalStep[26649] Training Speed: 398.97 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 8:30:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:58:00 INFO loss_tracker.py:84 | Epoch[194/NA] Step[74] GlobalStep[26652/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 15:58:09 INFO stats.py:314 | Epoch[194] Step[96] GlobalStep[26674] Training Speed: 428.61 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:29:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:58:10 INFO loss_tracker.py:84 | Epoch[194/NA] Step[99] GlobalStep[26677/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0129] total_loss[0.0175] Rank[0/16] 06/24/2025 15:58:19 INFO stats.py:314 | Epoch[194] Step[121] GlobalStep[26699] Training Speed: 449.08 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:29:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:58:20 INFO loss_tracker.py:84 | Epoch[194/NA] Step[124] GlobalStep[26702/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0129] total_loss[0.0179] Rank[0/16] 06/24/2025 15:58:25 INFO stats.py:394 | Epoch[194] completed. Training Speed: 305.91 samples/sec across all devices. Epoch Time: 57.32 sec. Average Epoch Time: 57.32 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 8:29:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:58:30 INFO stats.py:314 | Epoch[195] Step[9] GlobalStep[26724] Training Speed: 432.38 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:29:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:58:36 INFO loss_tracker.py:84 | Epoch[195/NA] Step[24] GlobalStep[26739/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0048] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 15:58:40 INFO stats.py:314 | Epoch[195] Step[34] GlobalStep[26749] Training Speed: 429.77 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:29:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:58:46 INFO loss_tracker.py:84 | Epoch[195/NA] Step[49] GlobalStep[26764/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 15:58:50 INFO stats.py:314 | Epoch[195] Step[59] GlobalStep[26774] Training Speed: 432.79 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:29:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:58:56 INFO loss_tracker.py:84 | Epoch[195/NA] Step[74] GlobalStep[26789/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 15:59:00 INFO stats.py:314 | Epoch[195] Step[84] GlobalStep[26799] Training Speed: 432.49 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:28:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:59:07 INFO loss_tracker.py:84 | Epoch[195/NA] Step[99] GlobalStep[26814/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0055] loss_depth[0.0129] total_loss[0.0184] Rank[0/16] 06/24/2025 15:59:11 INFO stats.py:314 | Epoch[195] Step[109] GlobalStep[26824] Training Speed: 437.01 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:28:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:59:17 INFO loss_tracker.py:84 | Epoch[195/NA] Step[124] GlobalStep[26839/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0128] total_loss[0.0176] Rank[0/16] 06/24/2025 15:59:21 INFO stats.py:314 | Epoch[195] Step[134] GlobalStep[26849] Training Speed: 449.37 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 8:28:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:59:21 INFO stats.py:394 | Epoch[195] completed. Training Speed: 308.77 samples/sec across all devices. Epoch Time: 56.79 sec. Average Epoch Time: 56.79 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:28:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:59:31 INFO stats.py:314 | Epoch[196] Step[22] GlobalStep[26874] Training Speed: 434.96 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:28:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:59:32 INFO loss_tracker.py:84 | Epoch[196/NA] Step[24] GlobalStep[26876/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0176] Rank[0/16] 06/24/2025 15:59:42 INFO stats.py:314 | Epoch[196] Step[47] GlobalStep[26899] Training Speed: 397.96 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 8:28:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:59:43 INFO loss_tracker.py:84 | Epoch[196/NA] Step[49] GlobalStep[26901/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0128] total_loss[0.0173] Rank[0/16] 06/24/2025 15:59:52 INFO stats.py:314 | Epoch[196] Step[72] GlobalStep[26924] Training Speed: 433.14 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:28:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 15:59:53 INFO loss_tracker.py:84 | Epoch[196/NA] Step[74] GlobalStep[26926/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0129] total_loss[0.0179] Rank[0/16] 06/24/2025 16:00:02 INFO stats.py:314 | Epoch[196] Step[97] GlobalStep[26949] Training Speed: 429.11 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:27:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:00:03 INFO loss_tracker.py:84 | Epoch[196/NA] Step[99] GlobalStep[26951/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0128] total_loss[0.0168] Rank[0/16] 06/24/2025 16:00:13 INFO stats.py:314 | Epoch[196] Step[122] GlobalStep[26974] Training Speed: 450.82 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 8:27:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:00:13 INFO loss_tracker.py:84 | Epoch[196/NA] Step[124] GlobalStep[26976/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0129] total_loss[0.0183] Rank[0/16] 06/24/2025 16:00:18 INFO stats.py:394 | Epoch[196] completed. Training Speed: 311.40 samples/sec across all devices. Epoch Time: 56.31 sec. Average Epoch Time: 56.31 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:27:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:00:23 INFO stats.py:314 | Epoch[197] Step[10] GlobalStep[26999] Training Speed: 352.70 samples/sec across all devices. Average Step Time: 0.36 sec. Estimated Remaining Time: 8:27:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:00:30 INFO loss_tracker.py:84 | Epoch[197/NA] Step[24] GlobalStep[27013/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0128] total_loss[0.0176] Rank[0/16] 06/24/2025 16:00:34 INFO stats.py:314 | Epoch[197] Step[35] GlobalStep[27024] Training Speed: 426.02 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:27:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:00:40 INFO loss_tracker.py:84 | Epoch[197/NA] Step[49] GlobalStep[27038/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0129] total_loss[0.0179] Rank[0/16] 06/24/2025 16:00:44 INFO stats.py:314 | Epoch[197] Step[60] GlobalStep[27049] Training Speed: 431.76 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:27:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:00:50 INFO loss_tracker.py:84 | Epoch[197/NA] Step[74] GlobalStep[27063/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0129] total_loss[0.0176] Rank[0/16] 06/24/2025 16:00:55 INFO stats.py:314 | Epoch[197] Step[85] GlobalStep[27074] Training Speed: 429.63 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:27:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:01:00 INFO loss_tracker.py:84 | Epoch[197/NA] Step[99] GlobalStep[27088/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0128] total_loss[0.0180] Rank[0/16] 06/24/2025 16:01:05 INFO stats.py:314 | Epoch[197] Step[110] GlobalStep[27099] Training Speed: 433.45 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:26:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:01:11 INFO loss_tracker.py:84 | Epoch[197/NA] Step[124] GlobalStep[27113/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 16:01:14 INFO stats.py:314 | Epoch[197] Step[135] GlobalStep[27124] Training Speed: 451.39 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 8:26:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:01:15 INFO stats.py:394 | Epoch[197] completed. Training Speed: 306.87 samples/sec across all devices. Epoch Time: 57.15 sec. Average Epoch Time: 57.15 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 8:26:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:01:26 INFO stats.py:314 | Epoch[198] Step[23] GlobalStep[27149] Training Speed: 402.60 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 8:26:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:01:26 INFO loss_tracker.py:84 | Epoch[198/NA] Step[24] GlobalStep[27150/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0055] loss_depth[0.0129] total_loss[0.0184] Rank[0/16] 06/24/2025 16:01:36 INFO stats.py:314 | Epoch[198] Step[48] GlobalStep[27174] Training Speed: 424.26 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:26:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:01:36 INFO loss_tracker.py:84 | Epoch[198/NA] Step[49] GlobalStep[27175/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0128] total_loss[0.0176] Rank[0/16] 06/24/2025 16:01:46 INFO stats.py:314 | Epoch[198] Step[73] GlobalStep[27199] Training Speed: 431.73 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:26:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:01:47 INFO loss_tracker.py:84 | Epoch[198/NA] Step[74] GlobalStep[27200/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0129] total_loss[0.0173] Rank[0/16] 06/24/2025 16:01:56 INFO stats.py:314 | Epoch[198] Step[98] GlobalStep[27224] Training Speed: 432.62 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:25:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:01:57 INFO loss_tracker.py:84 | Epoch[198/NA] Step[99] GlobalStep[27225/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0128] total_loss[0.0179] Rank[0/16] 06/24/2025 16:02:07 INFO stats.py:314 | Epoch[198] Step[123] GlobalStep[27249] Training Speed: 449.62 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 8:25:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:02:07 INFO loss_tracker.py:84 | Epoch[198/NA] Step[124] GlobalStep[27250/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0174] Rank[0/16] 06/24/2025 16:02:11 INFO stats.py:394 | Epoch[198] completed. Training Speed: 309.76 samples/sec across all devices. Epoch Time: 56.61 sec. Average Epoch Time: 56.61 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:25:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:02:17 INFO stats.py:314 | Epoch[199] Step[11] GlobalStep[27274] Training Speed: 431.87 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:25:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:02:23 INFO loss_tracker.py:84 | Epoch[199/NA] Step[24] GlobalStep[27287/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 16:02:28 INFO stats.py:314 | Epoch[199] Step[36] GlobalStep[27299] Training Speed: 429.02 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:25:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:02:33 INFO loss_tracker.py:84 | Epoch[199/NA] Step[49] GlobalStep[27312/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0174] Rank[0/16] 06/24/2025 16:02:38 INFO stats.py:314 | Epoch[199] Step[61] GlobalStep[27324] Training Speed: 434.30 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:25:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:02:44 INFO loss_tracker.py:84 | Epoch[199/NA] Step[74] GlobalStep[27337/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 16:02:49 INFO stats.py:314 | Epoch[199] Step[86] GlobalStep[27349] Training Speed: 426.65 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:25:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:02:54 INFO loss_tracker.py:84 | Epoch[199/NA] Step[99] GlobalStep[27362/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0128] total_loss[0.0180] Rank[0/16] 06/24/2025 16:02:59 INFO stats.py:314 | Epoch[199] Step[111] GlobalStep[27374] Training Speed: 433.22 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:24:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:03:04 INFO loss_tracker.py:84 | Epoch[199/NA] Step[124] GlobalStep[27387/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0128] total_loss[0.0173] Rank[0/16] 06/24/2025 16:03:08 INFO stats.py:314 | Epoch[199] Step[136] GlobalStep[27399] Training Speed: 453.53 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 8:24:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:03:08 INFO stats.py:394 | Epoch[199] completed. Training Speed: 307.36 samples/sec across all devices. Epoch Time: 57.05 sec. Average Epoch Time: 57.05 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 8:24:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:03:19 INFO stats.py:314 | Epoch[200] Step[24] GlobalStep[27424] Training Speed: 438.04 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:24:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:03:20 INFO loss_tracker.py:84 | Epoch[200/NA] Step[24] GlobalStep[27424/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 16:03:30 INFO stats.py:314 | Epoch[200] Step[49] GlobalStep[27449] Training Speed: 432.08 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:24:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:03:30 INFO loss_tracker.py:84 | Epoch[200/NA] Step[49] GlobalStep[27449/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0128] total_loss[0.0171] Rank[0/16] 06/24/2025 16:03:40 INFO stats.py:314 | Epoch[200] Step[74] GlobalStep[27474] Training Speed: 431.10 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:24:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:03:40 INFO loss_tracker.py:84 | Epoch[200/NA] Step[74] GlobalStep[27474/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0129] total_loss[0.0173] Rank[0/16] 06/24/2025 16:03:51 INFO stats.py:314 | Epoch[200] Step[99] GlobalStep[27499] Training Speed: 413.36 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 8:24:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:03:51 INFO loss_tracker.py:84 | Epoch[200/NA] Step[99] GlobalStep[27499/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0174] Rank[0/16] 06/24/2025 16:04:01 INFO stats.py:314 | Epoch[200] Step[124] GlobalStep[27524] Training Speed: 451.57 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 8:23:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:04:01 INFO loss_tracker.py:84 | Epoch[200/NA] Step[124] GlobalStep[27524/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 16:04:05 INFO stats.py:394 | Epoch[200] completed. Training Speed: 308.65 samples/sec across all devices. Epoch Time: 56.81 sec. Average Epoch Time: 56.81 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:23:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:04:12 INFO stats.py:314 | Epoch[201] Step[12] GlobalStep[27549] Training Speed: 430.14 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:23:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:04:17 INFO loss_tracker.py:84 | Epoch[201/NA] Step[24] GlobalStep[27561/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0129] total_loss[0.0176] Rank[0/16] 06/24/2025 16:04:22 INFO stats.py:314 | Epoch[201] Step[37] GlobalStep[27574] Training Speed: 435.66 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:23:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:04:27 INFO loss_tracker.py:84 | Epoch[201/NA] Step[49] GlobalStep[27586/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0128] total_loss[0.0172] Rank[0/16] 06/24/2025 16:04:32 INFO stats.py:314 | Epoch[201] Step[62] GlobalStep[27599] Training Speed: 433.23 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:23:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:04:37 INFO loss_tracker.py:84 | Epoch[201/NA] Step[74] GlobalStep[27611/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0039] loss_depth[0.0128] total_loss[0.0168] Rank[0/16] 06/24/2025 16:04:42 INFO stats.py:314 | Epoch[201] Step[87] GlobalStep[27624] Training Speed: 435.58 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:23:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:04:47 INFO loss_tracker.py:84 | Epoch[201/NA] Step[99] GlobalStep[27636/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0052] loss_depth[0.0128] total_loss[0.0181] Rank[0/16] 06/24/2025 16:04:52 INFO stats.py:314 | Epoch[201] Step[112] GlobalStep[27649] Training Speed: 433.50 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:22:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:04:57 INFO loss_tracker.py:84 | Epoch[201/NA] Step[124] GlobalStep[27661/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0128] total_loss[0.0174] Rank[0/16] 06/24/2025 16:05:02 INFO stats.py:394 | Epoch[201] completed. Training Speed: 311.60 samples/sec across all devices. Epoch Time: 56.28 sec. Average Epoch Time: 56.28 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:22:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:05:03 INFO stats.py:314 | Epoch[202] Step[0] GlobalStep[27674] Training Speed: 360.57 samples/sec across all devices. Average Step Time: 0.35 sec. Estimated Remaining Time: 8:22:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:05:13 INFO loss_tracker.py:84 | Epoch[202/NA] Step[24] GlobalStep[27698/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0128] total_loss[0.0181] Rank[0/16] 06/24/2025 16:05:13 INFO stats.py:314 | Epoch[202] Step[25] GlobalStep[27699] Training Speed: 400.05 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 8:22:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:05:23 INFO loss_tracker.py:84 | Epoch[202/NA] Step[49] GlobalStep[27723/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 16:05:23 INFO stats.py:314 | Epoch[202] Step[50] GlobalStep[27724] Training Speed: 429.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:22:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:05:33 INFO loss_tracker.py:84 | Epoch[202/NA] Step[74] GlobalStep[27748/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0128] total_loss[0.0183] Rank[0/16] 06/24/2025 16:05:33 INFO stats.py:314 | Epoch[202] Step[75] GlobalStep[27749] Training Speed: 404.57 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 8:22:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:05:43 INFO loss_tracker.py:84 | Epoch[202/NA] Step[99] GlobalStep[27773/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0129] total_loss[0.0180] Rank[0/16] 06/24/2025 16:05:44 INFO stats.py:314 | Epoch[202] Step[100] GlobalStep[27774] Training Speed: 418.74 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 8:22:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:05:53 INFO loss_tracker.py:84 | Epoch[202/NA] Step[124] GlobalStep[27798/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0128] total_loss[0.0172] Rank[0/16] 06/24/2025 16:05:54 INFO stats.py:314 | Epoch[202] Step[125] GlobalStep[27799] Training Speed: 423.14 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:21:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:05:57 INFO stats.py:394 | Epoch[202] completed. Training Speed: 313.63 samples/sec across all devices. Epoch Time: 55.91 sec. Average Epoch Time: 55.91 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:21:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:06:05 INFO stats.py:314 | Epoch[203] Step[13] GlobalStep[27824] Training Speed: 432.25 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:21:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:06:09 INFO loss_tracker.py:84 | Epoch[203/NA] Step[24] GlobalStep[27835/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0128] total_loss[0.0179] Rank[0/16] 06/24/2025 16:06:15 INFO stats.py:314 | Epoch[203] Step[38] GlobalStep[27849] Training Speed: 433.86 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:21:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:06:19 INFO loss_tracker.py:84 | Epoch[203/NA] Step[49] GlobalStep[27860/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0054] loss_depth[0.0128] total_loss[0.0183] Rank[0/16] 06/24/2025 16:06:25 INFO stats.py:314 | Epoch[203] Step[63] GlobalStep[27874] Training Speed: 433.69 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:21:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:06:29 INFO loss_tracker.py:84 | Epoch[203/NA] Step[74] GlobalStep[27885/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0129] total_loss[0.0181] Rank[0/16] 06/24/2025 16:06:35 INFO stats.py:314 | Epoch[203] Step[88] GlobalStep[27899] Training Speed: 434.03 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:21:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:06:39 INFO loss_tracker.py:84 | Epoch[203/NA] Step[99] GlobalStep[27910/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0128] total_loss[0.0179] Rank[0/16] 06/24/2025 16:06:45 INFO stats.py:314 | Epoch[203] Step[113] GlobalStep[27924] Training Speed: 431.72 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:20:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:06:49 INFO loss_tracker.py:84 | Epoch[203/NA] Step[124] GlobalStep[27935/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0129] total_loss[0.0176] Rank[0/16] 06/24/2025 16:06:54 INFO stats.py:394 | Epoch[203] completed. Training Speed: 311.13 samples/sec across all devices. Epoch Time: 56.36 sec. Average Epoch Time: 56.36 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:20:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:06:55 INFO stats.py:314 | Epoch[204] Step[1] GlobalStep[27949] Training Speed: 410.08 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 8:20:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:07:05 INFO loss_tracker.py:84 | Epoch[204/NA] Step[24] GlobalStep[27972/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0128] total_loss[0.0180] Rank[0/16] 06/24/2025 16:07:06 INFO stats.py:314 | Epoch[204] Step[26] GlobalStep[27974] Training Speed: 424.06 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:20:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:07:15 INFO loss_tracker.py:84 | Epoch[204/NA] Step[49] GlobalStep[27997/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0129] total_loss[0.0177] Rank[0/16] 06/24/2025 16:07:16 INFO stats.py:314 | Epoch[204] Step[51] GlobalStep[27999] Training Speed: 429.93 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:20:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:07:16 WARNING accelerator.py:3099 | Deleting 1 checkpoints to make room for new checkpoint. Rank[0/16] 06/24/2025 16:07:18 INFO accelerator.py:3111 | Saving current state to /job_data/checkpoints/checkpoint_6 Rank[7/16] 06/24/2025 16:07:18 INFO checkpoint.py:110 | Save checkpoint at the end of step 27999 to /job_data/checkpoints/checkpoint_6 Rank[5/16] 06/24/2025 16:07:18 INFO checkpoint.py:110 | Save checkpoint at the end of step 27999 to /job_data/checkpoints/checkpoint_6 Rank[1/16] 06/24/2025 16:07:18 INFO checkpoint.py:110 | Save checkpoint at the end of step 27999 to /job_data/checkpoints/checkpoint_6 Rank[12/16] 06/24/2025 16:07:18 INFO checkpoint.py:110 | Save checkpoint at the end of step 27999 to /job_data/checkpoints/checkpoint_6 Rank[3/16] 06/24/2025 16:07:18 INFO checkpoint.py:110 | Save checkpoint at the end of step 27999 to /job_data/checkpoints/checkpoint_6 Rank[14/16] 06/24/2025 16:07:18 INFO checkpoint.py:110 | Save checkpoint at the end of step 27999 to /job_data/checkpoints/checkpoint_6 Rank[9/16] 06/24/2025 16:07:18 INFO checkpoint.py:110 | Save checkpoint at the end of step 27999 to /job_data/checkpoints/checkpoint_6 Rank[6/16] 06/24/2025 16:07:18 INFO checkpoint.py:110 | Save checkpoint at the end of step 27999 to /job_data/checkpoints/checkpoint_6 Rank[11/16] 06/24/2025 16:07:18 INFO checkpoint.py:110 | Save checkpoint at the end of step 27999 to /job_data/checkpoints/checkpoint_6 Rank[2/16] 06/24/2025 16:07:18 INFO checkpoint.py:110 | Save checkpoint at the end of step 27999 to /job_data/checkpoints/checkpoint_6 Rank[4/16] 06/24/2025 16:07:18 INFO checkpoint.py:110 | Save checkpoint at the end of step 27999 to /job_data/checkpoints/checkpoint_6 Rank[8/16] 06/24/2025 16:07:18 INFO checkpoint.py:110 | Save checkpoint at the end of step 27999 to /job_data/checkpoints/checkpoint_6 Rank[10/16] 06/24/2025 16:07:18 INFO checkpoint.py:110 | Save checkpoint at the end of step 27999 to /job_data/checkpoints/checkpoint_6 Rank[15/16] 06/24/2025 16:07:18 INFO checkpoint.py:110 | Save checkpoint at the end of step 27999 to /job_data/checkpoints/checkpoint_6 Rank[13/16] 06/24/2025 16:07:19 INFO checkpoint.py:110 | Save checkpoint at the end of step 27999 to /job_data/checkpoints/checkpoint_6 Rank[0/16] 06/24/2025 16:07:19 INFO checkpointing.py:106 | Model weights saved in /job_data/checkpoints/checkpoint_6/model.safetensors Rank[0/16] 06/24/2025 16:07:20 INFO checkpointing.py:113 | Optimizer state saved in /job_data/checkpoints/checkpoint_6/optimizer.bin Rank[0/16] 06/24/2025 16:07:20 INFO checkpointing.py:120 | Scheduler state saved in /job_data/checkpoints/checkpoint_6/scheduler.bin Rank[0/16] 06/24/2025 16:07:20 INFO checkpointing.py:137 | Sampler state for dataloader 0 saved in /job_data/checkpoints/checkpoint_6/sampler.bin Rank[0/16] 06/24/2025 16:07:20 INFO checkpointing.py:164 | Random states saved in /job_data/checkpoints/checkpoint_6/random_states_0.pkl Rank[0/16] 06/24/2025 16:07:20 INFO checkpointing.py:300 | Saving the state of TrainerProgressState to /job_data/checkpoints/checkpoint_6/custom_checkpoint_0.pkl Rank[0/16] 06/24/2025 16:07:20 INFO checkpoint.py:110 | Save checkpoint at the end of step 27999 to /job_data/checkpoints/checkpoint_6 Rank[0/16] 06/24/2025 16:07:30 INFO loss_tracker.py:84 | Epoch[204/NA] Step[74] GlobalStep[28022/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0128] total_loss[0.0173] Rank[0/16] 06/24/2025 16:07:31 INFO stats.py:314 | Epoch[204] Step[76] GlobalStep[28024] Training Speed: 430.33 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:20:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:07:40 INFO loss_tracker.py:84 | Epoch[204/NA] Step[99] GlobalStep[28047/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 16:07:41 INFO stats.py:314 | Epoch[204] Step[101] GlobalStep[28049] Training Speed: 426.66 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:20:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:07:50 INFO loss_tracker.py:84 | Epoch[204/NA] Step[124] GlobalStep[28072/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0180] Rank[0/16] 06/24/2025 16:07:51 INFO stats.py:314 | Epoch[204] Step[126] GlobalStep[28074] Training Speed: 446.72 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:20:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:07:55 INFO stats.py:394 | Epoch[204] completed. Training Speed: 288.33 samples/sec across all devices. Epoch Time: 60.82 sec. Average Epoch Time: 60.82 sec. Average Step Time: 0.44 sec. Estimated Remaining Time: 8:19:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:08:01 INFO stats.py:314 | Epoch[205] Step[14] GlobalStep[28099] Training Speed: 431.52 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:19:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:08:06 INFO loss_tracker.py:84 | Epoch[205/NA] Step[24] GlobalStep[28109/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 16:08:12 INFO stats.py:314 | Epoch[205] Step[39] GlobalStep[28124] Training Speed: 435.06 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:19:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:08:16 INFO loss_tracker.py:84 | Epoch[205/NA] Step[49] GlobalStep[28134/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0128] total_loss[0.0181] Rank[0/16] 06/24/2025 16:08:22 INFO stats.py:314 | Epoch[205] Step[64] GlobalStep[28149] Training Speed: 421.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:19:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:08:26 INFO loss_tracker.py:84 | Epoch[205/NA] Step[74] GlobalStep[28159/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0128] total_loss[0.0181] Rank[0/16] 06/24/2025 16:08:33 INFO stats.py:314 | Epoch[205] Step[89] GlobalStep[28174] Training Speed: 434.27 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:19:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:08:37 INFO loss_tracker.py:84 | Epoch[205/NA] Step[99] GlobalStep[28184/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0129] total_loss[0.0175] Rank[0/16] 06/24/2025 16:08:43 INFO stats.py:314 | Epoch[205] Step[114] GlobalStep[28199] Training Speed: 426.01 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:19:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:08:47 INFO loss_tracker.py:84 | Epoch[205/NA] Step[124] GlobalStep[28209/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0128] total_loss[0.0172] Rank[0/16] 06/24/2025 16:08:52 INFO stats.py:394 | Epoch[205] completed. Training Speed: 307.36 samples/sec across all devices. Epoch Time: 57.05 sec. Average Epoch Time: 57.05 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 8:18:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:08:54 INFO stats.py:314 | Epoch[206] Step[2] GlobalStep[28224] Training Speed: 429.68 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:18:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:09:02 INFO loss_tracker.py:84 | Epoch[206/NA] Step[24] GlobalStep[28246/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0129] total_loss[0.0180] Rank[0/16] 06/24/2025 16:09:03 INFO stats.py:314 | Epoch[206] Step[27] GlobalStep[28249] Training Speed: 429.76 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:18:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:09:13 INFO loss_tracker.py:84 | Epoch[206/NA] Step[49] GlobalStep[28271/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0128] total_loss[0.0181] Rank[0/16] 06/24/2025 16:09:14 INFO stats.py:314 | Epoch[206] Step[52] GlobalStep[28274] Training Speed: 428.38 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:18:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:09:23 INFO loss_tracker.py:84 | Epoch[206/NA] Step[74] GlobalStep[28296/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 16:09:24 INFO stats.py:314 | Epoch[206] Step[77] GlobalStep[28299] Training Speed: 435.57 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:18:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:09:33 INFO loss_tracker.py:84 | Epoch[206/NA] Step[99] GlobalStep[28321/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0128] total_loss[0.0176] Rank[0/16] 06/24/2025 16:09:34 INFO stats.py:314 | Epoch[206] Step[102] GlobalStep[28324] Training Speed: 432.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:18:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:09:43 INFO loss_tracker.py:84 | Epoch[206/NA] Step[124] GlobalStep[28346/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0128] total_loss[0.0179] Rank[0/16] 06/24/2025 16:09:44 INFO stats.py:314 | Epoch[206] Step[127] GlobalStep[28349] Training Speed: 449.52 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 8:18:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:09:47 INFO stats.py:394 | Epoch[206] completed. Training Speed: 317.75 samples/sec across all devices. Epoch Time: 55.19 sec. Average Epoch Time: 55.19 sec. Average Step Time: 0.40 sec. Estimated Remaining Time: 8:17:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:09:55 INFO stats.py:314 | Epoch[207] Step[15] GlobalStep[28374] Training Speed: 434.72 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:17:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:09:58 INFO loss_tracker.py:84 | Epoch[207/NA] Step[24] GlobalStep[28383/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0128] total_loss[0.0170] Rank[0/16] 06/24/2025 16:10:04 INFO stats.py:314 | Epoch[207] Step[40] GlobalStep[28399] Training Speed: 431.66 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:17:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:10:08 INFO loss_tracker.py:84 | Epoch[207/NA] Step[49] GlobalStep[28408/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0176] Rank[0/16] 06/24/2025 16:10:15 INFO stats.py:314 | Epoch[207] Step[65] GlobalStep[28424] Training Speed: 431.45 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:17:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:10:18 INFO loss_tracker.py:84 | Epoch[207/NA] Step[74] GlobalStep[28433/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 16:10:24 INFO stats.py:314 | Epoch[207] Step[90] GlobalStep[28449] Training Speed: 432.61 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:17:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:10:29 INFO loss_tracker.py:84 | Epoch[207/NA] Step[99] GlobalStep[28458/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0128] total_loss[0.0176] Rank[0/16] 06/24/2025 16:10:35 INFO stats.py:314 | Epoch[207] Step[115] GlobalStep[28474] Training Speed: 429.86 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:17:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:10:38 INFO loss_tracker.py:84 | Epoch[207/NA] Step[124] GlobalStep[28483/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0176] Rank[0/16] 06/24/2025 16:10:43 INFO stats.py:394 | Epoch[207] completed. Training Speed: 314.21 samples/sec across all devices. Epoch Time: 55.81 sec. Average Epoch Time: 55.81 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:16:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:10:45 INFO stats.py:314 | Epoch[208] Step[3] GlobalStep[28499] Training Speed: 430.95 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:16:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:10:54 INFO loss_tracker.py:84 | Epoch[208/NA] Step[24] GlobalStep[28520/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0128] total_loss[0.0181] Rank[0/16] 06/24/2025 16:10:56 INFO stats.py:314 | Epoch[208] Step[28] GlobalStep[28524] Training Speed: 433.04 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:16:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:11:05 INFO loss_tracker.py:84 | Epoch[208/NA] Step[49] GlobalStep[28545/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0129] total_loss[0.0175] Rank[0/16] 06/24/2025 16:11:06 INFO stats.py:314 | Epoch[208] Step[53] GlobalStep[28549] Training Speed: 427.23 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:16:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:11:15 INFO loss_tracker.py:84 | Epoch[208/NA] Step[74] GlobalStep[28570/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 16:11:16 INFO stats.py:314 | Epoch[208] Step[78] GlobalStep[28574] Training Speed: 433.65 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:16:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:11:25 INFO loss_tracker.py:84 | Epoch[208/NA] Step[99] GlobalStep[28595/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 16:11:27 INFO stats.py:314 | Epoch[208] Step[103] GlobalStep[28599] Training Speed: 402.46 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 8:16:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:11:35 INFO loss_tracker.py:84 | Epoch[208/NA] Step[124] GlobalStep[28620/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 16:11:36 INFO stats.py:314 | Epoch[208] Step[128] GlobalStep[28624] Training Speed: 448.28 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:16:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:11:39 INFO stats.py:394 | Epoch[208] completed. Training Speed: 311.65 samples/sec across all devices. Epoch Time: 56.27 sec. Average Epoch Time: 56.27 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:15:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:11:47 INFO stats.py:314 | Epoch[209] Step[16] GlobalStep[28649] Training Speed: 422.98 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:15:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:11:50 INFO loss_tracker.py:84 | Epoch[209/NA] Step[24] GlobalStep[28657/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0128] total_loss[0.0173] Rank[0/16] 06/24/2025 16:11:57 INFO stats.py:314 | Epoch[209] Step[41] GlobalStep[28674] Training Speed: 422.84 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:15:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:12:01 INFO loss_tracker.py:84 | Epoch[209/NA] Step[49] GlobalStep[28682/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 16:12:08 INFO stats.py:314 | Epoch[209] Step[66] GlobalStep[28699] Training Speed: 428.24 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:15:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:12:11 INFO loss_tracker.py:84 | Epoch[209/NA] Step[74] GlobalStep[28707/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0129] total_loss[0.0173] Rank[0/16] 06/24/2025 16:12:18 INFO stats.py:314 | Epoch[209] Step[91] GlobalStep[28724] Training Speed: 264.80 samples/sec across all devices. Average Step Time: 0.48 sec. Estimated Remaining Time: 8:15:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:12:21 INFO loss_tracker.py:84 | Epoch[209/NA] Step[99] GlobalStep[28732/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0129] total_loss[0.0180] Rank[0/16] 06/24/2025 16:12:29 INFO stats.py:314 | Epoch[209] Step[116] GlobalStep[28749] Training Speed: 429.07 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:15:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:12:32 INFO loss_tracker.py:84 | Epoch[209/NA] Step[124] GlobalStep[28757/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0174] Rank[0/16] 06/24/2025 16:12:36 INFO stats.py:394 | Epoch[209] completed. Training Speed: 306.24 samples/sec across all devices. Epoch Time: 57.26 sec. Average Epoch Time: 57.26 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 8:15:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:12:39 INFO stats.py:314 | Epoch[210] Step[4] GlobalStep[28774] Training Speed: 427.67 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:15:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:12:48 INFO loss_tracker.py:84 | Epoch[210/NA] Step[24] GlobalStep[28794/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 16:12:50 INFO stats.py:314 | Epoch[210] Step[29] GlobalStep[28799] Training Speed: 420.19 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:14:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:12:58 INFO loss_tracker.py:84 | Epoch[210/NA] Step[49] GlobalStep[28819/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0128] total_loss[0.0173] Rank[0/16] 06/24/2025 16:13:00 INFO stats.py:314 | Epoch[210] Step[54] GlobalStep[28824] Training Speed: 428.56 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:14:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:13:09 INFO loss_tracker.py:84 | Epoch[210/NA] Step[74] GlobalStep[28844/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0055] loss_depth[0.0128] total_loss[0.0183] Rank[0/16] 06/24/2025 16:13:11 INFO stats.py:314 | Epoch[210] Step[79] GlobalStep[28849] Training Speed: 429.32 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:14:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:13:19 INFO loss_tracker.py:84 | Epoch[210/NA] Step[99] GlobalStep[28869/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0128] total_loss[0.0173] Rank[0/16] 06/24/2025 16:13:21 INFO stats.py:314 | Epoch[210] Step[104] GlobalStep[28874] Training Speed: 428.99 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:14:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:13:29 INFO loss_tracker.py:84 | Epoch[210/NA] Step[124] GlobalStep[28894/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0128] total_loss[0.0171] Rank[0/16] 06/24/2025 16:13:31 INFO stats.py:314 | Epoch[210] Step[129] GlobalStep[28899] Training Speed: 446.16 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:14:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:13:34 INFO stats.py:394 | Epoch[210] completed. Training Speed: 306.03 samples/sec across all devices. Epoch Time: 57.30 sec. Average Epoch Time: 57.30 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 8:14:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:13:42 INFO stats.py:314 | Epoch[211] Step[17] GlobalStep[28924] Training Speed: 433.97 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:13:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:13:45 INFO loss_tracker.py:84 | Epoch[211/NA] Step[24] GlobalStep[28931/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0129] total_loss[0.0183] Rank[0/16] 06/24/2025 16:13:53 INFO stats.py:314 | Epoch[211] Step[42] GlobalStep[28949] Training Speed: 431.25 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:13:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:13:55 INFO loss_tracker.py:84 | Epoch[211/NA] Step[49] GlobalStep[28956/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0061] loss_depth[0.0128] total_loss[0.0190] Rank[0/16] 06/24/2025 16:14:03 INFO stats.py:314 | Epoch[211] Step[67] GlobalStep[28974] Training Speed: 435.37 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:13:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:14:05 INFO loss_tracker.py:84 | Epoch[211/NA] Step[74] GlobalStep[28981/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0128] total_loss[0.0182] Rank[0/16] 06/24/2025 16:14:13 INFO stats.py:314 | Epoch[211] Step[92] GlobalStep[28999] Training Speed: 420.76 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:13:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:14:16 INFO loss_tracker.py:84 | Epoch[211/NA] Step[99] GlobalStep[29006/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 16:14:23 INFO stats.py:314 | Epoch[211] Step[117] GlobalStep[29024] Training Speed: 434.12 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:13:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:14:26 INFO loss_tracker.py:84 | Epoch[211/NA] Step[124] GlobalStep[29031/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0174] Rank[0/16] 06/24/2025 16:14:30 INFO stats.py:394 | Epoch[211] completed. Training Speed: 310.62 samples/sec across all devices. Epoch Time: 56.45 sec. Average Epoch Time: 56.45 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:13:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:14:33 INFO stats.py:314 | Epoch[212] Step[5] GlobalStep[29049] Training Speed: 424.04 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:13:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:14:41 INFO loss_tracker.py:84 | Epoch[212/NA] Step[24] GlobalStep[29068/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0128] total_loss[0.0174] Rank[0/16] 06/24/2025 16:14:43 INFO stats.py:314 | Epoch[212] Step[30] GlobalStep[29074] Training Speed: 419.31 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 8:12:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:14:52 INFO loss_tracker.py:84 | Epoch[212/NA] Step[49] GlobalStep[29093/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 16:14:54 INFO stats.py:314 | Epoch[212] Step[55] GlobalStep[29099] Training Speed: 401.71 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 8:12:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:15:01 INFO loss_tracker.py:84 | Epoch[212/NA] Step[74] GlobalStep[29118/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0129] total_loss[0.0178] Rank[0/16] 06/24/2025 16:15:04 INFO stats.py:314 | Epoch[212] Step[80] GlobalStep[29124] Training Speed: 420.93 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:12:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:15:12 INFO loss_tracker.py:84 | Epoch[212/NA] Step[99] GlobalStep[29143/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0128] total_loss[0.0179] Rank[0/16] 06/24/2025 16:15:14 INFO stats.py:314 | Epoch[212] Step[105] GlobalStep[29149] Training Speed: 418.66 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 8:12:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:15:22 INFO loss_tracker.py:84 | Epoch[212/NA] Step[124] GlobalStep[29168/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0128] total_loss[0.0182] Rank[0/16] 06/24/2025 16:15:24 INFO stats.py:314 | Epoch[212] Step[130] GlobalStep[29174] Training Speed: 447.25 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:12:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:15:26 INFO stats.py:394 | Epoch[212] completed. Training Speed: 313.95 samples/sec across all devices. Epoch Time: 55.86 sec. Average Epoch Time: 55.86 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:12:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:15:35 INFO stats.py:314 | Epoch[213] Step[18] GlobalStep[29199] Training Speed: 431.59 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:11:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:15:37 INFO loss_tracker.py:84 | Epoch[213/NA] Step[24] GlobalStep[29205/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0174] Rank[0/16] 06/24/2025 16:15:45 INFO stats.py:314 | Epoch[213] Step[43] GlobalStep[29224] Training Speed: 406.45 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 8:11:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:15:48 INFO loss_tracker.py:84 | Epoch[213/NA] Step[49] GlobalStep[29230/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 16:15:55 INFO stats.py:314 | Epoch[213] Step[68] GlobalStep[29249] Training Speed: 430.42 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:11:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:15:58 INFO loss_tracker.py:84 | Epoch[213/NA] Step[74] GlobalStep[29255/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0128] total_loss[0.0181] Rank[0/16] 06/24/2025 16:16:06 INFO stats.py:314 | Epoch[213] Step[93] GlobalStep[29274] Training Speed: 420.60 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:11:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:16:08 INFO loss_tracker.py:84 | Epoch[213/NA] Step[99] GlobalStep[29280/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 16:16:16 INFO stats.py:314 | Epoch[213] Step[118] GlobalStep[29299] Training Speed: 423.20 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:11:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:16:18 INFO loss_tracker.py:84 | Epoch[213/NA] Step[124] GlobalStep[29305/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0128] total_loss[0.0180] Rank[0/16] 06/24/2025 16:16:23 INFO stats.py:394 | Epoch[213] completed. Training Speed: 308.43 samples/sec across all devices. Epoch Time: 56.86 sec. Average Epoch Time: 56.86 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 8:11:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:16:27 INFO stats.py:314 | Epoch[214] Step[6] GlobalStep[29324] Training Speed: 428.64 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:11:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:16:34 INFO loss_tracker.py:84 | Epoch[214/NA] Step[24] GlobalStep[29342/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 16:16:37 INFO stats.py:314 | Epoch[214] Step[31] GlobalStep[29349] Training Speed: 404.46 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 8:10:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:16:45 INFO loss_tracker.py:84 | Epoch[214/NA] Step[49] GlobalStep[29367/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0129] total_loss[0.0179] Rank[0/16] 06/24/2025 16:16:48 INFO stats.py:314 | Epoch[214] Step[56] GlobalStep[29374] Training Speed: 418.96 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 8:10:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:16:55 INFO loss_tracker.py:84 | Epoch[214/NA] Step[74] GlobalStep[29392/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0128] total_loss[0.0172] Rank[0/16] 06/24/2025 16:16:58 INFO stats.py:314 | Epoch[214] Step[81] GlobalStep[29399] Training Speed: 428.18 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:10:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:17:05 INFO loss_tracker.py:84 | Epoch[214/NA] Step[99] GlobalStep[29417/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0128] total_loss[0.0181] Rank[0/16] 06/24/2025 16:17:08 INFO stats.py:314 | Epoch[214] Step[106] GlobalStep[29424] Training Speed: 433.56 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:10:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:17:15 INFO loss_tracker.py:84 | Epoch[214/NA] Step[124] GlobalStep[29442/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 16:17:18 INFO stats.py:314 | Epoch[214] Step[131] GlobalStep[29449] Training Speed: 449.15 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 8:10:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:17:19 INFO stats.py:394 | Epoch[214] completed. Training Speed: 309.11 samples/sec across all devices. Epoch Time: 56.73 sec. Average Epoch Time: 56.73 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:10:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:17:29 INFO stats.py:314 | Epoch[215] Step[19] GlobalStep[29474] Training Speed: 433.85 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:10:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:17:31 INFO loss_tracker.py:84 | Epoch[215/NA] Step[24] GlobalStep[29479/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0128] total_loss[0.0179] Rank[0/16] 06/24/2025 16:17:39 INFO stats.py:314 | Epoch[215] Step[44] GlobalStep[29499] Training Speed: 403.91 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 8:09:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:17:41 INFO loss_tracker.py:84 | Epoch[215/NA] Step[49] GlobalStep[29504/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0058] loss_depth[0.0128] total_loss[0.0187] Rank[0/16] 06/24/2025 16:17:49 INFO stats.py:314 | Epoch[215] Step[69] GlobalStep[29524] Training Speed: 429.80 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:09:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:17:52 INFO loss_tracker.py:84 | Epoch[215/NA] Step[74] GlobalStep[29529/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0129] total_loss[0.0179] Rank[0/16] 06/24/2025 16:17:59 INFO stats.py:314 | Epoch[215] Step[94] GlobalStep[29549] Training Speed: 435.21 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:09:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:18:02 INFO loss_tracker.py:84 | Epoch[215/NA] Step[99] GlobalStep[29554/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0129] total_loss[0.0179] Rank[0/16] 06/24/2025 16:18:10 INFO stats.py:314 | Epoch[215] Step[119] GlobalStep[29574] Training Speed: 433.52 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:09:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:18:12 INFO loss_tracker.py:84 | Epoch[215/NA] Step[124] GlobalStep[29579/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0128] total_loss[0.0173] Rank[0/16] 06/24/2025 16:18:16 INFO stats.py:394 | Epoch[215] completed. Training Speed: 308.23 samples/sec across all devices. Epoch Time: 56.89 sec. Average Epoch Time: 56.89 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 8:09:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:18:21 INFO stats.py:314 | Epoch[216] Step[7] GlobalStep[29599] Training Speed: 405.31 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 8:09:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:18:28 INFO loss_tracker.py:84 | Epoch[216/NA] Step[24] GlobalStep[29616/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0128] total_loss[0.0179] Rank[0/16] 06/24/2025 16:18:31 INFO stats.py:314 | Epoch[216] Step[32] GlobalStep[29624] Training Speed: 426.13 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:08:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:18:38 INFO loss_tracker.py:84 | Epoch[216/NA] Step[49] GlobalStep[29641/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0055] loss_depth[0.0128] total_loss[0.0184] Rank[0/16] 06/24/2025 16:18:41 INFO stats.py:314 | Epoch[216] Step[57] GlobalStep[29649] Training Speed: 429.84 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:08:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:18:48 INFO loss_tracker.py:84 | Epoch[216/NA] Step[74] GlobalStep[29666/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0128] total_loss[0.0181] Rank[0/16] 06/24/2025 16:18:51 INFO stats.py:314 | Epoch[216] Step[82] GlobalStep[29674] Training Speed: 420.02 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:08:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:18:58 INFO loss_tracker.py:84 | Epoch[216/NA] Step[99] GlobalStep[29691/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0128] total_loss[0.0179] Rank[0/16] 06/24/2025 16:19:01 INFO stats.py:314 | Epoch[216] Step[107] GlobalStep[29699] Training Speed: 429.92 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:08:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:19:08 INFO loss_tracker.py:84 | Epoch[216/NA] Step[124] GlobalStep[29716/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0128] total_loss[0.0180] Rank[0/16] 06/24/2025 16:19:11 INFO stats.py:314 | Epoch[216] Step[132] GlobalStep[29724] Training Speed: 448.68 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:08:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:19:12 INFO stats.py:394 | Epoch[216] completed. Training Speed: 312.14 samples/sec across all devices. Epoch Time: 56.18 sec. Average Epoch Time: 56.18 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:08:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:19:21 INFO stats.py:314 | Epoch[217] Step[20] GlobalStep[29749] Training Speed: 418.70 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 8:08:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:19:23 INFO loss_tracker.py:84 | Epoch[217/NA] Step[24] GlobalStep[29753/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0128] total_loss[0.0173] Rank[0/16] 06/24/2025 16:19:32 INFO stats.py:314 | Epoch[217] Step[45] GlobalStep[29774] Training Speed: 433.59 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:07:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:19:34 INFO loss_tracker.py:84 | Epoch[217/NA] Step[49] GlobalStep[29778/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0128] total_loss[0.0173] Rank[0/16] 06/24/2025 16:19:42 INFO stats.py:314 | Epoch[217] Step[70] GlobalStep[29799] Training Speed: 243.99 samples/sec across all devices. Average Step Time: 0.52 sec. Estimated Remaining Time: 8:07:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:19:44 INFO loss_tracker.py:84 | Epoch[217/NA] Step[74] GlobalStep[29803/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 16:19:52 INFO stats.py:314 | Epoch[217] Step[95] GlobalStep[29824] Training Speed: 433.42 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:07:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:19:54 INFO loss_tracker.py:84 | Epoch[217/NA] Step[99] GlobalStep[29828/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0056] loss_depth[0.0128] total_loss[0.0185] Rank[0/16] 06/24/2025 16:20:02 INFO stats.py:314 | Epoch[217] Step[120] GlobalStep[29849] Training Speed: 447.27 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:07:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:20:04 INFO loss_tracker.py:84 | Epoch[217/NA] Step[124] GlobalStep[29853/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0128] total_loss[0.0173] Rank[0/16] 06/24/2025 16:20:08 INFO stats.py:394 | Epoch[217] completed. Training Speed: 315.37 samples/sec across all devices. Epoch Time: 55.60 sec. Average Epoch Time: 55.60 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:07:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:20:13 INFO stats.py:314 | Epoch[218] Step[8] GlobalStep[29874] Training Speed: 435.96 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:07:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:20:19 INFO loss_tracker.py:84 | Epoch[218/NA] Step[24] GlobalStep[29890/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0055] loss_depth[0.0128] total_loss[0.0184] Rank[0/16] 06/24/2025 16:20:23 INFO stats.py:314 | Epoch[218] Step[33] GlobalStep[29899] Training Speed: 431.54 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:06:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:20:30 INFO loss_tracker.py:84 | Epoch[218/NA] Step[49] GlobalStep[29915/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 16:20:33 INFO stats.py:314 | Epoch[218] Step[58] GlobalStep[29924] Training Speed: 431.96 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:06:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:20:40 INFO loss_tracker.py:84 | Epoch[218/NA] Step[74] GlobalStep[29940/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0129] total_loss[0.0174] Rank[0/16] 06/24/2025 16:20:44 INFO stats.py:314 | Epoch[218] Step[83] GlobalStep[29949] Training Speed: 427.97 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:06:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:20:51 INFO loss_tracker.py:84 | Epoch[218/NA] Step[99] GlobalStep[29965/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0128] total_loss[0.0173] Rank[0/16] 06/24/2025 16:20:54 INFO stats.py:314 | Epoch[218] Step[108] GlobalStep[29974] Training Speed: 430.17 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:06:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:21:00 INFO loss_tracker.py:84 | Epoch[218/NA] Step[124] GlobalStep[29990/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 16:21:03 INFO stats.py:314 | Epoch[218] Step[133] GlobalStep[29999] Training Speed: 447.19 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:06:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:21:04 INFO stats.py:394 | Epoch[218] completed. Training Speed: 312.50 samples/sec across all devices. Epoch Time: 56.12 sec. Average Epoch Time: 56.12 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:06:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:21:14 INFO stats.py:314 | Epoch[219] Step[21] GlobalStep[30024] Training Speed: 432.97 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:06:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:21:16 INFO loss_tracker.py:84 | Epoch[219/NA] Step[24] GlobalStep[30027/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 16:21:25 INFO stats.py:314 | Epoch[219] Step[46] GlobalStep[30049] Training Speed: 433.28 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:05:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:21:26 INFO loss_tracker.py:84 | Epoch[219/NA] Step[49] GlobalStep[30052/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0128] total_loss[0.0176] Rank[0/16] 06/24/2025 16:21:35 INFO stats.py:314 | Epoch[219] Step[71] GlobalStep[30074] Training Speed: 433.00 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:05:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:21:36 INFO loss_tracker.py:84 | Epoch[219/NA] Step[74] GlobalStep[30077/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0174] Rank[0/16] 06/24/2025 16:21:45 INFO stats.py:314 | Epoch[219] Step[96] GlobalStep[30099] Training Speed: 429.47 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:05:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:21:46 INFO loss_tracker.py:84 | Epoch[219/NA] Step[99] GlobalStep[30102/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0128] total_loss[0.0171] Rank[0/16] 06/24/2025 16:21:55 INFO stats.py:314 | Epoch[219] Step[121] GlobalStep[30124] Training Speed: 449.84 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 8:05:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:21:56 INFO loss_tracker.py:84 | Epoch[219/NA] Step[124] GlobalStep[30127/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0128] total_loss[0.0176] Rank[0/16] 06/24/2025 16:22:00 INFO stats.py:394 | Epoch[219] completed. Training Speed: 312.00 samples/sec across all devices. Epoch Time: 56.20 sec. Average Epoch Time: 56.20 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:05:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:22:05 INFO stats.py:314 | Epoch[220] Step[9] GlobalStep[30149] Training Speed: 428.55 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:05:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:22:12 INFO loss_tracker.py:84 | Epoch[220/NA] Step[24] GlobalStep[30164/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0128] total_loss[0.0170] Rank[0/16] 06/24/2025 16:22:16 INFO stats.py:314 | Epoch[220] Step[34] GlobalStep[30174] Training Speed: 433.40 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:05:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:22:22 INFO loss_tracker.py:84 | Epoch[220/NA] Step[49] GlobalStep[30189/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 16:22:26 INFO stats.py:314 | Epoch[220] Step[59] GlobalStep[30199] Training Speed: 430.52 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:04:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:22:33 INFO loss_tracker.py:84 | Epoch[220/NA] Step[74] GlobalStep[30214/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0128] total_loss[0.0179] Rank[0/16] 06/24/2025 16:22:37 INFO stats.py:314 | Epoch[220] Step[84] GlobalStep[30224] Training Speed: 425.48 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:04:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:22:43 INFO loss_tracker.py:84 | Epoch[220/NA] Step[99] GlobalStep[30239/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0128] total_loss[0.0173] Rank[0/16] 06/24/2025 16:22:47 INFO stats.py:314 | Epoch[220] Step[109] GlobalStep[30249] Training Speed: 430.03 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:04:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:22:53 INFO loss_tracker.py:84 | Epoch[220/NA] Step[124] GlobalStep[30264/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0128] total_loss[0.0179] Rank[0/16] 06/24/2025 16:22:57 INFO stats.py:314 | Epoch[220] Step[134] GlobalStep[30274] Training Speed: 449.22 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 8:04:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:22:58 INFO stats.py:394 | Epoch[220] completed. Training Speed: 306.84 samples/sec across all devices. Epoch Time: 57.15 sec. Average Epoch Time: 57.15 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 8:04:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:23:08 INFO stats.py:314 | Epoch[221] Step[22] GlobalStep[30299] Training Speed: 434.41 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:04:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:23:09 INFO loss_tracker.py:84 | Epoch[221/NA] Step[24] GlobalStep[30301/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 16:23:18 INFO stats.py:314 | Epoch[221] Step[47] GlobalStep[30324] Training Speed: 429.30 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:03:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:23:19 INFO loss_tracker.py:84 | Epoch[221/NA] Step[49] GlobalStep[30326/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0128] total_loss[0.0182] Rank[0/16] 06/24/2025 16:23:29 INFO stats.py:314 | Epoch[221] Step[72] GlobalStep[30349] Training Speed: 426.91 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:03:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:23:29 INFO loss_tracker.py:84 | Epoch[221/NA] Step[74] GlobalStep[30351/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0128] total_loss[0.0181] Rank[0/16] 06/24/2025 16:23:38 INFO stats.py:314 | Epoch[221] Step[97] GlobalStep[30374] Training Speed: 416.94 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 8:03:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:23:39 INFO loss_tracker.py:84 | Epoch[221/NA] Step[99] GlobalStep[30376/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0128] total_loss[0.0181] Rank[0/16] 06/24/2025 16:23:49 INFO stats.py:314 | Epoch[221] Step[122] GlobalStep[30399] Training Speed: 452.37 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 8:03:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:23:50 INFO loss_tracker.py:84 | Epoch[221/NA] Step[124] GlobalStep[30401/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0128] total_loss[0.0179] Rank[0/16] 06/24/2025 16:23:54 INFO stats.py:394 | Epoch[221] completed. Training Speed: 310.36 samples/sec across all devices. Epoch Time: 56.50 sec. Average Epoch Time: 56.50 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:03:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:23:59 INFO stats.py:314 | Epoch[222] Step[10] GlobalStep[30424] Training Speed: 437.16 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:03:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:24:05 INFO loss_tracker.py:84 | Epoch[222/NA] Step[24] GlobalStep[30438/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0174] Rank[0/16] 06/24/2025 16:24:10 INFO stats.py:314 | Epoch[222] Step[35] GlobalStep[30449] Training Speed: 421.45 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:03:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:24:16 INFO loss_tracker.py:84 | Epoch[222/NA] Step[49] GlobalStep[30463/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 16:24:20 INFO stats.py:314 | Epoch[222] Step[60] GlobalStep[30474] Training Speed: 429.21 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:02:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:24:26 INFO loss_tracker.py:84 | Epoch[222/NA] Step[74] GlobalStep[30488/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0128] total_loss[0.0179] Rank[0/16] 06/24/2025 16:24:31 INFO stats.py:314 | Epoch[222] Step[85] GlobalStep[30499] Training Speed: 435.62 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:02:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:24:36 INFO loss_tracker.py:84 | Epoch[222/NA] Step[99] GlobalStep[30513/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 16:24:40 INFO stats.py:314 | Epoch[222] Step[110] GlobalStep[30524] Training Speed: 433.21 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:02:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:24:46 INFO loss_tracker.py:84 | Epoch[222/NA] Step[124] GlobalStep[30538/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0128] total_loss[0.0171] Rank[0/16] 06/24/2025 16:24:50 INFO stats.py:314 | Epoch[222] Step[135] GlobalStep[30549] Training Speed: 451.26 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 8:02:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:24:50 INFO stats.py:394 | Epoch[222] completed. Training Speed: 311.34 samples/sec across all devices. Epoch Time: 56.32 sec. Average Epoch Time: 56.32 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:02:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:25:01 INFO stats.py:314 | Epoch[223] Step[23] GlobalStep[30574] Training Speed: 437.60 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:02:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:25:02 INFO loss_tracker.py:84 | Epoch[223/NA] Step[24] GlobalStep[30575/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0128] total_loss[0.0179] Rank[0/16] 06/24/2025 16:25:11 INFO stats.py:314 | Epoch[223] Step[48] GlobalStep[30599] Training Speed: 428.20 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:01:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:25:12 INFO loss_tracker.py:84 | Epoch[223/NA] Step[49] GlobalStep[30600/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0128] total_loss[0.0174] Rank[0/16] 06/24/2025 16:25:21 INFO stats.py:314 | Epoch[223] Step[73] GlobalStep[30624] Training Speed: 421.21 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:01:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:25:22 INFO loss_tracker.py:84 | Epoch[223/NA] Step[74] GlobalStep[30625/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 16:25:32 INFO stats.py:314 | Epoch[223] Step[98] GlobalStep[30649] Training Speed: 433.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:01:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:25:32 INFO loss_tracker.py:84 | Epoch[223/NA] Step[99] GlobalStep[30650/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 16:25:42 INFO stats.py:314 | Epoch[223] Step[123] GlobalStep[30674] Training Speed: 452.03 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 8:01:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:25:42 INFO loss_tracker.py:84 | Epoch[223/NA] Step[124] GlobalStep[30675/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0128] total_loss[0.0176] Rank[0/16] 06/24/2025 16:25:46 INFO stats.py:394 | Epoch[223] completed. Training Speed: 313.59 samples/sec across all devices. Epoch Time: 55.92 sec. Average Epoch Time: 55.92 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:01:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:25:52 INFO stats.py:314 | Epoch[224] Step[11] GlobalStep[30699] Training Speed: 428.21 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:01:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:25:58 INFO loss_tracker.py:84 | Epoch[224/NA] Step[24] GlobalStep[30712/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 16:26:03 INFO stats.py:314 | Epoch[224] Step[36] GlobalStep[30724] Training Speed: 425.93 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:01:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:26:08 INFO loss_tracker.py:84 | Epoch[224/NA] Step[49] GlobalStep[30737/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0128] total_loss[0.0181] Rank[0/16] 06/24/2025 16:26:12 INFO stats.py:314 | Epoch[224] Step[61] GlobalStep[30749] Training Speed: 436.76 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 8:00:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:26:18 INFO loss_tracker.py:84 | Epoch[224/NA] Step[74] GlobalStep[30762/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0174] Rank[0/16] 06/24/2025 16:26:23 INFO stats.py:314 | Epoch[224] Step[86] GlobalStep[30774] Training Speed: 403.29 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 8:00:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:26:29 INFO loss_tracker.py:84 | Epoch[224/NA] Step[99] GlobalStep[30787/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0174] Rank[0/16] 06/24/2025 16:26:33 INFO stats.py:314 | Epoch[224] Step[111] GlobalStep[30799] Training Speed: 428.03 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:00:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:26:38 INFO loss_tracker.py:84 | Epoch[224/NA] Step[124] GlobalStep[30812/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0128] total_loss[0.0180] Rank[0/16] 06/24/2025 16:26:42 INFO stats.py:314 | Epoch[224] Step[136] GlobalStep[30824] Training Speed: 451.52 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 8:00:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:26:42 INFO stats.py:394 | Epoch[224] completed. Training Speed: 312.21 samples/sec across all devices. Epoch Time: 56.17 sec. Average Epoch Time: 56.17 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 8:00:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:26:54 INFO stats.py:314 | Epoch[225] Step[24] GlobalStep[30849] Training Speed: 430.93 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:00:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:26:54 INFO loss_tracker.py:84 | Epoch[225/NA] Step[24] GlobalStep[30849/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 16:27:04 INFO stats.py:314 | Epoch[225] Step[49] GlobalStep[30874] Training Speed: 430.87 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 8:00:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:27:04 INFO loss_tracker.py:84 | Epoch[225/NA] Step[49] GlobalStep[30874/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0174] Rank[0/16] 06/24/2025 16:27:14 INFO stats.py:314 | Epoch[225] Step[74] GlobalStep[30899] Training Speed: 427.76 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:59:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:27:14 INFO loss_tracker.py:84 | Epoch[225/NA] Step[74] GlobalStep[30899/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0054] loss_depth[0.0128] total_loss[0.0182] Rank[0/16] 06/24/2025 16:27:25 INFO stats.py:314 | Epoch[225] Step[99] GlobalStep[30924] Training Speed: 435.04 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:59:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:27:25 INFO loss_tracker.py:84 | Epoch[225/NA] Step[99] GlobalStep[30924/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0128] total_loss[0.0179] Rank[0/16] 06/24/2025 16:27:35 INFO stats.py:314 | Epoch[225] Step[124] GlobalStep[30949] Training Speed: 452.80 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 7:59:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:27:35 INFO loss_tracker.py:84 | Epoch[225/NA] Step[124] GlobalStep[30949/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 16:27:39 INFO stats.py:394 | Epoch[225] completed. Training Speed: 310.50 samples/sec across all devices. Epoch Time: 56.48 sec. Average Epoch Time: 56.48 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:59:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:27:45 INFO stats.py:314 | Epoch[226] Step[12] GlobalStep[30974] Training Speed: 422.40 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:59:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:27:50 INFO loss_tracker.py:84 | Epoch[226/NA] Step[24] GlobalStep[30986/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0054] loss_depth[0.0128] total_loss[0.0183] Rank[0/16] 06/24/2025 16:27:55 INFO stats.py:314 | Epoch[226] Step[37] GlobalStep[30999] Training Speed: 433.63 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:59:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:28:01 INFO loss_tracker.py:84 | Epoch[226/NA] Step[49] GlobalStep[31011/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0127] total_loss[0.0180] Rank[0/16] 06/24/2025 16:28:06 INFO stats.py:314 | Epoch[226] Step[62] GlobalStep[31024] Training Speed: 430.14 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:58:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:28:11 INFO loss_tracker.py:84 | Epoch[226/NA] Step[74] GlobalStep[31036/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0176] Rank[0/16] 06/24/2025 16:28:16 INFO stats.py:314 | Epoch[226] Step[87] GlobalStep[31049] Training Speed: 405.69 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 7:58:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:28:21 INFO loss_tracker.py:84 | Epoch[226/NA] Step[99] GlobalStep[31061/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0054] loss_depth[0.0128] total_loss[0.0183] Rank[0/16] 06/24/2025 16:28:26 INFO stats.py:314 | Epoch[226] Step[112] GlobalStep[31074] Training Speed: 425.27 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:58:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:28:31 INFO loss_tracker.py:84 | Epoch[226/NA] Step[124] GlobalStep[31086/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0129] total_loss[0.0178] Rank[0/16] 06/24/2025 16:28:35 INFO stats.py:394 | Epoch[226] completed. Training Speed: 311.38 samples/sec across all devices. Epoch Time: 56.32 sec. Average Epoch Time: 56.32 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:58:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:28:36 INFO stats.py:314 | Epoch[227] Step[0] GlobalStep[31099] Training Speed: 356.46 samples/sec across all devices. Average Step Time: 0.36 sec. Estimated Remaining Time: 7:58:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:28:47 INFO loss_tracker.py:84 | Epoch[227/NA] Step[24] GlobalStep[31123/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 16:28:47 INFO stats.py:314 | Epoch[227] Step[25] GlobalStep[31124] Training Speed: 421.10 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:58:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:28:57 INFO loss_tracker.py:84 | Epoch[227/NA] Step[49] GlobalStep[31148/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0128] total_loss[0.0172] Rank[0/16] 06/24/2025 16:28:57 INFO stats.py:314 | Epoch[227] Step[50] GlobalStep[31149] Training Speed: 431.26 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:58:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:29:07 INFO loss_tracker.py:84 | Epoch[227/NA] Step[74] GlobalStep[31173/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 16:29:07 INFO stats.py:314 | Epoch[227] Step[75] GlobalStep[31174] Training Speed: 408.68 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 7:57:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:29:17 INFO loss_tracker.py:84 | Epoch[227/NA] Step[99] GlobalStep[31198/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 16:29:17 INFO stats.py:314 | Epoch[227] Step[100] GlobalStep[31199] Training Speed: 433.45 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:57:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:29:27 INFO loss_tracker.py:84 | Epoch[227/NA] Step[124] GlobalStep[31223/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0128] total_loss[0.0172] Rank[0/16] 06/24/2025 16:29:27 INFO stats.py:314 | Epoch[227] Step[125] GlobalStep[31224] Training Speed: 421.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:57:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:29:31 INFO stats.py:394 | Epoch[227] completed. Training Speed: 313.89 samples/sec across all devices. Epoch Time: 55.87 sec. Average Epoch Time: 55.87 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:57:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:29:38 INFO stats.py:314 | Epoch[228] Step[13] GlobalStep[31249] Training Speed: 423.91 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:57:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:29:42 INFO loss_tracker.py:84 | Epoch[228/NA] Step[24] GlobalStep[31260/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0128] total_loss[0.0176] Rank[0/16] 06/24/2025 16:29:48 INFO stats.py:314 | Epoch[228] Step[38] GlobalStep[31274] Training Speed: 433.79 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:57:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:29:53 INFO loss_tracker.py:84 | Epoch[228/NA] Step[49] GlobalStep[31285/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0178] Rank[0/16] 06/24/2025 16:29:58 INFO stats.py:314 | Epoch[228] Step[63] GlobalStep[31299] Training Speed: 405.49 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 7:56:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:30:03 INFO loss_tracker.py:84 | Epoch[228/NA] Step[74] GlobalStep[31310/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0128] total_loss[0.0172] Rank[0/16] 06/24/2025 16:30:08 INFO stats.py:314 | Epoch[228] Step[88] GlobalStep[31324] Training Speed: 431.24 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:56:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:30:13 INFO loss_tracker.py:84 | Epoch[228/NA] Step[99] GlobalStep[31335/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0128] total_loss[0.0179] Rank[0/16] 06/24/2025 16:30:18 INFO stats.py:314 | Epoch[228] Step[113] GlobalStep[31349] Training Speed: 430.73 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:56:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:30:23 INFO loss_tracker.py:84 | Epoch[228/NA] Step[124] GlobalStep[31360/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0128] total_loss[0.0173] Rank[0/16] 06/24/2025 16:30:27 INFO stats.py:394 | Epoch[228] completed. Training Speed: 314.35 samples/sec across all devices. Epoch Time: 55.78 sec. Average Epoch Time: 55.78 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:56:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:30:29 INFO stats.py:314 | Epoch[229] Step[1] GlobalStep[31374] Training Speed: 376.22 samples/sec across all devices. Average Step Time: 0.34 sec. Estimated Remaining Time: 7:56:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:30:38 INFO loss_tracker.py:84 | Epoch[229/NA] Step[24] GlobalStep[31397/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 16:30:39 INFO stats.py:314 | Epoch[229] Step[26] GlobalStep[31399] Training Speed: 436.62 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:56:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:30:49 INFO loss_tracker.py:84 | Epoch[229/NA] Step[49] GlobalStep[31422/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0128] total_loss[0.0172] Rank[0/16] 06/24/2025 16:30:50 INFO stats.py:314 | Epoch[229] Step[51] GlobalStep[31424] Training Speed: 431.26 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:56:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:30:59 INFO loss_tracker.py:84 | Epoch[229/NA] Step[74] GlobalStep[31447/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0176] Rank[0/16] 06/24/2025 16:31:00 INFO stats.py:314 | Epoch[229] Step[76] GlobalStep[31449] Training Speed: 426.45 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:55:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:31:09 INFO loss_tracker.py:84 | Epoch[229/NA] Step[99] GlobalStep[31472/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 16:31:10 INFO stats.py:314 | Epoch[229] Step[101] GlobalStep[31474] Training Speed: 430.23 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:55:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:31:19 INFO loss_tracker.py:84 | Epoch[229/NA] Step[124] GlobalStep[31497/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0128] total_loss[0.0176] Rank[0/16] 06/24/2025 16:31:20 INFO stats.py:314 | Epoch[229] Step[126] GlobalStep[31499] Training Speed: 447.12 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:55:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:31:23 INFO stats.py:394 | Epoch[229] completed. Training Speed: 310.78 samples/sec across all devices. Epoch Time: 56.43 sec. Average Epoch Time: 56.43 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:55:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:31:31 INFO stats.py:314 | Epoch[230] Step[14] GlobalStep[31524] Training Speed: 433.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:55:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:31:35 INFO loss_tracker.py:84 | Epoch[230/NA] Step[24] GlobalStep[31534/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0128] total_loss[0.0167] Rank[0/16] 06/24/2025 16:31:41 INFO stats.py:314 | Epoch[230] Step[39] GlobalStep[31549] Training Speed: 435.18 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:55:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:31:45 INFO loss_tracker.py:84 | Epoch[230/NA] Step[49] GlobalStep[31559/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0128] total_loss[0.0176] Rank[0/16] 06/24/2025 16:31:51 INFO stats.py:314 | Epoch[230] Step[64] GlobalStep[31574] Training Speed: 435.08 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:54:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:31:55 INFO loss_tracker.py:84 | Epoch[230/NA] Step[74] GlobalStep[31584/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 16:32:01 INFO stats.py:314 | Epoch[230] Step[89] GlobalStep[31599] Training Speed: 425.26 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:54:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:32:05 INFO loss_tracker.py:84 | Epoch[230/NA] Step[99] GlobalStep[31609/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0057] loss_depth[0.0128] total_loss[0.0185] Rank[0/16] 06/24/2025 16:32:12 INFO stats.py:314 | Epoch[230] Step[114] GlobalStep[31624] Training Speed: 434.64 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:54:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:32:15 INFO loss_tracker.py:84 | Epoch[230/NA] Step[124] GlobalStep[31634/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 16:32:20 INFO stats.py:394 | Epoch[230] completed. Training Speed: 311.72 samples/sec across all devices. Epoch Time: 56.26 sec. Average Epoch Time: 56.26 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:54:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:32:22 INFO stats.py:314 | Epoch[231] Step[2] GlobalStep[31649] Training Speed: 427.54 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:54:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:32:31 INFO loss_tracker.py:84 | Epoch[231/NA] Step[24] GlobalStep[31671/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0128] total_loss[0.0180] Rank[0/16] 06/24/2025 16:32:32 INFO stats.py:314 | Epoch[231] Step[27] GlobalStep[31674] Training Speed: 433.18 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:54:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:32:41 INFO loss_tracker.py:84 | Epoch[231/NA] Step[49] GlobalStep[31696/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0128] total_loss[0.0181] Rank[0/16] 06/24/2025 16:32:42 INFO stats.py:314 | Epoch[231] Step[52] GlobalStep[31699] Training Speed: 424.16 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:54:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:32:51 INFO loss_tracker.py:84 | Epoch[231/NA] Step[74] GlobalStep[31721/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0128] total_loss[0.0171] Rank[0/16] 06/24/2025 16:32:52 INFO stats.py:314 | Epoch[231] Step[77] GlobalStep[31724] Training Speed: 437.95 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:53:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:33:01 INFO loss_tracker.py:84 | Epoch[231/NA] Step[99] GlobalStep[31746/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0055] loss_depth[0.0128] total_loss[0.0183] Rank[0/16] 06/24/2025 16:33:03 INFO stats.py:314 | Epoch[231] Step[102] GlobalStep[31749] Training Speed: 432.09 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:53:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:33:11 INFO loss_tracker.py:84 | Epoch[231/NA] Step[124] GlobalStep[31771/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0128] total_loss[0.0176] Rank[0/16] 06/24/2025 16:33:12 INFO stats.py:314 | Epoch[231] Step[127] GlobalStep[31774] Training Speed: 448.73 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:53:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:33:16 INFO stats.py:394 | Epoch[231] completed. Training Speed: 311.74 samples/sec across all devices. Epoch Time: 56.25 sec. Average Epoch Time: 56.25 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:53:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:33:23 INFO stats.py:314 | Epoch[232] Step[15] GlobalStep[31799] Training Speed: 431.33 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:53:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:33:27 INFO loss_tracker.py:84 | Epoch[232/NA] Step[24] GlobalStep[31808/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0128] total_loss[0.0171] Rank[0/16] 06/24/2025 16:33:34 INFO stats.py:314 | Epoch[232] Step[40] GlobalStep[31824] Training Speed: 432.07 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:53:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:33:37 INFO loss_tracker.py:84 | Epoch[232/NA] Step[49] GlobalStep[31833/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 16:33:44 INFO stats.py:314 | Epoch[232] Step[65] GlobalStep[31849] Training Speed: 426.90 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:53:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:33:48 INFO loss_tracker.py:84 | Epoch[232/NA] Step[74] GlobalStep[31858/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 16:33:54 INFO stats.py:314 | Epoch[232] Step[90] GlobalStep[31874] Training Speed: 409.51 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 7:52:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:33:58 INFO loss_tracker.py:84 | Epoch[232/NA] Step[99] GlobalStep[31883/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0128] total_loss[0.0181] Rank[0/16] 06/24/2025 16:34:04 INFO stats.py:314 | Epoch[232] Step[115] GlobalStep[31899] Training Speed: 430.15 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:52:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:34:07 INFO loss_tracker.py:84 | Epoch[232/NA] Step[124] GlobalStep[31908/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0055] loss_depth[0.0128] total_loss[0.0183] Rank[0/16] 06/24/2025 16:34:12 INFO stats.py:394 | Epoch[232] completed. Training Speed: 311.95 samples/sec across all devices. Epoch Time: 56.21 sec. Average Epoch Time: 56.21 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:52:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:34:14 INFO stats.py:314 | Epoch[233] Step[3] GlobalStep[31924] Training Speed: 436.12 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:52:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:34:23 INFO loss_tracker.py:84 | Epoch[233/NA] Step[24] GlobalStep[31945/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0128] total_loss[0.0176] Rank[0/16] 06/24/2025 16:34:25 INFO stats.py:314 | Epoch[233] Step[28] GlobalStep[31949] Training Speed: 428.12 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:52:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:34:33 INFO loss_tracker.py:84 | Epoch[233/NA] Step[49] GlobalStep[31970/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0176] Rank[0/16] 06/24/2025 16:34:35 INFO stats.py:314 | Epoch[233] Step[53] GlobalStep[31974] Training Speed: 399.96 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 7:52:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:34:44 INFO loss_tracker.py:84 | Epoch[233/NA] Step[74] GlobalStep[31995/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 16:34:45 INFO stats.py:314 | Epoch[233] Step[78] GlobalStep[31999] Training Speed: 433.31 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:51:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:34:45 WARNING accelerator.py:3099 | Deleting 1 checkpoints to make room for new checkpoint. Rank[0/16] 06/24/2025 16:34:46 INFO accelerator.py:3111 | Saving current state to /job_data/checkpoints/checkpoint_7 Rank[12/16] 06/24/2025 16:34:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 31999 to /job_data/checkpoints/checkpoint_7 Rank[15/16] 06/24/2025 16:34:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 31999 to /job_data/checkpoints/checkpoint_7 Rank[8/16] 06/24/2025 16:34:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 31999 to /job_data/checkpoints/checkpoint_7 Rank[10/16] 06/24/2025 16:34:47 INFO checkpoint.py:110 | Save checkpoint at the end of step 31999 to /job_data/checkpoints/checkpoint_7 Rank[6/16] 06/24/2025 16:34:47 INFO checkpoint.py:110 | Save checkpoint at the end of step 31999 to /job_data/checkpoints/checkpoint_7 Rank[11/16] 06/24/2025 16:34:47 INFO checkpoint.py:110 | Save checkpoint at the end of step 31999 to /job_data/checkpoints/checkpoint_7 Rank[9/16] 06/24/2025 16:34:47 INFO checkpoint.py:110 | Save checkpoint at the end of step 31999 to /job_data/checkpoints/checkpoint_7 Rank[13/16] 06/24/2025 16:34:47 INFO checkpoint.py:110 | Save checkpoint at the end of step 31999 to /job_data/checkpoints/checkpoint_7 Rank[14/16] 06/24/2025 16:34:47 INFO checkpoint.py:110 | Save checkpoint at the end of step 31999 to /job_data/checkpoints/checkpoint_7 Rank[3/16] 06/24/2025 16:34:47 INFO checkpoint.py:110 | Save checkpoint at the end of step 31999 to /job_data/checkpoints/checkpoint_7 Rank[1/16] 06/24/2025 16:34:47 INFO checkpoint.py:110 | Save checkpoint at the end of step 31999 to /job_data/checkpoints/checkpoint_7 Rank[7/16] 06/24/2025 16:34:47 INFO checkpoint.py:110 | Save checkpoint at the end of step 31999 to /job_data/checkpoints/checkpoint_7 Rank[4/16] 06/24/2025 16:34:47 INFO checkpoint.py:110 | Save checkpoint at the end of step 31999 to /job_data/checkpoints/checkpoint_7 Rank[5/16] 06/24/2025 16:34:47 INFO checkpoint.py:110 | Save checkpoint at the end of step 31999 to /job_data/checkpoints/checkpoint_7 Rank[2/16] 06/24/2025 16:34:47 INFO checkpoint.py:110 | Save checkpoint at the end of step 31999 to /job_data/checkpoints/checkpoint_7 Rank[0/16] 06/24/2025 16:34:47 INFO checkpointing.py:106 | Model weights saved in /job_data/checkpoints/checkpoint_7/model.safetensors Rank[0/16] 06/24/2025 16:34:48 INFO checkpointing.py:113 | Optimizer state saved in /job_data/checkpoints/checkpoint_7/optimizer.bin Rank[0/16] 06/24/2025 16:34:48 INFO checkpointing.py:120 | Scheduler state saved in /job_data/checkpoints/checkpoint_7/scheduler.bin Rank[0/16] 06/24/2025 16:34:48 INFO checkpointing.py:137 | Sampler state for dataloader 0 saved in /job_data/checkpoints/checkpoint_7/sampler.bin Rank[0/16] 06/24/2025 16:34:48 INFO checkpointing.py:164 | Random states saved in /job_data/checkpoints/checkpoint_7/random_states_0.pkl Rank[0/16] 06/24/2025 16:34:48 INFO checkpointing.py:300 | Saving the state of TrainerProgressState to /job_data/checkpoints/checkpoint_7/custom_checkpoint_0.pkl Rank[0/16] 06/24/2025 16:34:48 INFO checkpoint.py:110 | Save checkpoint at the end of step 31999 to /job_data/checkpoints/checkpoint_7 Rank[0/16] 06/24/2025 16:34:56 INFO loss_tracker.py:84 | Epoch[233/NA] Step[99] GlobalStep[32020/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 16:34:58 INFO stats.py:314 | Epoch[233] Step[103] GlobalStep[32024] Training Speed: 427.19 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:51:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:35:07 INFO loss_tracker.py:84 | Epoch[233/NA] Step[124] GlobalStep[32045/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0127] total_loss[0.0168] Rank[0/16] 06/24/2025 16:35:08 INFO stats.py:314 | Epoch[233] Step[128] GlobalStep[32049] Training Speed: 442.23 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:51:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:35:11 INFO stats.py:394 | Epoch[233] completed. Training Speed: 298.75 samples/sec across all devices. Epoch Time: 58.70 sec. Average Epoch Time: 58.70 sec. Average Step Time: 0.43 sec. Estimated Remaining Time: 7:51:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:35:18 INFO stats.py:314 | Epoch[234] Step[16] GlobalStep[32074] Training Speed: 434.80 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:51:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:35:22 INFO loss_tracker.py:84 | Epoch[234/NA] Step[24] GlobalStep[32082/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0128] total_loss[0.0173] Rank[0/16] 06/24/2025 16:35:29 INFO stats.py:314 | Epoch[234] Step[41] GlobalStep[32099] Training Speed: 409.87 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 7:51:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:35:32 INFO loss_tracker.py:84 | Epoch[234/NA] Step[49] GlobalStep[32107/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 16:35:39 INFO stats.py:314 | Epoch[234] Step[66] GlobalStep[32124] Training Speed: 434.17 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:51:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:35:42 INFO loss_tracker.py:84 | Epoch[234/NA] Step[74] GlobalStep[32132/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0178] Rank[0/16] 06/24/2025 16:35:49 INFO stats.py:314 | Epoch[234] Step[91] GlobalStep[32149] Training Speed: 428.04 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:50:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:35:52 INFO loss_tracker.py:84 | Epoch[234/NA] Step[99] GlobalStep[32157/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0056] loss_depth[0.0128] total_loss[0.0185] Rank[0/16] 06/24/2025 16:35:59 INFO stats.py:314 | Epoch[234] Step[116] GlobalStep[32174] Training Speed: 432.89 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:50:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:36:02 INFO loss_tracker.py:84 | Epoch[234/NA] Step[124] GlobalStep[32182/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0174] Rank[0/16] 06/24/2025 16:36:07 INFO stats.py:394 | Epoch[234] completed. Training Speed: 314.63 samples/sec across all devices. Epoch Time: 55.73 sec. Average Epoch Time: 55.73 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:50:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:36:10 INFO stats.py:314 | Epoch[235] Step[4] GlobalStep[32199] Training Speed: 431.98 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:50:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:36:17 INFO loss_tracker.py:84 | Epoch[235/NA] Step[24] GlobalStep[32219/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 16:36:20 INFO stats.py:314 | Epoch[235] Step[29] GlobalStep[32224] Training Speed: 431.77 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:50:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:36:28 INFO loss_tracker.py:84 | Epoch[235/NA] Step[49] GlobalStep[32244/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 16:36:30 INFO stats.py:314 | Epoch[235] Step[54] GlobalStep[32249] Training Speed: 430.92 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:50:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:36:38 INFO loss_tracker.py:84 | Epoch[235/NA] Step[74] GlobalStep[32269/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 16:36:40 INFO stats.py:314 | Epoch[235] Step[79] GlobalStep[32274] Training Speed: 433.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:50:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:36:49 INFO loss_tracker.py:84 | Epoch[235/NA] Step[99] GlobalStep[32294/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0128] total_loss[0.0180] Rank[0/16] 06/24/2025 16:36:51 INFO stats.py:314 | Epoch[235] Step[104] GlobalStep[32299] Training Speed: 426.06 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:49:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:36:59 INFO loss_tracker.py:84 | Epoch[235/NA] Step[124] GlobalStep[32319/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0178] Rank[0/16] 06/24/2025 16:37:00 INFO stats.py:314 | Epoch[235] Step[129] GlobalStep[32324] Training Speed: 450.32 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 7:49:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:37:03 INFO stats.py:394 | Epoch[235] completed. Training Speed: 309.42 samples/sec across all devices. Epoch Time: 56.67 sec. Average Epoch Time: 56.67 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:49:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:37:12 INFO stats.py:314 | Epoch[236] Step[17] GlobalStep[32349] Training Speed: 425.99 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:49:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:37:14 INFO loss_tracker.py:84 | Epoch[236/NA] Step[24] GlobalStep[32356/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0128] total_loss[0.0173] Rank[0/16] 06/24/2025 16:37:22 INFO stats.py:314 | Epoch[236] Step[42] GlobalStep[32374] Training Speed: 429.89 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:49:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:37:25 INFO loss_tracker.py:84 | Epoch[236/NA] Step[49] GlobalStep[32381/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0174] Rank[0/16] 06/24/2025 16:37:32 INFO stats.py:314 | Epoch[236] Step[67] GlobalStep[32399] Training Speed: 431.42 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:49:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:37:34 INFO loss_tracker.py:84 | Epoch[236/NA] Step[74] GlobalStep[32406/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0128] total_loss[0.0170] Rank[0/16] 06/24/2025 16:37:42 INFO stats.py:314 | Epoch[236] Step[92] GlobalStep[32424] Training Speed: 436.39 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:48:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:37:45 INFO loss_tracker.py:84 | Epoch[236/NA] Step[99] GlobalStep[32431/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0128] total_loss[0.0181] Rank[0/16] 06/24/2025 16:37:52 INFO stats.py:314 | Epoch[236] Step[117] GlobalStep[32449] Training Speed: 436.79 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:48:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:37:55 INFO loss_tracker.py:84 | Epoch[236/NA] Step[124] GlobalStep[32456/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0128] total_loss[0.0180] Rank[0/16] 06/24/2025 16:37:59 INFO stats.py:394 | Epoch[236] completed. Training Speed: 314.50 samples/sec across all devices. Epoch Time: 55.76 sec. Average Epoch Time: 55.76 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:48:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:38:02 INFO stats.py:314 | Epoch[237] Step[5] GlobalStep[32474] Training Speed: 436.95 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:48:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:38:10 INFO loss_tracker.py:84 | Epoch[237/NA] Step[24] GlobalStep[32493/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0128] total_loss[0.0174] Rank[0/16] 06/24/2025 16:38:13 INFO stats.py:314 | Epoch[237] Step[30] GlobalStep[32499] Training Speed: 437.57 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:48:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:38:21 INFO loss_tracker.py:84 | Epoch[237/NA] Step[49] GlobalStep[32518/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0128] total_loss[0.0179] Rank[0/16] 06/24/2025 16:38:23 INFO stats.py:314 | Epoch[237] Step[55] GlobalStep[32524] Training Speed: 434.34 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:48:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:38:31 INFO loss_tracker.py:84 | Epoch[237/NA] Step[74] GlobalStep[32543/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0180] Rank[0/16] 06/24/2025 16:38:34 INFO stats.py:314 | Epoch[237] Step[80] GlobalStep[32549] Training Speed: 437.62 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:48:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:38:42 INFO loss_tracker.py:84 | Epoch[237/NA] Step[99] GlobalStep[32568/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0056] loss_depth[0.0128] total_loss[0.0184] Rank[0/16] 06/24/2025 16:38:44 INFO stats.py:314 | Epoch[237] Step[105] GlobalStep[32574] Training Speed: 434.26 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:47:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:38:52 INFO loss_tracker.py:84 | Epoch[237/NA] Step[124] GlobalStep[32593/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0128] total_loss[0.0176] Rank[0/16] 06/24/2025 16:38:54 INFO stats.py:314 | Epoch[237] Step[130] GlobalStep[32599] Training Speed: 451.36 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 7:47:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:38:56 INFO stats.py:394 | Epoch[237] completed. Training Speed: 307.26 samples/sec across all devices. Epoch Time: 57.07 sec. Average Epoch Time: 57.07 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 7:47:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:39:05 INFO stats.py:314 | Epoch[238] Step[18] GlobalStep[32624] Training Speed: 438.32 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:47:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:39:08 INFO loss_tracker.py:84 | Epoch[238/NA] Step[24] GlobalStep[32630/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0174] Rank[0/16] 06/24/2025 16:39:15 INFO stats.py:314 | Epoch[238] Step[43] GlobalStep[32649] Training Speed: 432.66 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:47:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:39:18 INFO loss_tracker.py:84 | Epoch[238/NA] Step[49] GlobalStep[32655/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 16:39:25 INFO stats.py:314 | Epoch[238] Step[68] GlobalStep[32674] Training Speed: 409.50 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 7:47:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:39:28 INFO loss_tracker.py:84 | Epoch[238/NA] Step[74] GlobalStep[32680/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0180] Rank[0/16] 06/24/2025 16:39:36 INFO stats.py:314 | Epoch[238] Step[93] GlobalStep[32699] Training Speed: 432.78 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:47:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:39:38 INFO loss_tracker.py:84 | Epoch[238/NA] Step[99] GlobalStep[32705/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 16:39:46 INFO stats.py:314 | Epoch[238] Step[118] GlobalStep[32724] Training Speed: 432.01 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:46:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:39:49 INFO loss_tracker.py:84 | Epoch[238/NA] Step[124] GlobalStep[32730/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 16:39:53 INFO stats.py:394 | Epoch[238] completed. Training Speed: 307.97 samples/sec across all devices. Epoch Time: 56.94 sec. Average Epoch Time: 56.94 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 7:46:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:39:57 INFO stats.py:314 | Epoch[239] Step[6] GlobalStep[32749] Training Speed: 437.91 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:46:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:40:04 INFO loss_tracker.py:84 | Epoch[239/NA] Step[24] GlobalStep[32767/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0128] total_loss[0.0173] Rank[0/16] 06/24/2025 16:40:07 INFO stats.py:314 | Epoch[239] Step[31] GlobalStep[32774] Training Speed: 432.81 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:46:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:40:15 INFO loss_tracker.py:84 | Epoch[239/NA] Step[49] GlobalStep[32792/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0128] total_loss[0.0173] Rank[0/16] 06/24/2025 16:40:17 INFO stats.py:314 | Epoch[239] Step[56] GlobalStep[32799] Training Speed: 434.21 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:46:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:40:25 INFO loss_tracker.py:84 | Epoch[239/NA] Step[74] GlobalStep[32817/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0129] total_loss[0.0179] Rank[0/16] 06/24/2025 16:40:27 INFO stats.py:314 | Epoch[239] Step[81] GlobalStep[32824] Training Speed: 430.88 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:46:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:40:35 INFO loss_tracker.py:84 | Epoch[239/NA] Step[99] GlobalStep[32842/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 16:40:37 INFO stats.py:314 | Epoch[239] Step[106] GlobalStep[32849] Training Speed: 434.31 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:45:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:40:45 INFO loss_tracker.py:84 | Epoch[239/NA] Step[124] GlobalStep[32867/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0128] total_loss[0.0180] Rank[0/16] 06/24/2025 16:40:47 INFO stats.py:314 | Epoch[239] Step[131] GlobalStep[32874] Training Speed: 451.22 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 7:45:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:40:49 INFO stats.py:394 | Epoch[239] completed. Training Speed: 313.11 samples/sec across all devices. Epoch Time: 56.01 sec. Average Epoch Time: 56.01 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:45:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:40:58 INFO stats.py:314 | Epoch[240] Step[19] GlobalStep[32899] Training Speed: 434.31 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:45:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:41:00 INFO loss_tracker.py:84 | Epoch[240/NA] Step[24] GlobalStep[32904/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0174] Rank[0/16] 06/24/2025 16:41:09 INFO stats.py:314 | Epoch[240] Step[44] GlobalStep[32924] Training Speed: 422.50 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:45:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:41:11 INFO loss_tracker.py:84 | Epoch[240/NA] Step[49] GlobalStep[32929/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 16:41:19 INFO stats.py:314 | Epoch[240] Step[69] GlobalStep[32949] Training Speed: 433.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:45:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:41:21 INFO loss_tracker.py:84 | Epoch[240/NA] Step[74] GlobalStep[32954/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0128] total_loss[0.0181] Rank[0/16] 06/24/2025 16:41:29 INFO stats.py:314 | Epoch[240] Step[94] GlobalStep[32974] Training Speed: 436.41 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:45:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:41:31 INFO loss_tracker.py:84 | Epoch[240/NA] Step[99] GlobalStep[32979/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 16:41:39 INFO stats.py:314 | Epoch[240] Step[119] GlobalStep[32999] Training Speed: 437.84 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:44:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:41:41 INFO loss_tracker.py:84 | Epoch[240/NA] Step[124] GlobalStep[33004/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0128] total_loss[0.0179] Rank[0/16] 06/24/2025 16:41:46 INFO stats.py:394 | Epoch[240] completed. Training Speed: 309.34 samples/sec across all devices. Epoch Time: 56.69 sec. Average Epoch Time: 56.69 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:44:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:41:50 INFO stats.py:314 | Epoch[241] Step[7] GlobalStep[33024] Training Speed: 433.87 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:44:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:41:57 INFO loss_tracker.py:84 | Epoch[241/NA] Step[24] GlobalStep[33041/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0128] total_loss[0.0180] Rank[0/16] 06/24/2025 16:42:00 INFO stats.py:314 | Epoch[241] Step[32] GlobalStep[33049] Training Speed: 431.54 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:44:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:42:07 INFO loss_tracker.py:84 | Epoch[241/NA] Step[49] GlobalStep[33066/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0128] total_loss[0.0172] Rank[0/16] 06/24/2025 16:42:10 INFO stats.py:314 | Epoch[241] Step[57] GlobalStep[33074] Training Speed: 430.19 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:44:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:42:17 INFO loss_tracker.py:84 | Epoch[241/NA] Step[74] GlobalStep[33091/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 16:42:20 INFO stats.py:314 | Epoch[241] Step[82] GlobalStep[33099] Training Speed: 433.61 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:44:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:42:27 INFO loss_tracker.py:84 | Epoch[241/NA] Step[99] GlobalStep[33116/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0055] loss_depth[0.0127] total_loss[0.0183] Rank[0/16] 06/24/2025 16:42:30 INFO stats.py:314 | Epoch[241] Step[107] GlobalStep[33124] Training Speed: 264.99 samples/sec across all devices. Average Step Time: 0.48 sec. Estimated Remaining Time: 7:43:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:42:37 INFO loss_tracker.py:84 | Epoch[241/NA] Step[124] GlobalStep[33141/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0128] total_loss[0.0176] Rank[0/16] 06/24/2025 16:42:39 INFO stats.py:314 | Epoch[241] Step[132] GlobalStep[33149] Training Speed: 449.83 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 7:43:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:42:41 INFO stats.py:394 | Epoch[241] completed. Training Speed: 317.76 samples/sec across all devices. Epoch Time: 55.19 sec. Average Epoch Time: 55.19 sec. Average Step Time: 0.40 sec. Estimated Remaining Time: 7:43:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:42:50 INFO stats.py:314 | Epoch[242] Step[20] GlobalStep[33174] Training Speed: 427.83 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:43:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:42:52 INFO loss_tracker.py:84 | Epoch[242/NA] Step[24] GlobalStep[33178/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 16:43:01 INFO stats.py:314 | Epoch[242] Step[45] GlobalStep[33199] Training Speed: 435.02 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:43:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:43:03 INFO loss_tracker.py:84 | Epoch[242/NA] Step[49] GlobalStep[33203/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 16:43:11 INFO stats.py:314 | Epoch[242] Step[70] GlobalStep[33224] Training Speed: 433.62 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:43:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:43:13 INFO loss_tracker.py:84 | Epoch[242/NA] Step[74] GlobalStep[33228/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 16:43:21 INFO stats.py:314 | Epoch[242] Step[95] GlobalStep[33249] Training Speed: 431.02 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:43:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:43:23 INFO loss_tracker.py:84 | Epoch[242/NA] Step[99] GlobalStep[33253/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0127] total_loss[0.0181] Rank[0/16] 06/24/2025 16:43:31 INFO stats.py:314 | Epoch[242] Step[120] GlobalStep[33274] Training Speed: 452.16 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 7:42:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:43:33 INFO loss_tracker.py:84 | Epoch[242/NA] Step[124] GlobalStep[33278/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0128] total_loss[0.0172] Rank[0/16] 06/24/2025 16:43:37 INFO stats.py:394 | Epoch[242] completed. Training Speed: 313.42 samples/sec across all devices. Epoch Time: 55.95 sec. Average Epoch Time: 55.95 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:42:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:43:41 INFO stats.py:314 | Epoch[243] Step[8] GlobalStep[33299] Training Speed: 431.40 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:42:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:43:48 INFO loss_tracker.py:84 | Epoch[243/NA] Step[24] GlobalStep[33315/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 16:43:52 INFO stats.py:314 | Epoch[243] Step[33] GlobalStep[33324] Training Speed: 430.08 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:42:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:43:58 INFO loss_tracker.py:84 | Epoch[243/NA] Step[49] GlobalStep[33340/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0127] total_loss[0.0178] Rank[0/16] 06/24/2025 16:44:02 INFO stats.py:314 | Epoch[243] Step[58] GlobalStep[33349] Training Speed: 432.67 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:42:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:44:08 INFO loss_tracker.py:84 | Epoch[243/NA] Step[74] GlobalStep[33365/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0053] loss_depth[0.0128] total_loss[0.0182] Rank[0/16] 06/24/2025 16:44:12 INFO stats.py:314 | Epoch[243] Step[83] GlobalStep[33374] Training Speed: 434.77 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:42:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:44:19 INFO loss_tracker.py:84 | Epoch[243/NA] Step[99] GlobalStep[33390/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 16:44:22 INFO stats.py:314 | Epoch[243] Step[108] GlobalStep[33399] Training Speed: 428.21 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:42:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:44:28 INFO loss_tracker.py:84 | Epoch[243/NA] Step[124] GlobalStep[33415/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0054] loss_depth[0.0127] total_loss[0.0182] Rank[0/16] 06/24/2025 16:44:31 INFO stats.py:314 | Epoch[243] Step[133] GlobalStep[33424] Training Speed: 450.80 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 7:41:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:44:33 INFO stats.py:394 | Epoch[243] completed. Training Speed: 314.30 samples/sec across all devices. Epoch Time: 55.79 sec. Average Epoch Time: 55.79 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:41:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:44:43 INFO stats.py:314 | Epoch[244] Step[21] GlobalStep[33449] Training Speed: 402.20 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 7:41:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:44:44 INFO loss_tracker.py:84 | Epoch[244/NA] Step[24] GlobalStep[33452/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0128] total_loss[0.0169] Rank[0/16] 06/24/2025 16:44:53 INFO stats.py:314 | Epoch[244] Step[46] GlobalStep[33474] Training Speed: 422.22 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:41:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:44:54 INFO loss_tracker.py:84 | Epoch[244/NA] Step[49] GlobalStep[33477/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 16:45:03 INFO stats.py:314 | Epoch[244] Step[71] GlobalStep[33499] Training Speed: 431.89 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:41:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:45:05 INFO loss_tracker.py:84 | Epoch[244/NA] Step[74] GlobalStep[33502/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0128] total_loss[0.0180] Rank[0/16] 06/24/2025 16:45:14 INFO stats.py:314 | Epoch[244] Step[96] GlobalStep[33524] Training Speed: 399.72 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 7:41:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:45:15 INFO loss_tracker.py:84 | Epoch[244/NA] Step[99] GlobalStep[33527/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0128] total_loss[0.0172] Rank[0/16] 06/24/2025 16:45:24 INFO stats.py:314 | Epoch[244] Step[121] GlobalStep[33549] Training Speed: 452.13 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 7:40:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:45:25 INFO loss_tracker.py:84 | Epoch[244/NA] Step[124] GlobalStep[33552/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0128] total_loss[0.0170] Rank[0/16] 06/24/2025 16:45:29 INFO stats.py:394 | Epoch[244] completed. Training Speed: 309.49 samples/sec across all devices. Epoch Time: 56.66 sec. Average Epoch Time: 56.66 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:40:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:45:34 INFO stats.py:314 | Epoch[245] Step[9] GlobalStep[33574] Training Speed: 434.07 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:40:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:45:41 INFO loss_tracker.py:84 | Epoch[245/NA] Step[24] GlobalStep[33589/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 16:45:45 INFO stats.py:314 | Epoch[245] Step[34] GlobalStep[33599] Training Speed: 434.05 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:40:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:45:51 INFO loss_tracker.py:84 | Epoch[245/NA] Step[49] GlobalStep[33614/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0178] Rank[0/16] 06/24/2025 16:45:55 INFO stats.py:314 | Epoch[245] Step[59] GlobalStep[33624] Training Speed: 429.33 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:40:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:46:02 INFO loss_tracker.py:84 | Epoch[245/NA] Step[74] GlobalStep[33639/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0128] total_loss[0.0181] Rank[0/16] 06/24/2025 16:46:06 INFO stats.py:314 | Epoch[245] Step[84] GlobalStep[33649] Training Speed: 418.49 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 7:40:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:46:12 INFO loss_tracker.py:84 | Epoch[245/NA] Step[99] GlobalStep[33664/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0179] Rank[0/16] 06/24/2025 16:46:16 INFO stats.py:314 | Epoch[245] Step[109] GlobalStep[33674] Training Speed: 423.86 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:40:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:46:22 INFO loss_tracker.py:84 | Epoch[245/NA] Step[124] GlobalStep[33689/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 16:46:26 INFO stats.py:314 | Epoch[245] Step[134] GlobalStep[33699] Training Speed: 448.86 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:39:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:46:26 INFO stats.py:394 | Epoch[245] completed. Training Speed: 307.09 samples/sec across all devices. Epoch Time: 57.10 sec. Average Epoch Time: 57.10 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 7:39:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:46:37 INFO stats.py:314 | Epoch[246] Step[22] GlobalStep[33724] Training Speed: 433.91 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:39:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:46:38 INFO loss_tracker.py:84 | Epoch[246/NA] Step[24] GlobalStep[33726/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0174] Rank[0/16] 06/24/2025 16:46:47 INFO stats.py:314 | Epoch[246] Step[47] GlobalStep[33749] Training Speed: 436.63 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:39:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:46:48 INFO loss_tracker.py:84 | Epoch[246/NA] Step[49] GlobalStep[33751/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 16:46:58 INFO stats.py:314 | Epoch[246] Step[72] GlobalStep[33774] Training Speed: 431.32 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:39:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:46:59 INFO loss_tracker.py:84 | Epoch[246/NA] Step[74] GlobalStep[33776/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 16:47:08 INFO stats.py:314 | Epoch[246] Step[97] GlobalStep[33799] Training Speed: 425.27 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:39:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:47:09 INFO loss_tracker.py:84 | Epoch[246/NA] Step[99] GlobalStep[33801/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 16:47:18 INFO stats.py:314 | Epoch[246] Step[122] GlobalStep[33824] Training Speed: 451.79 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 7:39:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:47:19 INFO loss_tracker.py:84 | Epoch[246/NA] Step[124] GlobalStep[33826/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 16:47:24 INFO stats.py:394 | Epoch[246] completed. Training Speed: 306.06 samples/sec across all devices. Epoch Time: 57.30 sec. Average Epoch Time: 57.30 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 7:38:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:47:29 INFO stats.py:314 | Epoch[247] Step[10] GlobalStep[33849] Training Speed: 432.54 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:38:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:47:35 INFO loss_tracker.py:84 | Epoch[247/NA] Step[24] GlobalStep[33863/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0128] total_loss[0.0181] Rank[0/16] 06/24/2025 16:47:40 INFO stats.py:314 | Epoch[247] Step[35] GlobalStep[33874] Training Speed: 434.89 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:38:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:47:45 INFO loss_tracker.py:84 | Epoch[247/NA] Step[49] GlobalStep[33888/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0128] total_loss[0.0172] Rank[0/16] 06/24/2025 16:47:49 INFO stats.py:314 | Epoch[247] Step[60] GlobalStep[33899] Training Speed: 432.14 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:38:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:47:56 INFO loss_tracker.py:84 | Epoch[247/NA] Step[74] GlobalStep[33913/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0128] total_loss[0.0180] Rank[0/16] 06/24/2025 16:48:00 INFO stats.py:314 | Epoch[247] Step[85] GlobalStep[33924] Training Speed: 446.79 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:38:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:48:06 INFO loss_tracker.py:84 | Epoch[247/NA] Step[99] GlobalStep[33938/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 16:48:10 INFO stats.py:314 | Epoch[247] Step[110] GlobalStep[33949] Training Speed: 423.14 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:38:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:48:16 INFO loss_tracker.py:84 | Epoch[247/NA] Step[124] GlobalStep[33963/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 16:48:20 INFO stats.py:314 | Epoch[247] Step[135] GlobalStep[33974] Training Speed: 452.37 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 7:37:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:48:21 INFO stats.py:394 | Epoch[247] completed. Training Speed: 306.64 samples/sec across all devices. Epoch Time: 57.19 sec. Average Epoch Time: 57.19 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 7:37:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:48:31 INFO stats.py:314 | Epoch[248] Step[23] GlobalStep[33999] Training Speed: 423.73 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:37:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:48:32 INFO loss_tracker.py:84 | Epoch[248/NA] Step[24] GlobalStep[34000/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 16:48:42 INFO stats.py:314 | Epoch[248] Step[48] GlobalStep[34024] Training Speed: 426.21 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:37:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:48:42 INFO loss_tracker.py:84 | Epoch[248/NA] Step[49] GlobalStep[34025/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0128] total_loss[0.0169] Rank[0/16] 06/24/2025 16:48:52 INFO stats.py:314 | Epoch[248] Step[73] GlobalStep[34049] Training Speed: 425.15 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:37:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:48:53 INFO loss_tracker.py:84 | Epoch[248/NA] Step[74] GlobalStep[34050/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 16:49:03 INFO stats.py:314 | Epoch[248] Step[98] GlobalStep[34074] Training Speed: 426.94 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:37:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:49:03 INFO loss_tracker.py:84 | Epoch[248/NA] Step[99] GlobalStep[34075/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 16:49:13 INFO stats.py:314 | Epoch[248] Step[123] GlobalStep[34099] Training Speed: 455.01 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 7:37:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:49:13 INFO loss_tracker.py:84 | Epoch[248/NA] Step[124] GlobalStep[34100/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0128] total_loss[0.0180] Rank[0/16] 06/24/2025 16:49:17 INFO stats.py:394 | Epoch[248] completed. Training Speed: 309.51 samples/sec across all devices. Epoch Time: 56.66 sec. Average Epoch Time: 56.66 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:37:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:49:23 INFO stats.py:314 | Epoch[249] Step[11] GlobalStep[34124] Training Speed: 431.11 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:36:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:49:29 INFO loss_tracker.py:84 | Epoch[249/NA] Step[24] GlobalStep[34137/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 16:49:34 INFO stats.py:314 | Epoch[249] Step[36] GlobalStep[34149] Training Speed: 434.39 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:36:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:49:40 INFO loss_tracker.py:84 | Epoch[249/NA] Step[49] GlobalStep[34162/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 16:49:44 INFO stats.py:314 | Epoch[249] Step[61] GlobalStep[34174] Training Speed: 433.34 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:36:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:49:49 INFO loss_tracker.py:84 | Epoch[249/NA] Step[74] GlobalStep[34187/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 16:49:55 INFO stats.py:314 | Epoch[249] Step[86] GlobalStep[34199] Training Speed: 434.68 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:36:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:50:00 INFO loss_tracker.py:84 | Epoch[249/NA] Step[99] GlobalStep[34212/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0127] total_loss[0.0178] Rank[0/16] 06/24/2025 16:50:04 INFO stats.py:314 | Epoch[249] Step[111] GlobalStep[34224] Training Speed: 419.43 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 7:36:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:50:09 INFO loss_tracker.py:84 | Epoch[249/NA] Step[124] GlobalStep[34237/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0179] Rank[0/16] 06/24/2025 16:50:14 INFO stats.py:314 | Epoch[249] Step[136] GlobalStep[34249] Training Speed: 450.44 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 7:36:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:50:14 INFO stats.py:394 | Epoch[249] completed. Training Speed: 312.24 samples/sec across all devices. Epoch Time: 56.16 sec. Average Epoch Time: 56.16 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:36:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:50:25 INFO stats.py:314 | Epoch[250] Step[24] GlobalStep[34274] Training Speed: 431.23 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:35:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:50:25 INFO loss_tracker.py:84 | Epoch[250/NA] Step[24] GlobalStep[34274/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0174] Rank[0/16] 06/24/2025 16:50:35 INFO stats.py:314 | Epoch[250] Step[49] GlobalStep[34299] Training Speed: 432.26 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:35:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:50:35 INFO loss_tracker.py:84 | Epoch[250/NA] Step[49] GlobalStep[34299/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0128] total_loss[0.0181] Rank[0/16] 06/24/2025 16:50:45 INFO stats.py:314 | Epoch[250] Step[74] GlobalStep[34324] Training Speed: 426.27 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:35:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:50:45 INFO loss_tracker.py:84 | Epoch[250/NA] Step[74] GlobalStep[34324/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0128] total_loss[0.0172] Rank[0/16] 06/24/2025 16:50:56 INFO stats.py:314 | Epoch[250] Step[99] GlobalStep[34349] Training Speed: 433.63 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:35:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:50:56 INFO loss_tracker.py:84 | Epoch[250/NA] Step[99] GlobalStep[34349/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 16:51:06 INFO stats.py:314 | Epoch[250] Step[124] GlobalStep[34374] Training Speed: 452.87 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 7:35:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:51:06 INFO loss_tracker.py:84 | Epoch[250/NA] Step[124] GlobalStep[34374/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0055] loss_depth[0.0127] total_loss[0.0183] Rank[0/16] 06/24/2025 16:51:10 INFO stats.py:394 | Epoch[250] completed. Training Speed: 309.78 samples/sec across all devices. Epoch Time: 56.61 sec. Average Epoch Time: 56.61 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:35:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:51:17 INFO stats.py:314 | Epoch[251] Step[12] GlobalStep[34399] Training Speed: 437.78 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:35:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:51:22 INFO loss_tracker.py:84 | Epoch[251/NA] Step[24] GlobalStep[34411/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 16:51:27 INFO stats.py:314 | Epoch[251] Step[37] GlobalStep[34424] Training Speed: 434.57 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:34:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:51:32 INFO loss_tracker.py:84 | Epoch[251/NA] Step[49] GlobalStep[34436/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 16:51:38 INFO stats.py:314 | Epoch[251] Step[62] GlobalStep[34449] Training Speed: 430.11 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:34:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:51:43 INFO loss_tracker.py:84 | Epoch[251/NA] Step[74] GlobalStep[34461/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 16:51:48 INFO stats.py:314 | Epoch[251] Step[87] GlobalStep[34474] Training Speed: 437.78 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:34:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:51:53 INFO loss_tracker.py:84 | Epoch[251/NA] Step[99] GlobalStep[34486/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 16:51:58 INFO stats.py:314 | Epoch[251] Step[112] GlobalStep[34499] Training Speed: 436.08 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:34:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:52:03 INFO loss_tracker.py:84 | Epoch[251/NA] Step[124] GlobalStep[34511/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 16:52:07 INFO stats.py:394 | Epoch[251] completed. Training Speed: 309.39 samples/sec across all devices. Epoch Time: 56.68 sec. Average Epoch Time: 56.68 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:34:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:52:08 INFO stats.py:314 | Epoch[252] Step[0] GlobalStep[34524] Training Speed: 355.11 samples/sec across all devices. Average Step Time: 0.36 sec. Estimated Remaining Time: 7:34:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:52:19 INFO loss_tracker.py:84 | Epoch[252/NA] Step[24] GlobalStep[34548/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 16:52:19 INFO stats.py:314 | Epoch[252] Step[25] GlobalStep[34549] Training Speed: 421.54 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:33:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:52:29 INFO loss_tracker.py:84 | Epoch[252/NA] Step[49] GlobalStep[34573/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0127] total_loss[0.0181] Rank[0/16] 06/24/2025 16:52:29 INFO stats.py:314 | Epoch[252] Step[50] GlobalStep[34574] Training Speed: 433.50 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:33:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:52:40 INFO loss_tracker.py:84 | Epoch[252/NA] Step[74] GlobalStep[34598/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 16:52:40 INFO stats.py:314 | Epoch[252] Step[75] GlobalStep[34599] Training Speed: 424.77 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:33:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:52:50 INFO loss_tracker.py:84 | Epoch[252/NA] Step[99] GlobalStep[34623/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 16:52:50 INFO stats.py:314 | Epoch[252] Step[100] GlobalStep[34624] Training Speed: 428.50 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:33:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:53:00 INFO loss_tracker.py:84 | Epoch[252/NA] Step[124] GlobalStep[34648/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0127] total_loss[0.0181] Rank[0/16] 06/24/2025 16:53:00 INFO stats.py:314 | Epoch[252] Step[125] GlobalStep[34649] Training Speed: 436.11 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:33:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:53:04 INFO stats.py:394 | Epoch[252] completed. Training Speed: 307.29 samples/sec across all devices. Epoch Time: 57.07 sec. Average Epoch Time: 57.07 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 7:33:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:53:11 INFO stats.py:314 | Epoch[253] Step[13] GlobalStep[34674] Training Speed: 437.30 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:33:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:53:16 INFO loss_tracker.py:84 | Epoch[253/NA] Step[24] GlobalStep[34685/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 16:53:21 INFO stats.py:314 | Epoch[253] Step[38] GlobalStep[34699] Training Speed: 428.05 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:32:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:53:26 INFO loss_tracker.py:84 | Epoch[253/NA] Step[49] GlobalStep[34710/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0178] Rank[0/16] 06/24/2025 16:53:32 INFO stats.py:314 | Epoch[253] Step[63] GlobalStep[34724] Training Speed: 266.68 samples/sec across all devices. Average Step Time: 0.48 sec. Estimated Remaining Time: 7:32:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:53:36 INFO loss_tracker.py:84 | Epoch[253/NA] Step[74] GlobalStep[34735/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0128] total_loss[0.0172] Rank[0/16] 06/24/2025 16:53:42 INFO stats.py:314 | Epoch[253] Step[88] GlobalStep[34749] Training Speed: 434.32 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:32:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:53:47 INFO loss_tracker.py:84 | Epoch[253/NA] Step[99] GlobalStep[34760/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0128] total_loss[0.0180] Rank[0/16] 06/24/2025 16:53:52 INFO stats.py:314 | Epoch[253] Step[113] GlobalStep[34774] Training Speed: 423.28 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:32:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:53:57 INFO loss_tracker.py:84 | Epoch[253/NA] Step[124] GlobalStep[34785/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 16:54:01 INFO stats.py:394 | Epoch[253] completed. Training Speed: 307.60 samples/sec across all devices. Epoch Time: 57.01 sec. Average Epoch Time: 57.01 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 7:32:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:54:03 INFO stats.py:314 | Epoch[254] Step[1] GlobalStep[34799] Training Speed: 432.75 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:32:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:54:13 INFO loss_tracker.py:84 | Epoch[254/NA] Step[24] GlobalStep[34822/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 16:54:13 INFO stats.py:314 | Epoch[254] Step[26] GlobalStep[34824] Training Speed: 430.45 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:32:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:54:23 INFO loss_tracker.py:84 | Epoch[254/NA] Step[49] GlobalStep[34847/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 16:54:24 INFO stats.py:314 | Epoch[254] Step[51] GlobalStep[34849] Training Speed: 424.89 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:31:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:54:33 INFO loss_tracker.py:84 | Epoch[254/NA] Step[74] GlobalStep[34872/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 16:54:34 INFO stats.py:314 | Epoch[254] Step[76] GlobalStep[34874] Training Speed: 430.59 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:31:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:54:43 INFO loss_tracker.py:84 | Epoch[254/NA] Step[99] GlobalStep[34897/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 16:54:44 INFO stats.py:314 | Epoch[254] Step[101] GlobalStep[34899] Training Speed: 433.83 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:31:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:54:54 INFO loss_tracker.py:84 | Epoch[254/NA] Step[124] GlobalStep[34922/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 16:54:54 INFO stats.py:314 | Epoch[254] Step[126] GlobalStep[34924] Training Speed: 450.78 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 7:31:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:54:58 INFO stats.py:394 | Epoch[254] completed. Training Speed: 308.49 samples/sec across all devices. Epoch Time: 56.84 sec. Average Epoch Time: 56.84 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:31:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:55:05 INFO stats.py:314 | Epoch[255] Step[14] GlobalStep[34949] Training Speed: 429.67 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:31:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:55:09 INFO loss_tracker.py:84 | Epoch[255/NA] Step[24] GlobalStep[34959/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 16:55:15 INFO stats.py:314 | Epoch[255] Step[39] GlobalStep[34974] Training Speed: 434.09 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:31:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:55:20 INFO loss_tracker.py:84 | Epoch[255/NA] Step[49] GlobalStep[34984/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0128] total_loss[0.0180] Rank[0/16] 06/24/2025 16:55:26 INFO stats.py:314 | Epoch[255] Step[64] GlobalStep[34999] Training Speed: 432.98 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:30:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:55:30 INFO loss_tracker.py:84 | Epoch[255/NA] Step[74] GlobalStep[35009/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0128] total_loss[0.0172] Rank[0/16] 06/24/2025 16:55:36 INFO stats.py:314 | Epoch[255] Step[89] GlobalStep[35024] Training Speed: 425.72 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:30:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:55:41 INFO loss_tracker.py:84 | Epoch[255/NA] Step[99] GlobalStep[35034/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0128] total_loss[0.0172] Rank[0/16] 06/24/2025 16:55:47 INFO stats.py:314 | Epoch[255] Step[114] GlobalStep[35049] Training Speed: 435.77 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:30:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:55:51 INFO loss_tracker.py:84 | Epoch[255/NA] Step[124] GlobalStep[35059/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 16:55:55 INFO stats.py:394 | Epoch[255] completed. Training Speed: 306.37 samples/sec across all devices. Epoch Time: 57.24 sec. Average Epoch Time: 57.24 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 7:30:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:55:58 INFO stats.py:314 | Epoch[256] Step[2] GlobalStep[35074] Training Speed: 427.73 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:30:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:56:07 INFO loss_tracker.py:84 | Epoch[256/NA] Step[24] GlobalStep[35096/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 16:56:08 INFO stats.py:314 | Epoch[256] Step[27] GlobalStep[35099] Training Speed: 420.86 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:30:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:56:17 INFO loss_tracker.py:84 | Epoch[256/NA] Step[49] GlobalStep[35121/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 16:56:18 INFO stats.py:314 | Epoch[256] Step[52] GlobalStep[35124] Training Speed: 430.36 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:29:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:56:27 INFO loss_tracker.py:84 | Epoch[256/NA] Step[74] GlobalStep[35146/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 16:56:28 INFO stats.py:314 | Epoch[256] Step[77] GlobalStep[35149] Training Speed: 431.01 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:29:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:56:37 INFO loss_tracker.py:84 | Epoch[256/NA] Step[99] GlobalStep[35171/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 16:56:38 INFO stats.py:314 | Epoch[256] Step[102] GlobalStep[35174] Training Speed: 435.15 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:29:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:56:47 INFO loss_tracker.py:84 | Epoch[256/NA] Step[124] GlobalStep[35196/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0128] total_loss[0.0173] Rank[0/16] 06/24/2025 16:56:48 INFO stats.py:314 | Epoch[256] Step[127] GlobalStep[35199] Training Speed: 452.52 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 7:29:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:56:52 INFO stats.py:394 | Epoch[256] completed. Training Speed: 308.55 samples/sec across all devices. Epoch Time: 56.83 sec. Average Epoch Time: 56.83 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:29:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:57:00 INFO stats.py:314 | Epoch[257] Step[15] GlobalStep[35224] Training Speed: 388.72 samples/sec across all devices. Average Step Time: 0.33 sec. Estimated Remaining Time: 7:29:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:57:04 INFO loss_tracker.py:84 | Epoch[257/NA] Step[24] GlobalStep[35233/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 16:57:10 INFO stats.py:314 | Epoch[257] Step[40] GlobalStep[35249] Training Speed: 432.07 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:29:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:57:13 INFO loss_tracker.py:84 | Epoch[257/NA] Step[49] GlobalStep[35258/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 16:57:20 INFO stats.py:314 | Epoch[257] Step[65] GlobalStep[35274] Training Speed: 433.13 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:28:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:57:24 INFO loss_tracker.py:84 | Epoch[257/NA] Step[74] GlobalStep[35283/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 16:57:30 INFO stats.py:314 | Epoch[257] Step[90] GlobalStep[35299] Training Speed: 421.76 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:28:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:57:34 INFO loss_tracker.py:84 | Epoch[257/NA] Step[99] GlobalStep[35308/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0128] total_loss[0.0173] Rank[0/16] 06/24/2025 16:57:40 INFO stats.py:314 | Epoch[257] Step[115] GlobalStep[35324] Training Speed: 410.23 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 7:28:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:57:44 INFO loss_tracker.py:84 | Epoch[257/NA] Step[124] GlobalStep[35333/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 16:57:48 INFO stats.py:394 | Epoch[257] completed. Training Speed: 312.53 samples/sec across all devices. Epoch Time: 56.11 sec. Average Epoch Time: 56.11 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:28:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:57:51 INFO stats.py:314 | Epoch[258] Step[3] GlobalStep[35349] Training Speed: 408.13 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 7:28:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:57:59 INFO loss_tracker.py:84 | Epoch[258/NA] Step[24] GlobalStep[35370/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 16:58:01 INFO stats.py:314 | Epoch[258] Step[28] GlobalStep[35374] Training Speed: 430.61 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:28:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:58:10 INFO loss_tracker.py:84 | Epoch[258/NA] Step[49] GlobalStep[35395/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 16:58:11 INFO stats.py:314 | Epoch[258] Step[53] GlobalStep[35399] Training Speed: 428.82 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:28:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:58:20 INFO loss_tracker.py:84 | Epoch[258/NA] Step[74] GlobalStep[35420/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0054] loss_depth[0.0128] total_loss[0.0182] Rank[0/16] 06/24/2025 16:58:22 INFO stats.py:314 | Epoch[258] Step[78] GlobalStep[35424] Training Speed: 418.63 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 7:27:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:58:31 INFO loss_tracker.py:84 | Epoch[258/NA] Step[99] GlobalStep[35445/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 16:58:32 INFO stats.py:314 | Epoch[258] Step[103] GlobalStep[35449] Training Speed: 433.13 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:27:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:58:41 INFO loss_tracker.py:84 | Epoch[258/NA] Step[124] GlobalStep[35470/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 16:58:42 INFO stats.py:314 | Epoch[258] Step[128] GlobalStep[35474] Training Speed: 448.68 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:27:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:58:45 INFO stats.py:394 | Epoch[258] completed. Training Speed: 306.99 samples/sec across all devices. Epoch Time: 57.12 sec. Average Epoch Time: 57.12 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 7:27:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:58:53 INFO stats.py:314 | Epoch[259] Step[16] GlobalStep[35499] Training Speed: 426.70 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:27:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:58:56 INFO loss_tracker.py:84 | Epoch[259/NA] Step[24] GlobalStep[35507/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 16:59:04 INFO stats.py:314 | Epoch[259] Step[41] GlobalStep[35524] Training Speed: 423.69 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:27:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:59:07 INFO loss_tracker.py:84 | Epoch[259/NA] Step[49] GlobalStep[35532/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0128] total_loss[0.0172] Rank[0/16] 06/24/2025 16:59:14 INFO stats.py:314 | Epoch[259] Step[66] GlobalStep[35549] Training Speed: 421.27 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:26:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:59:17 INFO loss_tracker.py:84 | Epoch[259/NA] Step[74] GlobalStep[35557/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 16:59:24 INFO stats.py:314 | Epoch[259] Step[91] GlobalStep[35574] Training Speed: 404.20 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 7:26:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:59:27 INFO loss_tracker.py:84 | Epoch[259/NA] Step[99] GlobalStep[35582/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0127] total_loss[0.0167] Rank[0/16] 06/24/2025 16:59:34 INFO stats.py:314 | Epoch[259] Step[116] GlobalStep[35599] Training Speed: 429.18 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:26:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:59:37 INFO loss_tracker.py:84 | Epoch[259/NA] Step[124] GlobalStep[35607/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 16:59:41 INFO stats.py:394 | Epoch[259] completed. Training Speed: 313.82 samples/sec across all devices. Epoch Time: 55.88 sec. Average Epoch Time: 55.88 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:26:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:59:44 INFO stats.py:314 | Epoch[260] Step[4] GlobalStep[35624] Training Speed: 428.48 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:26:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 16:59:52 INFO loss_tracker.py:84 | Epoch[260/NA] Step[24] GlobalStep[35644/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 16:59:54 INFO stats.py:314 | Epoch[260] Step[29] GlobalStep[35649] Training Speed: 405.73 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 7:26:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:00:03 INFO loss_tracker.py:84 | Epoch[260/NA] Step[49] GlobalStep[35669/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0178] Rank[0/16] 06/24/2025 17:00:05 INFO stats.py:314 | Epoch[260] Step[54] GlobalStep[35674] Training Speed: 428.69 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:26:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:00:13 INFO loss_tracker.py:84 | Epoch[260/NA] Step[74] GlobalStep[35694/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0128] total_loss[0.0171] Rank[0/16] 06/24/2025 17:00:15 INFO stats.py:314 | Epoch[260] Step[79] GlobalStep[35699] Training Speed: 430.47 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:25:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:00:23 INFO loss_tracker.py:84 | Epoch[260/NA] Step[99] GlobalStep[35719/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 17:00:25 INFO stats.py:314 | Epoch[260] Step[104] GlobalStep[35724] Training Speed: 431.09 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:25:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:00:33 INFO loss_tracker.py:84 | Epoch[260/NA] Step[124] GlobalStep[35744/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0128] total_loss[0.0176] Rank[0/16] 06/24/2025 17:00:35 INFO stats.py:314 | Epoch[260] Step[129] GlobalStep[35749] Training Speed: 447.13 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:25:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:00:37 INFO stats.py:394 | Epoch[260] completed. Training Speed: 312.18 samples/sec across all devices. Epoch Time: 56.17 sec. Average Epoch Time: 56.17 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:25:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:00:46 INFO stats.py:314 | Epoch[261] Step[17] GlobalStep[35774] Training Speed: 447.59 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:25:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:00:49 INFO loss_tracker.py:84 | Epoch[261/NA] Step[24] GlobalStep[35781/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 17:00:56 INFO stats.py:314 | Epoch[261] Step[42] GlobalStep[35799] Training Speed: 428.24 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:25:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:00:59 INFO loss_tracker.py:84 | Epoch[261/NA] Step[49] GlobalStep[35806/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0127] total_loss[0.0180] Rank[0/16] 06/24/2025 17:01:06 INFO stats.py:314 | Epoch[261] Step[67] GlobalStep[35824] Training Speed: 431.65 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:25:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:01:09 INFO loss_tracker.py:84 | Epoch[261/NA] Step[74] GlobalStep[35831/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 17:01:17 INFO stats.py:314 | Epoch[261] Step[92] GlobalStep[35849] Training Speed: 433.26 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:24:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:01:20 INFO loss_tracker.py:84 | Epoch[261/NA] Step[99] GlobalStep[35856/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 17:01:27 INFO stats.py:314 | Epoch[261] Step[117] GlobalStep[35874] Training Speed: 423.33 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:24:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:01:30 INFO loss_tracker.py:84 | Epoch[261/NA] Step[124] GlobalStep[35881/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 17:01:34 INFO stats.py:394 | Epoch[261] completed. Training Speed: 310.81 samples/sec across all devices. Epoch Time: 56.42 sec. Average Epoch Time: 56.42 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:24:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:01:37 INFO stats.py:314 | Epoch[262] Step[5] GlobalStep[35899] Training Speed: 433.41 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:24:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:01:45 INFO loss_tracker.py:84 | Epoch[262/NA] Step[24] GlobalStep[35918/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:01:48 INFO stats.py:314 | Epoch[262] Step[30] GlobalStep[35924] Training Speed: 431.24 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:24:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:01:56 INFO loss_tracker.py:84 | Epoch[262/NA] Step[49] GlobalStep[35943/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0174] Rank[0/16] 06/24/2025 17:01:58 INFO stats.py:314 | Epoch[262] Step[55] GlobalStep[35949] Training Speed: 432.98 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:24:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:02:06 INFO loss_tracker.py:84 | Epoch[262/NA] Step[74] GlobalStep[35968/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 17:02:08 INFO stats.py:314 | Epoch[262] Step[80] GlobalStep[35974] Training Speed: 430.33 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:23:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:02:16 INFO loss_tracker.py:84 | Epoch[262/NA] Step[99] GlobalStep[35993/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 17:02:18 INFO stats.py:314 | Epoch[262] Step[105] GlobalStep[35999] Training Speed: 429.41 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:23:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:02:19 WARNING accelerator.py:3099 | Deleting 1 checkpoints to make room for new checkpoint. Rank[0/16] 06/24/2025 17:02:20 INFO accelerator.py:3111 | Saving current state to /job_data/checkpoints/checkpoint_8 Rank[9/16] 06/24/2025 17:02:20 INFO checkpoint.py:110 | Save checkpoint at the end of step 35999 to /job_data/checkpoints/checkpoint_8 Rank[14/16] 06/24/2025 17:02:20 INFO checkpoint.py:110 | Save checkpoint at the end of step 35999 to /job_data/checkpoints/checkpoint_8 Rank[8/16] 06/24/2025 17:02:20 INFO checkpoint.py:110 | Save checkpoint at the end of step 35999 to /job_data/checkpoints/checkpoint_8 Rank[1/16] 06/24/2025 17:02:20 INFO checkpoint.py:110 | Save checkpoint at the end of step 35999 to /job_data/checkpoints/checkpoint_8 Rank[2/16] 06/24/2025 17:02:20 INFO checkpoint.py:110 | Save checkpoint at the end of step 35999 to /job_data/checkpoints/checkpoint_8 Rank[3/16] 06/24/2025 17:02:20 INFO checkpoint.py:110 | Save checkpoint at the end of step 35999 to /job_data/checkpoints/checkpoint_8 Rank[15/16] 06/24/2025 17:02:20 INFO checkpoint.py:110 | Save checkpoint at the end of step 35999 to /job_data/checkpoints/checkpoint_8 Rank[4/16] 06/24/2025 17:02:20 INFO checkpoint.py:110 | Save checkpoint at the end of step 35999 to /job_data/checkpoints/checkpoint_8 Rank[10/16] 06/24/2025 17:02:20 INFO checkpoint.py:110 | Save checkpoint at the end of step 35999 to /job_data/checkpoints/checkpoint_8 Rank[12/16] 06/24/2025 17:02:21 INFO checkpoint.py:110 | Save checkpoint at the end of step 35999 to /job_data/checkpoints/checkpoint_8 Rank[6/16] 06/24/2025 17:02:21 INFO checkpoint.py:110 | Save checkpoint at the end of step 35999 to /job_data/checkpoints/checkpoint_8 Rank[7/16] 06/24/2025 17:02:21 INFO checkpoint.py:110 | Save checkpoint at the end of step 35999 to /job_data/checkpoints/checkpoint_8 Rank[11/16] 06/24/2025 17:02:21 INFO checkpoint.py:110 | Save checkpoint at the end of step 35999 to /job_data/checkpoints/checkpoint_8 Rank[13/16] 06/24/2025 17:02:21 INFO checkpoint.py:110 | Save checkpoint at the end of step 35999 to /job_data/checkpoints/checkpoint_8 Rank[5/16] 06/24/2025 17:02:21 INFO checkpoint.py:110 | Save checkpoint at the end of step 35999 to /job_data/checkpoints/checkpoint_8 Rank[0/16] 06/24/2025 17:02:21 INFO checkpointing.py:106 | Model weights saved in /job_data/checkpoints/checkpoint_8/model.safetensors Rank[0/16] 06/24/2025 17:02:22 INFO checkpointing.py:113 | Optimizer state saved in /job_data/checkpoints/checkpoint_8/optimizer.bin Rank[0/16] 06/24/2025 17:02:22 INFO checkpointing.py:120 | Scheduler state saved in /job_data/checkpoints/checkpoint_8/scheduler.bin Rank[0/16] 06/24/2025 17:02:22 INFO checkpointing.py:137 | Sampler state for dataloader 0 saved in /job_data/checkpoints/checkpoint_8/sampler.bin Rank[0/16] 06/24/2025 17:02:22 INFO checkpointing.py:164 | Random states saved in /job_data/checkpoints/checkpoint_8/random_states_0.pkl Rank[0/16] 06/24/2025 17:02:22 INFO checkpointing.py:300 | Saving the state of TrainerProgressState to /job_data/checkpoints/checkpoint_8/custom_checkpoint_0.pkl Rank[0/16] 06/24/2025 17:02:22 INFO checkpoint.py:110 | Save checkpoint at the end of step 35999 to /job_data/checkpoints/checkpoint_8 Rank[0/16] 06/24/2025 17:02:30 INFO loss_tracker.py:84 | Epoch[262/NA] Step[124] GlobalStep[36018/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 17:02:32 INFO stats.py:314 | Epoch[262] Step[130] GlobalStep[36024] Training Speed: 447.67 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:23:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:02:35 INFO stats.py:394 | Epoch[262] completed. Training Speed: 287.90 samples/sec across all devices. Epoch Time: 60.91 sec. Average Epoch Time: 60.91 sec. Average Step Time: 0.44 sec. Estimated Remaining Time: 7:23:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:02:44 INFO stats.py:314 | Epoch[263] Step[18] GlobalStep[36049] Training Speed: 430.02 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:23:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:02:47 INFO loss_tracker.py:84 | Epoch[263/NA] Step[24] GlobalStep[36055/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0128] total_loss[0.0172] Rank[0/16] 06/24/2025 17:02:54 INFO stats.py:314 | Epoch[263] Step[43] GlobalStep[36074] Training Speed: 430.74 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:23:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:02:57 INFO loss_tracker.py:84 | Epoch[263/NA] Step[49] GlobalStep[36080/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 17:03:05 INFO stats.py:314 | Epoch[263] Step[68] GlobalStep[36099] Training Speed: 424.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:23:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:03:07 INFO loss_tracker.py:84 | Epoch[263/NA] Step[74] GlobalStep[36105/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0174] Rank[0/16] 06/24/2025 17:03:15 INFO stats.py:314 | Epoch[263] Step[93] GlobalStep[36124] Training Speed: 438.25 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:23:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:03:17 INFO loss_tracker.py:84 | Epoch[263/NA] Step[99] GlobalStep[36130/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 17:03:25 INFO stats.py:314 | Epoch[263] Step[118] GlobalStep[36149] Training Speed: 237.90 samples/sec across all devices. Average Step Time: 0.54 sec. Estimated Remaining Time: 7:22:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:03:27 INFO loss_tracker.py:84 | Epoch[263/NA] Step[124] GlobalStep[36155/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0178] Rank[0/16] 06/24/2025 17:03:31 INFO stats.py:394 | Epoch[263] completed. Training Speed: 308.21 samples/sec across all devices. Epoch Time: 56.90 sec. Average Epoch Time: 56.90 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 7:22:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:03:35 INFO stats.py:314 | Epoch[264] Step[6] GlobalStep[36174] Training Speed: 427.82 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:22:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:03:43 INFO loss_tracker.py:84 | Epoch[264/NA] Step[24] GlobalStep[36192/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 17:03:46 INFO stats.py:314 | Epoch[264] Step[31] GlobalStep[36199] Training Speed: 420.09 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:22:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:03:53 INFO loss_tracker.py:84 | Epoch[264/NA] Step[49] GlobalStep[36217/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0127] total_loss[0.0166] Rank[0/16] 06/24/2025 17:03:56 INFO stats.py:314 | Epoch[264] Step[56] GlobalStep[36224] Training Speed: 432.34 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:22:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:04:03 INFO loss_tracker.py:84 | Epoch[264/NA] Step[74] GlobalStep[36242/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0127] total_loss[0.0179] Rank[0/16] 06/24/2025 17:04:06 INFO stats.py:314 | Epoch[264] Step[81] GlobalStep[36249] Training Speed: 425.06 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:22:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:04:13 INFO loss_tracker.py:84 | Epoch[264/NA] Step[99] GlobalStep[36267/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0127] total_loss[0.0167] Rank[0/16] 06/24/2025 17:04:16 INFO stats.py:314 | Epoch[264] Step[106] GlobalStep[36274] Training Speed: 427.96 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:21:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:04:23 INFO loss_tracker.py:84 | Epoch[264/NA] Step[124] GlobalStep[36292/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0056] loss_depth[0.0127] total_loss[0.0184] Rank[0/16] 06/24/2025 17:04:26 INFO stats.py:314 | Epoch[264] Step[131] GlobalStep[36299] Training Speed: 454.04 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 7:21:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:04:28 INFO stats.py:394 | Epoch[264] completed. Training Speed: 310.56 samples/sec across all devices. Epoch Time: 56.47 sec. Average Epoch Time: 56.47 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:21:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:04:37 INFO stats.py:314 | Epoch[265] Step[19] GlobalStep[36324] Training Speed: 431.28 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:21:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:04:39 INFO loss_tracker.py:84 | Epoch[265/NA] Step[24] GlobalStep[36329/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 17:04:47 INFO stats.py:314 | Epoch[265] Step[44] GlobalStep[36349] Training Speed: 426.40 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:21:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:04:50 INFO loss_tracker.py:84 | Epoch[265/NA] Step[49] GlobalStep[36354/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:04:57 INFO stats.py:314 | Epoch[265] Step[69] GlobalStep[36374] Training Speed: 433.59 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:21:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:05:00 INFO loss_tracker.py:84 | Epoch[265/NA] Step[74] GlobalStep[36379/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0128] total_loss[0.0174] Rank[0/16] 06/24/2025 17:05:08 INFO stats.py:314 | Epoch[265] Step[94] GlobalStep[36399] Training Speed: 437.96 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:21:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:05:10 INFO loss_tracker.py:84 | Epoch[265/NA] Step[99] GlobalStep[36404/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0174] Rank[0/16] 06/24/2025 17:05:18 INFO stats.py:314 | Epoch[265] Step[119] GlobalStep[36424] Training Speed: 427.71 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:20:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:05:20 INFO loss_tracker.py:84 | Epoch[265/NA] Step[124] GlobalStep[36429/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0127] total_loss[0.0178] Rank[0/16] 06/24/2025 17:05:24 INFO stats.py:394 | Epoch[265] completed. Training Speed: 312.10 samples/sec across all devices. Epoch Time: 56.19 sec. Average Epoch Time: 56.19 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:20:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:05:29 INFO stats.py:314 | Epoch[266] Step[7] GlobalStep[36449] Training Speed: 430.35 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:20:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:05:35 INFO loss_tracker.py:84 | Epoch[266/NA] Step[24] GlobalStep[36466/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 17:05:39 INFO stats.py:314 | Epoch[266] Step[32] GlobalStep[36474] Training Speed: 431.17 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:20:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:05:46 INFO loss_tracker.py:84 | Epoch[266/NA] Step[49] GlobalStep[36491/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0127] total_loss[0.0166] Rank[0/16] 06/24/2025 17:05:49 INFO stats.py:314 | Epoch[266] Step[57] GlobalStep[36499] Training Speed: 425.10 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:20:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:05:56 INFO loss_tracker.py:84 | Epoch[266/NA] Step[74] GlobalStep[36516/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0127] total_loss[0.0179] Rank[0/16] 06/24/2025 17:06:00 INFO stats.py:314 | Epoch[266] Step[82] GlobalStep[36524] Training Speed: 436.02 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:20:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:06:07 INFO loss_tracker.py:84 | Epoch[266/NA] Step[99] GlobalStep[36541/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0059] loss_depth[0.0127] total_loss[0.0187] Rank[0/16] 06/24/2025 17:06:10 INFO stats.py:314 | Epoch[266] Step[107] GlobalStep[36549] Training Speed: 423.99 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:20:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:06:17 INFO loss_tracker.py:84 | Epoch[266/NA] Step[124] GlobalStep[36566/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0127] total_loss[0.0181] Rank[0/16] 06/24/2025 17:06:20 INFO stats.py:314 | Epoch[266] Step[132] GlobalStep[36574] Training Speed: 447.50 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:19:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:06:21 INFO stats.py:394 | Epoch[266] completed. Training Speed: 307.24 samples/sec across all devices. Epoch Time: 57.08 sec. Average Epoch Time: 57.08 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 7:19:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:06:31 INFO stats.py:314 | Epoch[267] Step[20] GlobalStep[36599] Training Speed: 437.62 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:19:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:06:32 INFO loss_tracker.py:84 | Epoch[267/NA] Step[24] GlobalStep[36603/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:06:41 INFO stats.py:314 | Epoch[267] Step[45] GlobalStep[36624] Training Speed: 433.15 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:19:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:06:43 INFO loss_tracker.py:84 | Epoch[267/NA] Step[49] GlobalStep[36628/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 17:06:52 INFO stats.py:314 | Epoch[267] Step[70] GlobalStep[36649] Training Speed: 407.01 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 7:19:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:06:53 INFO loss_tracker.py:84 | Epoch[267/NA] Step[74] GlobalStep[36653/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 17:07:01 INFO stats.py:314 | Epoch[267] Step[95] GlobalStep[36674] Training Speed: 435.16 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:19:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:07:03 INFO loss_tracker.py:84 | Epoch[267/NA] Step[99] GlobalStep[36678/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0180] Rank[0/16] 06/24/2025 17:07:12 INFO stats.py:314 | Epoch[267] Step[120] GlobalStep[36699] Training Speed: 447.47 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:19:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:07:13 INFO loss_tracker.py:84 | Epoch[267/NA] Step[124] GlobalStep[36703/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 17:07:18 INFO stats.py:394 | Epoch[267] completed. Training Speed: 310.49 samples/sec across all devices. Epoch Time: 56.48 sec. Average Epoch Time: 56.48 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:18:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:07:22 INFO stats.py:314 | Epoch[268] Step[8] GlobalStep[36724] Training Speed: 427.35 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:18:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:07:30 INFO loss_tracker.py:84 | Epoch[268/NA] Step[24] GlobalStep[36740/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 17:07:33 INFO stats.py:314 | Epoch[268] Step[33] GlobalStep[36749] Training Speed: 435.88 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:18:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:07:40 INFO loss_tracker.py:84 | Epoch[268/NA] Step[49] GlobalStep[36765/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0127] total_loss[0.0168] Rank[0/16] 06/24/2025 17:07:44 INFO stats.py:314 | Epoch[268] Step[58] GlobalStep[36774] Training Speed: 433.77 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:18:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:07:50 INFO loss_tracker.py:84 | Epoch[268/NA] Step[74] GlobalStep[36790/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0127] total_loss[0.0179] Rank[0/16] 06/24/2025 17:07:54 INFO stats.py:314 | Epoch[268] Step[83] GlobalStep[36799] Training Speed: 430.34 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:18:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:08:01 INFO loss_tracker.py:84 | Epoch[268/NA] Step[99] GlobalStep[36815/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 17:08:05 INFO stats.py:314 | Epoch[268] Step[108] GlobalStep[36824] Training Speed: 425.64 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:18:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:08:11 INFO loss_tracker.py:84 | Epoch[268/NA] Step[124] GlobalStep[36840/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0174] Rank[0/16] 06/24/2025 17:08:14 INFO stats.py:314 | Epoch[268] Step[133] GlobalStep[36849] Training Speed: 450.92 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 7:17:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:08:15 INFO stats.py:394 | Epoch[268] completed. Training Speed: 303.51 samples/sec across all devices. Epoch Time: 57.78 sec. Average Epoch Time: 57.78 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 7:17:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:08:26 INFO stats.py:314 | Epoch[269] Step[21] GlobalStep[36874] Training Speed: 435.71 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:17:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:08:27 INFO loss_tracker.py:84 | Epoch[269/NA] Step[24] GlobalStep[36877/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 17:08:36 INFO stats.py:314 | Epoch[269] Step[46] GlobalStep[36899] Training Speed: 431.29 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:17:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:08:38 INFO loss_tracker.py:84 | Epoch[269/NA] Step[49] GlobalStep[36902/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0179] Rank[0/16] 06/24/2025 17:08:46 INFO stats.py:314 | Epoch[269] Step[71] GlobalStep[36924] Training Speed: 434.55 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:17:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:08:48 INFO loss_tracker.py:84 | Epoch[269/NA] Step[74] GlobalStep[36927/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 17:08:57 INFO stats.py:314 | Epoch[269] Step[96] GlobalStep[36949] Training Speed: 422.20 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:17:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:08:59 INFO loss_tracker.py:84 | Epoch[269/NA] Step[99] GlobalStep[36952/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 17:09:07 INFO stats.py:314 | Epoch[269] Step[121] GlobalStep[36974] Training Speed: 448.95 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:17:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:09:09 INFO loss_tracker.py:84 | Epoch[269/NA] Step[124] GlobalStep[36977/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:09:13 INFO stats.py:394 | Epoch[269] completed. Training Speed: 305.42 samples/sec across all devices. Epoch Time: 57.42 sec. Average Epoch Time: 57.42 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 7:17:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:09:18 INFO stats.py:314 | Epoch[270] Step[9] GlobalStep[36999] Training Speed: 424.35 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:16:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:09:24 INFO loss_tracker.py:84 | Epoch[270/NA] Step[24] GlobalStep[37014/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0055] loss_depth[0.0127] total_loss[0.0182] Rank[0/16] 06/24/2025 17:09:28 INFO stats.py:314 | Epoch[270] Step[34] GlobalStep[37024] Training Speed: 433.85 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:16:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:09:34 INFO loss_tracker.py:84 | Epoch[270/NA] Step[49] GlobalStep[37039/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0128] total_loss[0.0179] Rank[0/16] 06/24/2025 17:09:38 INFO stats.py:314 | Epoch[270] Step[59] GlobalStep[37049] Training Speed: 434.71 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:16:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:09:44 INFO loss_tracker.py:84 | Epoch[270/NA] Step[74] GlobalStep[37064/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 17:09:48 INFO stats.py:314 | Epoch[270] Step[84] GlobalStep[37074] Training Speed: 427.66 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:16:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:09:55 INFO loss_tracker.py:84 | Epoch[270/NA] Step[99] GlobalStep[37089/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 17:09:59 INFO stats.py:314 | Epoch[270] Step[109] GlobalStep[37099] Training Speed: 430.16 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:16:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:10:04 INFO loss_tracker.py:84 | Epoch[270/NA] Step[124] GlobalStep[37114/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 17:10:08 INFO stats.py:314 | Epoch[270] Step[134] GlobalStep[37124] Training Speed: 452.07 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 7:16:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:10:09 INFO stats.py:394 | Epoch[270] completed. Training Speed: 313.96 samples/sec across all devices. Epoch Time: 55.86 sec. Average Epoch Time: 55.86 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:16:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:10:19 INFO stats.py:314 | Epoch[271] Step[22] GlobalStep[37149] Training Speed: 437.88 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:15:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:10:20 INFO loss_tracker.py:84 | Epoch[271/NA] Step[24] GlobalStep[37151/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 17:10:29 INFO stats.py:314 | Epoch[271] Step[47] GlobalStep[37174] Training Speed: 433.70 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:15:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:10:30 INFO loss_tracker.py:84 | Epoch[271/NA] Step[49] GlobalStep[37176/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 17:10:39 INFO stats.py:314 | Epoch[271] Step[72] GlobalStep[37199] Training Speed: 434.11 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:15:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:10:40 INFO loss_tracker.py:84 | Epoch[271/NA] Step[74] GlobalStep[37201/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0179] Rank[0/16] 06/24/2025 17:10:50 INFO stats.py:314 | Epoch[271] Step[97] GlobalStep[37224] Training Speed: 434.09 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:15:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:10:51 INFO loss_tracker.py:84 | Epoch[271/NA] Step[99] GlobalStep[37226/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0178] Rank[0/16] 06/24/2025 17:11:00 INFO stats.py:314 | Epoch[271] Step[122] GlobalStep[37249] Training Speed: 453.80 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 7:15:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:11:01 INFO loss_tracker.py:84 | Epoch[271/NA] Step[124] GlobalStep[37251/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 17:11:05 INFO stats.py:394 | Epoch[271] completed. Training Speed: 313.55 samples/sec across all devices. Epoch Time: 55.93 sec. Average Epoch Time: 55.93 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:15:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:11:10 INFO stats.py:314 | Epoch[272] Step[10] GlobalStep[37274] Training Speed: 434.12 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:14:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:11:16 INFO loss_tracker.py:84 | Epoch[272/NA] Step[24] GlobalStep[37288/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0127] total_loss[0.0179] Rank[0/16] 06/24/2025 17:11:20 INFO stats.py:314 | Epoch[272] Step[35] GlobalStep[37299] Training Speed: 422.04 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:14:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:11:26 INFO loss_tracker.py:84 | Epoch[272/NA] Step[49] GlobalStep[37313/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 17:11:31 INFO stats.py:314 | Epoch[272] Step[60] GlobalStep[37324] Training Speed: 431.14 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:14:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:11:36 INFO loss_tracker.py:84 | Epoch[272/NA] Step[74] GlobalStep[37338/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:11:40 INFO stats.py:314 | Epoch[272] Step[85] GlobalStep[37349] Training Speed: 431.54 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:14:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:11:46 INFO loss_tracker.py:84 | Epoch[272/NA] Step[99] GlobalStep[37363/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 17:11:51 INFO stats.py:314 | Epoch[272] Step[110] GlobalStep[37374] Training Speed: 429.26 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:14:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:11:56 INFO loss_tracker.py:84 | Epoch[272/NA] Step[124] GlobalStep[37388/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 17:12:00 INFO stats.py:314 | Epoch[272] Step[135] GlobalStep[37399] Training Speed: 450.41 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 7:14:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:12:00 INFO stats.py:394 | Epoch[272] completed. Training Speed: 314.62 samples/sec across all devices. Epoch Time: 55.74 sec. Average Epoch Time: 55.74 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:14:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:12:11 INFO stats.py:314 | Epoch[273] Step[23] GlobalStep[37424] Training Speed: 435.72 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:13:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:12:12 INFO loss_tracker.py:84 | Epoch[273/NA] Step[24] GlobalStep[37425/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0174] Rank[0/16] 06/24/2025 17:12:21 INFO stats.py:314 | Epoch[273] Step[48] GlobalStep[37449] Training Speed: 431.67 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:13:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:12:22 INFO loss_tracker.py:84 | Epoch[273/NA] Step[49] GlobalStep[37450/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 17:12:32 INFO stats.py:314 | Epoch[273] Step[73] GlobalStep[37474] Training Speed: 424.91 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:13:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:12:33 INFO loss_tracker.py:84 | Epoch[273/NA] Step[74] GlobalStep[37475/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0128] total_loss[0.0171] Rank[0/16] 06/24/2025 17:12:42 INFO stats.py:314 | Epoch[273] Step[98] GlobalStep[37499] Training Speed: 430.33 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:13:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:12:42 INFO loss_tracker.py:84 | Epoch[273/NA] Step[99] GlobalStep[37500/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0127] total_loss[0.0166] Rank[0/16] 06/24/2025 17:12:53 INFO stats.py:314 | Epoch[273] Step[123] GlobalStep[37524] Training Speed: 450.61 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 7:13:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:12:53 INFO loss_tracker.py:84 | Epoch[273/NA] Step[124] GlobalStep[37525/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0179] Rank[0/16] 06/24/2025 17:12:57 INFO stats.py:394 | Epoch[273] completed. Training Speed: 308.90 samples/sec across all devices. Epoch Time: 56.77 sec. Average Epoch Time: 56.77 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:13:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:13:03 INFO stats.py:314 | Epoch[274] Step[11] GlobalStep[37549] Training Speed: 432.88 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:13:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:13:09 INFO loss_tracker.py:84 | Epoch[274/NA] Step[24] GlobalStep[37562/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 17:13:14 INFO stats.py:314 | Epoch[274] Step[36] GlobalStep[37574] Training Speed: 426.13 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:12:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:13:19 INFO loss_tracker.py:84 | Epoch[274/NA] Step[49] GlobalStep[37587/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0127] total_loss[0.0180] Rank[0/16] 06/24/2025 17:13:23 INFO stats.py:314 | Epoch[274] Step[61] GlobalStep[37599] Training Speed: 439.45 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:12:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:13:29 INFO loss_tracker.py:84 | Epoch[274/NA] Step[74] GlobalStep[37612/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0127] total_loss[0.0179] Rank[0/16] 06/24/2025 17:13:34 INFO stats.py:314 | Epoch[274] Step[86] GlobalStep[37624] Training Speed: 431.28 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:12:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:13:39 INFO loss_tracker.py:84 | Epoch[274/NA] Step[99] GlobalStep[37637/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 17:13:44 INFO stats.py:314 | Epoch[274] Step[111] GlobalStep[37649] Training Speed: 435.45 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:12:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:13:50 INFO loss_tracker.py:84 | Epoch[274/NA] Step[124] GlobalStep[37662/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 17:13:54 INFO stats.py:314 | Epoch[274] Step[136] GlobalStep[37674] Training Speed: 449.32 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 7:12:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:13:54 INFO stats.py:394 | Epoch[274] completed. Training Speed: 307.93 samples/sec across all devices. Epoch Time: 56.95 sec. Average Epoch Time: 56.95 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 7:12:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:14:05 INFO stats.py:314 | Epoch[275] Step[24] GlobalStep[37699] Training Speed: 437.39 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:11:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:14:05 INFO loss_tracker.py:84 | Epoch[275/NA] Step[24] GlobalStep[37699/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 17:14:15 INFO stats.py:314 | Epoch[275] Step[49] GlobalStep[37724] Training Speed: 433.66 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:11:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:14:15 INFO loss_tracker.py:84 | Epoch[275/NA] Step[49] GlobalStep[37724/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 17:14:26 INFO stats.py:314 | Epoch[275] Step[74] GlobalStep[37749] Training Speed: 391.69 samples/sec across all devices. Average Step Time: 0.33 sec. Estimated Remaining Time: 7:11:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:14:26 INFO loss_tracker.py:84 | Epoch[275/NA] Step[74] GlobalStep[37749/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 17:14:36 INFO stats.py:314 | Epoch[275] Step[99] GlobalStep[37774] Training Speed: 431.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:11:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:14:36 INFO loss_tracker.py:84 | Epoch[275/NA] Step[99] GlobalStep[37774/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 17:14:46 INFO stats.py:314 | Epoch[275] Step[124] GlobalStep[37799] Training Speed: 450.70 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 7:11:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:14:46 INFO loss_tracker.py:84 | Epoch[275/NA] Step[124] GlobalStep[37799/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0128] total_loss[0.0168] Rank[0/16] 06/24/2025 17:14:50 INFO stats.py:394 | Epoch[275] completed. Training Speed: 311.57 samples/sec across all devices. Epoch Time: 56.28 sec. Average Epoch Time: 56.28 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:11:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:14:56 INFO stats.py:314 | Epoch[276] Step[12] GlobalStep[37824] Training Speed: 432.76 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:11:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:15:01 INFO loss_tracker.py:84 | Epoch[276/NA] Step[24] GlobalStep[37836/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0177] Rank[0/16] 06/24/2025 17:15:07 INFO stats.py:314 | Epoch[276] Step[37] GlobalStep[37849] Training Speed: 429.97 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:10:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:15:12 INFO loss_tracker.py:84 | Epoch[276/NA] Step[49] GlobalStep[37861/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 17:15:17 INFO stats.py:314 | Epoch[276] Step[62] GlobalStep[37874] Training Speed: 437.52 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:10:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:15:22 INFO loss_tracker.py:84 | Epoch[276/NA] Step[74] GlobalStep[37886/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 17:15:27 INFO stats.py:314 | Epoch[276] Step[87] GlobalStep[37899] Training Speed: 434.06 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:10:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:15:32 INFO loss_tracker.py:84 | Epoch[276/NA] Step[99] GlobalStep[37911/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 17:15:37 INFO stats.py:314 | Epoch[276] Step[112] GlobalStep[37924] Training Speed: 433.13 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:10:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:15:42 INFO loss_tracker.py:84 | Epoch[276/NA] Step[124] GlobalStep[37936/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0057] loss_depth[0.0127] total_loss[0.0185] Rank[0/16] 06/24/2025 17:15:46 INFO stats.py:394 | Epoch[276] completed. Training Speed: 313.06 samples/sec across all devices. Epoch Time: 56.01 sec. Average Epoch Time: 56.01 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:10:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:15:47 INFO stats.py:314 | Epoch[277] Step[0] GlobalStep[37949] Training Speed: 366.24 samples/sec across all devices. Average Step Time: 0.35 sec. Estimated Remaining Time: 7:10:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:15:58 INFO loss_tracker.py:84 | Epoch[277/NA] Step[24] GlobalStep[37973/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0128] total_loss[0.0172] Rank[0/16] 06/24/2025 17:15:58 INFO stats.py:314 | Epoch[277] Step[25] GlobalStep[37974] Training Speed: 408.85 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 7:10:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:16:08 INFO loss_tracker.py:84 | Epoch[277/NA] Step[49] GlobalStep[37998/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 17:16:09 INFO stats.py:314 | Epoch[277] Step[50] GlobalStep[37999] Training Speed: 434.69 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:09:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:16:18 INFO loss_tracker.py:84 | Epoch[277/NA] Step[74] GlobalStep[38023/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 17:16:19 INFO stats.py:314 | Epoch[277] Step[75] GlobalStep[38024] Training Speed: 431.28 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:09:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:16:29 INFO loss_tracker.py:84 | Epoch[277/NA] Step[99] GlobalStep[38048/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0178] Rank[0/16] 06/24/2025 17:16:30 INFO stats.py:314 | Epoch[277] Step[100] GlobalStep[38049] Training Speed: 432.30 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:09:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:16:39 INFO loss_tracker.py:84 | Epoch[277/NA] Step[124] GlobalStep[38073/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 17:16:39 INFO stats.py:314 | Epoch[277] Step[125] GlobalStep[38074] Training Speed: 423.75 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:09:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:16:44 INFO stats.py:394 | Epoch[277] completed. Training Speed: 305.84 samples/sec across all devices. Epoch Time: 57.34 sec. Average Epoch Time: 57.34 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 7:09:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:16:51 INFO stats.py:314 | Epoch[278] Step[13] GlobalStep[38099] Training Speed: 432.67 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:09:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:16:55 INFO loss_tracker.py:84 | Epoch[278/NA] Step[24] GlobalStep[38110/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 17:17:00 INFO stats.py:314 | Epoch[278] Step[38] GlobalStep[38124] Training Speed: 437.30 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:09:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:17:05 INFO loss_tracker.py:84 | Epoch[278/NA] Step[49] GlobalStep[38135/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0055] loss_depth[0.0127] total_loss[0.0182] Rank[0/16] 06/24/2025 17:17:11 INFO stats.py:314 | Epoch[278] Step[63] GlobalStep[38149] Training Speed: 429.19 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:08:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:17:15 INFO loss_tracker.py:84 | Epoch[278/NA] Step[74] GlobalStep[38160/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:17:21 INFO stats.py:314 | Epoch[278] Step[88] GlobalStep[38174] Training Speed: 431.06 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:08:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:17:26 INFO loss_tracker.py:84 | Epoch[278/NA] Step[99] GlobalStep[38185/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 17:17:31 INFO stats.py:314 | Epoch[278] Step[113] GlobalStep[38199] Training Speed: 426.97 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:08:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:17:36 INFO loss_tracker.py:84 | Epoch[278/NA] Step[124] GlobalStep[38210/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 17:17:40 INFO stats.py:394 | Epoch[278] completed. Training Speed: 312.50 samples/sec across all devices. Epoch Time: 56.12 sec. Average Epoch Time: 56.12 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:08:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:17:42 INFO stats.py:314 | Epoch[279] Step[1] GlobalStep[38224] Training Speed: 435.01 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:08:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:17:51 INFO loss_tracker.py:84 | Epoch[279/NA] Step[24] GlobalStep[38247/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 17:17:52 INFO stats.py:314 | Epoch[279] Step[26] GlobalStep[38249] Training Speed: 407.09 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 7:08:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:18:02 INFO loss_tracker.py:84 | Epoch[279/NA] Step[49] GlobalStep[38272/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0128] total_loss[0.0170] Rank[0/16] 06/24/2025 17:18:02 INFO stats.py:314 | Epoch[279] Step[51] GlobalStep[38274] Training Speed: 436.76 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:07:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:18:12 INFO loss_tracker.py:84 | Epoch[279/NA] Step[74] GlobalStep[38297/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 17:18:13 INFO stats.py:314 | Epoch[279] Step[76] GlobalStep[38299] Training Speed: 438.47 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:07:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:18:22 INFO loss_tracker.py:84 | Epoch[279/NA] Step[99] GlobalStep[38322/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 17:18:23 INFO stats.py:314 | Epoch[279] Step[101] GlobalStep[38324] Training Speed: 433.39 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:07:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:18:32 INFO loss_tracker.py:84 | Epoch[279/NA] Step[124] GlobalStep[38347/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 17:18:33 INFO stats.py:314 | Epoch[279] Step[126] GlobalStep[38349] Training Speed: 448.37 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:07:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:18:37 INFO stats.py:394 | Epoch[279] completed. Training Speed: 309.37 samples/sec across all devices. Epoch Time: 56.68 sec. Average Epoch Time: 56.68 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:07:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:18:43 INFO stats.py:314 | Epoch[280] Step[14] GlobalStep[38374] Training Speed: 389.04 samples/sec across all devices. Average Step Time: 0.33 sec. Estimated Remaining Time: 7:07:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:18:48 INFO loss_tracker.py:84 | Epoch[280/NA] Step[24] GlobalStep[38384/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:18:54 INFO stats.py:314 | Epoch[280] Step[39] GlobalStep[38399] Training Speed: 428.72 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:07:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:18:58 INFO loss_tracker.py:84 | Epoch[280/NA] Step[49] GlobalStep[38409/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 17:19:04 INFO stats.py:314 | Epoch[280] Step[64] GlobalStep[38424] Training Speed: 433.11 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:06:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:19:08 INFO loss_tracker.py:84 | Epoch[280/NA] Step[74] GlobalStep[38434/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 17:19:14 INFO stats.py:314 | Epoch[280] Step[89] GlobalStep[38449] Training Speed: 430.97 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:06:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:19:18 INFO loss_tracker.py:84 | Epoch[280/NA] Step[99] GlobalStep[38459/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0128] total_loss[0.0181] Rank[0/16] 06/24/2025 17:19:24 INFO stats.py:314 | Epoch[280] Step[114] GlobalStep[38474] Training Speed: 438.07 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:06:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:19:29 INFO loss_tracker.py:84 | Epoch[280/NA] Step[124] GlobalStep[38484/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 17:19:33 INFO stats.py:394 | Epoch[280] completed. Training Speed: 310.03 samples/sec across all devices. Epoch Time: 56.56 sec. Average Epoch Time: 56.56 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:06:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:19:35 INFO stats.py:314 | Epoch[281] Step[2] GlobalStep[38499] Training Speed: 433.73 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:06:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:19:44 INFO loss_tracker.py:84 | Epoch[281/NA] Step[24] GlobalStep[38521/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0180] Rank[0/16] 06/24/2025 17:19:45 INFO stats.py:314 | Epoch[281] Step[27] GlobalStep[38524] Training Speed: 432.60 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:06:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:19:55 INFO loss_tracker.py:84 | Epoch[281/NA] Step[49] GlobalStep[38546/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0179] Rank[0/16] 06/24/2025 17:19:56 INFO stats.py:314 | Epoch[281] Step[52] GlobalStep[38549] Training Speed: 429.24 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:06:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:20:05 INFO loss_tracker.py:84 | Epoch[281/NA] Step[74] GlobalStep[38571/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 17:20:06 INFO stats.py:314 | Epoch[281] Step[77] GlobalStep[38574] Training Speed: 412.63 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 7:05:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:20:16 INFO loss_tracker.py:84 | Epoch[281/NA] Step[99] GlobalStep[38596/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0054] loss_depth[0.0127] total_loss[0.0182] Rank[0/16] 06/24/2025 17:20:17 INFO stats.py:314 | Epoch[281] Step[102] GlobalStep[38599] Training Speed: 413.32 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 7:05:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:20:25 INFO loss_tracker.py:84 | Epoch[281/NA] Step[124] GlobalStep[38621/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:20:26 INFO stats.py:314 | Epoch[281] Step[127] GlobalStep[38624] Training Speed: 449.79 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 7:05:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:20:30 INFO stats.py:394 | Epoch[281] completed. Training Speed: 307.99 samples/sec across all devices. Epoch Time: 56.94 sec. Average Epoch Time: 56.94 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 7:05:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:20:37 INFO stats.py:314 | Epoch[282] Step[15] GlobalStep[38649] Training Speed: 428.41 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:05:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:20:41 INFO loss_tracker.py:84 | Epoch[282/NA] Step[24] GlobalStep[38658/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 17:20:48 INFO stats.py:314 | Epoch[282] Step[40] GlobalStep[38674] Training Speed: 434.80 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:05:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:20:52 INFO loss_tracker.py:84 | Epoch[282/NA] Step[49] GlobalStep[38683/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0056] loss_depth[0.0127] total_loss[0.0183] Rank[0/16] 06/24/2025 17:20:58 INFO stats.py:314 | Epoch[282] Step[65] GlobalStep[38699] Training Speed: 431.54 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:04:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:21:02 INFO loss_tracker.py:84 | Epoch[282/NA] Step[74] GlobalStep[38708/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 17:21:09 INFO stats.py:314 | Epoch[282] Step[90] GlobalStep[38724] Training Speed: 434.71 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:04:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:21:12 INFO loss_tracker.py:84 | Epoch[282/NA] Step[99] GlobalStep[38733/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0127] total_loss[0.0166] Rank[0/16] 06/24/2025 17:21:19 INFO stats.py:314 | Epoch[282] Step[115] GlobalStep[38749] Training Speed: 423.46 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:04:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:21:22 INFO loss_tracker.py:84 | Epoch[282/NA] Step[124] GlobalStep[38758/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0127] total_loss[0.0181] Rank[0/16] 06/24/2025 17:21:26 INFO stats.py:394 | Epoch[282] completed. Training Speed: 313.63 samples/sec across all devices. Epoch Time: 55.91 sec. Average Epoch Time: 55.91 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:04:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:21:29 INFO stats.py:314 | Epoch[283] Step[3] GlobalStep[38774] Training Speed: 418.84 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 7:04:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:21:38 INFO loss_tracker.py:84 | Epoch[283/NA] Step[24] GlobalStep[38795/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 17:21:39 INFO stats.py:314 | Epoch[283] Step[28] GlobalStep[38799] Training Speed: 406.11 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 7:04:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:21:48 INFO loss_tracker.py:84 | Epoch[283/NA] Step[49] GlobalStep[38820/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0127] total_loss[0.0179] Rank[0/16] 06/24/2025 17:21:49 INFO stats.py:314 | Epoch[283] Step[53] GlobalStep[38824] Training Speed: 413.87 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 7:04:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:21:58 INFO loss_tracker.py:84 | Epoch[283/NA] Step[74] GlobalStep[38845/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0174] Rank[0/16] 06/24/2025 17:21:59 INFO stats.py:314 | Epoch[283] Step[78] GlobalStep[38849] Training Speed: 419.06 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 7:03:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:22:08 INFO loss_tracker.py:84 | Epoch[283/NA] Step[99] GlobalStep[38870/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0128] total_loss[0.0174] Rank[0/16] 06/24/2025 17:22:10 INFO stats.py:314 | Epoch[283] Step[103] GlobalStep[38874] Training Speed: 261.29 samples/sec across all devices. Average Step Time: 0.49 sec. Estimated Remaining Time: 7:03:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:22:18 INFO loss_tracker.py:84 | Epoch[283/NA] Step[124] GlobalStep[38895/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0178] Rank[0/16] 06/24/2025 17:22:19 INFO stats.py:314 | Epoch[283] Step[128] GlobalStep[38899] Training Speed: 449.60 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 7:03:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:22:22 INFO stats.py:394 | Epoch[283] completed. Training Speed: 313.48 samples/sec across all devices. Epoch Time: 55.94 sec. Average Epoch Time: 55.94 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:03:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:22:30 INFO stats.py:314 | Epoch[284] Step[16] GlobalStep[38924] Training Speed: 431.10 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:03:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:22:34 INFO loss_tracker.py:84 | Epoch[284/NA] Step[24] GlobalStep[38932/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 17:22:40 INFO stats.py:314 | Epoch[284] Step[41] GlobalStep[38949] Training Speed: 433.30 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:03:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:22:43 INFO loss_tracker.py:84 | Epoch[284/NA] Step[49] GlobalStep[38957/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 17:22:50 INFO stats.py:314 | Epoch[284] Step[66] GlobalStep[38974] Training Speed: 386.53 samples/sec across all devices. Average Step Time: 0.33 sec. Estimated Remaining Time: 7:03:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:22:54 INFO loss_tracker.py:84 | Epoch[284/NA] Step[74] GlobalStep[38982/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:23:00 INFO stats.py:314 | Epoch[284] Step[91] GlobalStep[38999] Training Speed: 433.12 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:02:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:23:04 INFO loss_tracker.py:84 | Epoch[284/NA] Step[99] GlobalStep[39007/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 17:23:11 INFO stats.py:314 | Epoch[284] Step[116] GlobalStep[39024] Training Speed: 421.66 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:02:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:23:14 INFO loss_tracker.py:84 | Epoch[284/NA] Step[124] GlobalStep[39032/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 17:23:18 INFO stats.py:394 | Epoch[284] completed. Training Speed: 311.20 samples/sec across all devices. Epoch Time: 56.35 sec. Average Epoch Time: 56.35 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:02:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:23:21 INFO stats.py:314 | Epoch[285] Step[4] GlobalStep[39049] Training Speed: 431.87 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:02:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:23:30 INFO loss_tracker.py:84 | Epoch[285/NA] Step[24] GlobalStep[39069/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 17:23:32 INFO stats.py:314 | Epoch[285] Step[29] GlobalStep[39074] Training Speed: 432.99 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:02:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:23:40 INFO loss_tracker.py:84 | Epoch[285/NA] Step[49] GlobalStep[39094/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 17:23:42 INFO stats.py:314 | Epoch[285] Step[54] GlobalStep[39099] Training Speed: 420.14 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:02:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:23:50 INFO loss_tracker.py:84 | Epoch[285/NA] Step[74] GlobalStep[39119/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:23:53 INFO stats.py:314 | Epoch[285] Step[79] GlobalStep[39124] Training Speed: 253.98 samples/sec across all devices. Average Step Time: 0.50 sec. Estimated Remaining Time: 7:01:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:24:00 INFO loss_tracker.py:84 | Epoch[285/NA] Step[99] GlobalStep[39144/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 17:24:02 INFO stats.py:314 | Epoch[285] Step[104] GlobalStep[39149] Training Speed: 434.26 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:01:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:24:11 INFO loss_tracker.py:84 | Epoch[285/NA] Step[124] GlobalStep[39169/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 17:24:12 INFO stats.py:314 | Epoch[285] Step[129] GlobalStep[39174] Training Speed: 449.23 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 7:01:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:24:15 INFO stats.py:394 | Epoch[285] completed. Training Speed: 306.66 samples/sec across all devices. Epoch Time: 57.18 sec. Average Epoch Time: 57.18 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 7:01:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:24:23 INFO stats.py:314 | Epoch[286] Step[17] GlobalStep[39199] Training Speed: 434.91 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:01:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:24:27 INFO loss_tracker.py:84 | Epoch[286/NA] Step[24] GlobalStep[39206/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:24:34 INFO stats.py:314 | Epoch[286] Step[42] GlobalStep[39224] Training Speed: 434.08 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:01:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:24:37 INFO loss_tracker.py:84 | Epoch[286/NA] Step[49] GlobalStep[39231/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 17:24:44 INFO stats.py:314 | Epoch[286] Step[67] GlobalStep[39249] Training Speed: 381.79 samples/sec across all devices. Average Step Time: 0.34 sec. Estimated Remaining Time: 7:01:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:24:47 INFO loss_tracker.py:84 | Epoch[286/NA] Step[74] GlobalStep[39256/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 17:24:55 INFO stats.py:314 | Epoch[286] Step[92] GlobalStep[39274] Training Speed: 436.41 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:00:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:24:57 INFO loss_tracker.py:84 | Epoch[286/NA] Step[99] GlobalStep[39281/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0178] Rank[0/16] 06/24/2025 17:25:05 INFO stats.py:314 | Epoch[286] Step[117] GlobalStep[39299] Training Speed: 433.14 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:00:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:25:07 INFO loss_tracker.py:84 | Epoch[286/NA] Step[124] GlobalStep[39306/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 17:25:11 INFO stats.py:394 | Epoch[286] completed. Training Speed: 313.01 samples/sec across all devices. Epoch Time: 56.02 sec. Average Epoch Time: 56.02 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 7:00:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:25:15 INFO stats.py:314 | Epoch[287] Step[5] GlobalStep[39324] Training Speed: 435.63 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:00:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:25:22 INFO loss_tracker.py:84 | Epoch[287/NA] Step[24] GlobalStep[39343/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 17:25:25 INFO stats.py:314 | Epoch[287] Step[30] GlobalStep[39349] Training Speed: 421.20 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:00:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:25:33 INFO loss_tracker.py:84 | Epoch[287/NA] Step[49] GlobalStep[39368/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0127] total_loss[0.0178] Rank[0/16] 06/24/2025 17:25:35 INFO stats.py:314 | Epoch[287] Step[55] GlobalStep[39374] Training Speed: 434.86 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 7:00:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:25:42 INFO loss_tracker.py:84 | Epoch[287/NA] Step[74] GlobalStep[39393/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 17:25:45 INFO stats.py:314 | Epoch[287] Step[80] GlobalStep[39399] Training Speed: 432.24 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 7:00:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:25:52 INFO loss_tracker.py:84 | Epoch[287/NA] Step[99] GlobalStep[39418/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 17:25:55 INFO stats.py:314 | Epoch[287] Step[105] GlobalStep[39424] Training Speed: 430.32 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:59:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:26:02 INFO loss_tracker.py:84 | Epoch[287/NA] Step[124] GlobalStep[39443/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:26:04 INFO stats.py:314 | Epoch[287] Step[130] GlobalStep[39449] Training Speed: 449.48 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 6:59:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:26:06 INFO stats.py:394 | Epoch[287] completed. Training Speed: 318.72 samples/sec across all devices. Epoch Time: 55.02 sec. Average Epoch Time: 55.02 sec. Average Step Time: 0.40 sec. Estimated Remaining Time: 6:59:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:26:15 INFO stats.py:314 | Epoch[288] Step[18] GlobalStep[39474] Training Speed: 433.47 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:59:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:26:17 INFO loss_tracker.py:84 | Epoch[288/NA] Step[24] GlobalStep[39480/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0055] loss_depth[0.0127] total_loss[0.0183] Rank[0/16] 06/24/2025 17:26:26 INFO stats.py:314 | Epoch[288] Step[43] GlobalStep[39499] Training Speed: 419.56 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 6:59:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:26:28 INFO loss_tracker.py:84 | Epoch[288/NA] Step[49] GlobalStep[39505/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 17:26:36 INFO stats.py:314 | Epoch[288] Step[68] GlobalStep[39524] Training Speed: 432.20 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:59:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:26:38 INFO loss_tracker.py:84 | Epoch[288/NA] Step[74] GlobalStep[39530/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 17:26:46 INFO stats.py:314 | Epoch[288] Step[93] GlobalStep[39549] Training Speed: 431.91 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:58:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:26:48 INFO loss_tracker.py:84 | Epoch[288/NA] Step[99] GlobalStep[39555/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 17:26:56 INFO stats.py:314 | Epoch[288] Step[118] GlobalStep[39574] Training Speed: 434.72 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:58:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:26:58 INFO loss_tracker.py:84 | Epoch[288/NA] Step[124] GlobalStep[39580/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:27:02 INFO stats.py:394 | Epoch[288] completed. Training Speed: 313.47 samples/sec across all devices. Epoch Time: 55.94 sec. Average Epoch Time: 55.94 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:58:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:27:06 INFO stats.py:314 | Epoch[289] Step[6] GlobalStep[39599] Training Speed: 427.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:58:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:27:14 INFO loss_tracker.py:84 | Epoch[289/NA] Step[24] GlobalStep[39617/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 17:27:16 INFO stats.py:314 | Epoch[289] Step[31] GlobalStep[39624] Training Speed: 431.08 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:58:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:27:24 INFO loss_tracker.py:84 | Epoch[289/NA] Step[49] GlobalStep[39642/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 17:27:27 INFO stats.py:314 | Epoch[289] Step[56] GlobalStep[39649] Training Speed: 426.43 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:58:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:27:35 INFO loss_tracker.py:84 | Epoch[289/NA] Step[74] GlobalStep[39667/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 17:27:37 INFO stats.py:314 | Epoch[289] Step[81] GlobalStep[39674] Training Speed: 431.47 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:58:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:27:45 INFO loss_tracker.py:84 | Epoch[289/NA] Step[99] GlobalStep[39692/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 17:27:48 INFO stats.py:314 | Epoch[289] Step[106] GlobalStep[39699] Training Speed: 431.83 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:57:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:27:55 INFO loss_tracker.py:84 | Epoch[289/NA] Step[124] GlobalStep[39717/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 17:27:57 INFO stats.py:314 | Epoch[289] Step[131] GlobalStep[39724] Training Speed: 449.45 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 6:57:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:27:59 INFO stats.py:394 | Epoch[289] completed. Training Speed: 310.08 samples/sec across all devices. Epoch Time: 56.55 sec. Average Epoch Time: 56.55 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:57:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:28:08 INFO stats.py:314 | Epoch[290] Step[19] GlobalStep[39749] Training Speed: 426.66 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:57:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:28:11 INFO loss_tracker.py:84 | Epoch[290/NA] Step[24] GlobalStep[39754/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 17:28:18 INFO stats.py:314 | Epoch[290] Step[44] GlobalStep[39774] Training Speed: 426.33 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:57:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:28:21 INFO loss_tracker.py:84 | Epoch[290/NA] Step[49] GlobalStep[39779/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0127] total_loss[0.0179] Rank[0/16] 06/24/2025 17:28:29 INFO stats.py:314 | Epoch[290] Step[69] GlobalStep[39799] Training Speed: 429.05 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:57:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:28:31 INFO loss_tracker.py:84 | Epoch[290/NA] Step[74] GlobalStep[39804/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 17:28:39 INFO stats.py:314 | Epoch[290] Step[94] GlobalStep[39824] Training Speed: 434.78 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:57:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:28:41 INFO loss_tracker.py:84 | Epoch[290/NA] Step[99] GlobalStep[39829/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 17:28:50 INFO stats.py:314 | Epoch[290] Step[119] GlobalStep[39849] Training Speed: 431.98 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:56:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:28:52 INFO loss_tracker.py:84 | Epoch[290/NA] Step[124] GlobalStep[39854/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 17:28:56 INFO stats.py:394 | Epoch[290] completed. Training Speed: 307.80 samples/sec across all devices. Epoch Time: 56.97 sec. Average Epoch Time: 56.97 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 6:56:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:29:00 INFO stats.py:314 | Epoch[291] Step[7] GlobalStep[39874] Training Speed: 424.96 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:56:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:29:08 INFO loss_tracker.py:84 | Epoch[291/NA] Step[24] GlobalStep[39891/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 17:29:11 INFO stats.py:314 | Epoch[291] Step[32] GlobalStep[39899] Training Speed: 434.55 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:56:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:29:18 INFO loss_tracker.py:84 | Epoch[291/NA] Step[49] GlobalStep[39916/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 17:29:21 INFO stats.py:314 | Epoch[291] Step[57] GlobalStep[39924] Training Speed: 432.16 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:56:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:29:29 INFO loss_tracker.py:84 | Epoch[291/NA] Step[74] GlobalStep[39941/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 17:29:32 INFO stats.py:314 | Epoch[291] Step[82] GlobalStep[39949] Training Speed: 419.90 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:56:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:29:39 INFO loss_tracker.py:84 | Epoch[291/NA] Step[99] GlobalStep[39966/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:29:42 INFO stats.py:314 | Epoch[291] Step[107] GlobalStep[39974] Training Speed: 417.48 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 6:55:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:29:49 INFO loss_tracker.py:84 | Epoch[291/NA] Step[124] GlobalStep[39991/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 17:29:52 INFO stats.py:314 | Epoch[291] Step[132] GlobalStep[39999] Training Speed: 445.33 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:55:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:29:52 WARNING accelerator.py:3099 | Deleting 1 checkpoints to make room for new checkpoint. Rank[0/16] 06/24/2025 17:29:53 INFO accelerator.py:3111 | Saving current state to /job_data/checkpoints/checkpoint_9 Rank[13/16] 06/24/2025 17:29:53 INFO checkpoint.py:110 | Save checkpoint at the end of step 39999 to /job_data/checkpoints/checkpoint_9 Rank[11/16] 06/24/2025 17:29:53 INFO checkpoint.py:110 | Save checkpoint at the end of step 39999 to /job_data/checkpoints/checkpoint_9 Rank[10/16] 06/24/2025 17:29:53 INFO checkpoint.py:110 | Save checkpoint at the end of step 39999 to /job_data/checkpoints/checkpoint_9 Rank[9/16] 06/24/2025 17:29:53 INFO checkpoint.py:110 | Save checkpoint at the end of step 39999 to /job_data/checkpoints/checkpoint_9 Rank[14/16] 06/24/2025 17:29:53 INFO checkpoint.py:110 | Save checkpoint at the end of step 39999 to /job_data/checkpoints/checkpoint_9 Rank[5/16] 06/24/2025 17:29:53 INFO checkpoint.py:110 | Save checkpoint at the end of step 39999 to /job_data/checkpoints/checkpoint_9 Rank[8/16] 06/24/2025 17:29:53 INFO checkpoint.py:110 | Save checkpoint at the end of step 39999 to /job_data/checkpoints/checkpoint_9 Rank[15/16] 06/24/2025 17:29:53 INFO checkpoint.py:110 | Save checkpoint at the end of step 39999 to /job_data/checkpoints/checkpoint_9 Rank[12/16] 06/24/2025 17:29:53 INFO checkpoint.py:110 | Save checkpoint at the end of step 39999 to /job_data/checkpoints/checkpoint_9 Rank[4/16] 06/24/2025 17:29:53 INFO checkpoint.py:110 | Save checkpoint at the end of step 39999 to /job_data/checkpoints/checkpoint_9 Rank[7/16] 06/24/2025 17:29:53 INFO checkpoint.py:110 | Save checkpoint at the end of step 39999 to /job_data/checkpoints/checkpoint_9 Rank[3/16] 06/24/2025 17:29:53 INFO checkpoint.py:110 | Save checkpoint at the end of step 39999 to /job_data/checkpoints/checkpoint_9 Rank[1/16] 06/24/2025 17:29:53 INFO checkpoint.py:110 | Save checkpoint at the end of step 39999 to /job_data/checkpoints/checkpoint_9 Rank[6/16] 06/24/2025 17:29:53 INFO checkpoint.py:110 | Save checkpoint at the end of step 39999 to /job_data/checkpoints/checkpoint_9 Rank[2/16] 06/24/2025 17:29:53 INFO checkpoint.py:110 | Save checkpoint at the end of step 39999 to /job_data/checkpoints/checkpoint_9 Rank[0/16] 06/24/2025 17:29:54 INFO checkpointing.py:106 | Model weights saved in /job_data/checkpoints/checkpoint_9/model.safetensors Rank[0/16] 06/24/2025 17:29:55 INFO checkpointing.py:113 | Optimizer state saved in /job_data/checkpoints/checkpoint_9/optimizer.bin Rank[0/16] 06/24/2025 17:29:55 INFO checkpointing.py:120 | Scheduler state saved in /job_data/checkpoints/checkpoint_9/scheduler.bin Rank[0/16] 06/24/2025 17:29:55 INFO checkpointing.py:137 | Sampler state for dataloader 0 saved in /job_data/checkpoints/checkpoint_9/sampler.bin Rank[0/16] 06/24/2025 17:29:55 INFO checkpointing.py:164 | Random states saved in /job_data/checkpoints/checkpoint_9/random_states_0.pkl Rank[0/16] 06/24/2025 17:29:55 INFO checkpointing.py:300 | Saving the state of TrainerProgressState to /job_data/checkpoints/checkpoint_9/custom_checkpoint_0.pkl Rank[0/16] 06/24/2025 17:29:55 INFO checkpoint.py:110 | Save checkpoint at the end of step 39999 to /job_data/checkpoints/checkpoint_9 Rank[0/16] 06/24/2025 17:29:57 INFO stats.py:394 | Epoch[291] completed. Training Speed: 289.00 samples/sec across all devices. Epoch Time: 60.68 sec. Average Epoch Time: 60.68 sec. Average Step Time: 0.44 sec. Estimated Remaining Time: 6:55:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:30:06 INFO stats.py:314 | Epoch[292] Step[20] GlobalStep[40024] Training Speed: 427.52 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:55:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:30:08 INFO loss_tracker.py:84 | Epoch[292/NA] Step[24] GlobalStep[40028/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 17:30:17 INFO stats.py:314 | Epoch[292] Step[45] GlobalStep[40049] Training Speed: 432.68 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:55:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:30:18 INFO loss_tracker.py:84 | Epoch[292/NA] Step[49] GlobalStep[40053/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0127] total_loss[0.0179] Rank[0/16] 06/24/2025 17:30:27 INFO stats.py:314 | Epoch[292] Step[70] GlobalStep[40074] Training Speed: 411.33 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 6:55:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:30:29 INFO loss_tracker.py:84 | Epoch[292/NA] Step[74] GlobalStep[40078/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 17:30:37 INFO stats.py:314 | Epoch[292] Step[95] GlobalStep[40099] Training Speed: 406.47 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 6:55:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:30:39 INFO loss_tracker.py:84 | Epoch[292/NA] Step[99] GlobalStep[40103/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:30:47 INFO stats.py:314 | Epoch[292] Step[120] GlobalStep[40124] Training Speed: 433.71 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:55:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:30:49 INFO loss_tracker.py:84 | Epoch[292/NA] Step[124] GlobalStep[40128/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 17:30:54 INFO stats.py:394 | Epoch[292] completed. Training Speed: 308.09 samples/sec across all devices. Epoch Time: 56.92 sec. Average Epoch Time: 56.92 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 6:54:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:30:58 INFO stats.py:314 | Epoch[293] Step[8] GlobalStep[40149] Training Speed: 433.71 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:54:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:31:05 INFO loss_tracker.py:84 | Epoch[293/NA] Step[24] GlobalStep[40165/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 17:31:09 INFO stats.py:314 | Epoch[293] Step[33] GlobalStep[40174] Training Speed: 420.02 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:54:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:31:15 INFO loss_tracker.py:84 | Epoch[293/NA] Step[49] GlobalStep[40190/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0128] total_loss[0.0173] Rank[0/16] 06/24/2025 17:31:19 INFO stats.py:314 | Epoch[293] Step[58] GlobalStep[40199] Training Speed: 429.74 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:54:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:31:25 INFO loss_tracker.py:84 | Epoch[293/NA] Step[74] GlobalStep[40215/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 17:31:29 INFO stats.py:314 | Epoch[293] Step[83] GlobalStep[40224] Training Speed: 431.43 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:54:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:31:36 INFO loss_tracker.py:84 | Epoch[293/NA] Step[99] GlobalStep[40240/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0128] total_loss[0.0172] Rank[0/16] 06/24/2025 17:31:39 INFO stats.py:314 | Epoch[293] Step[108] GlobalStep[40249] Training Speed: 436.22 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:54:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:31:45 INFO loss_tracker.py:84 | Epoch[293/NA] Step[124] GlobalStep[40265/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 17:31:49 INFO stats.py:314 | Epoch[293] Step[133] GlobalStep[40274] Training Speed: 450.19 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 6:53:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:31:50 INFO stats.py:394 | Epoch[293] completed. Training Speed: 310.33 samples/sec across all devices. Epoch Time: 56.51 sec. Average Epoch Time: 56.51 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:53:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:32:00 INFO stats.py:314 | Epoch[294] Step[21] GlobalStep[40299] Training Speed: 410.37 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 6:53:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:32:02 INFO loss_tracker.py:84 | Epoch[294/NA] Step[24] GlobalStep[40302/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 17:32:11 INFO stats.py:314 | Epoch[294] Step[46] GlobalStep[40324] Training Speed: 434.19 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:53:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:32:12 INFO loss_tracker.py:84 | Epoch[294/NA] Step[49] GlobalStep[40327/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 17:32:21 INFO stats.py:314 | Epoch[294] Step[71] GlobalStep[40349] Training Speed: 424.32 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:53:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:32:23 INFO loss_tracker.py:84 | Epoch[294/NA] Step[74] GlobalStep[40352/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 17:32:31 INFO stats.py:314 | Epoch[294] Step[96] GlobalStep[40374] Training Speed: 431.23 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:53:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:32:33 INFO loss_tracker.py:84 | Epoch[294/NA] Step[99] GlobalStep[40377/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:32:42 INFO stats.py:314 | Epoch[294] Step[121] GlobalStep[40399] Training Speed: 451.46 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 6:53:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:32:43 INFO loss_tracker.py:84 | Epoch[294/NA] Step[124] GlobalStep[40402/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 17:32:47 INFO stats.py:394 | Epoch[294] completed. Training Speed: 307.86 samples/sec across all devices. Epoch Time: 56.96 sec. Average Epoch Time: 56.96 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 6:52:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:32:52 INFO stats.py:314 | Epoch[295] Step[9] GlobalStep[40424] Training Speed: 422.39 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:52:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:32:58 INFO loss_tracker.py:84 | Epoch[295/NA] Step[24] GlobalStep[40439/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0180] Rank[0/16] 06/24/2025 17:33:02 INFO stats.py:314 | Epoch[295] Step[34] GlobalStep[40449] Training Speed: 428.34 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:52:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:33:08 INFO loss_tracker.py:84 | Epoch[295/NA] Step[49] GlobalStep[40464/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 17:33:13 INFO stats.py:314 | Epoch[295] Step[59] GlobalStep[40474] Training Speed: 422.68 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:52:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:33:19 INFO loss_tracker.py:84 | Epoch[295/NA] Step[74] GlobalStep[40489/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 17:33:23 INFO stats.py:314 | Epoch[295] Step[84] GlobalStep[40499] Training Speed: 435.81 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:52:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:33:29 INFO loss_tracker.py:84 | Epoch[295/NA] Step[99] GlobalStep[40514/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 17:33:33 INFO stats.py:314 | Epoch[295] Step[109] GlobalStep[40524] Training Speed: 418.11 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 6:52:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:33:39 INFO loss_tracker.py:84 | Epoch[295/NA] Step[124] GlobalStep[40539/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 17:33:43 INFO stats.py:314 | Epoch[295] Step[134] GlobalStep[40549] Training Speed: 453.45 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 6:52:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:33:44 INFO stats.py:394 | Epoch[295] completed. Training Speed: 309.59 samples/sec across all devices. Epoch Time: 56.64 sec. Average Epoch Time: 56.64 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:52:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:33:54 INFO stats.py:314 | Epoch[296] Step[22] GlobalStep[40574] Training Speed: 406.42 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 6:51:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:33:55 INFO loss_tracker.py:84 | Epoch[296/NA] Step[24] GlobalStep[40576/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 17:34:04 INFO stats.py:314 | Epoch[296] Step[47] GlobalStep[40599] Training Speed: 432.47 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:51:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:34:05 INFO loss_tracker.py:84 | Epoch[296/NA] Step[49] GlobalStep[40601/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0180] Rank[0/16] 06/24/2025 17:34:14 INFO stats.py:314 | Epoch[296] Step[72] GlobalStep[40624] Training Speed: 433.97 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:51:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:34:15 INFO loss_tracker.py:84 | Epoch[296/NA] Step[74] GlobalStep[40626/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0127] total_loss[0.0179] Rank[0/16] 06/24/2025 17:34:25 INFO stats.py:314 | Epoch[296] Step[97] GlobalStep[40649] Training Speed: 437.57 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:51:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:34:26 INFO loss_tracker.py:84 | Epoch[296/NA] Step[99] GlobalStep[40651/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0127] total_loss[0.0168] Rank[0/16] 06/24/2025 17:34:35 INFO stats.py:314 | Epoch[296] Step[122] GlobalStep[40674] Training Speed: 452.91 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 6:51:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:34:36 INFO loss_tracker.py:84 | Epoch[296/NA] Step[124] GlobalStep[40676/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:34:40 INFO stats.py:394 | Epoch[296] completed. Training Speed: 309.39 samples/sec across all devices. Epoch Time: 56.68 sec. Average Epoch Time: 56.68 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:51:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:34:46 INFO stats.py:314 | Epoch[297] Step[10] GlobalStep[40699] Training Speed: 434.67 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:51:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:34:52 INFO loss_tracker.py:84 | Epoch[297/NA] Step[24] GlobalStep[40713/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 17:34:56 INFO stats.py:314 | Epoch[297] Step[35] GlobalStep[40724] Training Speed: 428.59 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:50:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:35:02 INFO loss_tracker.py:84 | Epoch[297/NA] Step[49] GlobalStep[40738/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 17:35:06 INFO stats.py:314 | Epoch[297] Step[60] GlobalStep[40749] Training Speed: 426.41 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:50:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:35:12 INFO loss_tracker.py:84 | Epoch[297/NA] Step[74] GlobalStep[40763/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0055] loss_depth[0.0127] total_loss[0.0182] Rank[0/16] 06/24/2025 17:35:16 INFO stats.py:314 | Epoch[297] Step[85] GlobalStep[40774] Training Speed: 432.83 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:50:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:35:22 INFO loss_tracker.py:84 | Epoch[297/NA] Step[99] GlobalStep[40788/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 17:35:27 INFO stats.py:314 | Epoch[297] Step[110] GlobalStep[40799] Training Speed: 435.17 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:50:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:35:32 INFO loss_tracker.py:84 | Epoch[297/NA] Step[124] GlobalStep[40813/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0179] Rank[0/16] 06/24/2025 17:35:36 INFO stats.py:314 | Epoch[297] Step[135] GlobalStep[40824] Training Speed: 447.19 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:50:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:35:36 INFO stats.py:394 | Epoch[297] completed. Training Speed: 314.27 samples/sec across all devices. Epoch Time: 55.80 sec. Average Epoch Time: 55.80 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:50:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:35:47 INFO stats.py:314 | Epoch[298] Step[23] GlobalStep[40849] Training Speed: 430.41 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:49:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:35:47 INFO loss_tracker.py:84 | Epoch[298/NA] Step[24] GlobalStep[40850/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:35:57 INFO stats.py:314 | Epoch[298] Step[48] GlobalStep[40874] Training Speed: 427.62 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:49:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:35:58 INFO loss_tracker.py:84 | Epoch[298/NA] Step[49] GlobalStep[40875/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 17:36:07 INFO stats.py:314 | Epoch[298] Step[73] GlobalStep[40899] Training Speed: 425.00 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:49:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:36:08 INFO loss_tracker.py:84 | Epoch[298/NA] Step[74] GlobalStep[40900/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 17:36:18 INFO stats.py:314 | Epoch[298] Step[98] GlobalStep[40924] Training Speed: 260.76 samples/sec across all devices. Average Step Time: 0.49 sec. Estimated Remaining Time: 6:49:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:36:18 INFO loss_tracker.py:84 | Epoch[298/NA] Step[99] GlobalStep[40925/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 17:36:28 INFO stats.py:314 | Epoch[298] Step[123] GlobalStep[40949] Training Speed: 448.09 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:49:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:36:28 INFO loss_tracker.py:84 | Epoch[298/NA] Step[124] GlobalStep[40950/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0127] total_loss[0.0180] Rank[0/16] 06/24/2025 17:36:32 INFO stats.py:394 | Epoch[298] completed. Training Speed: 312.39 samples/sec across all devices. Epoch Time: 56.14 sec. Average Epoch Time: 56.14 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:49:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:36:39 INFO stats.py:314 | Epoch[299] Step[11] GlobalStep[40974] Training Speed: 230.85 samples/sec across all devices. Average Step Time: 0.55 sec. Estimated Remaining Time: 6:49:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:36:44 INFO loss_tracker.py:84 | Epoch[299/NA] Step[24] GlobalStep[40987/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 17:36:49 INFO stats.py:314 | Epoch[299] Step[36] GlobalStep[40999] Training Speed: 410.40 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 6:48:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:36:54 INFO loss_tracker.py:84 | Epoch[299/NA] Step[49] GlobalStep[41012/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 17:36:59 INFO stats.py:314 | Epoch[299] Step[61] GlobalStep[41024] Training Speed: 426.92 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:48:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:37:04 INFO loss_tracker.py:84 | Epoch[299/NA] Step[74] GlobalStep[41037/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0127] total_loss[0.0178] Rank[0/16] 06/24/2025 17:37:09 INFO stats.py:314 | Epoch[299] Step[86] GlobalStep[41049] Training Speed: 433.09 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:48:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:37:15 INFO loss_tracker.py:84 | Epoch[299/NA] Step[99] GlobalStep[41062/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 17:37:20 INFO stats.py:314 | Epoch[299] Step[111] GlobalStep[41074] Training Speed: 434.24 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:48:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:37:25 INFO loss_tracker.py:84 | Epoch[299/NA] Step[124] GlobalStep[41087/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 17:37:29 INFO stats.py:314 | Epoch[299] Step[136] GlobalStep[41099] Training Speed: 450.73 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 6:48:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:37:29 INFO stats.py:394 | Epoch[299] completed. Training Speed: 308.27 samples/sec across all devices. Epoch Time: 56.89 sec. Average Epoch Time: 56.89 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 6:48:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:37:41 INFO stats.py:314 | Epoch[300] Step[24] GlobalStep[41124] Training Speed: 439.94 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:48:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:37:41 INFO loss_tracker.py:84 | Epoch[300/NA] Step[24] GlobalStep[41124/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:37:50 INFO stats.py:314 | Epoch[300] Step[49] GlobalStep[41149] Training Speed: 420.65 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:47:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:37:50 INFO loss_tracker.py:84 | Epoch[300/NA] Step[49] GlobalStep[41149/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 17:38:01 INFO stats.py:314 | Epoch[300] Step[74] GlobalStep[41174] Training Speed: 434.10 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:47:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:38:01 INFO loss_tracker.py:84 | Epoch[300/NA] Step[74] GlobalStep[41174/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 17:38:10 INFO stats.py:314 | Epoch[300] Step[99] GlobalStep[41199] Training Speed: 425.09 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:47:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:38:10 INFO loss_tracker.py:84 | Epoch[300/NA] Step[99] GlobalStep[41199/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0127] total_loss[0.0168] Rank[0/16] 06/24/2025 17:38:21 INFO stats.py:314 | Epoch[300] Step[124] GlobalStep[41224] Training Speed: 450.45 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 6:47:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:38:21 INFO loss_tracker.py:84 | Epoch[300/NA] Step[124] GlobalStep[41224/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 17:38:25 INFO stats.py:394 | Epoch[300] completed. Training Speed: 313.29 samples/sec across all devices. Epoch Time: 55.97 sec. Average Epoch Time: 55.97 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:47:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:38:32 INFO stats.py:314 | Epoch[301] Step[12] GlobalStep[41249] Training Speed: 433.89 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:47:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:38:37 INFO loss_tracker.py:84 | Epoch[301/NA] Step[24] GlobalStep[41261/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 17:38:42 INFO stats.py:314 | Epoch[301] Step[37] GlobalStep[41274] Training Speed: 430.05 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:46:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:38:47 INFO loss_tracker.py:84 | Epoch[301/NA] Step[49] GlobalStep[41286/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0178] Rank[0/16] 06/24/2025 17:38:52 INFO stats.py:314 | Epoch[301] Step[62] GlobalStep[41299] Training Speed: 435.40 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:46:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:38:57 INFO loss_tracker.py:84 | Epoch[301/NA] Step[74] GlobalStep[41311/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0056] loss_depth[0.0127] total_loss[0.0184] Rank[0/16] 06/24/2025 17:39:02 INFO stats.py:314 | Epoch[301] Step[87] GlobalStep[41324] Training Speed: 428.20 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:46:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:39:07 INFO loss_tracker.py:84 | Epoch[301/NA] Step[99] GlobalStep[41336/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 17:39:13 INFO stats.py:314 | Epoch[301] Step[112] GlobalStep[41349] Training Speed: 432.63 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:46:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:39:17 INFO loss_tracker.py:84 | Epoch[301/NA] Step[124] GlobalStep[41361/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 17:39:22 INFO stats.py:394 | Epoch[301] completed. Training Speed: 310.79 samples/sec across all devices. Epoch Time: 56.42 sec. Average Epoch Time: 56.42 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:46:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:39:23 INFO stats.py:314 | Epoch[302] Step[0] GlobalStep[41374] Training Speed: 375.91 samples/sec across all devices. Average Step Time: 0.34 sec. Estimated Remaining Time: 6:46:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:39:33 INFO loss_tracker.py:84 | Epoch[302/NA] Step[24] GlobalStep[41398/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 17:39:33 INFO stats.py:314 | Epoch[302] Step[25] GlobalStep[41399] Training Speed: 408.63 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 6:46:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:39:43 INFO loss_tracker.py:84 | Epoch[302/NA] Step[49] GlobalStep[41423/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0055] loss_depth[0.0127] total_loss[0.0182] Rank[0/16] 06/24/2025 17:39:43 INFO stats.py:314 | Epoch[302] Step[50] GlobalStep[41424] Training Speed: 432.47 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:45:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:39:54 INFO loss_tracker.py:84 | Epoch[302/NA] Step[74] GlobalStep[41448/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0127] total_loss[0.0168] Rank[0/16] 06/24/2025 17:39:54 INFO stats.py:314 | Epoch[302] Step[75] GlobalStep[41449] Training Speed: 423.65 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:45:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:40:04 INFO loss_tracker.py:84 | Epoch[302/NA] Step[99] GlobalStep[41473/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 17:40:04 INFO stats.py:314 | Epoch[302] Step[100] GlobalStep[41474] Training Speed: 430.09 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:45:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:40:14 INFO loss_tracker.py:84 | Epoch[302/NA] Step[124] GlobalStep[41498/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0127] total_loss[0.0168] Rank[0/16] 06/24/2025 17:40:14 INFO stats.py:314 | Epoch[302] Step[125] GlobalStep[41499] Training Speed: 438.36 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:45:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:40:18 INFO stats.py:394 | Epoch[302] completed. Training Speed: 312.20 samples/sec across all devices. Epoch Time: 56.17 sec. Average Epoch Time: 56.17 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:45:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:40:25 INFO stats.py:314 | Epoch[303] Step[13] GlobalStep[41524] Training Speed: 429.76 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:45:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:40:29 INFO loss_tracker.py:84 | Epoch[303/NA] Step[24] GlobalStep[41535/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 17:40:35 INFO stats.py:314 | Epoch[303] Step[38] GlobalStep[41549] Training Speed: 432.18 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:45:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:40:40 INFO loss_tracker.py:84 | Epoch[303/NA] Step[49] GlobalStep[41560/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 17:40:45 INFO stats.py:314 | Epoch[303] Step[63] GlobalStep[41574] Training Speed: 434.53 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:44:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:40:50 INFO loss_tracker.py:84 | Epoch[303/NA] Step[74] GlobalStep[41585/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:40:55 INFO stats.py:314 | Epoch[303] Step[88] GlobalStep[41599] Training Speed: 434.92 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:44:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:41:00 INFO loss_tracker.py:84 | Epoch[303/NA] Step[99] GlobalStep[41610/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0178] Rank[0/16] 06/24/2025 17:41:05 INFO stats.py:314 | Epoch[303] Step[113] GlobalStep[41624] Training Speed: 432.55 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:44:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:41:10 INFO loss_tracker.py:84 | Epoch[303/NA] Step[124] GlobalStep[41635/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 17:41:14 INFO stats.py:394 | Epoch[303] completed. Training Speed: 310.72 samples/sec across all devices. Epoch Time: 56.44 sec. Average Epoch Time: 56.44 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:44:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:41:16 INFO stats.py:314 | Epoch[304] Step[1] GlobalStep[41649] Training Speed: 432.92 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:44:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:41:25 INFO loss_tracker.py:84 | Epoch[304/NA] Step[24] GlobalStep[41672/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 17:41:26 INFO stats.py:314 | Epoch[304] Step[26] GlobalStep[41674] Training Speed: 432.69 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:44:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:41:35 INFO loss_tracker.py:84 | Epoch[304/NA] Step[49] GlobalStep[41697/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:41:36 INFO stats.py:314 | Epoch[304] Step[51] GlobalStep[41699] Training Speed: 426.22 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:43:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:41:46 INFO loss_tracker.py:84 | Epoch[304/NA] Step[74] GlobalStep[41722/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0179] Rank[0/16] 06/24/2025 17:41:46 INFO stats.py:314 | Epoch[304] Step[76] GlobalStep[41724] Training Speed: 430.41 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:43:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:41:56 INFO loss_tracker.py:84 | Epoch[304/NA] Step[99] GlobalStep[41747/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 17:41:57 INFO stats.py:314 | Epoch[304] Step[101] GlobalStep[41749] Training Speed: 430.34 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:43:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:42:06 INFO loss_tracker.py:84 | Epoch[304/NA] Step[124] GlobalStep[41772/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 17:42:07 INFO stats.py:314 | Epoch[304] Step[126] GlobalStep[41774] Training Speed: 448.34 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:43:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:42:10 INFO stats.py:394 | Epoch[304] completed. Training Speed: 312.14 samples/sec across all devices. Epoch Time: 56.18 sec. Average Epoch Time: 56.18 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:43:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:42:17 INFO stats.py:314 | Epoch[305] Step[14] GlobalStep[41799] Training Speed: 434.21 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:43:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:42:22 INFO loss_tracker.py:84 | Epoch[305/NA] Step[24] GlobalStep[41809/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 17:42:28 INFO stats.py:314 | Epoch[305] Step[39] GlobalStep[41824] Training Speed: 427.93 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:43:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:42:32 INFO loss_tracker.py:84 | Epoch[305/NA] Step[49] GlobalStep[41834/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 17:42:38 INFO stats.py:314 | Epoch[305] Step[64] GlobalStep[41849] Training Speed: 428.19 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:42:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:42:42 INFO loss_tracker.py:84 | Epoch[305/NA] Step[74] GlobalStep[41859/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0127] total_loss[0.0180] Rank[0/16] 06/24/2025 17:42:48 INFO stats.py:314 | Epoch[305] Step[89] GlobalStep[41874] Training Speed: 430.30 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:42:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:42:53 INFO loss_tracker.py:84 | Epoch[305/NA] Step[99] GlobalStep[41884/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 17:42:59 INFO stats.py:314 | Epoch[305] Step[114] GlobalStep[41899] Training Speed: 436.31 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:42:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:43:02 INFO loss_tracker.py:84 | Epoch[305/NA] Step[124] GlobalStep[41909/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 17:43:07 INFO stats.py:394 | Epoch[305] completed. Training Speed: 311.50 samples/sec across all devices. Epoch Time: 56.30 sec. Average Epoch Time: 56.30 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:42:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:43:09 INFO stats.py:314 | Epoch[306] Step[2] GlobalStep[41924] Training Speed: 424.70 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:42:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:43:18 INFO loss_tracker.py:84 | Epoch[306/NA] Step[24] GlobalStep[41946/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0178] Rank[0/16] 06/24/2025 17:43:19 INFO stats.py:314 | Epoch[306] Step[27] GlobalStep[41949] Training Speed: 434.77 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:42:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:43:28 INFO loss_tracker.py:84 | Epoch[306/NA] Step[49] GlobalStep[41971/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0180] Rank[0/16] 06/24/2025 17:43:29 INFO stats.py:314 | Epoch[306] Step[52] GlobalStep[41974] Training Speed: 430.60 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:42:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:43:38 INFO loss_tracker.py:84 | Epoch[306/NA] Step[74] GlobalStep[41996/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0128] total_loss[0.0172] Rank[0/16] 06/24/2025 17:43:40 INFO stats.py:314 | Epoch[306] Step[77] GlobalStep[41999] Training Speed: 438.92 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:41:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:43:49 INFO loss_tracker.py:84 | Epoch[306/NA] Step[99] GlobalStep[42021/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:43:50 INFO stats.py:314 | Epoch[306] Step[102] GlobalStep[42024] Training Speed: 432.20 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:41:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:43:59 INFO loss_tracker.py:84 | Epoch[306/NA] Step[124] GlobalStep[42046/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0180] Rank[0/16] 06/24/2025 17:44:00 INFO stats.py:314 | Epoch[306] Step[127] GlobalStep[42049] Training Speed: 448.11 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:41:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:44:03 INFO stats.py:394 | Epoch[306] completed. Training Speed: 310.66 samples/sec across all devices. Epoch Time: 56.45 sec. Average Epoch Time: 56.45 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:41:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:44:11 INFO stats.py:314 | Epoch[307] Step[15] GlobalStep[42074] Training Speed: 432.76 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:41:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:44:14 INFO loss_tracker.py:84 | Epoch[307/NA] Step[24] GlobalStep[42083/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 17:44:20 INFO stats.py:314 | Epoch[307] Step[40] GlobalStep[42099] Training Speed: 434.46 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:41:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:44:24 INFO loss_tracker.py:84 | Epoch[307/NA] Step[49] GlobalStep[42108/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0127] total_loss[0.0166] Rank[0/16] 06/24/2025 17:44:31 INFO stats.py:314 | Epoch[307] Step[65] GlobalStep[42124] Training Speed: 435.81 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:40:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:44:34 INFO loss_tracker.py:84 | Epoch[307/NA] Step[74] GlobalStep[42133/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:44:41 INFO stats.py:314 | Epoch[307] Step[90] GlobalStep[42149] Training Speed: 432.00 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:40:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:44:44 INFO loss_tracker.py:84 | Epoch[307/NA] Step[99] GlobalStep[42158/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0127] total_loss[0.0167] Rank[0/16] 06/24/2025 17:44:51 INFO stats.py:314 | Epoch[307] Step[115] GlobalStep[42174] Training Speed: 434.32 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:40:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:44:54 INFO loss_tracker.py:84 | Epoch[307/NA] Step[124] GlobalStep[42183/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 17:44:59 INFO stats.py:394 | Epoch[307] completed. Training Speed: 316.11 samples/sec across all devices. Epoch Time: 55.48 sec. Average Epoch Time: 55.48 sec. Average Step Time: 0.40 sec. Estimated Remaining Time: 6:40:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:45:01 INFO stats.py:314 | Epoch[308] Step[3] GlobalStep[42199] Training Speed: 432.43 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:40:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:45:10 INFO loss_tracker.py:84 | Epoch[308/NA] Step[24] GlobalStep[42220/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 17:45:11 INFO stats.py:314 | Epoch[308] Step[28] GlobalStep[42224] Training Speed: 433.38 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:40:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:45:20 INFO loss_tracker.py:84 | Epoch[308/NA] Step[49] GlobalStep[42245/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 17:45:21 INFO stats.py:314 | Epoch[308] Step[53] GlobalStep[42249] Training Speed: 424.19 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:40:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:45:30 INFO loss_tracker.py:84 | Epoch[308/NA] Step[74] GlobalStep[42270/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 17:45:32 INFO stats.py:314 | Epoch[308] Step[78] GlobalStep[42274] Training Speed: 433.83 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:39:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:45:40 INFO loss_tracker.py:84 | Epoch[308/NA] Step[99] GlobalStep[42295/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:45:42 INFO stats.py:314 | Epoch[308] Step[103] GlobalStep[42299] Training Speed: 433.51 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:39:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:45:51 INFO loss_tracker.py:84 | Epoch[308/NA] Step[124] GlobalStep[42320/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 17:45:52 INFO stats.py:314 | Epoch[308] Step[128] GlobalStep[42324] Training Speed: 446.26 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:39:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:45:55 INFO stats.py:394 | Epoch[308] completed. Training Speed: 311.11 samples/sec across all devices. Epoch Time: 56.37 sec. Average Epoch Time: 56.37 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:39:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:46:03 INFO stats.py:314 | Epoch[309] Step[16] GlobalStep[42349] Training Speed: 436.11 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:39:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:46:06 INFO loss_tracker.py:84 | Epoch[309/NA] Step[24] GlobalStep[42357/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 17:46:13 INFO stats.py:314 | Epoch[309] Step[41] GlobalStep[42374] Training Speed: 428.21 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:39:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:46:17 INFO loss_tracker.py:84 | Epoch[309/NA] Step[49] GlobalStep[42382/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0128] total_loss[0.0179] Rank[0/16] 06/24/2025 17:46:24 INFO stats.py:314 | Epoch[309] Step[66] GlobalStep[42399] Training Speed: 410.39 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 6:39:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:46:28 INFO loss_tracker.py:84 | Epoch[309/NA] Step[74] GlobalStep[42407/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 17:46:34 INFO stats.py:314 | Epoch[309] Step[91] GlobalStep[42424] Training Speed: 405.30 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 6:38:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:46:38 INFO loss_tracker.py:84 | Epoch[309/NA] Step[99] GlobalStep[42432/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0179] Rank[0/16] 06/24/2025 17:46:45 INFO stats.py:314 | Epoch[309] Step[116] GlobalStep[42449] Training Speed: 441.27 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:38:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:46:48 INFO loss_tracker.py:84 | Epoch[309/NA] Step[124] GlobalStep[42457/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 17:46:52 INFO stats.py:394 | Epoch[309] completed. Training Speed: 304.99 samples/sec across all devices. Epoch Time: 57.50 sec. Average Epoch Time: 57.50 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 6:38:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:46:56 INFO stats.py:314 | Epoch[310] Step[4] GlobalStep[42474] Training Speed: 437.04 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:38:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:47:04 INFO loss_tracker.py:84 | Epoch[310/NA] Step[24] GlobalStep[42494/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 17:47:06 INFO stats.py:314 | Epoch[310] Step[29] GlobalStep[42499] Training Speed: 436.55 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:38:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:47:15 INFO loss_tracker.py:84 | Epoch[310/NA] Step[49] GlobalStep[42519/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0127] total_loss[0.0167] Rank[0/16] 06/24/2025 17:47:17 INFO stats.py:314 | Epoch[310] Step[54] GlobalStep[42524] Training Speed: 433.45 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:38:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:47:25 INFO loss_tracker.py:84 | Epoch[310/NA] Step[74] GlobalStep[42544/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0127] total_loss[0.0165] Rank[0/16] 06/24/2025 17:47:27 INFO stats.py:314 | Epoch[310] Step[79] GlobalStep[42549] Training Speed: 431.24 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:38:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:47:35 INFO loss_tracker.py:84 | Epoch[310/NA] Step[99] GlobalStep[42569/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0179] Rank[0/16] 06/24/2025 17:47:37 INFO stats.py:314 | Epoch[310] Step[104] GlobalStep[42574] Training Speed: 433.36 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:37:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:47:45 INFO loss_tracker.py:84 | Epoch[310/NA] Step[124] GlobalStep[42594/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0127] total_loss[0.0164] Rank[0/16] 06/24/2025 17:47:47 INFO stats.py:314 | Epoch[310] Step[129] GlobalStep[42599] Training Speed: 452.33 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 6:37:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:47:49 INFO stats.py:394 | Epoch[310] completed. Training Speed: 308.65 samples/sec across all devices. Epoch Time: 56.81 sec. Average Epoch Time: 56.81 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:37:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:47:57 INFO stats.py:314 | Epoch[311] Step[17] GlobalStep[42624] Training Speed: 429.45 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:37:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:48:00 INFO loss_tracker.py:84 | Epoch[311/NA] Step[24] GlobalStep[42631/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0127] total_loss[0.0180] Rank[0/16] 06/24/2025 17:48:08 INFO stats.py:314 | Epoch[311] Step[42] GlobalStep[42649] Training Speed: 434.46 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:37:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:48:11 INFO loss_tracker.py:84 | Epoch[311/NA] Step[49] GlobalStep[42656/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:48:18 INFO stats.py:314 | Epoch[311] Step[67] GlobalStep[42674] Training Speed: 436.92 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:37:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:48:21 INFO loss_tracker.py:84 | Epoch[311/NA] Step[74] GlobalStep[42681/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 17:48:28 INFO stats.py:314 | Epoch[311] Step[92] GlobalStep[42699] Training Speed: 442.05 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:36:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:48:31 INFO loss_tracker.py:84 | Epoch[311/NA] Step[99] GlobalStep[42706/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0179] Rank[0/16] 06/24/2025 17:48:38 INFO stats.py:314 | Epoch[311] Step[117] GlobalStep[42724] Training Speed: 424.16 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:36:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:48:41 INFO loss_tracker.py:84 | Epoch[311/NA] Step[124] GlobalStep[42731/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 17:48:46 INFO stats.py:394 | Epoch[311] completed. Training Speed: 310.66 samples/sec across all devices. Epoch Time: 56.45 sec. Average Epoch Time: 56.45 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:36:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:48:49 INFO stats.py:314 | Epoch[312] Step[5] GlobalStep[42749] Training Speed: 432.45 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:36:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:48:57 INFO loss_tracker.py:84 | Epoch[312/NA] Step[24] GlobalStep[42768/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 17:48:59 INFO stats.py:314 | Epoch[312] Step[30] GlobalStep[42774] Training Speed: 435.59 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:36:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:49:07 INFO loss_tracker.py:84 | Epoch[312/NA] Step[49] GlobalStep[42793/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 17:49:09 INFO stats.py:314 | Epoch[312] Step[55] GlobalStep[42799] Training Speed: 425.99 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:36:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:49:17 INFO loss_tracker.py:84 | Epoch[312/NA] Step[74] GlobalStep[42818/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 17:49:19 INFO stats.py:314 | Epoch[312] Step[80] GlobalStep[42824] Training Speed: 431.97 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:36:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:49:27 INFO loss_tracker.py:84 | Epoch[312/NA] Step[99] GlobalStep[42843/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 17:49:30 INFO stats.py:314 | Epoch[312] Step[105] GlobalStep[42849] Training Speed: 427.71 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:35:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:49:37 INFO loss_tracker.py:84 | Epoch[312/NA] Step[124] GlobalStep[42868/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0127] total_loss[0.0181] Rank[0/16] 06/24/2025 17:49:39 INFO stats.py:314 | Epoch[312] Step[130] GlobalStep[42874] Training Speed: 448.98 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:35:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:49:41 INFO stats.py:394 | Epoch[312] completed. Training Speed: 315.34 samples/sec across all devices. Epoch Time: 55.61 sec. Average Epoch Time: 55.61 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:35:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:49:50 INFO stats.py:314 | Epoch[313] Step[18] GlobalStep[42899] Training Speed: 429.15 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:35:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:49:52 INFO loss_tracker.py:84 | Epoch[313/NA] Step[24] GlobalStep[42905/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:50:00 INFO stats.py:314 | Epoch[313] Step[43] GlobalStep[42924] Training Speed: 431.13 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:35:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:50:03 INFO loss_tracker.py:84 | Epoch[313/NA] Step[49] GlobalStep[42930/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0127] total_loss[0.0164] Rank[0/16] 06/24/2025 17:50:11 INFO stats.py:314 | Epoch[313] Step[68] GlobalStep[42949] Training Speed: 419.97 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:35:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:50:13 INFO loss_tracker.py:84 | Epoch[313/NA] Step[74] GlobalStep[42955/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 17:50:21 INFO stats.py:314 | Epoch[313] Step[93] GlobalStep[42974] Training Speed: 427.67 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:34:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:50:23 INFO loss_tracker.py:84 | Epoch[313/NA] Step[99] GlobalStep[42980/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0180] Rank[0/16] 06/24/2025 17:50:31 INFO stats.py:314 | Epoch[313] Step[118] GlobalStep[42999] Training Speed: 426.40 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:34:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:50:33 INFO loss_tracker.py:84 | Epoch[313/NA] Step[124] GlobalStep[43005/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 17:50:37 INFO stats.py:394 | Epoch[313] completed. Training Speed: 312.99 samples/sec across all devices. Epoch Time: 56.03 sec. Average Epoch Time: 56.03 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:34:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:50:41 INFO stats.py:314 | Epoch[314] Step[6] GlobalStep[43024] Training Speed: 432.42 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:34:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:50:48 INFO loss_tracker.py:84 | Epoch[314/NA] Step[24] GlobalStep[43042/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0176] Rank[0/16] 06/24/2025 17:50:51 INFO stats.py:314 | Epoch[314] Step[31] GlobalStep[43049] Training Speed: 400.05 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 6:34:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:50:59 INFO loss_tracker.py:84 | Epoch[314/NA] Step[49] GlobalStep[43067/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 17:51:02 INFO stats.py:314 | Epoch[314] Step[56] GlobalStep[43074] Training Speed: 431.01 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:34:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:51:09 INFO loss_tracker.py:84 | Epoch[314/NA] Step[74] GlobalStep[43092/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0054] loss_depth[0.0127] total_loss[0.0181] Rank[0/16] 06/24/2025 17:51:12 INFO stats.py:314 | Epoch[314] Step[81] GlobalStep[43099] Training Speed: 423.87 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:34:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:51:19 INFO loss_tracker.py:84 | Epoch[314/NA] Step[99] GlobalStep[43117/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 17:51:22 INFO stats.py:314 | Epoch[314] Step[106] GlobalStep[43124] Training Speed: 425.34 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:33:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:51:30 INFO loss_tracker.py:84 | Epoch[314/NA] Step[124] GlobalStep[43142/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 17:51:32 INFO stats.py:314 | Epoch[314] Step[131] GlobalStep[43149] Training Speed: 447.76 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:33:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:51:34 INFO stats.py:394 | Epoch[314] completed. Training Speed: 310.95 samples/sec across all devices. Epoch Time: 56.40 sec. Average Epoch Time: 56.40 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:33:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:51:43 INFO stats.py:314 | Epoch[315] Step[19] GlobalStep[43174] Training Speed: 432.57 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:33:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:51:45 INFO loss_tracker.py:84 | Epoch[315/NA] Step[24] GlobalStep[43179/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0179] Rank[0/16] 06/24/2025 17:51:53 INFO stats.py:314 | Epoch[315] Step[44] GlobalStep[43199] Training Speed: 433.67 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:33:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:51:56 INFO loss_tracker.py:84 | Epoch[315/NA] Step[49] GlobalStep[43204/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 17:52:04 INFO stats.py:314 | Epoch[315] Step[69] GlobalStep[43224] Training Speed: 423.10 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:33:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:52:06 INFO loss_tracker.py:84 | Epoch[315/NA] Step[74] GlobalStep[43229/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 17:52:14 INFO stats.py:314 | Epoch[315] Step[94] GlobalStep[43249] Training Speed: 433.11 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:33:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:52:16 INFO loss_tracker.py:84 | Epoch[315/NA] Step[99] GlobalStep[43254/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0127] total_loss[0.0178] Rank[0/16] 06/24/2025 17:52:24 INFO stats.py:314 | Epoch[315] Step[119] GlobalStep[43274] Training Speed: 422.97 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:32:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:52:26 INFO loss_tracker.py:84 | Epoch[315/NA] Step[124] GlobalStep[43279/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0128] total_loss[0.0178] Rank[0/16] 06/24/2025 17:52:30 INFO stats.py:394 | Epoch[315] completed. Training Speed: 309.56 samples/sec across all devices. Epoch Time: 56.65 sec. Average Epoch Time: 56.65 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:32:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:52:34 INFO stats.py:314 | Epoch[316] Step[7] GlobalStep[43299] Training Speed: 400.10 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 6:32:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:52:42 INFO loss_tracker.py:84 | Epoch[316/NA] Step[24] GlobalStep[43316/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 17:52:45 INFO stats.py:314 | Epoch[316] Step[32] GlobalStep[43324] Training Speed: 429.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:32:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:52:52 INFO loss_tracker.py:84 | Epoch[316/NA] Step[49] GlobalStep[43341/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 17:52:55 INFO stats.py:314 | Epoch[316] Step[57] GlobalStep[43349] Training Speed: 434.13 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:32:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:53:02 INFO loss_tracker.py:84 | Epoch[316/NA] Step[74] GlobalStep[43366/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:53:05 INFO stats.py:314 | Epoch[316] Step[82] GlobalStep[43374] Training Speed: 429.75 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:32:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:53:12 INFO loss_tracker.py:84 | Epoch[316/NA] Step[99] GlobalStep[43391/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:53:15 INFO stats.py:314 | Epoch[316] Step[107] GlobalStep[43399] Training Speed: 427.96 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:32:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:53:22 INFO loss_tracker.py:84 | Epoch[316/NA] Step[124] GlobalStep[43416/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 17:53:25 INFO stats.py:314 | Epoch[316] Step[132] GlobalStep[43424] Training Speed: 451.20 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 6:31:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:53:27 INFO stats.py:394 | Epoch[316] completed. Training Speed: 311.81 samples/sec across all devices. Epoch Time: 56.24 sec. Average Epoch Time: 56.24 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:31:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:53:36 INFO stats.py:314 | Epoch[317] Step[20] GlobalStep[43449] Training Speed: 429.89 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:31:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:53:38 INFO loss_tracker.py:84 | Epoch[317/NA] Step[24] GlobalStep[43453/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 17:53:46 INFO stats.py:314 | Epoch[317] Step[45] GlobalStep[43474] Training Speed: 422.19 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:31:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:53:48 INFO loss_tracker.py:84 | Epoch[317/NA] Step[49] GlobalStep[43478/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 17:53:56 INFO stats.py:314 | Epoch[317] Step[70] GlobalStep[43499] Training Speed: 436.36 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:31:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:53:58 INFO loss_tracker.py:84 | Epoch[317/NA] Step[74] GlobalStep[43503/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 17:54:07 INFO stats.py:314 | Epoch[317] Step[95] GlobalStep[43524] Training Speed: 433.92 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:31:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:54:09 INFO loss_tracker.py:84 | Epoch[317/NA] Step[99] GlobalStep[43528/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 17:54:17 INFO stats.py:314 | Epoch[317] Step[120] GlobalStep[43549] Training Speed: 448.64 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:30:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:54:19 INFO loss_tracker.py:84 | Epoch[317/NA] Step[124] GlobalStep[43553/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0054] loss_depth[0.0127] total_loss[0.0181] Rank[0/16] 06/24/2025 17:54:23 INFO stats.py:394 | Epoch[317] completed. Training Speed: 312.04 samples/sec across all devices. Epoch Time: 56.20 sec. Average Epoch Time: 56.20 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:30:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:54:28 INFO stats.py:314 | Epoch[318] Step[8] GlobalStep[43574] Training Speed: 436.72 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:30:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:54:34 INFO loss_tracker.py:84 | Epoch[318/NA] Step[24] GlobalStep[43590/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 17:54:38 INFO stats.py:314 | Epoch[318] Step[33] GlobalStep[43599] Training Speed: 421.03 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:30:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:54:45 INFO loss_tracker.py:84 | Epoch[318/NA] Step[49] GlobalStep[43615/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0176] Rank[0/16] 06/24/2025 17:54:49 INFO stats.py:314 | Epoch[318] Step[58] GlobalStep[43624] Training Speed: 432.97 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:30:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:54:55 INFO loss_tracker.py:84 | Epoch[318/NA] Step[74] GlobalStep[43640/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 17:54:59 INFO stats.py:314 | Epoch[318] Step[83] GlobalStep[43649] Training Speed: 429.24 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:30:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:55:06 INFO loss_tracker.py:84 | Epoch[318/NA] Step[99] GlobalStep[43665/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:55:09 INFO stats.py:314 | Epoch[318] Step[108] GlobalStep[43674] Training Speed: 430.43 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:30:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:55:15 INFO loss_tracker.py:84 | Epoch[318/NA] Step[124] GlobalStep[43690/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 17:55:19 INFO stats.py:314 | Epoch[318] Step[133] GlobalStep[43699] Training Speed: 449.87 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 6:29:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:55:20 INFO stats.py:394 | Epoch[318] completed. Training Speed: 307.76 samples/sec across all devices. Epoch Time: 56.98 sec. Average Epoch Time: 56.98 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 6:29:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:55:30 INFO stats.py:314 | Epoch[319] Step[21] GlobalStep[43724] Training Speed: 432.40 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:29:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:55:31 INFO loss_tracker.py:84 | Epoch[319/NA] Step[24] GlobalStep[43727/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0127] total_loss[0.0168] Rank[0/16] 06/24/2025 17:55:40 INFO stats.py:314 | Epoch[319] Step[46] GlobalStep[43749] Training Speed: 422.41 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:29:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:55:41 INFO loss_tracker.py:84 | Epoch[319/NA] Step[49] GlobalStep[43752/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0127] total_loss[0.0167] Rank[0/16] 06/24/2025 17:55:51 INFO stats.py:314 | Epoch[319] Step[71] GlobalStep[43774] Training Speed: 421.39 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:29:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:55:52 INFO loss_tracker.py:84 | Epoch[319/NA] Step[74] GlobalStep[43777/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 17:56:01 INFO stats.py:314 | Epoch[319] Step[96] GlobalStep[43799] Training Speed: 434.55 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:29:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:56:02 INFO loss_tracker.py:84 | Epoch[319/NA] Step[99] GlobalStep[43802/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0128] total_loss[0.0175] Rank[0/16] 06/24/2025 17:56:11 INFO stats.py:314 | Epoch[319] Step[121] GlobalStep[43824] Training Speed: 447.12 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:29:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:56:12 INFO loss_tracker.py:84 | Epoch[319/NA] Step[124] GlobalStep[43827/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 17:56:16 INFO stats.py:394 | Epoch[319] completed. Training Speed: 309.16 samples/sec across all devices. Epoch Time: 56.72 sec. Average Epoch Time: 56.72 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:28:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:56:21 INFO stats.py:314 | Epoch[320] Step[9] GlobalStep[43849] Training Speed: 421.47 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:28:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:56:28 INFO loss_tracker.py:84 | Epoch[320/NA] Step[24] GlobalStep[43864/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0127] total_loss[0.0178] Rank[0/16] 06/24/2025 17:56:32 INFO stats.py:314 | Epoch[320] Step[34] GlobalStep[43874] Training Speed: 405.48 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 6:28:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:56:38 INFO loss_tracker.py:84 | Epoch[320/NA] Step[49] GlobalStep[43889/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0128] total_loss[0.0168] Rank[0/16] 06/24/2025 17:56:41 INFO stats.py:314 | Epoch[320] Step[59] GlobalStep[43899] Training Speed: 431.79 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:28:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:56:48 INFO loss_tracker.py:84 | Epoch[320/NA] Step[74] GlobalStep[43914/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 17:56:52 INFO stats.py:314 | Epoch[320] Step[84] GlobalStep[43924] Training Speed: 433.89 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:28:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:56:58 INFO loss_tracker.py:84 | Epoch[320/NA] Step[99] GlobalStep[43939/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:57:02 INFO stats.py:314 | Epoch[320] Step[109] GlobalStep[43949] Training Speed: 422.15 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:28:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:57:09 INFO loss_tracker.py:84 | Epoch[320/NA] Step[124] GlobalStep[43964/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 17:57:12 INFO stats.py:314 | Epoch[320] Step[134] GlobalStep[43974] Training Speed: 451.29 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 6:27:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:57:13 INFO stats.py:394 | Epoch[320] completed. Training Speed: 309.63 samples/sec across all devices. Epoch Time: 56.64 sec. Average Epoch Time: 56.64 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:27:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:57:24 INFO stats.py:314 | Epoch[321] Step[22] GlobalStep[43999] Training Speed: 433.13 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:27:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:57:24 WARNING accelerator.py:3099 | Deleting 1 checkpoints to make room for new checkpoint. Rank[0/16] 06/24/2025 17:57:24 INFO accelerator.py:3111 | Saving current state to /job_data/checkpoints/checkpoint_10 Rank[13/16] 06/24/2025 17:57:25 INFO checkpoint.py:110 | Save checkpoint at the end of step 43999 to /job_data/checkpoints/checkpoint_10 Rank[8/16] 06/24/2025 17:57:25 INFO checkpoint.py:110 | Save checkpoint at the end of step 43999 to /job_data/checkpoints/checkpoint_10 Rank[3/16] 06/24/2025 17:57:25 INFO checkpoint.py:110 | Save checkpoint at the end of step 43999 to /job_data/checkpoints/checkpoint_10 Rank[11/16] 06/24/2025 17:57:25 INFO checkpoint.py:110 | Save checkpoint at the end of step 43999 to /job_data/checkpoints/checkpoint_10 Rank[1/16] 06/24/2025 17:57:25 INFO checkpoint.py:110 | Save checkpoint at the end of step 43999 to /job_data/checkpoints/checkpoint_10 Rank[15/16] 06/24/2025 17:57:25 INFO checkpoint.py:110 | Save checkpoint at the end of step 43999 to /job_data/checkpoints/checkpoint_10 Rank[7/16] 06/24/2025 17:57:25 INFO checkpoint.py:110 | Save checkpoint at the end of step 43999 to /job_data/checkpoints/checkpoint_10 Rank[5/16] 06/24/2025 17:57:25 INFO checkpoint.py:110 | Save checkpoint at the end of step 43999 to /job_data/checkpoints/checkpoint_10 Rank[12/16] 06/24/2025 17:57:25 INFO checkpoint.py:110 | Save checkpoint at the end of step 43999 to /job_data/checkpoints/checkpoint_10 Rank[4/16] 06/24/2025 17:57:25 INFO checkpoint.py:110 | Save checkpoint at the end of step 43999 to /job_data/checkpoints/checkpoint_10 Rank[10/16] 06/24/2025 17:57:25 INFO checkpoint.py:110 | Save checkpoint at the end of step 43999 to /job_data/checkpoints/checkpoint_10 Rank[6/16] 06/24/2025 17:57:25 INFO checkpoint.py:110 | Save checkpoint at the end of step 43999 to /job_data/checkpoints/checkpoint_10 Rank[9/16] 06/24/2025 17:57:25 INFO checkpoint.py:110 | Save checkpoint at the end of step 43999 to /job_data/checkpoints/checkpoint_10 Rank[2/16] 06/24/2025 17:57:25 INFO checkpoint.py:110 | Save checkpoint at the end of step 43999 to /job_data/checkpoints/checkpoint_10 Rank[14/16] 06/24/2025 17:57:25 INFO checkpoint.py:110 | Save checkpoint at the end of step 43999 to /job_data/checkpoints/checkpoint_10 Rank[0/16] 06/24/2025 17:57:25 INFO checkpointing.py:106 | Model weights saved in /job_data/checkpoints/checkpoint_10/model.safetensors Rank[0/16] 06/24/2025 17:57:26 INFO checkpointing.py:113 | Optimizer state saved in /job_data/checkpoints/checkpoint_10/optimizer.bin Rank[0/16] 06/24/2025 17:57:26 INFO checkpointing.py:120 | Scheduler state saved in /job_data/checkpoints/checkpoint_10/scheduler.bin Rank[0/16] 06/24/2025 17:57:26 INFO checkpointing.py:137 | Sampler state for dataloader 0 saved in /job_data/checkpoints/checkpoint_10/sampler.bin Rank[0/16] 06/24/2025 17:57:26 INFO checkpointing.py:164 | Random states saved in /job_data/checkpoints/checkpoint_10/random_states_0.pkl Rank[0/16] 06/24/2025 17:57:26 INFO checkpointing.py:300 | Saving the state of TrainerProgressState to /job_data/checkpoints/checkpoint_10/custom_checkpoint_0.pkl Rank[0/16] 06/24/2025 17:57:27 INFO checkpoint.py:110 | Save checkpoint at the end of step 43999 to /job_data/checkpoints/checkpoint_10 Rank[0/16] 06/24/2025 17:57:27 INFO loss_tracker.py:84 | Epoch[321/NA] Step[24] GlobalStep[44001/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:57:37 INFO stats.py:314 | Epoch[321] Step[47] GlobalStep[44024] Training Speed: 427.92 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:27:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:57:38 INFO loss_tracker.py:84 | Epoch[321/NA] Step[49] GlobalStep[44026/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 17:57:47 INFO stats.py:314 | Epoch[321] Step[72] GlobalStep[44049] Training Speed: 434.25 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:27:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:57:48 INFO loss_tracker.py:84 | Epoch[321/NA] Step[74] GlobalStep[44051/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 17:57:57 INFO stats.py:314 | Epoch[321] Step[97] GlobalStep[44074] Training Speed: 434.32 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:27:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:57:58 INFO loss_tracker.py:84 | Epoch[321/NA] Step[99] GlobalStep[44076/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 17:58:08 INFO stats.py:314 | Epoch[321] Step[122] GlobalStep[44099] Training Speed: 449.15 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 6:27:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:58:08 INFO loss_tracker.py:84 | Epoch[321/NA] Step[124] GlobalStep[44101/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 17:58:13 INFO stats.py:394 | Epoch[321] completed. Training Speed: 293.52 samples/sec across all devices. Epoch Time: 59.74 sec. Average Epoch Time: 59.74 sec. Average Step Time: 0.44 sec. Estimated Remaining Time: 6:27:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:58:18 INFO stats.py:314 | Epoch[322] Step[10] GlobalStep[44124] Training Speed: 412.85 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 6:27:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:58:24 INFO loss_tracker.py:84 | Epoch[322/NA] Step[24] GlobalStep[44138/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:58:29 INFO stats.py:314 | Epoch[322] Step[35] GlobalStep[44149] Training Speed: 258.83 samples/sec across all devices. Average Step Time: 0.49 sec. Estimated Remaining Time: 6:26:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:58:34 INFO loss_tracker.py:84 | Epoch[322/NA] Step[49] GlobalStep[44163/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0054] loss_depth[0.0127] total_loss[0.0182] Rank[0/16] 06/24/2025 17:58:39 INFO stats.py:314 | Epoch[322] Step[60] GlobalStep[44174] Training Speed: 437.06 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:26:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:58:44 INFO loss_tracker.py:84 | Epoch[322/NA] Step[74] GlobalStep[44188/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0055] loss_depth[0.0127] total_loss[0.0182] Rank[0/16] 06/24/2025 17:58:49 INFO stats.py:314 | Epoch[322] Step[85] GlobalStep[44199] Training Speed: 244.11 samples/sec across all devices. Average Step Time: 0.52 sec. Estimated Remaining Time: 6:26:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:58:55 INFO loss_tracker.py:84 | Epoch[322/NA] Step[99] GlobalStep[44213/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 17:58:59 INFO stats.py:314 | Epoch[322] Step[110] GlobalStep[44224] Training Speed: 433.69 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:26:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:59:05 INFO loss_tracker.py:84 | Epoch[322/NA] Step[124] GlobalStep[44238/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0127] total_loss[0.0168] Rank[0/16] 06/24/2025 17:59:09 INFO stats.py:314 | Epoch[322] Step[135] GlobalStep[44249] Training Speed: 454.49 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 6:26:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:59:09 INFO stats.py:394 | Epoch[322] completed. Training Speed: 310.61 samples/sec across all devices. Epoch Time: 56.46 sec. Average Epoch Time: 56.46 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:26:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:59:20 INFO stats.py:314 | Epoch[323] Step[23] GlobalStep[44274] Training Speed: 426.67 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:25:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:59:20 INFO loss_tracker.py:84 | Epoch[323/NA] Step[24] GlobalStep[44275/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 17:59:30 INFO stats.py:314 | Epoch[323] Step[48] GlobalStep[44299] Training Speed: 424.43 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:25:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:59:31 INFO loss_tracker.py:84 | Epoch[323/NA] Step[49] GlobalStep[44300/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 17:59:40 INFO stats.py:314 | Epoch[323] Step[73] GlobalStep[44324] Training Speed: 428.03 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:25:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:59:40 INFO loss_tracker.py:84 | Epoch[323/NA] Step[74] GlobalStep[44325/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 17:59:50 INFO stats.py:314 | Epoch[323] Step[98] GlobalStep[44349] Training Speed: 414.23 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 6:25:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 17:59:51 INFO loss_tracker.py:84 | Epoch[323/NA] Step[99] GlobalStep[44350/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 18:00:00 INFO stats.py:314 | Epoch[323] Step[123] GlobalStep[44374] Training Speed: 453.19 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 6:25:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:00:00 INFO loss_tracker.py:84 | Epoch[323/NA] Step[124] GlobalStep[44375/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 18:00:05 INFO stats.py:394 | Epoch[323] completed. Training Speed: 315.80 samples/sec across all devices. Epoch Time: 55.53 sec. Average Epoch Time: 55.53 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:25:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:00:11 INFO stats.py:314 | Epoch[324] Step[11] GlobalStep[44399] Training Speed: 433.05 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:25:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:00:17 INFO loss_tracker.py:84 | Epoch[324/NA] Step[24] GlobalStep[44412/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 18:00:21 INFO stats.py:314 | Epoch[324] Step[36] GlobalStep[44424] Training Speed: 434.72 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:24:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:00:26 INFO loss_tracker.py:84 | Epoch[324/NA] Step[49] GlobalStep[44437/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 18:00:31 INFO stats.py:314 | Epoch[324] Step[61] GlobalStep[44449] Training Speed: 420.52 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:24:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:00:36 INFO loss_tracker.py:84 | Epoch[324/NA] Step[74] GlobalStep[44462/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 18:00:41 INFO stats.py:314 | Epoch[324] Step[86] GlobalStep[44474] Training Speed: 433.80 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:24:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:00:46 INFO loss_tracker.py:84 | Epoch[324/NA] Step[99] GlobalStep[44487/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 18:00:51 INFO stats.py:314 | Epoch[324] Step[111] GlobalStep[44499] Training Speed: 421.57 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:24:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:00:57 INFO loss_tracker.py:84 | Epoch[324/NA] Step[124] GlobalStep[44512/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 18:01:01 INFO stats.py:314 | Epoch[324] Step[136] GlobalStep[44524] Training Speed: 451.64 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 6:24:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:01:01 INFO stats.py:394 | Epoch[324] completed. Training Speed: 312.61 samples/sec across all devices. Epoch Time: 56.10 sec. Average Epoch Time: 56.10 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:24:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:01:12 INFO stats.py:314 | Epoch[325] Step[24] GlobalStep[44549] Training Speed: 435.90 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:24:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:01:12 INFO loss_tracker.py:84 | Epoch[325/NA] Step[24] GlobalStep[44549/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0127] total_loss[0.0168] Rank[0/16] 06/24/2025 18:01:22 INFO stats.py:314 | Epoch[325] Step[49] GlobalStep[44574] Training Speed: 432.83 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:23:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:01:22 INFO loss_tracker.py:84 | Epoch[325/NA] Step[49] GlobalStep[44574/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 18:01:33 INFO stats.py:314 | Epoch[325] Step[74] GlobalStep[44599] Training Speed: 424.07 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:23:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:01:33 INFO loss_tracker.py:84 | Epoch[325/NA] Step[74] GlobalStep[44599/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 18:01:43 INFO stats.py:314 | Epoch[325] Step[99] GlobalStep[44624] Training Speed: 431.31 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:23:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:01:43 INFO loss_tracker.py:84 | Epoch[325/NA] Step[99] GlobalStep[44624/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 18:01:54 INFO stats.py:314 | Epoch[325] Step[124] GlobalStep[44649] Training Speed: 452.72 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 6:23:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:01:54 INFO loss_tracker.py:84 | Epoch[325/NA] Step[124] GlobalStep[44649/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 18:01:58 INFO stats.py:394 | Epoch[325] completed. Training Speed: 307.55 samples/sec across all devices. Epoch Time: 57.02 sec. Average Epoch Time: 57.02 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 6:23:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:02:04 INFO stats.py:314 | Epoch[326] Step[12] GlobalStep[44674] Training Speed: 406.90 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 6:23:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:02:09 INFO loss_tracker.py:84 | Epoch[326/NA] Step[24] GlobalStep[44686/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 18:02:15 INFO stats.py:314 | Epoch[326] Step[37] GlobalStep[44699] Training Speed: 420.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:22:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:02:20 INFO loss_tracker.py:84 | Epoch[326/NA] Step[49] GlobalStep[44711/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 18:02:25 INFO stats.py:314 | Epoch[326] Step[62] GlobalStep[44724] Training Speed: 423.56 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:22:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:02:30 INFO loss_tracker.py:84 | Epoch[326/NA] Step[74] GlobalStep[44736/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 18:02:36 INFO stats.py:314 | Epoch[326] Step[87] GlobalStep[44749] Training Speed: 427.50 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:22:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:02:41 INFO loss_tracker.py:84 | Epoch[326/NA] Step[99] GlobalStep[44761/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0127] total_loss[0.0166] Rank[0/16] 06/24/2025 18:02:46 INFO stats.py:314 | Epoch[326] Step[112] GlobalStep[44774] Training Speed: 434.12 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:22:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:02:51 INFO loss_tracker.py:84 | Epoch[326/NA] Step[124] GlobalStep[44786/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 18:02:55 INFO stats.py:394 | Epoch[326] completed. Training Speed: 306.59 samples/sec across all devices. Epoch Time: 57.20 sec. Average Epoch Time: 57.20 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 6:22:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:02:56 INFO stats.py:314 | Epoch[327] Step[0] GlobalStep[44799] Training Speed: 352.52 samples/sec across all devices. Average Step Time: 0.36 sec. Estimated Remaining Time: 6:22:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:03:06 INFO loss_tracker.py:84 | Epoch[327/NA] Step[24] GlobalStep[44823/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 18:03:06 INFO stats.py:314 | Epoch[327] Step[25] GlobalStep[44824] Training Speed: 423.52 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:22:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:03:17 INFO loss_tracker.py:84 | Epoch[327/NA] Step[49] GlobalStep[44848/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 18:03:17 INFO stats.py:314 | Epoch[327] Step[50] GlobalStep[44849] Training Speed: 422.90 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:21:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:03:26 INFO loss_tracker.py:84 | Epoch[327/NA] Step[74] GlobalStep[44873/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 18:03:27 INFO stats.py:314 | Epoch[327] Step[75] GlobalStep[44874] Training Speed: 424.22 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:21:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:03:37 INFO loss_tracker.py:84 | Epoch[327/NA] Step[99] GlobalStep[44898/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 18:03:37 INFO stats.py:314 | Epoch[327] Step[100] GlobalStep[44899] Training Speed: 410.98 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 6:21:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:03:47 INFO loss_tracker.py:84 | Epoch[327/NA] Step[124] GlobalStep[44923/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 18:03:47 INFO stats.py:314 | Epoch[327] Step[125] GlobalStep[44924] Training Speed: 421.00 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:21:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:03:51 INFO stats.py:394 | Epoch[327] completed. Training Speed: 313.20 samples/sec across all devices. Epoch Time: 55.99 sec. Average Epoch Time: 55.99 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:21:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:03:58 INFO stats.py:314 | Epoch[328] Step[13] GlobalStep[44949] Training Speed: 433.69 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:21:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:04:02 INFO loss_tracker.py:84 | Epoch[328/NA] Step[24] GlobalStep[44960/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 18:04:08 INFO stats.py:314 | Epoch[328] Step[38] GlobalStep[44974] Training Speed: 426.01 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:21:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:04:13 INFO loss_tracker.py:84 | Epoch[328/NA] Step[49] GlobalStep[44985/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 18:04:19 INFO stats.py:314 | Epoch[328] Step[63] GlobalStep[44999] Training Speed: 424.73 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:20:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:04:23 INFO loss_tracker.py:84 | Epoch[328/NA] Step[74] GlobalStep[45010/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 18:04:29 INFO stats.py:314 | Epoch[328] Step[88] GlobalStep[45024] Training Speed: 422.11 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:20:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:04:33 INFO loss_tracker.py:84 | Epoch[328/NA] Step[99] GlobalStep[45035/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 18:04:39 INFO stats.py:314 | Epoch[328] Step[113] GlobalStep[45049] Training Speed: 432.16 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:20:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:04:43 INFO loss_tracker.py:84 | Epoch[328/NA] Step[124] GlobalStep[45060/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 18:04:47 INFO stats.py:394 | Epoch[328] completed. Training Speed: 312.35 samples/sec across all devices. Epoch Time: 56.14 sec. Average Epoch Time: 56.14 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:20:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:04:49 INFO stats.py:314 | Epoch[329] Step[1] GlobalStep[45074] Training Speed: 428.40 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:20:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:04:59 INFO loss_tracker.py:84 | Epoch[329/NA] Step[24] GlobalStep[45097/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 18:05:00 INFO stats.py:314 | Epoch[329] Step[26] GlobalStep[45099] Training Speed: 429.40 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:20:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:05:09 INFO loss_tracker.py:84 | Epoch[329/NA] Step[49] GlobalStep[45122/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 18:05:10 INFO stats.py:314 | Epoch[329] Step[51] GlobalStep[45124] Training Speed: 429.63 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:20:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:05:19 INFO loss_tracker.py:84 | Epoch[329/NA] Step[74] GlobalStep[45147/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 18:05:20 INFO stats.py:314 | Epoch[329] Step[76] GlobalStep[45149] Training Speed: 434.94 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:19:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:05:29 INFO loss_tracker.py:84 | Epoch[329/NA] Step[99] GlobalStep[45172/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 18:05:30 INFO stats.py:314 | Epoch[329] Step[101] GlobalStep[45174] Training Speed: 432.20 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:19:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:05:40 INFO loss_tracker.py:84 | Epoch[329/NA] Step[124] GlobalStep[45197/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 18:05:40 INFO stats.py:314 | Epoch[329] Step[126] GlobalStep[45199] Training Speed: 449.90 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 6:19:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:05:44 INFO stats.py:394 | Epoch[329] completed. Training Speed: 311.12 samples/sec across all devices. Epoch Time: 56.36 sec. Average Epoch Time: 56.36 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:19:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:05:50 INFO stats.py:314 | Epoch[330] Step[14] GlobalStep[45224] Training Speed: 432.63 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:19:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:05:55 INFO loss_tracker.py:84 | Epoch[330/NA] Step[24] GlobalStep[45234/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 18:06:01 INFO stats.py:314 | Epoch[330] Step[39] GlobalStep[45249] Training Speed: 434.47 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:19:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:06:05 INFO loss_tracker.py:84 | Epoch[330/NA] Step[49] GlobalStep[45259/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 18:06:11 INFO stats.py:314 | Epoch[330] Step[64] GlobalStep[45274] Training Speed: 435.85 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:18:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:06:15 INFO loss_tracker.py:84 | Epoch[330/NA] Step[74] GlobalStep[45284/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 18:06:21 INFO stats.py:314 | Epoch[330] Step[89] GlobalStep[45299] Training Speed: 431.63 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:18:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:06:26 INFO loss_tracker.py:84 | Epoch[330/NA] Step[99] GlobalStep[45309/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 18:06:32 INFO stats.py:314 | Epoch[330] Step[114] GlobalStep[45324] Training Speed: 437.38 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:18:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:06:35 INFO loss_tracker.py:84 | Epoch[330/NA] Step[124] GlobalStep[45334/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0176] Rank[0/16] 06/24/2025 18:06:40 INFO stats.py:394 | Epoch[330] completed. Training Speed: 312.64 samples/sec across all devices. Epoch Time: 56.09 sec. Average Epoch Time: 56.09 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:18:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:06:42 INFO stats.py:314 | Epoch[331] Step[2] GlobalStep[45349] Training Speed: 435.44 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:18:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:06:51 INFO loss_tracker.py:84 | Epoch[331/NA] Step[24] GlobalStep[45371/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 18:06:53 INFO stats.py:314 | Epoch[331] Step[27] GlobalStep[45374] Training Speed: 264.33 samples/sec across all devices. Average Step Time: 0.48 sec. Estimated Remaining Time: 6:18:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:07:02 INFO loss_tracker.py:84 | Epoch[331/NA] Step[49] GlobalStep[45396/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 18:07:03 INFO stats.py:314 | Epoch[331] Step[52] GlobalStep[45399] Training Speed: 431.21 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:18:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:07:12 INFO loss_tracker.py:84 | Epoch[331/NA] Step[74] GlobalStep[45421/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0179] Rank[0/16] 06/24/2025 18:07:14 INFO stats.py:314 | Epoch[331] Step[77] GlobalStep[45424] Training Speed: 427.58 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:17:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:07:23 INFO loss_tracker.py:84 | Epoch[331/NA] Step[99] GlobalStep[45446/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 18:07:24 INFO stats.py:314 | Epoch[331] Step[102] GlobalStep[45449] Training Speed: 434.68 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:17:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:07:33 INFO loss_tracker.py:84 | Epoch[331/NA] Step[124] GlobalStep[45471/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 18:07:34 INFO stats.py:314 | Epoch[331] Step[127] GlobalStep[45474] Training Speed: 450.83 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 6:17:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:07:38 INFO stats.py:394 | Epoch[331] completed. Training Speed: 303.67 samples/sec across all devices. Epoch Time: 57.75 sec. Average Epoch Time: 57.75 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 6:17:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:07:45 INFO stats.py:314 | Epoch[332] Step[15] GlobalStep[45499] Training Speed: 433.23 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:17:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:07:49 INFO loss_tracker.py:84 | Epoch[332/NA] Step[24] GlobalStep[45508/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 18:07:56 INFO stats.py:314 | Epoch[332] Step[40] GlobalStep[45524] Training Speed: 431.02 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:17:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:07:59 INFO loss_tracker.py:84 | Epoch[332/NA] Step[49] GlobalStep[45533/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 18:08:06 INFO stats.py:314 | Epoch[332] Step[65] GlobalStep[45549] Training Speed: 434.84 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:17:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:08:09 INFO loss_tracker.py:84 | Epoch[332/NA] Step[74] GlobalStep[45558/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 18:08:16 INFO stats.py:314 | Epoch[332] Step[90] GlobalStep[45574] Training Speed: 434.06 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:16:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:08:19 INFO loss_tracker.py:84 | Epoch[332/NA] Step[99] GlobalStep[45583/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0127] total_loss[0.0178] Rank[0/16] 06/24/2025 18:08:26 INFO stats.py:314 | Epoch[332] Step[115] GlobalStep[45599] Training Speed: 404.80 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 6:16:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:08:30 INFO loss_tracker.py:84 | Epoch[332/NA] Step[124] GlobalStep[45608/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0054] loss_depth[0.0127] total_loss[0.0181] Rank[0/16] 06/24/2025 18:08:34 INFO stats.py:394 | Epoch[332] completed. Training Speed: 310.87 samples/sec across all devices. Epoch Time: 56.41 sec. Average Epoch Time: 56.41 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:16:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:08:36 INFO stats.py:314 | Epoch[333] Step[3] GlobalStep[45624] Training Speed: 419.80 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:16:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:08:45 INFO loss_tracker.py:84 | Epoch[333/NA] Step[24] GlobalStep[45645/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0179] Rank[0/16] 06/24/2025 18:08:47 INFO stats.py:314 | Epoch[333] Step[28] GlobalStep[45649] Training Speed: 435.51 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:16:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:08:56 INFO loss_tracker.py:84 | Epoch[333/NA] Step[49] GlobalStep[45670/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 18:08:57 INFO stats.py:314 | Epoch[333] Step[53] GlobalStep[45674] Training Speed: 434.31 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:16:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:09:06 INFO loss_tracker.py:84 | Epoch[333/NA] Step[74] GlobalStep[45695/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 18:09:07 INFO stats.py:314 | Epoch[333] Step[78] GlobalStep[45699] Training Speed: 431.48 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:16:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:09:16 INFO loss_tracker.py:84 | Epoch[333/NA] Step[99] GlobalStep[45720/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 18:09:18 INFO stats.py:314 | Epoch[333] Step[103] GlobalStep[45724] Training Speed: 421.59 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:15:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:09:27 INFO loss_tracker.py:84 | Epoch[333/NA] Step[124] GlobalStep[45745/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 18:09:28 INFO stats.py:314 | Epoch[333] Step[128] GlobalStep[45749] Training Speed: 447.50 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:15:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:09:31 INFO stats.py:394 | Epoch[333] completed. Training Speed: 308.87 samples/sec across all devices. Epoch Time: 56.77 sec. Average Epoch Time: 56.77 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:15:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:09:39 INFO stats.py:314 | Epoch[334] Step[16] GlobalStep[45774] Training Speed: 434.82 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:15:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:09:42 INFO loss_tracker.py:84 | Epoch[334/NA] Step[24] GlobalStep[45782/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 18:09:49 INFO stats.py:314 | Epoch[334] Step[41] GlobalStep[45799] Training Speed: 433.22 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:15:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:09:53 INFO loss_tracker.py:84 | Epoch[334/NA] Step[49] GlobalStep[45807/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 18:10:00 INFO stats.py:314 | Epoch[334] Step[66] GlobalStep[45824] Training Speed: 433.97 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:15:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:10:03 INFO loss_tracker.py:84 | Epoch[334/NA] Step[74] GlobalStep[45832/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 18:10:10 INFO stats.py:314 | Epoch[334] Step[91] GlobalStep[45849] Training Speed: 429.49 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:14:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:10:13 INFO loss_tracker.py:84 | Epoch[334/NA] Step[99] GlobalStep[45857/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 18:10:20 INFO stats.py:314 | Epoch[334] Step[116] GlobalStep[45874] Training Speed: 432.47 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:14:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:10:23 INFO loss_tracker.py:84 | Epoch[334/NA] Step[124] GlobalStep[45882/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 18:10:28 INFO stats.py:394 | Epoch[334] completed. Training Speed: 308.50 samples/sec across all devices. Epoch Time: 56.84 sec. Average Epoch Time: 56.84 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:14:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:10:31 INFO stats.py:314 | Epoch[335] Step[4] GlobalStep[45899] Training Speed: 430.28 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:14:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:10:39 INFO loss_tracker.py:84 | Epoch[335/NA] Step[24] GlobalStep[45919/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 18:10:41 INFO stats.py:314 | Epoch[335] Step[29] GlobalStep[45924] Training Speed: 406.29 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 6:14:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:10:49 INFO loss_tracker.py:84 | Epoch[335/NA] Step[49] GlobalStep[45944/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 18:10:51 INFO stats.py:314 | Epoch[335] Step[54] GlobalStep[45949] Training Speed: 434.56 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:14:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:11:00 INFO loss_tracker.py:84 | Epoch[335/NA] Step[74] GlobalStep[45969/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 18:11:02 INFO stats.py:314 | Epoch[335] Step[79] GlobalStep[45974] Training Speed: 425.58 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:14:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:11:10 INFO loss_tracker.py:84 | Epoch[335/NA] Step[99] GlobalStep[45994/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 18:11:12 INFO stats.py:314 | Epoch[335] Step[104] GlobalStep[45999] Training Speed: 418.33 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 6:13:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:11:20 INFO loss_tracker.py:84 | Epoch[335/NA] Step[124] GlobalStep[46019/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 18:11:22 INFO stats.py:314 | Epoch[335] Step[129] GlobalStep[46024] Training Speed: 447.24 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:13:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:11:25 INFO stats.py:394 | Epoch[335] completed. Training Speed: 306.53 samples/sec across all devices. Epoch Time: 57.21 sec. Average Epoch Time: 57.21 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 6:13:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:11:33 INFO stats.py:314 | Epoch[336] Step[17] GlobalStep[46049] Training Speed: 429.96 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:13:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:11:36 INFO loss_tracker.py:84 | Epoch[336/NA] Step[24] GlobalStep[46056/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 18:11:43 INFO stats.py:314 | Epoch[336] Step[42] GlobalStep[46074] Training Speed: 426.75 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:13:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:11:46 INFO loss_tracker.py:84 | Epoch[336/NA] Step[49] GlobalStep[46081/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 18:11:53 INFO stats.py:314 | Epoch[336] Step[67] GlobalStep[46099] Training Speed: 434.54 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:13:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:11:56 INFO loss_tracker.py:84 | Epoch[336/NA] Step[74] GlobalStep[46106/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 18:12:04 INFO stats.py:314 | Epoch[336] Step[92] GlobalStep[46124] Training Speed: 432.94 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:13:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:12:07 INFO loss_tracker.py:84 | Epoch[336/NA] Step[99] GlobalStep[46131/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 18:12:14 INFO stats.py:314 | Epoch[336] Step[117] GlobalStep[46149] Training Speed: 435.35 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:12:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:12:17 INFO loss_tracker.py:84 | Epoch[336/NA] Step[124] GlobalStep[46156/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 18:12:21 INFO stats.py:394 | Epoch[336] completed. Training Speed: 310.47 samples/sec across all devices. Epoch Time: 56.48 sec. Average Epoch Time: 56.48 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:12:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:12:25 INFO stats.py:314 | Epoch[337] Step[5] GlobalStep[46174] Training Speed: 399.48 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 6:12:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:12:32 INFO loss_tracker.py:84 | Epoch[337/NA] Step[24] GlobalStep[46193/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0127] total_loss[0.0168] Rank[0/16] 06/24/2025 18:12:35 INFO stats.py:314 | Epoch[337] Step[30] GlobalStep[46199] Training Speed: 433.40 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:12:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:12:42 INFO loss_tracker.py:84 | Epoch[337/NA] Step[49] GlobalStep[46218/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 18:12:45 INFO stats.py:314 | Epoch[337] Step[55] GlobalStep[46224] Training Speed: 434.70 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:12:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:12:52 INFO loss_tracker.py:84 | Epoch[337/NA] Step[74] GlobalStep[46243/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 18:12:55 INFO stats.py:314 | Epoch[337] Step[80] GlobalStep[46249] Training Speed: 428.34 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:12:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:13:03 INFO loss_tracker.py:84 | Epoch[337/NA] Step[99] GlobalStep[46268/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 18:13:05 INFO stats.py:314 | Epoch[337] Step[105] GlobalStep[46274] Training Speed: 434.18 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:12:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:13:12 INFO loss_tracker.py:84 | Epoch[337/NA] Step[124] GlobalStep[46293/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0176] Rank[0/16] 06/24/2025 18:13:14 INFO stats.py:314 | Epoch[337] Step[130] GlobalStep[46299] Training Speed: 449.04 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:11:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:13:17 INFO stats.py:394 | Epoch[337] completed. Training Speed: 316.55 samples/sec across all devices. Epoch Time: 55.40 sec. Average Epoch Time: 55.40 sec. Average Step Time: 0.40 sec. Estimated Remaining Time: 6:11:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:13:26 INFO stats.py:314 | Epoch[338] Step[18] GlobalStep[46324] Training Speed: 427.10 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:11:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:13:28 INFO loss_tracker.py:84 | Epoch[338/NA] Step[24] GlobalStep[46330/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0127] total_loss[0.0166] Rank[0/16] 06/24/2025 18:13:36 INFO stats.py:314 | Epoch[338] Step[43] GlobalStep[46349] Training Speed: 434.72 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:11:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:13:38 INFO loss_tracker.py:84 | Epoch[338/NA] Step[49] GlobalStep[46355/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 18:13:46 INFO stats.py:314 | Epoch[338] Step[68] GlobalStep[46374] Training Speed: 433.48 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:11:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:13:48 INFO loss_tracker.py:84 | Epoch[338/NA] Step[74] GlobalStep[46380/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0127] total_loss[0.0168] Rank[0/16] 06/24/2025 18:13:56 INFO stats.py:314 | Epoch[338] Step[93] GlobalStep[46399] Training Speed: 412.78 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 6:11:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:13:59 INFO loss_tracker.py:84 | Epoch[338/NA] Step[99] GlobalStep[46405/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 18:14:06 INFO stats.py:314 | Epoch[338] Step[118] GlobalStep[46424] Training Speed: 432.91 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:10:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:14:08 INFO loss_tracker.py:84 | Epoch[338/NA] Step[124] GlobalStep[46430/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 18:14:12 INFO stats.py:394 | Epoch[338] completed. Training Speed: 315.53 samples/sec across all devices. Epoch Time: 55.58 sec. Average Epoch Time: 55.58 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:10:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:14:16 INFO stats.py:314 | Epoch[339] Step[6] GlobalStep[46449] Training Speed: 437.37 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:10:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:14:24 INFO loss_tracker.py:84 | Epoch[339/NA] Step[24] GlobalStep[46467/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 18:14:26 INFO stats.py:314 | Epoch[339] Step[31] GlobalStep[46474] Training Speed: 431.29 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:10:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:14:34 INFO loss_tracker.py:84 | Epoch[339/NA] Step[49] GlobalStep[46492/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0127] total_loss[0.0168] Rank[0/16] 06/24/2025 18:14:37 INFO stats.py:314 | Epoch[339] Step[56] GlobalStep[46499] Training Speed: 427.93 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:10:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:14:45 INFO loss_tracker.py:84 | Epoch[339/NA] Step[74] GlobalStep[46517/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0179] Rank[0/16] 06/24/2025 18:14:47 INFO stats.py:314 | Epoch[339] Step[81] GlobalStep[46524] Training Speed: 434.20 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:10:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:14:55 INFO loss_tracker.py:84 | Epoch[339/NA] Step[99] GlobalStep[46542/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 18:14:58 INFO stats.py:314 | Epoch[339] Step[106] GlobalStep[46549] Training Speed: 435.78 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:10:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:15:05 INFO loss_tracker.py:84 | Epoch[339/NA] Step[124] GlobalStep[46567/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 18:15:07 INFO stats.py:314 | Epoch[339] Step[131] GlobalStep[46574] Training Speed: 433.30 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:09:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:15:09 INFO stats.py:394 | Epoch[339] completed. Training Speed: 309.22 samples/sec across all devices. Epoch Time: 56.71 sec. Average Epoch Time: 56.71 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:09:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:15:18 INFO stats.py:314 | Epoch[340] Step[19] GlobalStep[46599] Training Speed: 429.73 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:09:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:15:20 INFO loss_tracker.py:84 | Epoch[340/NA] Step[24] GlobalStep[46604/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 18:15:28 INFO stats.py:314 | Epoch[340] Step[44] GlobalStep[46624] Training Speed: 434.39 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:09:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:15:30 INFO loss_tracker.py:84 | Epoch[340/NA] Step[49] GlobalStep[46629/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 18:15:39 INFO stats.py:314 | Epoch[340] Step[69] GlobalStep[46649] Training Speed: 435.31 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:09:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:15:41 INFO loss_tracker.py:84 | Epoch[340/NA] Step[74] GlobalStep[46654/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0127] total_loss[0.0165] Rank[0/16] 06/24/2025 18:15:48 INFO stats.py:314 | Epoch[340] Step[94] GlobalStep[46674] Training Speed: 437.01 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:09:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:15:51 INFO loss_tracker.py:84 | Epoch[340/NA] Step[99] GlobalStep[46679/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0055] loss_depth[0.0126] total_loss[0.0182] Rank[0/16] 06/24/2025 18:15:59 INFO stats.py:314 | Epoch[340] Step[119] GlobalStep[46699] Training Speed: 437.16 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:09:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:16:01 INFO loss_tracker.py:84 | Epoch[340/NA] Step[124] GlobalStep[46704/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 18:16:05 INFO stats.py:394 | Epoch[340] completed. Training Speed: 311.07 samples/sec across all devices. Epoch Time: 56.37 sec. Average Epoch Time: 56.37 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:08:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:16:09 INFO stats.py:314 | Epoch[341] Step[7] GlobalStep[46724] Training Speed: 435.27 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:08:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:16:17 INFO loss_tracker.py:84 | Epoch[341/NA] Step[24] GlobalStep[46741/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 18:16:20 INFO stats.py:314 | Epoch[341] Step[32] GlobalStep[46749] Training Speed: 433.52 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:08:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:16:27 INFO loss_tracker.py:84 | Epoch[341/NA] Step[49] GlobalStep[46766/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 18:16:30 INFO stats.py:314 | Epoch[341] Step[57] GlobalStep[46774] Training Speed: 432.22 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:08:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:16:37 INFO loss_tracker.py:84 | Epoch[341/NA] Step[74] GlobalStep[46791/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 18:16:40 INFO stats.py:314 | Epoch[341] Step[82] GlobalStep[46799] Training Speed: 436.09 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:08:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:16:47 INFO loss_tracker.py:84 | Epoch[341/NA] Step[99] GlobalStep[46816/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 18:16:50 INFO stats.py:314 | Epoch[341] Step[107] GlobalStep[46824] Training Speed: 431.04 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:08:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:16:57 INFO loss_tracker.py:84 | Epoch[341/NA] Step[124] GlobalStep[46841/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 18:17:00 INFO stats.py:314 | Epoch[341] Step[132] GlobalStep[46849] Training Speed: 451.12 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 6:07:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:17:02 INFO stats.py:394 | Epoch[341] completed. Training Speed: 311.41 samples/sec across all devices. Epoch Time: 56.31 sec. Average Epoch Time: 56.31 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:07:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:17:11 INFO stats.py:314 | Epoch[342] Step[20] GlobalStep[46874] Training Speed: 436.35 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:07:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:17:13 INFO loss_tracker.py:84 | Epoch[342/NA] Step[24] GlobalStep[46878/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 18:17:21 INFO stats.py:314 | Epoch[342] Step[45] GlobalStep[46899] Training Speed: 423.80 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:07:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:17:23 INFO loss_tracker.py:84 | Epoch[342/NA] Step[49] GlobalStep[46903/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0126] total_loss[0.0180] Rank[0/16] 06/24/2025 18:17:32 INFO stats.py:314 | Epoch[342] Step[70] GlobalStep[46924] Training Speed: 406.27 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 6:07:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:17:33 INFO loss_tracker.py:84 | Epoch[342/NA] Step[74] GlobalStep[46928/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 18:17:42 INFO stats.py:314 | Epoch[342] Step[95] GlobalStep[46949] Training Speed: 435.34 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:07:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:17:43 INFO loss_tracker.py:84 | Epoch[342/NA] Step[99] GlobalStep[46953/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 18:17:52 INFO stats.py:314 | Epoch[342] Step[120] GlobalStep[46974] Training Speed: 446.54 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:07:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:17:53 INFO loss_tracker.py:84 | Epoch[342/NA] Step[124] GlobalStep[46978/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0179] Rank[0/16] 06/24/2025 18:17:57 INFO stats.py:394 | Epoch[342] completed. Training Speed: 314.38 samples/sec across all devices. Epoch Time: 55.78 sec. Average Epoch Time: 55.78 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:06:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:18:02 INFO stats.py:314 | Epoch[343] Step[8] GlobalStep[46999] Training Speed: 432.43 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:06:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:18:09 INFO loss_tracker.py:84 | Epoch[343/NA] Step[24] GlobalStep[47015/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 18:18:12 INFO stats.py:314 | Epoch[343] Step[33] GlobalStep[47024] Training Speed: 425.90 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:06:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:18:19 INFO loss_tracker.py:84 | Epoch[343/NA] Step[49] GlobalStep[47040/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 18:18:23 INFO stats.py:314 | Epoch[343] Step[58] GlobalStep[47049] Training Speed: 436.11 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:06:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:18:29 INFO loss_tracker.py:84 | Epoch[343/NA] Step[74] GlobalStep[47065/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 18:18:33 INFO stats.py:314 | Epoch[343] Step[83] GlobalStep[47074] Training Speed: 425.73 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:06:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:18:40 INFO loss_tracker.py:84 | Epoch[343/NA] Step[99] GlobalStep[47090/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0057] loss_depth[0.0126] total_loss[0.0184] Rank[0/16] 06/24/2025 18:18:43 INFO stats.py:314 | Epoch[343] Step[108] GlobalStep[47099] Training Speed: 433.53 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:06:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:18:49 INFO loss_tracker.py:84 | Epoch[343/NA] Step[124] GlobalStep[47115/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0180] Rank[0/16] 06/24/2025 18:18:53 INFO stats.py:314 | Epoch[343] Step[133] GlobalStep[47124] Training Speed: 449.18 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 6:06:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:18:54 INFO stats.py:394 | Epoch[343] completed. Training Speed: 310.09 samples/sec across all devices. Epoch Time: 56.55 sec. Average Epoch Time: 56.55 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:05:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:19:04 INFO stats.py:314 | Epoch[344] Step[21] GlobalStep[47149] Training Speed: 431.86 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:05:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:19:05 INFO loss_tracker.py:84 | Epoch[344/NA] Step[24] GlobalStep[47152/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 18:19:15 INFO stats.py:314 | Epoch[344] Step[46] GlobalStep[47174] Training Speed: 431.61 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:05:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:19:16 INFO loss_tracker.py:84 | Epoch[344/NA] Step[49] GlobalStep[47177/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 18:19:25 INFO stats.py:314 | Epoch[344] Step[71] GlobalStep[47199] Training Speed: 434.24 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:05:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:19:26 INFO loss_tracker.py:84 | Epoch[344/NA] Step[74] GlobalStep[47202/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0127] total_loss[0.0168] Rank[0/16] 06/24/2025 18:19:35 INFO stats.py:314 | Epoch[344] Step[96] GlobalStep[47224] Training Speed: 425.56 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:05:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:19:37 INFO loss_tracker.py:84 | Epoch[344/NA] Step[99] GlobalStep[47227/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0178] Rank[0/16] 06/24/2025 18:19:45 INFO stats.py:314 | Epoch[344] Step[121] GlobalStep[47249] Training Speed: 451.24 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 6:05:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:19:47 INFO loss_tracker.py:84 | Epoch[344/NA] Step[124] GlobalStep[47252/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 18:19:51 INFO stats.py:394 | Epoch[344] completed. Training Speed: 308.58 samples/sec across all devices. Epoch Time: 56.83 sec. Average Epoch Time: 56.83 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:05:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:19:56 INFO stats.py:314 | Epoch[345] Step[9] GlobalStep[47274] Training Speed: 429.06 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:04:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:20:02 INFO loss_tracker.py:84 | Epoch[345/NA] Step[24] GlobalStep[47289/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 18:20:06 INFO stats.py:314 | Epoch[345] Step[34] GlobalStep[47299] Training Speed: 424.22 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:04:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:20:12 INFO loss_tracker.py:84 | Epoch[345/NA] Step[49] GlobalStep[47314/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 18:20:16 INFO stats.py:314 | Epoch[345] Step[59] GlobalStep[47324] Training Speed: 434.59 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:04:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:20:22 INFO loss_tracker.py:84 | Epoch[345/NA] Step[74] GlobalStep[47339/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 18:20:27 INFO stats.py:314 | Epoch[345] Step[84] GlobalStep[47349] Training Speed: 432.00 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:04:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:20:34 INFO loss_tracker.py:84 | Epoch[345/NA] Step[99] GlobalStep[47364/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 18:20:38 INFO stats.py:314 | Epoch[345] Step[109] GlobalStep[47374] Training Speed: 434.55 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:04:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:20:44 INFO loss_tracker.py:84 | Epoch[345/NA] Step[124] GlobalStep[47389/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 18:20:47 INFO stats.py:314 | Epoch[345] Step[134] GlobalStep[47399] Training Speed: 451.97 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 6:04:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:20:48 INFO stats.py:394 | Epoch[345] completed. Training Speed: 306.82 samples/sec across all devices. Epoch Time: 57.15 sec. Average Epoch Time: 57.15 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 6:04:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:20:59 INFO stats.py:314 | Epoch[346] Step[22] GlobalStep[47424] Training Speed: 435.36 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:03:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:21:00 INFO loss_tracker.py:84 | Epoch[346/NA] Step[24] GlobalStep[47426/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 18:21:09 INFO stats.py:314 | Epoch[346] Step[47] GlobalStep[47449] Training Speed: 433.71 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:03:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:21:10 INFO loss_tracker.py:84 | Epoch[346/NA] Step[49] GlobalStep[47451/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 18:21:19 INFO stats.py:314 | Epoch[346] Step[72] GlobalStep[47474] Training Speed: 441.45 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:03:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:21:20 INFO loss_tracker.py:84 | Epoch[346/NA] Step[74] GlobalStep[47476/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0127] total_loss[0.0166] Rank[0/16] 06/24/2025 18:21:30 INFO stats.py:314 | Epoch[346] Step[97] GlobalStep[47499] Training Speed: 424.82 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:03:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:21:30 INFO loss_tracker.py:84 | Epoch[346/NA] Step[99] GlobalStep[47501/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0127] total_loss[0.0165] Rank[0/16] 06/24/2025 18:21:40 INFO stats.py:314 | Epoch[346] Step[122] GlobalStep[47524] Training Speed: 450.11 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 6:03:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:21:40 INFO loss_tracker.py:84 | Epoch[346/NA] Step[124] GlobalStep[47526/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0176] Rank[0/16] 06/24/2025 18:21:45 INFO stats.py:394 | Epoch[346] completed. Training Speed: 308.84 samples/sec across all devices. Epoch Time: 56.78 sec. Average Epoch Time: 56.78 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:03:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:21:50 INFO stats.py:314 | Epoch[347] Step[10] GlobalStep[47549] Training Speed: 431.86 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:03:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:21:56 INFO loss_tracker.py:84 | Epoch[347/NA] Step[24] GlobalStep[47563/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 18:22:01 INFO stats.py:314 | Epoch[347] Step[35] GlobalStep[47574] Training Speed: 433.39 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:02:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:22:06 INFO loss_tracker.py:84 | Epoch[347/NA] Step[49] GlobalStep[47588/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 18:22:11 INFO stats.py:314 | Epoch[347] Step[60] GlobalStep[47599] Training Speed: 432.64 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:02:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:22:17 INFO loss_tracker.py:84 | Epoch[347/NA] Step[74] GlobalStep[47613/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0127] total_loss[0.0168] Rank[0/16] 06/24/2025 18:22:21 INFO stats.py:314 | Epoch[347] Step[85] GlobalStep[47624] Training Speed: 430.47 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:02:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:22:27 INFO loss_tracker.py:84 | Epoch[347/NA] Step[99] GlobalStep[47638/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0054] loss_depth[0.0127] total_loss[0.0180] Rank[0/16] 06/24/2025 18:22:31 INFO stats.py:314 | Epoch[347] Step[110] GlobalStep[47649] Training Speed: 435.27 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:02:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:22:37 INFO loss_tracker.py:84 | Epoch[347/NA] Step[124] GlobalStep[47663/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 18:22:41 INFO stats.py:314 | Epoch[347] Step[135] GlobalStep[47674] Training Speed: 452.41 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 6:02:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:22:41 INFO stats.py:394 | Epoch[347] completed. Training Speed: 309.16 samples/sec across all devices. Epoch Time: 56.72 sec. Average Epoch Time: 56.72 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:02:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:22:52 INFO stats.py:314 | Epoch[348] Step[23] GlobalStep[47699] Training Speed: 421.47 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:02:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:22:53 INFO loss_tracker.py:84 | Epoch[348/NA] Step[24] GlobalStep[47700/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 18:23:03 INFO stats.py:314 | Epoch[348] Step[48] GlobalStep[47724] Training Speed: 433.27 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:01:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:23:03 INFO loss_tracker.py:84 | Epoch[348/NA] Step[49] GlobalStep[47725/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 18:23:13 INFO stats.py:314 | Epoch[348] Step[73] GlobalStep[47749] Training Speed: 433.97 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 6:01:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:23:14 INFO loss_tracker.py:84 | Epoch[348/NA] Step[74] GlobalStep[47750/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0055] loss_depth[0.0126] total_loss[0.0181] Rank[0/16] 06/24/2025 18:23:24 INFO stats.py:314 | Epoch[348] Step[98] GlobalStep[47774] Training Speed: 432.71 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:01:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:23:24 INFO loss_tracker.py:84 | Epoch[348/NA] Step[99] GlobalStep[47775/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 18:23:33 INFO stats.py:314 | Epoch[348] Step[123] GlobalStep[47799] Training Speed: 453.96 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 6:01:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:23:34 INFO loss_tracker.py:84 | Epoch[348/NA] Step[124] GlobalStep[47800/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 18:23:38 INFO stats.py:394 | Epoch[348] completed. Training Speed: 308.31 samples/sec across all devices. Epoch Time: 56.88 sec. Average Epoch Time: 56.88 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 6:01:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:23:44 INFO stats.py:314 | Epoch[349] Step[11] GlobalStep[47824] Training Speed: 431.40 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:01:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:23:50 INFO loss_tracker.py:84 | Epoch[349/NA] Step[24] GlobalStep[47837/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 18:23:55 INFO stats.py:314 | Epoch[349] Step[36] GlobalStep[47849] Training Speed: 421.06 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:01:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:24:00 INFO loss_tracker.py:84 | Epoch[349/NA] Step[49] GlobalStep[47862/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 18:24:05 INFO stats.py:314 | Epoch[349] Step[61] GlobalStep[47874] Training Speed: 432.56 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:00:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:24:10 INFO loss_tracker.py:84 | Epoch[349/NA] Step[74] GlobalStep[47887/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0127] total_loss[0.0168] Rank[0/16] 06/24/2025 18:24:15 INFO stats.py:314 | Epoch[349] Step[86] GlobalStep[47899] Training Speed: 433.18 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:00:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:24:21 INFO loss_tracker.py:84 | Epoch[349/NA] Step[99] GlobalStep[47912/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 18:24:26 INFO stats.py:314 | Epoch[349] Step[111] GlobalStep[47924] Training Speed: 421.40 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 6:00:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:24:31 INFO loss_tracker.py:84 | Epoch[349/NA] Step[124] GlobalStep[47937/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 18:24:35 INFO stats.py:314 | Epoch[349] Step[136] GlobalStep[47949] Training Speed: 451.72 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 6:00:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:24:35 INFO stats.py:394 | Epoch[349] completed. Training Speed: 309.70 samples/sec across all devices. Epoch Time: 56.62 sec. Average Epoch Time: 56.62 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 6:00:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:24:46 INFO stats.py:314 | Epoch[350] Step[24] GlobalStep[47974] Training Speed: 406.60 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 6:00:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:24:46 INFO loss_tracker.py:84 | Epoch[350/NA] Step[24] GlobalStep[47974/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 18:24:56 INFO stats.py:314 | Epoch[350] Step[49] GlobalStep[47999] Training Speed: 430.25 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:59:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:24:56 INFO loss_tracker.py:84 | Epoch[350/NA] Step[49] GlobalStep[47999/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0057] loss_depth[0.0126] total_loss[0.0183] Rank[0/16] 06/24/2025 18:24:57 WARNING accelerator.py:3099 | Deleting 1 checkpoints to make room for new checkpoint. Rank[0/16] 06/24/2025 18:24:57 INFO accelerator.py:3111 | Saving current state to /job_data/checkpoints/checkpoint_11 Rank[8/16] 06/24/2025 18:24:58 INFO checkpoint.py:110 | Save checkpoint at the end of step 47999 to /job_data/checkpoints/checkpoint_11 Rank[14/16] 06/24/2025 18:24:58 INFO checkpoint.py:110 | Save checkpoint at the end of step 47999 to /job_data/checkpoints/checkpoint_11 Rank[5/16] 06/24/2025 18:24:58 INFO checkpoint.py:110 | Save checkpoint at the end of step 47999 to /job_data/checkpoints/checkpoint_11 Rank[13/16] 06/24/2025 18:24:58 INFO checkpoint.py:110 | Save checkpoint at the end of step 47999 to /job_data/checkpoints/checkpoint_11 Rank[6/16] 06/24/2025 18:24:58 INFO checkpoint.py:110 | Save checkpoint at the end of step 47999 to /job_data/checkpoints/checkpoint_11 Rank[2/16] 06/24/2025 18:24:58 INFO checkpoint.py:110 | Save checkpoint at the end of step 47999 to /job_data/checkpoints/checkpoint_11 Rank[3/16] 06/24/2025 18:24:58 INFO checkpoint.py:110 | Save checkpoint at the end of step 47999 to /job_data/checkpoints/checkpoint_11 Rank[10/16] 06/24/2025 18:24:58 INFO checkpoint.py:110 | Save checkpoint at the end of step 47999 to /job_data/checkpoints/checkpoint_11 Rank[1/16] 06/24/2025 18:24:58 INFO checkpoint.py:110 | Save checkpoint at the end of step 47999 to /job_data/checkpoints/checkpoint_11 Rank[12/16] 06/24/2025 18:24:58 INFO checkpoint.py:110 | Save checkpoint at the end of step 47999 to /job_data/checkpoints/checkpoint_11 Rank[7/16] 06/24/2025 18:24:58 INFO checkpoint.py:110 | Save checkpoint at the end of step 47999 to /job_data/checkpoints/checkpoint_11 Rank[4/16] 06/24/2025 18:24:58 INFO checkpoint.py:110 | Save checkpoint at the end of step 47999 to /job_data/checkpoints/checkpoint_11 Rank[11/16] 06/24/2025 18:24:58 INFO checkpoint.py:110 | Save checkpoint at the end of step 47999 to /job_data/checkpoints/checkpoint_11 Rank[9/16] 06/24/2025 18:24:58 INFO checkpoint.py:110 | Save checkpoint at the end of step 47999 to /job_data/checkpoints/checkpoint_11 Rank[15/16] 06/24/2025 18:24:58 INFO checkpoint.py:110 | Save checkpoint at the end of step 47999 to /job_data/checkpoints/checkpoint_11 Rank[0/16] 06/24/2025 18:24:59 INFO checkpointing.py:106 | Model weights saved in /job_data/checkpoints/checkpoint_11/model.safetensors Rank[0/16] 06/24/2025 18:25:00 INFO checkpointing.py:113 | Optimizer state saved in /job_data/checkpoints/checkpoint_11/optimizer.bin Rank[0/16] 06/24/2025 18:25:00 INFO checkpointing.py:120 | Scheduler state saved in /job_data/checkpoints/checkpoint_11/scheduler.bin Rank[0/16] 06/24/2025 18:25:00 INFO checkpointing.py:137 | Sampler state for dataloader 0 saved in /job_data/checkpoints/checkpoint_11/sampler.bin Rank[0/16] 06/24/2025 18:25:00 INFO checkpointing.py:164 | Random states saved in /job_data/checkpoints/checkpoint_11/random_states_0.pkl Rank[0/16] 06/24/2025 18:25:00 INFO checkpointing.py:300 | Saving the state of TrainerProgressState to /job_data/checkpoints/checkpoint_11/custom_checkpoint_0.pkl Rank[0/16] 06/24/2025 18:25:00 INFO checkpoint.py:110 | Save checkpoint at the end of step 47999 to /job_data/checkpoints/checkpoint_11 Rank[0/16] 06/24/2025 18:25:10 INFO stats.py:314 | Epoch[350] Step[74] GlobalStep[48024] Training Speed: 424.80 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:59:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:25:10 INFO loss_tracker.py:84 | Epoch[350/NA] Step[74] GlobalStep[48024/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 18:25:21 INFO stats.py:314 | Epoch[350] Step[99] GlobalStep[48049] Training Speed: 433.32 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:59:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:25:21 INFO loss_tracker.py:84 | Epoch[350/NA] Step[99] GlobalStep[48049/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 18:25:31 INFO stats.py:314 | Epoch[350] Step[124] GlobalStep[48074] Training Speed: 452.76 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 5:59:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:25:31 INFO loss_tracker.py:84 | Epoch[350/NA] Step[124] GlobalStep[48074/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 18:25:35 INFO stats.py:394 | Epoch[350] completed. Training Speed: 289.57 samples/sec across all devices. Epoch Time: 60.56 sec. Average Epoch Time: 60.56 sec. Average Step Time: 0.44 sec. Estimated Remaining Time: 5:59:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:25:42 INFO stats.py:314 | Epoch[351] Step[12] GlobalStep[48099] Training Speed: 435.97 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:59:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:25:47 INFO loss_tracker.py:84 | Epoch[351/NA] Step[24] GlobalStep[48111/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 18:25:52 INFO stats.py:314 | Epoch[351] Step[37] GlobalStep[48124] Training Speed: 434.09 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:59:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:25:57 INFO loss_tracker.py:84 | Epoch[351/NA] Step[49] GlobalStep[48136/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0180] Rank[0/16] 06/24/2025 18:26:02 INFO stats.py:314 | Epoch[351] Step[62] GlobalStep[48149] Training Speed: 431.16 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:58:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:26:07 INFO loss_tracker.py:84 | Epoch[351/NA] Step[74] GlobalStep[48161/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 18:26:12 INFO stats.py:314 | Epoch[351] Step[87] GlobalStep[48174] Training Speed: 435.72 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:58:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:26:17 INFO loss_tracker.py:84 | Epoch[351/NA] Step[99] GlobalStep[48186/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 18:26:23 INFO stats.py:314 | Epoch[351] Step[112] GlobalStep[48199] Training Speed: 431.67 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:58:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:26:27 INFO loss_tracker.py:84 | Epoch[351/NA] Step[124] GlobalStep[48211/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 18:26:31 INFO stats.py:394 | Epoch[351] completed. Training Speed: 313.07 samples/sec across all devices. Epoch Time: 56.01 sec. Average Epoch Time: 56.01 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:58:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:26:32 INFO stats.py:314 | Epoch[352] Step[0] GlobalStep[48224] Training Speed: 356.09 samples/sec across all devices. Average Step Time: 0.36 sec. Estimated Remaining Time: 5:58:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:26:43 INFO loss_tracker.py:84 | Epoch[352/NA] Step[24] GlobalStep[48248/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 18:26:43 INFO stats.py:314 | Epoch[352] Step[25] GlobalStep[48249] Training Speed: 404.37 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 5:58:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:26:53 INFO loss_tracker.py:84 | Epoch[352/NA] Step[49] GlobalStep[48273/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 18:26:53 INFO stats.py:314 | Epoch[352] Step[50] GlobalStep[48274] Training Speed: 431.41 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:58:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:27:03 INFO loss_tracker.py:84 | Epoch[352/NA] Step[74] GlobalStep[48298/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0126] total_loss[0.0179] Rank[0/16] 06/24/2025 18:27:04 INFO stats.py:314 | Epoch[352] Step[75] GlobalStep[48299] Training Speed: 411.46 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 5:57:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:27:13 INFO loss_tracker.py:84 | Epoch[352/NA] Step[99] GlobalStep[48323/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0056] loss_depth[0.0127] total_loss[0.0184] Rank[0/16] 06/24/2025 18:27:14 INFO stats.py:314 | Epoch[352] Step[100] GlobalStep[48324] Training Speed: 413.77 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 5:57:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:27:23 INFO loss_tracker.py:84 | Epoch[352/NA] Step[124] GlobalStep[48348/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 18:27:24 INFO stats.py:314 | Epoch[352] Step[125] GlobalStep[48349] Training Speed: 420.66 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:57:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:27:28 INFO stats.py:394 | Epoch[352] completed. Training Speed: 311.61 samples/sec across all devices. Epoch Time: 56.28 sec. Average Epoch Time: 56.28 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:57:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:27:34 INFO stats.py:314 | Epoch[353] Step[13] GlobalStep[48374] Training Speed: 434.55 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:57:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:27:39 INFO loss_tracker.py:84 | Epoch[353/NA] Step[24] GlobalStep[48385/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 18:27:45 INFO stats.py:314 | Epoch[353] Step[38] GlobalStep[48399] Training Speed: 426.52 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:57:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:27:49 INFO loss_tracker.py:84 | Epoch[353/NA] Step[49] GlobalStep[48410/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 18:27:55 INFO stats.py:314 | Epoch[353] Step[63] GlobalStep[48424] Training Speed: 436.59 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:57:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:28:00 INFO loss_tracker.py:84 | Epoch[353/NA] Step[74] GlobalStep[48435/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 18:28:05 INFO stats.py:314 | Epoch[353] Step[88] GlobalStep[48449] Training Speed: 426.74 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:56:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:28:10 INFO loss_tracker.py:84 | Epoch[353/NA] Step[99] GlobalStep[48460/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0176] Rank[0/16] 06/24/2025 18:28:15 INFO stats.py:314 | Epoch[353] Step[113] GlobalStep[48474] Training Speed: 421.13 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:56:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:28:20 INFO loss_tracker.py:84 | Epoch[353/NA] Step[124] GlobalStep[48485/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 18:28:24 INFO stats.py:394 | Epoch[353] completed. Training Speed: 309.30 samples/sec across all devices. Epoch Time: 56.70 sec. Average Epoch Time: 56.70 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:56:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:28:26 INFO stats.py:314 | Epoch[354] Step[1] GlobalStep[48499] Training Speed: 432.69 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:56:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:28:36 INFO loss_tracker.py:84 | Epoch[354/NA] Step[24] GlobalStep[48522/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 18:28:37 INFO stats.py:314 | Epoch[354] Step[26] GlobalStep[48524] Training Speed: 423.35 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:56:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:28:46 INFO loss_tracker.py:84 | Epoch[354/NA] Step[49] GlobalStep[48547/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 18:28:47 INFO stats.py:314 | Epoch[354] Step[51] GlobalStep[48549] Training Speed: 427.18 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:56:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:28:56 INFO loss_tracker.py:84 | Epoch[354/NA] Step[74] GlobalStep[48572/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0176] Rank[0/16] 06/24/2025 18:28:57 INFO stats.py:314 | Epoch[354] Step[76] GlobalStep[48574] Training Speed: 432.21 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:56:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:29:06 INFO loss_tracker.py:84 | Epoch[354/NA] Step[99] GlobalStep[48597/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 18:29:07 INFO stats.py:314 | Epoch[354] Step[101] GlobalStep[48599] Training Speed: 400.71 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 5:55:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:29:16 INFO loss_tracker.py:84 | Epoch[354/NA] Step[124] GlobalStep[48622/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 18:29:17 INFO stats.py:314 | Epoch[354] Step[126] GlobalStep[48624] Training Speed: 451.03 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 5:55:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:29:21 INFO stats.py:394 | Epoch[354] completed. Training Speed: 311.87 samples/sec across all devices. Epoch Time: 56.23 sec. Average Epoch Time: 56.23 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:55:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:29:28 INFO stats.py:314 | Epoch[355] Step[14] GlobalStep[48649] Training Speed: 437.48 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:55:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:29:32 INFO loss_tracker.py:84 | Epoch[355/NA] Step[24] GlobalStep[48659/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 18:29:39 INFO stats.py:314 | Epoch[355] Step[39] GlobalStep[48674] Training Speed: 424.18 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:55:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:29:43 INFO loss_tracker.py:84 | Epoch[355/NA] Step[49] GlobalStep[48684/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0126] total_loss[0.0180] Rank[0/16] 06/24/2025 18:29:49 INFO stats.py:314 | Epoch[355] Step[64] GlobalStep[48699] Training Speed: 432.52 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:55:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:29:53 INFO loss_tracker.py:84 | Epoch[355/NA] Step[74] GlobalStep[48709/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0127] total_loss[0.0167] Rank[0/16] 06/24/2025 18:29:59 INFO stats.py:314 | Epoch[355] Step[89] GlobalStep[48724] Training Speed: 431.10 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:54:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:30:03 INFO loss_tracker.py:84 | Epoch[355/NA] Step[99] GlobalStep[48734/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 18:30:09 INFO stats.py:314 | Epoch[355] Step[114] GlobalStep[48749] Training Speed: 423.59 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:54:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:30:13 INFO loss_tracker.py:84 | Epoch[355/NA] Step[124] GlobalStep[48759/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 18:30:17 INFO stats.py:394 | Epoch[355] completed. Training Speed: 310.34 samples/sec across all devices. Epoch Time: 56.51 sec. Average Epoch Time: 56.51 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:54:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:30:19 INFO stats.py:314 | Epoch[356] Step[2] GlobalStep[48774] Training Speed: 430.23 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:54:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:30:29 INFO loss_tracker.py:84 | Epoch[356/NA] Step[24] GlobalStep[48796/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 18:30:30 INFO stats.py:314 | Epoch[356] Step[27] GlobalStep[48799] Training Speed: 430.31 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:54:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:30:39 INFO loss_tracker.py:84 | Epoch[356/NA] Step[49] GlobalStep[48821/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 18:30:40 INFO stats.py:314 | Epoch[356] Step[52] GlobalStep[48824] Training Speed: 432.04 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:54:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:30:49 INFO loss_tracker.py:84 | Epoch[356/NA] Step[74] GlobalStep[48846/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0054] loss_depth[0.0127] total_loss[0.0181] Rank[0/16] 06/24/2025 18:30:50 INFO stats.py:314 | Epoch[356] Step[77] GlobalStep[48849] Training Speed: 429.14 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:54:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:30:59 INFO loss_tracker.py:84 | Epoch[356/NA] Step[99] GlobalStep[48871/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 18:31:00 INFO stats.py:314 | Epoch[356] Step[102] GlobalStep[48874] Training Speed: 429.32 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:53:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:31:09 INFO loss_tracker.py:84 | Epoch[356/NA] Step[124] GlobalStep[48896/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 18:31:10 INFO stats.py:314 | Epoch[356] Step[127] GlobalStep[48899] Training Speed: 448.99 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:53:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:31:13 INFO stats.py:394 | Epoch[356] completed. Training Speed: 315.75 samples/sec across all devices. Epoch Time: 55.54 sec. Average Epoch Time: 55.54 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:53:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:31:20 INFO stats.py:314 | Epoch[357] Step[15] GlobalStep[48924] Training Speed: 435.57 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:53:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:31:24 INFO loss_tracker.py:84 | Epoch[357/NA] Step[24] GlobalStep[48933/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0127] total_loss[0.0179] Rank[0/16] 06/24/2025 18:31:30 INFO stats.py:314 | Epoch[357] Step[40] GlobalStep[48949] Training Speed: 429.91 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:53:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:31:34 INFO loss_tracker.py:84 | Epoch[357/NA] Step[49] GlobalStep[48958/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 18:31:40 INFO stats.py:314 | Epoch[357] Step[65] GlobalStep[48974] Training Speed: 434.79 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:53:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:31:44 INFO loss_tracker.py:84 | Epoch[357/NA] Step[74] GlobalStep[48983/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 18:31:50 INFO stats.py:314 | Epoch[357] Step[90] GlobalStep[48999] Training Speed: 421.53 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:53:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:31:54 INFO loss_tracker.py:84 | Epoch[357/NA] Step[99] GlobalStep[49008/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0127] total_loss[0.0166] Rank[0/16] 06/24/2025 18:32:00 INFO stats.py:314 | Epoch[357] Step[115] GlobalStep[49024] Training Speed: 429.69 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:52:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:32:04 INFO loss_tracker.py:84 | Epoch[357/NA] Step[124] GlobalStep[49033/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0126] total_loss[0.0180] Rank[0/16] 06/24/2025 18:32:08 INFO stats.py:394 | Epoch[357] completed. Training Speed: 318.27 samples/sec across all devices. Epoch Time: 55.10 sec. Average Epoch Time: 55.10 sec. Average Step Time: 0.40 sec. Estimated Remaining Time: 5:52:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:32:10 INFO stats.py:314 | Epoch[358] Step[3] GlobalStep[49049] Training Speed: 443.74 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:52:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:32:20 INFO loss_tracker.py:84 | Epoch[358/NA] Step[24] GlobalStep[49070/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 18:32:21 INFO stats.py:314 | Epoch[358] Step[28] GlobalStep[49074] Training Speed: 425.62 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:52:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:32:29 INFO loss_tracker.py:84 | Epoch[358/NA] Step[49] GlobalStep[49095/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 18:32:31 INFO stats.py:314 | Epoch[358] Step[53] GlobalStep[49099] Training Speed: 433.45 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:52:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:32:40 INFO loss_tracker.py:84 | Epoch[358/NA] Step[74] GlobalStep[49120/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 18:32:41 INFO stats.py:314 | Epoch[358] Step[78] GlobalStep[49124] Training Speed: 430.79 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:52:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:32:50 INFO loss_tracker.py:84 | Epoch[358/NA] Step[99] GlobalStep[49145/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 18:32:52 INFO stats.py:314 | Epoch[358] Step[103] GlobalStep[49149] Training Speed: 426.42 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:51:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:33:00 INFO loss_tracker.py:84 | Epoch[358/NA] Step[124] GlobalStep[49170/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0127] total_loss[0.0165] Rank[0/16] 06/24/2025 18:33:01 INFO stats.py:314 | Epoch[358] Step[128] GlobalStep[49174] Training Speed: 446.93 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:51:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:33:04 INFO stats.py:394 | Epoch[358] completed. Training Speed: 311.94 samples/sec across all devices. Epoch Time: 56.22 sec. Average Epoch Time: 56.22 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:51:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:33:12 INFO stats.py:314 | Epoch[359] Step[16] GlobalStep[49199] Training Speed: 436.19 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:51:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:33:16 INFO loss_tracker.py:84 | Epoch[359/NA] Step[24] GlobalStep[49207/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 18:33:22 INFO stats.py:314 | Epoch[359] Step[41] GlobalStep[49224] Training Speed: 429.98 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:51:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:33:25 INFO loss_tracker.py:84 | Epoch[359/NA] Step[49] GlobalStep[49232/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 18:33:33 INFO stats.py:314 | Epoch[359] Step[66] GlobalStep[49249] Training Speed: 422.39 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:51:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:33:36 INFO loss_tracker.py:84 | Epoch[359/NA] Step[74] GlobalStep[49257/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 18:33:43 INFO stats.py:314 | Epoch[359] Step[91] GlobalStep[49274] Training Speed: 421.83 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:51:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:33:46 INFO loss_tracker.py:84 | Epoch[359/NA] Step[99] GlobalStep[49282/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 18:33:53 INFO stats.py:314 | Epoch[359] Step[116] GlobalStep[49299] Training Speed: 433.07 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:50:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:33:56 INFO loss_tracker.py:84 | Epoch[359/NA] Step[124] GlobalStep[49307/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0126] total_loss[0.0179] Rank[0/16] 06/24/2025 18:34:00 INFO stats.py:394 | Epoch[359] completed. Training Speed: 310.81 samples/sec across all devices. Epoch Time: 56.42 sec. Average Epoch Time: 56.42 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:50:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:34:03 INFO stats.py:314 | Epoch[360] Step[4] GlobalStep[49324] Training Speed: 446.27 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:50:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:34:12 INFO loss_tracker.py:84 | Epoch[360/NA] Step[24] GlobalStep[49344/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 18:34:14 INFO stats.py:314 | Epoch[360] Step[29] GlobalStep[49349] Training Speed: 431.24 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:50:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:34:22 INFO loss_tracker.py:84 | Epoch[360/NA] Step[49] GlobalStep[49369/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 18:34:24 INFO stats.py:314 | Epoch[360] Step[54] GlobalStep[49374] Training Speed: 433.40 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:50:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:34:33 INFO loss_tracker.py:84 | Epoch[360/NA] Step[74] GlobalStep[49394/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0126] total_loss[0.0162] Rank[0/16] 06/24/2025 18:34:35 INFO stats.py:314 | Epoch[360] Step[79] GlobalStep[49399] Training Speed: 430.75 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:50:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:34:42 INFO loss_tracker.py:84 | Epoch[360/NA] Step[99] GlobalStep[49419/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0127] total_loss[0.0167] Rank[0/16] 06/24/2025 18:34:45 INFO stats.py:314 | Epoch[360] Step[104] GlobalStep[49424] Training Speed: 237.59 samples/sec across all devices. Average Step Time: 0.54 sec. Estimated Remaining Time: 5:50:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:34:53 INFO loss_tracker.py:84 | Epoch[360/NA] Step[124] GlobalStep[49444/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0126] total_loss[0.0178] Rank[0/16] 06/24/2025 18:34:55 INFO stats.py:314 | Epoch[360] Step[129] GlobalStep[49449] Training Speed: 448.15 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:49:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:34:57 INFO stats.py:394 | Epoch[360] completed. Training Speed: 307.92 samples/sec across all devices. Epoch Time: 56.95 sec. Average Epoch Time: 56.95 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 5:49:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:35:05 INFO stats.py:314 | Epoch[361] Step[17] GlobalStep[49474] Training Speed: 252.82 samples/sec across all devices. Average Step Time: 0.51 sec. Estimated Remaining Time: 5:49:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:35:08 INFO loss_tracker.py:84 | Epoch[361/NA] Step[24] GlobalStep[49481/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 18:35:16 INFO stats.py:314 | Epoch[361] Step[42] GlobalStep[49499] Training Speed: 431.35 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:49:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:35:19 INFO loss_tracker.py:84 | Epoch[361/NA] Step[49] GlobalStep[49506/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 18:35:26 INFO stats.py:314 | Epoch[361] Step[67] GlobalStep[49524] Training Speed: 427.84 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:49:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:35:29 INFO loss_tracker.py:84 | Epoch[361/NA] Step[74] GlobalStep[49531/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 18:35:36 INFO stats.py:314 | Epoch[361] Step[92] GlobalStep[49549] Training Speed: 428.46 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:49:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:35:39 INFO loss_tracker.py:84 | Epoch[361/NA] Step[99] GlobalStep[49556/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0055] loss_depth[0.0127] total_loss[0.0182] Rank[0/16] 06/24/2025 18:35:46 INFO stats.py:314 | Epoch[361] Step[117] GlobalStep[49574] Training Speed: 433.26 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:48:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:35:49 INFO loss_tracker.py:84 | Epoch[361/NA] Step[124] GlobalStep[49581/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0176] Rank[0/16] 06/24/2025 18:35:54 INFO stats.py:394 | Epoch[361] completed. Training Speed: 312.66 samples/sec across all devices. Epoch Time: 56.09 sec. Average Epoch Time: 56.09 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:48:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:35:57 INFO stats.py:314 | Epoch[362] Step[5] GlobalStep[49599] Training Speed: 432.30 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:48:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:36:04 INFO loss_tracker.py:84 | Epoch[362/NA] Step[24] GlobalStep[49618/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 18:36:07 INFO stats.py:314 | Epoch[362] Step[30] GlobalStep[49624] Training Speed: 428.34 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:48:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:36:15 INFO loss_tracker.py:84 | Epoch[362/NA] Step[49] GlobalStep[49643/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 18:36:17 INFO stats.py:314 | Epoch[362] Step[55] GlobalStep[49649] Training Speed: 432.81 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:48:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:36:25 INFO loss_tracker.py:84 | Epoch[362/NA] Step[74] GlobalStep[49668/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 18:36:27 INFO stats.py:314 | Epoch[362] Step[80] GlobalStep[49674] Training Speed: 432.68 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:48:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:36:35 INFO loss_tracker.py:84 | Epoch[362/NA] Step[99] GlobalStep[49693/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 18:36:38 INFO stats.py:314 | Epoch[362] Step[105] GlobalStep[49699] Training Speed: 429.38 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:48:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:36:45 INFO loss_tracker.py:84 | Epoch[362/NA] Step[124] GlobalStep[49718/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 18:36:48 INFO stats.py:314 | Epoch[362] Step[130] GlobalStep[49724] Training Speed: 445.67 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:47:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:36:50 INFO stats.py:394 | Epoch[362] completed. Training Speed: 312.60 samples/sec across all devices. Epoch Time: 56.10 sec. Average Epoch Time: 56.10 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:47:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:36:58 INFO stats.py:314 | Epoch[363] Step[18] GlobalStep[49749] Training Speed: 425.45 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:47:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:37:00 INFO loss_tracker.py:84 | Epoch[363/NA] Step[24] GlobalStep[49755/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 18:37:08 INFO stats.py:314 | Epoch[363] Step[43] GlobalStep[49774] Training Speed: 433.04 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:47:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:37:11 INFO loss_tracker.py:84 | Epoch[363/NA] Step[49] GlobalStep[49780/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 18:37:18 INFO stats.py:314 | Epoch[363] Step[68] GlobalStep[49799] Training Speed: 429.77 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:47:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:37:20 INFO loss_tracker.py:84 | Epoch[363/NA] Step[74] GlobalStep[49805/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0127] total_loss[0.0167] Rank[0/16] 06/24/2025 18:37:28 INFO stats.py:314 | Epoch[363] Step[93] GlobalStep[49824] Training Speed: 431.69 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:47:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:37:31 INFO loss_tracker.py:84 | Epoch[363/NA] Step[99] GlobalStep[49830/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 18:37:38 INFO stats.py:314 | Epoch[363] Step[118] GlobalStep[49849] Training Speed: 429.43 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:47:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:37:40 INFO loss_tracker.py:84 | Epoch[363/NA] Step[124] GlobalStep[49855/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 18:37:45 INFO stats.py:394 | Epoch[363] completed. Training Speed: 319.06 samples/sec across all devices. Epoch Time: 54.96 sec. Average Epoch Time: 54.96 sec. Average Step Time: 0.40 sec. Estimated Remaining Time: 5:46:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:37:49 INFO stats.py:314 | Epoch[364] Step[6] GlobalStep[49874] Training Speed: 434.71 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:46:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:37:56 INFO loss_tracker.py:84 | Epoch[364/NA] Step[24] GlobalStep[49892/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 18:37:58 INFO stats.py:314 | Epoch[364] Step[31] GlobalStep[49899] Training Speed: 435.68 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:46:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:38:06 INFO loss_tracker.py:84 | Epoch[364/NA] Step[49] GlobalStep[49917/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 18:38:09 INFO stats.py:314 | Epoch[364] Step[56] GlobalStep[49924] Training Speed: 404.18 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 5:46:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:38:16 INFO loss_tracker.py:84 | Epoch[364/NA] Step[74] GlobalStep[49942/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0127] total_loss[0.0167] Rank[0/16] 06/24/2025 18:38:19 INFO stats.py:314 | Epoch[364] Step[81] GlobalStep[49949] Training Speed: 413.13 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 5:46:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:38:26 INFO loss_tracker.py:84 | Epoch[364/NA] Step[99] GlobalStep[49967/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 18:38:29 INFO stats.py:314 | Epoch[364] Step[106] GlobalStep[49974] Training Speed: 431.21 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:46:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:38:36 INFO loss_tracker.py:84 | Epoch[364/NA] Step[124] GlobalStep[49992/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 18:38:39 INFO stats.py:314 | Epoch[364] Step[131] GlobalStep[49999] Training Speed: 450.28 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 5:45:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:38:41 INFO stats.py:394 | Epoch[364] completed. Training Speed: 313.43 samples/sec across all devices. Epoch Time: 55.95 sec. Average Epoch Time: 55.95 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:45:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:38:50 INFO stats.py:314 | Epoch[365] Step[19] GlobalStep[50024] Training Speed: 433.07 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:45:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:38:52 INFO loss_tracker.py:84 | Epoch[365/NA] Step[24] GlobalStep[50029/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 18:39:00 INFO stats.py:314 | Epoch[365] Step[44] GlobalStep[50049] Training Speed: 422.78 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:45:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:39:03 INFO loss_tracker.py:84 | Epoch[365/NA] Step[49] GlobalStep[50054/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 18:39:11 INFO stats.py:314 | Epoch[365] Step[69] GlobalStep[50074] Training Speed: 429.01 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:45:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:39:13 INFO loss_tracker.py:84 | Epoch[365/NA] Step[74] GlobalStep[50079/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 18:39:21 INFO stats.py:314 | Epoch[365] Step[94] GlobalStep[50099] Training Speed: 427.32 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:45:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:39:23 INFO loss_tracker.py:84 | Epoch[365/NA] Step[99] GlobalStep[50104/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0054] loss_depth[0.0126] total_loss[0.0181] Rank[0/16] 06/24/2025 18:39:31 INFO stats.py:314 | Epoch[365] Step[119] GlobalStep[50124] Training Speed: 410.14 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 5:45:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:39:33 INFO loss_tracker.py:84 | Epoch[365/NA] Step[124] GlobalStep[50129/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 18:39:37 INFO stats.py:394 | Epoch[365] completed. Training Speed: 310.20 samples/sec across all devices. Epoch Time: 56.53 sec. Average Epoch Time: 56.53 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:44:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:39:41 INFO stats.py:314 | Epoch[366] Step[7] GlobalStep[50149] Training Speed: 434.84 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:44:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:39:49 INFO loss_tracker.py:84 | Epoch[366/NA] Step[24] GlobalStep[50166/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 18:39:52 INFO stats.py:314 | Epoch[366] Step[32] GlobalStep[50174] Training Speed: 428.29 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:44:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:39:59 INFO loss_tracker.py:84 | Epoch[366/NA] Step[49] GlobalStep[50191/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0051] loss_depth[0.0126] total_loss[0.0178] Rank[0/16] 06/24/2025 18:40:02 INFO stats.py:314 | Epoch[366] Step[57] GlobalStep[50199] Training Speed: 426.72 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:44:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:40:09 INFO loss_tracker.py:84 | Epoch[366/NA] Step[74] GlobalStep[50216/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 18:40:12 INFO stats.py:314 | Epoch[366] Step[82] GlobalStep[50224] Training Speed: 434.40 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:44:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:40:19 INFO loss_tracker.py:84 | Epoch[366/NA] Step[99] GlobalStep[50241/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 18:40:23 INFO stats.py:314 | Epoch[366] Step[107] GlobalStep[50249] Training Speed: 431.84 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:44:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:40:29 INFO loss_tracker.py:84 | Epoch[366/NA] Step[124] GlobalStep[50266/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 18:40:32 INFO stats.py:314 | Epoch[366] Step[132] GlobalStep[50274] Training Speed: 447.15 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:44:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:40:34 INFO stats.py:394 | Epoch[366] completed. Training Speed: 310.39 samples/sec across all devices. Epoch Time: 56.50 sec. Average Epoch Time: 56.50 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:44:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:40:43 INFO stats.py:314 | Epoch[367] Step[20] GlobalStep[50299] Training Speed: 423.74 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:43:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:40:45 INFO loss_tracker.py:84 | Epoch[367/NA] Step[24] GlobalStep[50303/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 18:40:54 INFO stats.py:314 | Epoch[367] Step[45] GlobalStep[50324] Training Speed: 433.76 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:43:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:40:55 INFO loss_tracker.py:84 | Epoch[367/NA] Step[49] GlobalStep[50328/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 18:41:04 INFO stats.py:314 | Epoch[367] Step[70] GlobalStep[50349] Training Speed: 432.14 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:43:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:41:06 INFO loss_tracker.py:84 | Epoch[367/NA] Step[74] GlobalStep[50353/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 18:41:14 INFO stats.py:314 | Epoch[367] Step[95] GlobalStep[50374] Training Speed: 435.82 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:43:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:41:16 INFO loss_tracker.py:84 | Epoch[367/NA] Step[99] GlobalStep[50378/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 18:41:24 INFO stats.py:314 | Epoch[367] Step[120] GlobalStep[50399] Training Speed: 450.19 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 5:43:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:41:26 INFO loss_tracker.py:84 | Epoch[367/NA] Step[124] GlobalStep[50403/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 18:41:30 INFO stats.py:394 | Epoch[367] completed. Training Speed: 310.41 samples/sec across all devices. Epoch Time: 56.49 sec. Average Epoch Time: 56.49 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:43:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:41:35 INFO stats.py:314 | Epoch[368] Step[8] GlobalStep[50424] Training Speed: 448.67 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:43:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:41:42 INFO loss_tracker.py:84 | Epoch[368/NA] Step[24] GlobalStep[50440/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 18:41:45 INFO stats.py:314 | Epoch[368] Step[33] GlobalStep[50449] Training Speed: 428.61 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:42:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:41:51 INFO loss_tracker.py:84 | Epoch[368/NA] Step[49] GlobalStep[50465/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0127] total_loss[0.0167] Rank[0/16] 06/24/2025 18:41:55 INFO stats.py:314 | Epoch[368] Step[58] GlobalStep[50474] Training Speed: 432.60 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:42:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:42:02 INFO loss_tracker.py:84 | Epoch[368/NA] Step[74] GlobalStep[50490/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0127] total_loss[0.0178] Rank[0/16] 06/24/2025 18:42:05 INFO stats.py:314 | Epoch[368] Step[83] GlobalStep[50499] Training Speed: 423.18 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:42:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:42:12 INFO loss_tracker.py:84 | Epoch[368/NA] Step[99] GlobalStep[50515/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 18:42:16 INFO stats.py:314 | Epoch[368] Step[108] GlobalStep[50524] Training Speed: 430.11 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:42:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:42:22 INFO loss_tracker.py:84 | Epoch[368/NA] Step[124] GlobalStep[50540/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0127] total_loss[0.0167] Rank[0/16] 06/24/2025 18:42:25 INFO stats.py:314 | Epoch[368] Step[133] GlobalStep[50549] Training Speed: 449.69 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 5:42:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:42:26 INFO stats.py:394 | Epoch[368] completed. Training Speed: 311.97 samples/sec across all devices. Epoch Time: 56.21 sec. Average Epoch Time: 56.21 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:42:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:42:36 INFO stats.py:314 | Epoch[369] Step[21] GlobalStep[50574] Training Speed: 433.26 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:41:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:42:38 INFO loss_tracker.py:84 | Epoch[369/NA] Step[24] GlobalStep[50577/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 18:42:46 INFO stats.py:314 | Epoch[369] Step[46] GlobalStep[50599] Training Speed: 433.40 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:41:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:42:48 INFO loss_tracker.py:84 | Epoch[369/NA] Step[49] GlobalStep[50602/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 18:42:56 INFO stats.py:314 | Epoch[369] Step[71] GlobalStep[50624] Training Speed: 427.34 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:41:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:42:58 INFO loss_tracker.py:84 | Epoch[369/NA] Step[74] GlobalStep[50627/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 18:43:07 INFO stats.py:314 | Epoch[369] Step[96] GlobalStep[50649] Training Speed: 402.76 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 5:41:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:43:08 INFO loss_tracker.py:84 | Epoch[369/NA] Step[99] GlobalStep[50652/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 18:43:17 INFO stats.py:314 | Epoch[369] Step[121] GlobalStep[50674] Training Speed: 449.34 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 5:41:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:43:18 INFO loss_tracker.py:84 | Epoch[369/NA] Step[124] GlobalStep[50677/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 18:43:22 INFO stats.py:394 | Epoch[369] completed. Training Speed: 314.29 samples/sec across all devices. Epoch Time: 55.80 sec. Average Epoch Time: 55.80 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:41:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:43:27 INFO stats.py:314 | Epoch[370] Step[9] GlobalStep[50699] Training Speed: 427.51 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:41:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:43:33 INFO loss_tracker.py:84 | Epoch[370/NA] Step[24] GlobalStep[50714/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 18:43:38 INFO stats.py:314 | Epoch[370] Step[34] GlobalStep[50724] Training Speed: 433.02 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:40:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:43:44 INFO loss_tracker.py:84 | Epoch[370/NA] Step[49] GlobalStep[50739/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 18:43:48 INFO stats.py:314 | Epoch[370] Step[59] GlobalStep[50749] Training Speed: 425.72 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:40:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:43:54 INFO loss_tracker.py:84 | Epoch[370/NA] Step[74] GlobalStep[50764/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 18:43:59 INFO stats.py:314 | Epoch[370] Step[84] GlobalStep[50774] Training Speed: 428.13 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:40:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:44:05 INFO loss_tracker.py:84 | Epoch[370/NA] Step[99] GlobalStep[50789/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 18:44:09 INFO stats.py:314 | Epoch[370] Step[109] GlobalStep[50799] Training Speed: 433.04 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:40:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:44:15 INFO loss_tracker.py:84 | Epoch[370/NA] Step[124] GlobalStep[50814/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 18:44:19 INFO stats.py:314 | Epoch[370] Step[134] GlobalStep[50824] Training Speed: 450.85 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 5:40:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:44:19 INFO stats.py:394 | Epoch[370] completed. Training Speed: 306.43 samples/sec across all devices. Epoch Time: 57.23 sec. Average Epoch Time: 57.23 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 5:40:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:44:29 INFO stats.py:314 | Epoch[371] Step[22] GlobalStep[50849] Training Speed: 431.93 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:40:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:44:30 INFO loss_tracker.py:84 | Epoch[371/NA] Step[24] GlobalStep[50851/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 18:44:40 INFO stats.py:314 | Epoch[371] Step[47] GlobalStep[50874] Training Speed: 427.16 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:39:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:44:41 INFO loss_tracker.py:84 | Epoch[371/NA] Step[49] GlobalStep[50876/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 18:44:50 INFO stats.py:314 | Epoch[371] Step[72] GlobalStep[50899] Training Speed: 432.94 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:39:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:44:51 INFO loss_tracker.py:84 | Epoch[371/NA] Step[74] GlobalStep[50901/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 18:45:00 INFO stats.py:314 | Epoch[371] Step[97] GlobalStep[50924] Training Speed: 430.11 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:39:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:45:01 INFO loss_tracker.py:84 | Epoch[371/NA] Step[99] GlobalStep[50926/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 18:45:10 INFO stats.py:314 | Epoch[371] Step[122] GlobalStep[50949] Training Speed: 452.92 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 5:39:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:45:11 INFO loss_tracker.py:84 | Epoch[371/NA] Step[124] GlobalStep[50951/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 18:45:15 INFO stats.py:394 | Epoch[371] completed. Training Speed: 315.52 samples/sec across all devices. Epoch Time: 55.58 sec. Average Epoch Time: 55.58 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:39:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:45:20 INFO stats.py:314 | Epoch[372] Step[10] GlobalStep[50974] Training Speed: 432.26 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:39:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:45:26 INFO loss_tracker.py:84 | Epoch[372/NA] Step[24] GlobalStep[50988/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 18:45:30 INFO stats.py:314 | Epoch[372] Step[35] GlobalStep[50999] Training Speed: 428.78 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:38:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:45:36 INFO loss_tracker.py:84 | Epoch[372/NA] Step[49] GlobalStep[51013/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 18:45:41 INFO stats.py:314 | Epoch[372] Step[60] GlobalStep[51024] Training Speed: 418.18 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 5:38:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:45:46 INFO loss_tracker.py:84 | Epoch[372/NA] Step[74] GlobalStep[51038/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 18:45:51 INFO stats.py:314 | Epoch[372] Step[85] GlobalStep[51049] Training Speed: 417.07 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 5:38:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:45:56 INFO loss_tracker.py:84 | Epoch[372/NA] Step[99] GlobalStep[51063/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 18:46:01 INFO stats.py:314 | Epoch[372] Step[110] GlobalStep[51074] Training Speed: 428.27 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:38:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:46:06 INFO loss_tracker.py:84 | Epoch[372/NA] Step[124] GlobalStep[51088/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 18:46:10 INFO stats.py:314 | Epoch[372] Step[135] GlobalStep[51099] Training Speed: 451.22 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 5:38:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:46:10 INFO stats.py:394 | Epoch[372] completed. Training Speed: 315.86 samples/sec across all devices. Epoch Time: 55.52 sec. Average Epoch Time: 55.52 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:38:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:46:21 INFO stats.py:314 | Epoch[373] Step[23] GlobalStep[51124] Training Speed: 429.68 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:38:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:46:22 INFO loss_tracker.py:84 | Epoch[373/NA] Step[24] GlobalStep[51125/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0127] total_loss[0.0178] Rank[0/16] 06/24/2025 18:46:32 INFO stats.py:314 | Epoch[373] Step[48] GlobalStep[51149] Training Speed: 424.76 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:37:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:46:33 INFO loss_tracker.py:84 | Epoch[373/NA] Step[49] GlobalStep[51150/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 18:46:42 INFO stats.py:314 | Epoch[373] Step[73] GlobalStep[51174] Training Speed: 431.43 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:37:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:46:43 INFO loss_tracker.py:84 | Epoch[373/NA] Step[74] GlobalStep[51175/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0127] total_loss[0.0165] Rank[0/16] 06/24/2025 18:46:52 INFO stats.py:314 | Epoch[373] Step[98] GlobalStep[51199] Training Speed: 428.55 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:37:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:46:53 INFO loss_tracker.py:84 | Epoch[373/NA] Step[99] GlobalStep[51200/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0179] Rank[0/16] 06/24/2025 18:47:03 INFO stats.py:314 | Epoch[373] Step[123] GlobalStep[51224] Training Speed: 453.11 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 5:37:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:47:03 INFO loss_tracker.py:84 | Epoch[373/NA] Step[124] GlobalStep[51225/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 18:47:07 INFO stats.py:394 | Epoch[373] completed. Training Speed: 308.49 samples/sec across all devices. Epoch Time: 56.84 sec. Average Epoch Time: 56.84 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:37:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:47:13 INFO stats.py:314 | Epoch[374] Step[11] GlobalStep[51249] Training Speed: 434.27 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:37:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:47:19 INFO loss_tracker.py:84 | Epoch[374/NA] Step[24] GlobalStep[51262/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 18:47:23 INFO stats.py:314 | Epoch[374] Step[36] GlobalStep[51274] Training Speed: 428.97 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:37:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:47:29 INFO loss_tracker.py:84 | Epoch[374/NA] Step[49] GlobalStep[51287/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 18:47:34 INFO stats.py:314 | Epoch[374] Step[61] GlobalStep[51299] Training Speed: 431.99 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:36:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:47:39 INFO loss_tracker.py:84 | Epoch[374/NA] Step[74] GlobalStep[51312/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 18:47:44 INFO stats.py:314 | Epoch[374] Step[86] GlobalStep[51324] Training Speed: 419.63 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 5:36:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:47:49 INFO loss_tracker.py:84 | Epoch[374/NA] Step[99] GlobalStep[51337/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 18:47:54 INFO stats.py:314 | Epoch[374] Step[111] GlobalStep[51349] Training Speed: 432.72 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:36:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:47:59 INFO loss_tracker.py:84 | Epoch[374/NA] Step[124] GlobalStep[51362/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 18:48:03 INFO stats.py:314 | Epoch[374] Step[136] GlobalStep[51374] Training Speed: 449.56 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 5:36:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:48:03 INFO stats.py:394 | Epoch[374] completed. Training Speed: 312.17 samples/sec across all devices. Epoch Time: 56.17 sec. Average Epoch Time: 56.17 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:36:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:48:15 INFO stats.py:314 | Epoch[375] Step[24] GlobalStep[51399] Training Speed: 409.81 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 5:36:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:48:15 INFO loss_tracker.py:84 | Epoch[375/NA] Step[24] GlobalStep[51399/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 18:48:25 INFO stats.py:314 | Epoch[375] Step[49] GlobalStep[51424] Training Speed: 429.26 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:36:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:48:26 INFO loss_tracker.py:84 | Epoch[375/NA] Step[49] GlobalStep[51424/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 18:48:36 INFO stats.py:314 | Epoch[375] Step[74] GlobalStep[51449] Training Speed: 431.14 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:35:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:48:36 INFO loss_tracker.py:84 | Epoch[375/NA] Step[74] GlobalStep[51449/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 18:48:46 INFO stats.py:314 | Epoch[375] Step[99] GlobalStep[51474] Training Speed: 431.64 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:35:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:48:46 INFO loss_tracker.py:84 | Epoch[375/NA] Step[99] GlobalStep[51474/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 18:48:56 INFO stats.py:314 | Epoch[375] Step[124] GlobalStep[51499] Training Speed: 455.53 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 5:35:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:48:56 INFO loss_tracker.py:84 | Epoch[375/NA] Step[124] GlobalStep[51499/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 18:49:01 INFO stats.py:394 | Epoch[375] completed. Training Speed: 306.07 samples/sec across all devices. Epoch Time: 57.29 sec. Average Epoch Time: 57.29 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 5:35:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:49:07 INFO stats.py:314 | Epoch[376] Step[12] GlobalStep[51524] Training Speed: 425.76 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:35:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:49:12 INFO loss_tracker.py:84 | Epoch[376/NA] Step[24] GlobalStep[51536/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 18:49:18 INFO stats.py:314 | Epoch[376] Step[37] GlobalStep[51549] Training Speed: 434.16 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:35:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:49:23 INFO loss_tracker.py:84 | Epoch[376/NA] Step[49] GlobalStep[51561/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 18:49:28 INFO stats.py:314 | Epoch[376] Step[62] GlobalStep[51574] Training Speed: 427.24 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:34:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:49:33 INFO loss_tracker.py:84 | Epoch[376/NA] Step[74] GlobalStep[51586/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 18:49:38 INFO stats.py:314 | Epoch[376] Step[87] GlobalStep[51599] Training Speed: 434.21 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:34:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:49:43 INFO loss_tracker.py:84 | Epoch[376/NA] Step[99] GlobalStep[51611/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 18:49:48 INFO stats.py:314 | Epoch[376] Step[112] GlobalStep[51624] Training Speed: 426.04 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:34:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:49:53 INFO loss_tracker.py:84 | Epoch[376/NA] Step[124] GlobalStep[51636/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 18:49:57 INFO stats.py:394 | Epoch[376] completed. Training Speed: 310.52 samples/sec across all devices. Epoch Time: 56.47 sec. Average Epoch Time: 56.47 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:34:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:49:58 INFO stats.py:314 | Epoch[377] Step[0] GlobalStep[51649] Training Speed: 359.74 samples/sec across all devices. Average Step Time: 0.36 sec. Estimated Remaining Time: 5:34:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:50:09 INFO loss_tracker.py:84 | Epoch[377/NA] Step[24] GlobalStep[51673/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 18:50:09 INFO stats.py:314 | Epoch[377] Step[25] GlobalStep[51674] Training Speed: 404.12 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 5:34:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:50:19 INFO loss_tracker.py:84 | Epoch[377/NA] Step[49] GlobalStep[51698/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 18:50:19 INFO stats.py:314 | Epoch[377] Step[50] GlobalStep[51699] Training Speed: 423.93 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:34:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:50:29 INFO loss_tracker.py:84 | Epoch[377/NA] Step[74] GlobalStep[51723/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 18:50:29 INFO stats.py:314 | Epoch[377] Step[75] GlobalStep[51724] Training Speed: 423.03 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:33:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:50:40 INFO loss_tracker.py:84 | Epoch[377/NA] Step[99] GlobalStep[51748/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 18:50:40 INFO stats.py:314 | Epoch[377] Step[100] GlobalStep[51749] Training Speed: 419.36 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 5:33:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:50:50 INFO loss_tracker.py:84 | Epoch[377/NA] Step[124] GlobalStep[51773/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0176] Rank[0/16] 06/24/2025 18:50:50 INFO stats.py:314 | Epoch[377] Step[125] GlobalStep[51774] Training Speed: 395.33 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 5:33:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:50:55 INFO stats.py:394 | Epoch[377] completed. Training Speed: 305.63 samples/sec across all devices. Epoch Time: 57.38 sec. Average Epoch Time: 57.38 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 5:33:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:51:01 INFO stats.py:314 | Epoch[378] Step[13] GlobalStep[51799] Training Speed: 434.83 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:33:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:51:06 INFO loss_tracker.py:84 | Epoch[378/NA] Step[24] GlobalStep[51810/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 18:51:12 INFO stats.py:314 | Epoch[378] Step[38] GlobalStep[51824] Training Speed: 435.08 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:33:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:51:17 INFO loss_tracker.py:84 | Epoch[378/NA] Step[49] GlobalStep[51835/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 18:51:22 INFO stats.py:314 | Epoch[378] Step[63] GlobalStep[51849] Training Speed: 406.22 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 5:33:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:51:27 INFO loss_tracker.py:84 | Epoch[378/NA] Step[74] GlobalStep[51860/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 18:51:32 INFO stats.py:314 | Epoch[378] Step[88] GlobalStep[51874] Training Speed: 428.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:32:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:51:37 INFO loss_tracker.py:84 | Epoch[378/NA] Step[99] GlobalStep[51885/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0054] loss_depth[0.0127] total_loss[0.0181] Rank[0/16] 06/24/2025 18:51:43 INFO stats.py:314 | Epoch[378] Step[113] GlobalStep[51899] Training Speed: 432.28 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:32:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:51:47 INFO loss_tracker.py:84 | Epoch[378/NA] Step[124] GlobalStep[51910/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 18:51:51 INFO stats.py:394 | Epoch[378] completed. Training Speed: 308.78 samples/sec across all devices. Epoch Time: 56.79 sec. Average Epoch Time: 56.79 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:32:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:51:53 INFO stats.py:314 | Epoch[379] Step[1] GlobalStep[51924] Training Speed: 430.85 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:32:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:52:03 INFO loss_tracker.py:84 | Epoch[379/NA] Step[24] GlobalStep[51947/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 18:52:03 INFO stats.py:314 | Epoch[379] Step[26] GlobalStep[51949] Training Speed: 432.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:32:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:52:12 INFO loss_tracker.py:84 | Epoch[379/NA] Step[49] GlobalStep[51972/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 18:52:13 INFO stats.py:314 | Epoch[379] Step[51] GlobalStep[51974] Training Speed: 434.11 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:32:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:52:23 INFO loss_tracker.py:84 | Epoch[379/NA] Step[74] GlobalStep[51997/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 18:52:23 INFO stats.py:314 | Epoch[379] Step[76] GlobalStep[51999] Training Speed: 417.40 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 5:32:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:52:24 WARNING accelerator.py:3099 | Deleting 1 checkpoints to make room for new checkpoint. Rank[0/16] 06/24/2025 18:52:24 INFO accelerator.py:3111 | Saving current state to /job_data/checkpoints/checkpoint_12 Rank[5/16] 06/24/2025 18:52:24 INFO checkpoint.py:110 | Save checkpoint at the end of step 51999 to /job_data/checkpoints/checkpoint_12 Rank[2/16] 06/24/2025 18:52:24 INFO checkpoint.py:110 | Save checkpoint at the end of step 51999 to /job_data/checkpoints/checkpoint_12 Rank[6/16] 06/24/2025 18:52:25 INFO checkpoint.py:110 | Save checkpoint at the end of step 51999 to /job_data/checkpoints/checkpoint_12 Rank[12/16] 06/24/2025 18:52:25 INFO checkpoint.py:110 | Save checkpoint at the end of step 51999 to /job_data/checkpoints/checkpoint_12 Rank[7/16] 06/24/2025 18:52:25 INFO checkpoint.py:110 | Save checkpoint at the end of step 51999 to /job_data/checkpoints/checkpoint_12 Rank[8/16] 06/24/2025 18:52:25 INFO checkpoint.py:110 | Save checkpoint at the end of step 51999 to /job_data/checkpoints/checkpoint_12 Rank[3/16] 06/24/2025 18:52:25 INFO checkpoint.py:110 | Save checkpoint at the end of step 51999 to /job_data/checkpoints/checkpoint_12 Rank[14/16] 06/24/2025 18:52:25 INFO checkpoint.py:110 | Save checkpoint at the end of step 51999 to /job_data/checkpoints/checkpoint_12 Rank[1/16] 06/24/2025 18:52:25 INFO checkpoint.py:110 | Save checkpoint at the end of step 51999 to /job_data/checkpoints/checkpoint_12 Rank[13/16] 06/24/2025 18:52:25 INFO checkpoint.py:110 | Save checkpoint at the end of step 51999 to /job_data/checkpoints/checkpoint_12 Rank[9/16] 06/24/2025 18:52:25 INFO checkpoint.py:110 | Save checkpoint at the end of step 51999 to /job_data/checkpoints/checkpoint_12 Rank[4/16] 06/24/2025 18:52:25 INFO checkpoint.py:110 | Save checkpoint at the end of step 51999 to /job_data/checkpoints/checkpoint_12 Rank[11/16] 06/24/2025 18:52:25 INFO checkpoint.py:110 | Save checkpoint at the end of step 51999 to /job_data/checkpoints/checkpoint_12 Rank[10/16] 06/24/2025 18:52:25 INFO checkpoint.py:110 | Save checkpoint at the end of step 51999 to /job_data/checkpoints/checkpoint_12 Rank[15/16] 06/24/2025 18:52:25 INFO checkpoint.py:110 | Save checkpoint at the end of step 51999 to /job_data/checkpoints/checkpoint_12 Rank[0/16] 06/24/2025 18:52:25 INFO checkpointing.py:106 | Model weights saved in /job_data/checkpoints/checkpoint_12/model.safetensors Rank[0/16] 06/24/2025 18:52:26 INFO checkpointing.py:113 | Optimizer state saved in /job_data/checkpoints/checkpoint_12/optimizer.bin Rank[0/16] 06/24/2025 18:52:26 INFO checkpointing.py:120 | Scheduler state saved in /job_data/checkpoints/checkpoint_12/scheduler.bin Rank[0/16] 06/24/2025 18:52:26 INFO checkpointing.py:137 | Sampler state for dataloader 0 saved in /job_data/checkpoints/checkpoint_12/sampler.bin Rank[0/16] 06/24/2025 18:52:26 INFO checkpointing.py:164 | Random states saved in /job_data/checkpoints/checkpoint_12/random_states_0.pkl Rank[0/16] 06/24/2025 18:52:26 INFO checkpointing.py:300 | Saving the state of TrainerProgressState to /job_data/checkpoints/checkpoint_12/custom_checkpoint_0.pkl Rank[0/16] 06/24/2025 18:52:26 INFO checkpoint.py:110 | Save checkpoint at the end of step 51999 to /job_data/checkpoints/checkpoint_12 Rank[0/16] 06/24/2025 18:52:35 INFO loss_tracker.py:84 | Epoch[379/NA] Step[99] GlobalStep[52022/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 18:52:36 INFO stats.py:314 | Epoch[379] Step[101] GlobalStep[52024] Training Speed: 240.15 samples/sec across all devices. Average Step Time: 0.53 sec. Estimated Remaining Time: 5:31:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:52:46 INFO loss_tracker.py:84 | Epoch[379/NA] Step[124] GlobalStep[52047/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 18:52:46 INFO stats.py:314 | Epoch[379] Step[126] GlobalStep[52049] Training Speed: 451.04 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 5:31:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:52:50 INFO stats.py:394 | Epoch[379] completed. Training Speed: 298.54 samples/sec across all devices. Epoch Time: 58.74 sec. Average Epoch Time: 58.74 sec. Average Step Time: 0.43 sec. Estimated Remaining Time: 5:31:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:52:57 INFO stats.py:314 | Epoch[380] Step[14] GlobalStep[52074] Training Speed: 434.42 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:31:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:53:02 INFO loss_tracker.py:84 | Epoch[380/NA] Step[24] GlobalStep[52084/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 18:53:08 INFO stats.py:314 | Epoch[380] Step[39] GlobalStep[52099] Training Speed: 426.49 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:31:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:53:12 INFO loss_tracker.py:84 | Epoch[380/NA] Step[49] GlobalStep[52109/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 18:53:18 INFO stats.py:314 | Epoch[380] Step[64] GlobalStep[52124] Training Speed: 430.42 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:31:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:53:22 INFO loss_tracker.py:84 | Epoch[380/NA] Step[74] GlobalStep[52134/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 18:53:28 INFO stats.py:314 | Epoch[380] Step[89] GlobalStep[52149] Training Speed: 431.79 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:31:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:53:32 INFO loss_tracker.py:84 | Epoch[380/NA] Step[99] GlobalStep[52159/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0126] total_loss[0.0178] Rank[0/16] 06/24/2025 18:53:38 INFO stats.py:314 | Epoch[380] Step[114] GlobalStep[52174] Training Speed: 432.84 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:30:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:53:42 INFO loss_tracker.py:84 | Epoch[380/NA] Step[124] GlobalStep[52184/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 18:53:46 INFO stats.py:394 | Epoch[380] completed. Training Speed: 311.50 samples/sec across all devices. Epoch Time: 56.29 sec. Average Epoch Time: 56.29 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:30:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:53:49 INFO stats.py:314 | Epoch[381] Step[2] GlobalStep[52199] Training Speed: 435.46 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:30:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:53:57 INFO loss_tracker.py:84 | Epoch[381/NA] Step[24] GlobalStep[52221/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 18:53:59 INFO stats.py:314 | Epoch[381] Step[27] GlobalStep[52224] Training Speed: 436.77 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:30:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:54:08 INFO loss_tracker.py:84 | Epoch[381/NA] Step[49] GlobalStep[52246/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 18:54:09 INFO stats.py:314 | Epoch[381] Step[52] GlobalStep[52249] Training Speed: 422.78 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:30:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:54:18 INFO loss_tracker.py:84 | Epoch[381/NA] Step[74] GlobalStep[52271/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 18:54:20 INFO stats.py:314 | Epoch[381] Step[77] GlobalStep[52274] Training Speed: 435.73 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:30:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:54:29 INFO loss_tracker.py:84 | Epoch[381/NA] Step[99] GlobalStep[52296/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 18:54:30 INFO stats.py:314 | Epoch[381] Step[102] GlobalStep[52299] Training Speed: 432.67 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:30:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:54:39 INFO loss_tracker.py:84 | Epoch[381/NA] Step[124] GlobalStep[52321/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 18:54:40 INFO stats.py:314 | Epoch[381] Step[127] GlobalStep[52324] Training Speed: 454.95 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 5:29:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:54:43 INFO stats.py:394 | Epoch[381] completed. Training Speed: 307.83 samples/sec across all devices. Epoch Time: 56.97 sec. Average Epoch Time: 56.97 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 5:29:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:54:51 INFO stats.py:314 | Epoch[382] Step[15] GlobalStep[52349] Training Speed: 424.11 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:29:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:54:54 INFO loss_tracker.py:84 | Epoch[382/NA] Step[24] GlobalStep[52358/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 18:55:01 INFO stats.py:314 | Epoch[382] Step[40] GlobalStep[52374] Training Speed: 407.34 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 5:29:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:55:05 INFO loss_tracker.py:84 | Epoch[382/NA] Step[49] GlobalStep[52383/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 18:55:11 INFO stats.py:314 | Epoch[382] Step[65] GlobalStep[52399] Training Speed: 426.20 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:29:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:55:15 INFO loss_tracker.py:84 | Epoch[382/NA] Step[74] GlobalStep[52408/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 18:55:21 INFO stats.py:314 | Epoch[382] Step[90] GlobalStep[52424] Training Speed: 430.91 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:29:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:55:25 INFO loss_tracker.py:84 | Epoch[382/NA] Step[99] GlobalStep[52433/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 18:55:32 INFO stats.py:314 | Epoch[382] Step[115] GlobalStep[52449] Training Speed: 431.13 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:28:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:55:35 INFO loss_tracker.py:84 | Epoch[382/NA] Step[124] GlobalStep[52458/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 18:55:39 INFO stats.py:394 | Epoch[382] completed. Training Speed: 312.70 samples/sec across all devices. Epoch Time: 56.08 sec. Average Epoch Time: 56.08 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:28:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:55:42 INFO stats.py:314 | Epoch[383] Step[3] GlobalStep[52474] Training Speed: 429.87 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:28:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:55:50 INFO loss_tracker.py:84 | Epoch[383/NA] Step[24] GlobalStep[52495/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 18:55:52 INFO stats.py:314 | Epoch[383] Step[28] GlobalStep[52499] Training Speed: 425.56 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:28:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:56:01 INFO loss_tracker.py:84 | Epoch[383/NA] Step[49] GlobalStep[52520/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 18:56:02 INFO stats.py:314 | Epoch[383] Step[53] GlobalStep[52524] Training Speed: 438.66 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:28:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:56:11 INFO loss_tracker.py:84 | Epoch[383/NA] Step[74] GlobalStep[52545/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 18:56:12 INFO stats.py:314 | Epoch[383] Step[78] GlobalStep[52549] Training Speed: 433.64 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:28:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:56:21 INFO loss_tracker.py:84 | Epoch[383/NA] Step[99] GlobalStep[52570/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 18:56:23 INFO stats.py:314 | Epoch[383] Step[103] GlobalStep[52574] Training Speed: 432.13 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:28:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:56:31 INFO loss_tracker.py:84 | Epoch[383/NA] Step[124] GlobalStep[52595/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 18:56:33 INFO stats.py:314 | Epoch[383] Step[128] GlobalStep[52599] Training Speed: 448.15 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:27:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:56:35 INFO stats.py:394 | Epoch[383] completed. Training Speed: 313.39 samples/sec across all devices. Epoch Time: 55.96 sec. Average Epoch Time: 55.96 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:27:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:56:43 INFO stats.py:314 | Epoch[384] Step[16] GlobalStep[52624] Training Speed: 401.31 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 5:27:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:56:46 INFO loss_tracker.py:84 | Epoch[384/NA] Step[24] GlobalStep[52632/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 18:56:53 INFO stats.py:314 | Epoch[384] Step[41] GlobalStep[52649] Training Speed: 424.94 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:27:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:56:57 INFO loss_tracker.py:84 | Epoch[384/NA] Step[49] GlobalStep[52657/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 18:57:04 INFO stats.py:314 | Epoch[384] Step[66] GlobalStep[52674] Training Speed: 434.16 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:27:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:57:07 INFO loss_tracker.py:84 | Epoch[384/NA] Step[74] GlobalStep[52682/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 18:57:14 INFO stats.py:314 | Epoch[384] Step[91] GlobalStep[52699] Training Speed: 423.32 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:27:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:57:17 INFO loss_tracker.py:84 | Epoch[384/NA] Step[99] GlobalStep[52707/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 18:57:24 INFO stats.py:314 | Epoch[384] Step[116] GlobalStep[52724] Training Speed: 433.78 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:27:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:57:27 INFO loss_tracker.py:84 | Epoch[384/NA] Step[124] GlobalStep[52732/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 18:57:32 INFO stats.py:394 | Epoch[384] completed. Training Speed: 310.64 samples/sec across all devices. Epoch Time: 56.45 sec. Average Epoch Time: 56.45 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:26:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:57:35 INFO stats.py:314 | Epoch[385] Step[4] GlobalStep[52749] Training Speed: 431.73 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:26:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:57:43 INFO loss_tracker.py:84 | Epoch[385/NA] Step[24] GlobalStep[52769/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 18:57:45 INFO stats.py:314 | Epoch[385] Step[29] GlobalStep[52774] Training Speed: 420.93 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:26:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:57:54 INFO loss_tracker.py:84 | Epoch[385/NA] Step[49] GlobalStep[52794/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0127] total_loss[0.0163] Rank[0/16] 06/24/2025 18:57:56 INFO stats.py:314 | Epoch[385] Step[54] GlobalStep[52799] Training Speed: 429.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:26:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:58:04 INFO loss_tracker.py:84 | Epoch[385/NA] Step[74] GlobalStep[52819/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 18:58:06 INFO stats.py:314 | Epoch[385] Step[79] GlobalStep[52824] Training Speed: 433.46 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:26:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:58:14 INFO loss_tracker.py:84 | Epoch[385/NA] Step[99] GlobalStep[52844/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 18:58:16 INFO stats.py:314 | Epoch[385] Step[104] GlobalStep[52849] Training Speed: 430.49 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:26:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:58:24 INFO loss_tracker.py:84 | Epoch[385/NA] Step[124] GlobalStep[52869/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0126] total_loss[0.0163] Rank[0/16] 06/24/2025 18:58:26 INFO stats.py:314 | Epoch[385] Step[129] GlobalStep[52874] Training Speed: 447.07 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:25:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:58:29 INFO stats.py:394 | Epoch[385] completed. Training Speed: 307.55 samples/sec across all devices. Epoch Time: 57.02 sec. Average Epoch Time: 57.02 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 5:25:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:58:37 INFO stats.py:314 | Epoch[386] Step[17] GlobalStep[52899] Training Speed: 428.80 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:25:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:58:40 INFO loss_tracker.py:84 | Epoch[386/NA] Step[24] GlobalStep[52906/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 18:58:48 INFO stats.py:314 | Epoch[386] Step[42] GlobalStep[52924] Training Speed: 434.60 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:25:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:58:51 INFO loss_tracker.py:84 | Epoch[386/NA] Step[49] GlobalStep[52931/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 18:58:58 INFO stats.py:314 | Epoch[386] Step[67] GlobalStep[52949] Training Speed: 428.23 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:25:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:59:01 INFO loss_tracker.py:84 | Epoch[386/NA] Step[74] GlobalStep[52956/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 18:59:08 INFO stats.py:314 | Epoch[386] Step[92] GlobalStep[52974] Training Speed: 433.06 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:25:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:59:11 INFO loss_tracker.py:84 | Epoch[386/NA] Step[99] GlobalStep[52981/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 18:59:18 INFO stats.py:314 | Epoch[386] Step[117] GlobalStep[52999] Training Speed: 433.40 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:25:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:59:21 INFO loss_tracker.py:84 | Epoch[386/NA] Step[124] GlobalStep[53006/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0179] Rank[0/16] 06/24/2025 18:59:26 INFO stats.py:394 | Epoch[386] completed. Training Speed: 309.38 samples/sec across all devices. Epoch Time: 56.68 sec. Average Epoch Time: 56.68 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:24:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:59:29 INFO stats.py:314 | Epoch[387] Step[5] GlobalStep[53024] Training Speed: 428.68 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:24:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:59:37 INFO loss_tracker.py:84 | Epoch[387/NA] Step[24] GlobalStep[53043/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 18:59:39 INFO stats.py:314 | Epoch[387] Step[30] GlobalStep[53049] Training Speed: 408.93 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 5:24:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:59:48 INFO loss_tracker.py:84 | Epoch[387/NA] Step[49] GlobalStep[53068/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 18:59:50 INFO stats.py:314 | Epoch[387] Step[55] GlobalStep[53074] Training Speed: 429.29 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:24:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 18:59:57 INFO loss_tracker.py:84 | Epoch[387/NA] Step[74] GlobalStep[53093/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 19:00:00 INFO stats.py:314 | Epoch[387] Step[80] GlobalStep[53099] Training Speed: 262.61 samples/sec across all devices. Average Step Time: 0.49 sec. Estimated Remaining Time: 5:24:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:00:08 INFO loss_tracker.py:84 | Epoch[387/NA] Step[99] GlobalStep[53118/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 19:00:10 INFO stats.py:314 | Epoch[387] Step[105] GlobalStep[53124] Training Speed: 431.05 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:24:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:00:18 INFO loss_tracker.py:84 | Epoch[387/NA] Step[124] GlobalStep[53143/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 19:00:20 INFO stats.py:314 | Epoch[387] Step[130] GlobalStep[53149] Training Speed: 448.14 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:24:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:00:22 INFO stats.py:394 | Epoch[387] completed. Training Speed: 310.20 samples/sec across all devices. Epoch Time: 56.53 sec. Average Epoch Time: 56.53 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:24:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:00:31 INFO stats.py:314 | Epoch[388] Step[18] GlobalStep[53174] Training Speed: 434.86 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:23:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:00:33 INFO loss_tracker.py:84 | Epoch[388/NA] Step[24] GlobalStep[53180/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 19:00:41 INFO stats.py:314 | Epoch[388] Step[43] GlobalStep[53199] Training Speed: 430.56 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:23:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:00:44 INFO loss_tracker.py:84 | Epoch[388/NA] Step[49] GlobalStep[53205/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 19:00:51 INFO stats.py:314 | Epoch[388] Step[68] GlobalStep[53224] Training Speed: 434.39 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:23:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:00:54 INFO loss_tracker.py:84 | Epoch[388/NA] Step[74] GlobalStep[53230/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 19:01:02 INFO stats.py:314 | Epoch[388] Step[93] GlobalStep[53249] Training Speed: 426.41 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:23:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:01:04 INFO loss_tracker.py:84 | Epoch[388/NA] Step[99] GlobalStep[53255/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0056] loss_depth[0.0126] total_loss[0.0182] Rank[0/16] 06/24/2025 19:01:12 INFO stats.py:314 | Epoch[388] Step[118] GlobalStep[53274] Training Speed: 433.04 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:23:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:01:14 INFO loss_tracker.py:84 | Epoch[388/NA] Step[124] GlobalStep[53280/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 19:01:18 INFO stats.py:394 | Epoch[388] completed. Training Speed: 311.84 samples/sec across all devices. Epoch Time: 56.23 sec. Average Epoch Time: 56.23 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:23:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:01:22 INFO stats.py:314 | Epoch[389] Step[6] GlobalStep[53299] Training Speed: 426.05 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:23:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:01:30 INFO loss_tracker.py:84 | Epoch[389/NA] Step[24] GlobalStep[53317/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 19:01:32 INFO stats.py:314 | Epoch[389] Step[31] GlobalStep[53324] Training Speed: 424.85 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:22:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:01:40 INFO loss_tracker.py:84 | Epoch[389/NA] Step[49] GlobalStep[53342/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0126] total_loss[0.0178] Rank[0/16] 06/24/2025 19:01:43 INFO stats.py:314 | Epoch[389] Step[56] GlobalStep[53349] Training Speed: 424.46 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:22:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:01:50 INFO loss_tracker.py:84 | Epoch[389/NA] Step[74] GlobalStep[53367/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 19:01:53 INFO stats.py:314 | Epoch[389] Step[81] GlobalStep[53374] Training Speed: 430.41 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:22:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:02:01 INFO loss_tracker.py:84 | Epoch[389/NA] Step[99] GlobalStep[53392/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 19:02:03 INFO stats.py:314 | Epoch[389] Step[106] GlobalStep[53399] Training Speed: 430.46 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:22:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:02:10 INFO loss_tracker.py:84 | Epoch[389/NA] Step[124] GlobalStep[53417/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 19:02:13 INFO stats.py:314 | Epoch[389] Step[131] GlobalStep[53424] Training Speed: 449.88 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 5:22:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:02:15 INFO stats.py:394 | Epoch[389] completed. Training Speed: 311.10 samples/sec across all devices. Epoch Time: 56.37 sec. Average Epoch Time: 56.37 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:22:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:02:24 INFO stats.py:314 | Epoch[390] Step[19] GlobalStep[53449] Training Speed: 433.42 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:22:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:02:27 INFO loss_tracker.py:84 | Epoch[390/NA] Step[24] GlobalStep[53454/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 19:02:35 INFO stats.py:314 | Epoch[390] Step[44] GlobalStep[53474] Training Speed: 433.29 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:21:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:02:37 INFO loss_tracker.py:84 | Epoch[390/NA] Step[49] GlobalStep[53479/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 19:02:45 INFO stats.py:314 | Epoch[390] Step[69] GlobalStep[53499] Training Speed: 426.39 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:21:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:02:47 INFO loss_tracker.py:84 | Epoch[390/NA] Step[74] GlobalStep[53504/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 19:02:55 INFO stats.py:314 | Epoch[390] Step[94] GlobalStep[53524] Training Speed: 435.48 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:21:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:02:57 INFO loss_tracker.py:84 | Epoch[390/NA] Step[99] GlobalStep[53529/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 19:03:06 INFO stats.py:314 | Epoch[390] Step[119] GlobalStep[53549] Training Speed: 435.21 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:21:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:03:07 INFO loss_tracker.py:84 | Epoch[390/NA] Step[124] GlobalStep[53554/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 19:03:12 INFO stats.py:394 | Epoch[390] completed. Training Speed: 307.59 samples/sec across all devices. Epoch Time: 57.01 sec. Average Epoch Time: 57.01 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 5:21:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:03:16 INFO stats.py:314 | Epoch[391] Step[7] GlobalStep[53574] Training Speed: 446.21 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:21:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:03:23 INFO loss_tracker.py:84 | Epoch[391/NA] Step[24] GlobalStep[53591/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0054] loss_depth[0.0126] total_loss[0.0180] Rank[0/16] 06/24/2025 19:03:26 INFO stats.py:314 | Epoch[391] Step[32] GlobalStep[53599] Training Speed: 430.35 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:20:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:03:34 INFO loss_tracker.py:84 | Epoch[391/NA] Step[49] GlobalStep[53616/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0126] total_loss[0.0162] Rank[0/16] 06/24/2025 19:03:37 INFO stats.py:314 | Epoch[391] Step[57] GlobalStep[53624] Training Speed: 431.31 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:20:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:03:44 INFO loss_tracker.py:84 | Epoch[391/NA] Step[74] GlobalStep[53641/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 19:03:47 INFO stats.py:314 | Epoch[391] Step[82] GlobalStep[53649] Training Speed: 422.85 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:20:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:03:54 INFO loss_tracker.py:84 | Epoch[391/NA] Step[99] GlobalStep[53666/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 19:03:57 INFO stats.py:314 | Epoch[391] Step[107] GlobalStep[53674] Training Speed: 434.14 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:20:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:04:04 INFO loss_tracker.py:84 | Epoch[391/NA] Step[124] GlobalStep[53691/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0127] total_loss[0.0166] Rank[0/16] 06/24/2025 19:04:07 INFO stats.py:314 | Epoch[391] Step[132] GlobalStep[53699] Training Speed: 448.53 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:20:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:04:08 INFO stats.py:394 | Epoch[391] completed. Training Speed: 310.89 samples/sec across all devices. Epoch Time: 56.41 sec. Average Epoch Time: 56.41 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:20:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:04:18 INFO stats.py:314 | Epoch[392] Step[20] GlobalStep[53724] Training Speed: 430.60 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:20:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:04:20 INFO loss_tracker.py:84 | Epoch[392/NA] Step[24] GlobalStep[53728/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 19:04:28 INFO stats.py:314 | Epoch[392] Step[45] GlobalStep[53749] Training Speed: 432.88 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:19:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:04:30 INFO loss_tracker.py:84 | Epoch[392/NA] Step[49] GlobalStep[53753/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 19:04:39 INFO stats.py:314 | Epoch[392] Step[70] GlobalStep[53774] Training Speed: 434.30 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:19:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:04:41 INFO loss_tracker.py:84 | Epoch[392/NA] Step[74] GlobalStep[53778/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 19:04:49 INFO stats.py:314 | Epoch[392] Step[95] GlobalStep[53799] Training Speed: 436.31 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:19:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:04:51 INFO loss_tracker.py:84 | Epoch[392/NA] Step[99] GlobalStep[53803/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 19:04:59 INFO stats.py:314 | Epoch[392] Step[120] GlobalStep[53824] Training Speed: 453.25 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 5:19:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:05:01 INFO loss_tracker.py:84 | Epoch[392/NA] Step[124] GlobalStep[53828/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 19:05:05 INFO stats.py:394 | Epoch[392] completed. Training Speed: 306.55 samples/sec across all devices. Epoch Time: 57.20 sec. Average Epoch Time: 57.20 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 5:19:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:05:10 INFO stats.py:314 | Epoch[393] Step[8] GlobalStep[53849] Training Speed: 401.38 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 5:19:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:05:17 INFO loss_tracker.py:84 | Epoch[393/NA] Step[24] GlobalStep[53865/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 19:05:20 INFO stats.py:314 | Epoch[393] Step[33] GlobalStep[53874] Training Speed: 429.58 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:19:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:05:27 INFO loss_tracker.py:84 | Epoch[393/NA] Step[49] GlobalStep[53890/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 19:05:31 INFO stats.py:314 | Epoch[393] Step[58] GlobalStep[53899] Training Speed: 424.79 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:18:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:05:37 INFO loss_tracker.py:84 | Epoch[393/NA] Step[74] GlobalStep[53915/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 19:05:41 INFO stats.py:314 | Epoch[393] Step[83] GlobalStep[53924] Training Speed: 425.65 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:18:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:05:48 INFO loss_tracker.py:84 | Epoch[393/NA] Step[99] GlobalStep[53940/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0126] total_loss[0.0180] Rank[0/16] 06/24/2025 19:05:51 INFO stats.py:314 | Epoch[393] Step[108] GlobalStep[53949] Training Speed: 433.30 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:18:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:05:57 INFO loss_tracker.py:84 | Epoch[393/NA] Step[124] GlobalStep[53965/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 19:06:00 INFO stats.py:314 | Epoch[393] Step[133] GlobalStep[53974] Training Speed: 451.26 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 5:18:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:06:01 INFO stats.py:394 | Epoch[393] completed. Training Speed: 312.74 samples/sec across all devices. Epoch Time: 56.07 sec. Average Epoch Time: 56.07 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:18:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:06:11 INFO stats.py:314 | Epoch[394] Step[21] GlobalStep[53999] Training Speed: 430.91 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:18:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:06:12 INFO loss_tracker.py:84 | Epoch[394/NA] Step[24] GlobalStep[54002/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 19:06:21 INFO stats.py:314 | Epoch[394] Step[46] GlobalStep[54024] Training Speed: 432.29 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:18:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:06:23 INFO loss_tracker.py:84 | Epoch[394/NA] Step[49] GlobalStep[54027/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 19:06:31 INFO stats.py:314 | Epoch[394] Step[71] GlobalStep[54049] Training Speed: 431.64 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:17:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:06:33 INFO loss_tracker.py:84 | Epoch[394/NA] Step[74] GlobalStep[54052/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 19:06:42 INFO stats.py:314 | Epoch[394] Step[96] GlobalStep[54074] Training Speed: 433.52 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:17:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:06:44 INFO loss_tracker.py:84 | Epoch[394/NA] Step[99] GlobalStep[54077/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0126] total_loss[0.0178] Rank[0/16] 06/24/2025 19:06:52 INFO stats.py:314 | Epoch[394] Step[121] GlobalStep[54099] Training Speed: 449.30 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 5:17:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:06:54 INFO loss_tracker.py:84 | Epoch[394/NA] Step[124] GlobalStep[54102/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 19:06:58 INFO stats.py:394 | Epoch[394] completed. Training Speed: 310.19 samples/sec across all devices. Epoch Time: 56.53 sec. Average Epoch Time: 56.53 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:17:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:07:03 INFO stats.py:314 | Epoch[395] Step[9] GlobalStep[54124] Training Speed: 435.51 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:17:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:07:09 INFO loss_tracker.py:84 | Epoch[395/NA] Step[24] GlobalStep[54139/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 19:07:13 INFO stats.py:314 | Epoch[395] Step[34] GlobalStep[54149] Training Speed: 433.22 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:17:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:07:19 INFO loss_tracker.py:84 | Epoch[395/NA] Step[49] GlobalStep[54164/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 19:07:24 INFO stats.py:314 | Epoch[395] Step[59] GlobalStep[54174] Training Speed: 430.78 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:16:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:07:30 INFO loss_tracker.py:84 | Epoch[395/NA] Step[74] GlobalStep[54189/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 19:07:34 INFO stats.py:314 | Epoch[395] Step[84] GlobalStep[54199] Training Speed: 434.57 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:16:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:07:40 INFO loss_tracker.py:84 | Epoch[395/NA] Step[99] GlobalStep[54214/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0127] total_loss[0.0164] Rank[0/16] 06/24/2025 19:07:44 INFO stats.py:314 | Epoch[395] Step[109] GlobalStep[54224] Training Speed: 428.58 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:16:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:07:50 INFO loss_tracker.py:84 | Epoch[395/NA] Step[124] GlobalStep[54239/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 19:07:54 INFO stats.py:314 | Epoch[395] Step[134] GlobalStep[54249] Training Speed: 448.59 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:16:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:07:54 INFO stats.py:394 | Epoch[395] completed. Training Speed: 310.20 samples/sec across all devices. Epoch Time: 56.53 sec. Average Epoch Time: 56.53 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:16:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:08:05 INFO stats.py:314 | Epoch[396] Step[22] GlobalStep[54274] Training Speed: 436.24 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:16:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:08:06 INFO loss_tracker.py:84 | Epoch[396/NA] Step[24] GlobalStep[54276/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 19:08:15 INFO stats.py:314 | Epoch[396] Step[47] GlobalStep[54299] Training Speed: 434.43 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:16:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:08:16 INFO loss_tracker.py:84 | Epoch[396/NA] Step[49] GlobalStep[54301/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 19:08:25 INFO stats.py:314 | Epoch[396] Step[72] GlobalStep[54324] Training Speed: 429.49 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:15:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:08:26 INFO loss_tracker.py:84 | Epoch[396/NA] Step[74] GlobalStep[54326/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 19:08:35 INFO stats.py:314 | Epoch[396] Step[97] GlobalStep[54349] Training Speed: 427.88 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:15:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:08:36 INFO loss_tracker.py:84 | Epoch[396/NA] Step[99] GlobalStep[54351/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 19:08:46 INFO stats.py:314 | Epoch[396] Step[122] GlobalStep[54374] Training Speed: 451.74 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 5:15:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:08:46 INFO loss_tracker.py:84 | Epoch[396/NA] Step[124] GlobalStep[54376/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 19:08:51 INFO stats.py:394 | Epoch[396] completed. Training Speed: 310.60 samples/sec across all devices. Epoch Time: 56.46 sec. Average Epoch Time: 56.46 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:15:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:08:56 INFO stats.py:314 | Epoch[397] Step[10] GlobalStep[54399] Training Speed: 432.12 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:15:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:09:02 INFO loss_tracker.py:84 | Epoch[397/NA] Step[24] GlobalStep[54413/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 19:09:07 INFO stats.py:314 | Epoch[397] Step[35] GlobalStep[54424] Training Speed: 429.11 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:15:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:09:12 INFO loss_tracker.py:84 | Epoch[397/NA] Step[49] GlobalStep[54438/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 19:09:17 INFO stats.py:314 | Epoch[397] Step[60] GlobalStep[54449] Training Speed: 434.69 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:15:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:09:23 INFO loss_tracker.py:84 | Epoch[397/NA] Step[74] GlobalStep[54463/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 19:09:27 INFO stats.py:314 | Epoch[397] Step[85] GlobalStep[54474] Training Speed: 427.86 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:14:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:09:33 INFO loss_tracker.py:84 | Epoch[397/NA] Step[99] GlobalStep[54488/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 19:09:37 INFO stats.py:314 | Epoch[397] Step[110] GlobalStep[54499] Training Speed: 430.54 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:14:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:09:43 INFO loss_tracker.py:84 | Epoch[397/NA] Step[124] GlobalStep[54513/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0127] total_loss[0.0175] Rank[0/16] 06/24/2025 19:09:47 INFO stats.py:314 | Epoch[397] Step[135] GlobalStep[54524] Training Speed: 449.44 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 5:14:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:09:47 INFO stats.py:394 | Epoch[397] completed. Training Speed: 311.66 samples/sec across all devices. Epoch Time: 56.27 sec. Average Epoch Time: 56.27 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:14:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:09:58 INFO stats.py:314 | Epoch[398] Step[23] GlobalStep[54549] Training Speed: 426.89 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:14:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:09:59 INFO loss_tracker.py:84 | Epoch[398/NA] Step[24] GlobalStep[54550/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 19:10:09 INFO stats.py:314 | Epoch[398] Step[48] GlobalStep[54574] Training Speed: 401.76 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 5:14:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:10:09 INFO loss_tracker.py:84 | Epoch[398/NA] Step[49] GlobalStep[54575/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 19:10:19 INFO stats.py:314 | Epoch[398] Step[73] GlobalStep[54599] Training Speed: 432.06 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:14:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:10:20 INFO loss_tracker.py:84 | Epoch[398/NA] Step[74] GlobalStep[54600/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 19:10:30 INFO stats.py:314 | Epoch[398] Step[98] GlobalStep[54624] Training Speed: 433.78 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:13:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:10:30 INFO loss_tracker.py:84 | Epoch[398/NA] Step[99] GlobalStep[54625/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 19:10:40 INFO stats.py:314 | Epoch[398] Step[123] GlobalStep[54649] Training Speed: 455.90 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 5:13:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:10:41 INFO loss_tracker.py:84 | Epoch[398/NA] Step[124] GlobalStep[54650/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 19:10:45 INFO stats.py:394 | Epoch[398] completed. Training Speed: 302.55 samples/sec across all devices. Epoch Time: 57.96 sec. Average Epoch Time: 57.96 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 5:13:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:10:50 INFO stats.py:314 | Epoch[399] Step[11] GlobalStep[54674] Training Speed: 430.31 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:13:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:10:56 INFO loss_tracker.py:84 | Epoch[399/NA] Step[24] GlobalStep[54687/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 19:11:01 INFO stats.py:314 | Epoch[399] Step[36] GlobalStep[54699] Training Speed: 429.73 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:13:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:11:06 INFO loss_tracker.py:84 | Epoch[399/NA] Step[49] GlobalStep[54712/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 19:11:11 INFO stats.py:314 | Epoch[399] Step[61] GlobalStep[54724] Training Speed: 434.23 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:13:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:11:16 INFO loss_tracker.py:84 | Epoch[399/NA] Step[74] GlobalStep[54737/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 19:11:21 INFO stats.py:314 | Epoch[399] Step[86] GlobalStep[54749] Training Speed: 432.54 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:12:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:11:27 INFO loss_tracker.py:84 | Epoch[399/NA] Step[99] GlobalStep[54762/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 19:11:31 INFO stats.py:314 | Epoch[399] Step[111] GlobalStep[54774] Training Speed: 420.84 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:12:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:11:36 INFO loss_tracker.py:84 | Epoch[399/NA] Step[124] GlobalStep[54787/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 19:11:41 INFO stats.py:314 | Epoch[399] Step[136] GlobalStep[54799] Training Speed: 449.58 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 5:12:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:11:41 INFO stats.py:394 | Epoch[399] completed. Training Speed: 313.38 samples/sec across all devices. Epoch Time: 55.96 sec. Average Epoch Time: 55.96 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:12:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:11:52 INFO stats.py:314 | Epoch[400] Step[24] GlobalStep[54824] Training Speed: 403.54 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 5:12:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:11:52 INFO loss_tracker.py:84 | Epoch[400/NA] Step[24] GlobalStep[54824/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 19:12:02 INFO stats.py:314 | Epoch[400] Step[49] GlobalStep[54849] Training Speed: 427.73 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:12:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:12:02 INFO loss_tracker.py:84 | Epoch[400/NA] Step[49] GlobalStep[54849/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0179] Rank[0/16] 06/24/2025 19:12:12 INFO stats.py:314 | Epoch[400] Step[74] GlobalStep[54874] Training Speed: 429.93 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:12:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:12:12 INFO loss_tracker.py:84 | Epoch[400/NA] Step[74] GlobalStep[54874/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 19:12:23 INFO stats.py:314 | Epoch[400] Step[99] GlobalStep[54899] Training Speed: 429.31 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:11:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:12:23 INFO loss_tracker.py:84 | Epoch[400/NA] Step[99] GlobalStep[54899/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 19:12:32 INFO stats.py:314 | Epoch[400] Step[124] GlobalStep[54924] Training Speed: 450.74 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 5:11:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:12:33 INFO loss_tracker.py:84 | Epoch[400/NA] Step[124] GlobalStep[54924/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 19:12:37 INFO stats.py:394 | Epoch[400] completed. Training Speed: 315.67 samples/sec across all devices. Epoch Time: 55.55 sec. Average Epoch Time: 55.55 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:11:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:12:43 INFO stats.py:314 | Epoch[401] Step[12] GlobalStep[54949] Training Speed: 433.58 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:11:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:12:47 INFO loss_tracker.py:84 | Epoch[401/NA] Step[24] GlobalStep[54961/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 19:12:53 INFO stats.py:314 | Epoch[401] Step[37] GlobalStep[54974] Training Speed: 435.96 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:11:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:12:58 INFO loss_tracker.py:84 | Epoch[401/NA] Step[49] GlobalStep[54986/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 19:13:03 INFO stats.py:314 | Epoch[401] Step[62] GlobalStep[54999] Training Speed: 432.39 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:11:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:13:08 INFO loss_tracker.py:84 | Epoch[401/NA] Step[74] GlobalStep[55011/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 19:13:13 INFO stats.py:314 | Epoch[401] Step[87] GlobalStep[55024] Training Speed: 429.67 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:11:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:13:18 INFO loss_tracker.py:84 | Epoch[401/NA] Step[99] GlobalStep[55036/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 19:13:23 INFO stats.py:314 | Epoch[401] Step[112] GlobalStep[55049] Training Speed: 435.15 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:10:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:13:28 INFO loss_tracker.py:84 | Epoch[401/NA] Step[124] GlobalStep[55061/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 19:13:32 INFO stats.py:394 | Epoch[401] completed. Training Speed: 316.72 samples/sec across all devices. Epoch Time: 55.37 sec. Average Epoch Time: 55.37 sec. Average Step Time: 0.40 sec. Estimated Remaining Time: 5:10:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:13:33 INFO stats.py:314 | Epoch[402] Step[0] GlobalStep[55074] Training Speed: 357.13 samples/sec across all devices. Average Step Time: 0.36 sec. Estimated Remaining Time: 5:10:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:13:43 INFO loss_tracker.py:84 | Epoch[402/NA] Step[24] GlobalStep[55098/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 19:13:43 INFO stats.py:314 | Epoch[402] Step[25] GlobalStep[55099] Training Speed: 428.51 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:10:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:13:53 INFO loss_tracker.py:84 | Epoch[402/NA] Step[49] GlobalStep[55123/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 19:13:54 INFO stats.py:314 | Epoch[402] Step[50] GlobalStep[55124] Training Speed: 424.07 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:10:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:14:03 INFO loss_tracker.py:84 | Epoch[402/NA] Step[74] GlobalStep[55148/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 19:14:03 INFO stats.py:314 | Epoch[402] Step[75] GlobalStep[55149] Training Speed: 415.48 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 5:10:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:14:14 INFO loss_tracker.py:84 | Epoch[402/NA] Step[99] GlobalStep[55173/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 19:14:14 INFO stats.py:314 | Epoch[402] Step[100] GlobalStep[55174] Training Speed: 431.30 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:09:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:14:23 INFO loss_tracker.py:84 | Epoch[402/NA] Step[124] GlobalStep[55198/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 19:14:24 INFO stats.py:314 | Epoch[402] Step[125] GlobalStep[55199] Training Speed: 435.94 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:09:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:14:27 INFO stats.py:394 | Epoch[402] completed. Training Speed: 315.83 samples/sec across all devices. Epoch Time: 55.52 sec. Average Epoch Time: 55.52 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:09:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:14:35 INFO stats.py:314 | Epoch[403] Step[13] GlobalStep[55224] Training Speed: 437.36 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:09:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:14:39 INFO loss_tracker.py:84 | Epoch[403/NA] Step[24] GlobalStep[55235/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 19:14:45 INFO stats.py:314 | Epoch[403] Step[38] GlobalStep[55249] Training Speed: 421.88 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:09:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:14:49 INFO loss_tracker.py:84 | Epoch[403/NA] Step[49] GlobalStep[55260/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 19:14:55 INFO stats.py:314 | Epoch[403] Step[63] GlobalStep[55274] Training Speed: 432.80 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:09:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:15:00 INFO loss_tracker.py:84 | Epoch[403/NA] Step[74] GlobalStep[55285/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0176] Rank[0/16] 06/24/2025 19:15:05 INFO stats.py:314 | Epoch[403] Step[88] GlobalStep[55299] Training Speed: 421.94 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:09:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:15:10 INFO loss_tracker.py:84 | Epoch[403/NA] Step[99] GlobalStep[55310/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 19:15:16 INFO stats.py:314 | Epoch[403] Step[113] GlobalStep[55324] Training Speed: 428.90 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:08:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:15:20 INFO loss_tracker.py:84 | Epoch[403/NA] Step[124] GlobalStep[55335/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 19:15:25 INFO stats.py:394 | Epoch[403] completed. Training Speed: 307.01 samples/sec across all devices. Epoch Time: 57.12 sec. Average Epoch Time: 57.12 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 5:08:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:15:27 INFO stats.py:314 | Epoch[404] Step[1] GlobalStep[55349] Training Speed: 400.72 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 5:08:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:15:36 INFO loss_tracker.py:84 | Epoch[404/NA] Step[24] GlobalStep[55372/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 19:15:37 INFO stats.py:314 | Epoch[404] Step[26] GlobalStep[55374] Training Speed: 421.49 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:08:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:15:47 INFO loss_tracker.py:84 | Epoch[404/NA] Step[49] GlobalStep[55397/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 19:15:47 INFO stats.py:314 | Epoch[404] Step[51] GlobalStep[55399] Training Speed: 412.88 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 5:08:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:15:57 INFO loss_tracker.py:84 | Epoch[404/NA] Step[74] GlobalStep[55422/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 19:15:58 INFO stats.py:314 | Epoch[404] Step[76] GlobalStep[55424] Training Speed: 429.69 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:08:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:16:07 INFO loss_tracker.py:84 | Epoch[404/NA] Step[99] GlobalStep[55447/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 19:16:08 INFO stats.py:314 | Epoch[404] Step[101] GlobalStep[55449] Training Speed: 431.40 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:08:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:16:17 INFO loss_tracker.py:84 | Epoch[404/NA] Step[124] GlobalStep[55472/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 19:16:18 INFO stats.py:314 | Epoch[404] Step[126] GlobalStep[55474] Training Speed: 448.81 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:07:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:16:22 INFO stats.py:394 | Epoch[404] completed. Training Speed: 308.27 samples/sec across all devices. Epoch Time: 56.89 sec. Average Epoch Time: 56.89 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 5:07:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:16:28 INFO stats.py:314 | Epoch[405] Step[14] GlobalStep[55499] Training Speed: 422.43 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:07:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:16:33 INFO loss_tracker.py:84 | Epoch[405/NA] Step[24] GlobalStep[55509/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 19:16:39 INFO stats.py:314 | Epoch[405] Step[39] GlobalStep[55524] Training Speed: 429.01 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:07:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:16:43 INFO loss_tracker.py:84 | Epoch[405/NA] Step[49] GlobalStep[55534/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 19:16:49 INFO stats.py:314 | Epoch[405] Step[64] GlobalStep[55549] Training Speed: 426.49 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:07:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:16:53 INFO loss_tracker.py:84 | Epoch[405/NA] Step[74] GlobalStep[55559/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 19:17:00 INFO stats.py:314 | Epoch[405] Step[89] GlobalStep[55574] Training Speed: 433.85 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:07:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:17:04 INFO loss_tracker.py:84 | Epoch[405/NA] Step[99] GlobalStep[55584/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 19:17:10 INFO stats.py:314 | Epoch[405] Step[114] GlobalStep[55599] Training Speed: 417.20 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 5:07:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:17:14 INFO loss_tracker.py:84 | Epoch[405/NA] Step[124] GlobalStep[55609/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 19:17:18 INFO stats.py:394 | Epoch[405] completed. Training Speed: 309.23 samples/sec across all devices. Epoch Time: 56.71 sec. Average Epoch Time: 56.71 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:06:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:17:20 INFO stats.py:314 | Epoch[406] Step[2] GlobalStep[55624] Training Speed: 429.43 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:06:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:17:30 INFO loss_tracker.py:84 | Epoch[406/NA] Step[24] GlobalStep[55646/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 19:17:31 INFO stats.py:314 | Epoch[406] Step[27] GlobalStep[55649] Training Speed: 430.41 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:06:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:17:40 INFO loss_tracker.py:84 | Epoch[406/NA] Step[49] GlobalStep[55671/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0127] total_loss[0.0174] Rank[0/16] 06/24/2025 19:17:41 INFO stats.py:314 | Epoch[406] Step[52] GlobalStep[55674] Training Speed: 433.85 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:06:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:17:50 INFO loss_tracker.py:84 | Epoch[406/NA] Step[74] GlobalStep[55696/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 19:17:51 INFO stats.py:314 | Epoch[406] Step[77] GlobalStep[55699] Training Speed: 422.29 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:06:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:18:00 INFO loss_tracker.py:84 | Epoch[406/NA] Step[99] GlobalStep[55721/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 19:18:01 INFO stats.py:314 | Epoch[406] Step[102] GlobalStep[55724] Training Speed: 426.57 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:06:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:18:11 INFO loss_tracker.py:84 | Epoch[406/NA] Step[124] GlobalStep[55746/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0055] loss_depth[0.0126] total_loss[0.0181] Rank[0/16] 06/24/2025 19:18:12 INFO stats.py:314 | Epoch[406] Step[127] GlobalStep[55749] Training Speed: 426.86 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:05:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:18:15 INFO stats.py:394 | Epoch[406] completed. Training Speed: 309.62 samples/sec across all devices. Epoch Time: 56.64 sec. Average Epoch Time: 56.64 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:05:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:18:22 INFO stats.py:314 | Epoch[407] Step[15] GlobalStep[55774] Training Speed: 421.84 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:05:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:18:26 INFO loss_tracker.py:84 | Epoch[407/NA] Step[24] GlobalStep[55783/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 19:18:32 INFO stats.py:314 | Epoch[407] Step[40] GlobalStep[55799] Training Speed: 422.61 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:05:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:18:36 INFO loss_tracker.py:84 | Epoch[407/NA] Step[49] GlobalStep[55808/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0173] Rank[0/16] 06/24/2025 19:18:43 INFO stats.py:314 | Epoch[407] Step[65] GlobalStep[55824] Training Speed: 429.79 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:05:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:18:46 INFO loss_tracker.py:84 | Epoch[407/NA] Step[74] GlobalStep[55833/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 19:18:53 INFO stats.py:314 | Epoch[407] Step[90] GlobalStep[55849] Training Speed: 420.74 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:05:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:18:57 INFO loss_tracker.py:84 | Epoch[407/NA] Step[99] GlobalStep[55858/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 19:19:03 INFO stats.py:314 | Epoch[407] Step[115] GlobalStep[55874] Training Speed: 409.45 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 5:05:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:19:06 INFO loss_tracker.py:84 | Epoch[407/NA] Step[124] GlobalStep[55883/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 19:19:11 INFO stats.py:394 | Epoch[407] completed. Training Speed: 313.85 samples/sec across all devices. Epoch Time: 55.87 sec. Average Epoch Time: 55.87 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:04:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:19:13 INFO stats.py:314 | Epoch[408] Step[3] GlobalStep[55899] Training Speed: 428.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:04:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:19:22 INFO loss_tracker.py:84 | Epoch[408/NA] Step[24] GlobalStep[55920/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 19:19:24 INFO stats.py:314 | Epoch[408] Step[28] GlobalStep[55924] Training Speed: 429.38 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:04:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:19:33 INFO loss_tracker.py:84 | Epoch[408/NA] Step[49] GlobalStep[55945/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 19:19:34 INFO stats.py:314 | Epoch[408] Step[53] GlobalStep[55949] Training Speed: 428.01 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:04:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:19:42 INFO loss_tracker.py:84 | Epoch[408/NA] Step[74] GlobalStep[55970/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 19:19:44 INFO stats.py:314 | Epoch[408] Step[78] GlobalStep[55974] Training Speed: 426.16 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:04:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:19:53 INFO loss_tracker.py:84 | Epoch[408/NA] Step[99] GlobalStep[55995/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 19:19:54 INFO stats.py:314 | Epoch[408] Step[103] GlobalStep[55999] Training Speed: 403.62 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 5:04:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:19:54 WARNING accelerator.py:3099 | Deleting 1 checkpoints to make room for new checkpoint. Rank[0/16] 06/24/2025 19:19:56 INFO accelerator.py:3111 | Saving current state to /job_data/checkpoints/checkpoint_13 Rank[3/16] 06/24/2025 19:19:56 INFO checkpoint.py:110 | Save checkpoint at the end of step 55999 to /job_data/checkpoints/checkpoint_13 Rank[11/16] 06/24/2025 19:19:56 INFO checkpoint.py:110 | Save checkpoint at the end of step 55999 to /job_data/checkpoints/checkpoint_13 Rank[10/16] 06/24/2025 19:19:56 INFO checkpoint.py:110 | Save checkpoint at the end of step 55999 to /job_data/checkpoints/checkpoint_13 Rank[5/16] 06/24/2025 19:19:56 INFO checkpoint.py:110 | Save checkpoint at the end of step 55999 to /job_data/checkpoints/checkpoint_13 Rank[15/16] 06/24/2025 19:19:56 INFO checkpoint.py:110 | Save checkpoint at the end of step 55999 to /job_data/checkpoints/checkpoint_13 Rank[7/16] 06/24/2025 19:19:56 INFO checkpoint.py:110 | Save checkpoint at the end of step 55999 to /job_data/checkpoints/checkpoint_13 Rank[1/16] 06/24/2025 19:19:56 INFO checkpoint.py:110 | Save checkpoint at the end of step 55999 to /job_data/checkpoints/checkpoint_13 Rank[8/16] 06/24/2025 19:19:56 INFO checkpoint.py:110 | Save checkpoint at the end of step 55999 to /job_data/checkpoints/checkpoint_13 Rank[13/16] 06/24/2025 19:19:56 INFO checkpoint.py:110 | Save checkpoint at the end of step 55999 to /job_data/checkpoints/checkpoint_13 Rank[4/16] 06/24/2025 19:19:56 INFO checkpoint.py:110 | Save checkpoint at the end of step 55999 to /job_data/checkpoints/checkpoint_13 Rank[6/16] 06/24/2025 19:19:56 INFO checkpoint.py:110 | Save checkpoint at the end of step 55999 to /job_data/checkpoints/checkpoint_13 Rank[14/16] 06/24/2025 19:19:56 INFO checkpoint.py:110 | Save checkpoint at the end of step 55999 to /job_data/checkpoints/checkpoint_13 Rank[2/16] 06/24/2025 19:19:56 INFO checkpoint.py:110 | Save checkpoint at the end of step 55999 to /job_data/checkpoints/checkpoint_13 Rank[12/16] 06/24/2025 19:19:56 INFO checkpoint.py:110 | Save checkpoint at the end of step 55999 to /job_data/checkpoints/checkpoint_13 Rank[9/16] 06/24/2025 19:19:56 INFO checkpoint.py:110 | Save checkpoint at the end of step 55999 to /job_data/checkpoints/checkpoint_13 Rank[0/16] 06/24/2025 19:19:57 INFO checkpointing.py:106 | Model weights saved in /job_data/checkpoints/checkpoint_13/model.safetensors Rank[0/16] 06/24/2025 19:19:58 INFO checkpointing.py:113 | Optimizer state saved in /job_data/checkpoints/checkpoint_13/optimizer.bin Rank[0/16] 06/24/2025 19:19:58 INFO checkpointing.py:120 | Scheduler state saved in /job_data/checkpoints/checkpoint_13/scheduler.bin Rank[0/16] 06/24/2025 19:19:58 INFO checkpointing.py:137 | Sampler state for dataloader 0 saved in /job_data/checkpoints/checkpoint_13/sampler.bin Rank[0/16] 06/24/2025 19:19:58 INFO checkpointing.py:164 | Random states saved in /job_data/checkpoints/checkpoint_13/random_states_0.pkl Rank[0/16] 06/24/2025 19:19:58 INFO checkpointing.py:300 | Saving the state of TrainerProgressState to /job_data/checkpoints/checkpoint_13/custom_checkpoint_0.pkl Rank[0/16] 06/24/2025 19:19:58 INFO checkpoint.py:110 | Save checkpoint at the end of step 55999 to /job_data/checkpoints/checkpoint_13 Rank[0/16] 06/24/2025 19:20:06 INFO loss_tracker.py:84 | Epoch[408/NA] Step[124] GlobalStep[56020/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0127] total_loss[0.0168] Rank[0/16] 06/24/2025 19:20:07 INFO stats.py:314 | Epoch[408] Step[128] GlobalStep[56024] Training Speed: 435.62 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:04:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:20:10 INFO stats.py:394 | Epoch[408] completed. Training Speed: 294.18 samples/sec across all devices. Epoch Time: 59.61 sec. Average Epoch Time: 59.61 sec. Average Step Time: 0.44 sec. Estimated Remaining Time: 5:04:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:20:18 INFO stats.py:314 | Epoch[409] Step[16] GlobalStep[56049] Training Speed: 428.04 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:03:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:20:22 INFO loss_tracker.py:84 | Epoch[409/NA] Step[24] GlobalStep[56057/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 19:20:29 INFO stats.py:314 | Epoch[409] Step[41] GlobalStep[56074] Training Speed: 429.94 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:03:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:20:32 INFO loss_tracker.py:84 | Epoch[409/NA] Step[49] GlobalStep[56082/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 19:20:39 INFO stats.py:314 | Epoch[409] Step[66] GlobalStep[56099] Training Speed: 432.47 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:03:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:20:42 INFO loss_tracker.py:84 | Epoch[409/NA] Step[74] GlobalStep[56107/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 19:20:49 INFO stats.py:314 | Epoch[409] Step[91] GlobalStep[56124] Training Speed: 430.90 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:03:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:20:52 INFO loss_tracker.py:84 | Epoch[409/NA] Step[99] GlobalStep[56132/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0126] total_loss[0.0180] Rank[0/16] 06/24/2025 19:20:59 INFO stats.py:314 | Epoch[409] Step[116] GlobalStep[56149] Training Speed: 423.05 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:03:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:21:02 INFO loss_tracker.py:84 | Epoch[409/NA] Step[124] GlobalStep[56157/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 19:21:06 INFO stats.py:394 | Epoch[409] completed. Training Speed: 314.02 samples/sec across all devices. Epoch Time: 55.84 sec. Average Epoch Time: 55.84 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:03:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:21:10 INFO stats.py:314 | Epoch[410] Step[4] GlobalStep[56174] Training Speed: 426.92 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:03:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:21:17 INFO loss_tracker.py:84 | Epoch[410/NA] Step[24] GlobalStep[56194/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0127] total_loss[0.0167] Rank[0/16] 06/24/2025 19:21:19 INFO stats.py:314 | Epoch[410] Step[29] GlobalStep[56199] Training Speed: 419.80 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:02:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:21:27 INFO loss_tracker.py:84 | Epoch[410/NA] Step[49] GlobalStep[56219/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 19:21:30 INFO stats.py:314 | Epoch[410] Step[54] GlobalStep[56224] Training Speed: 431.92 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:02:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:21:38 INFO loss_tracker.py:84 | Epoch[410/NA] Step[74] GlobalStep[56244/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 19:21:40 INFO stats.py:314 | Epoch[410] Step[79] GlobalStep[56249] Training Speed: 430.17 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:02:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:21:48 INFO loss_tracker.py:84 | Epoch[410/NA] Step[99] GlobalStep[56269/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0035] loss_depth[0.0126] total_loss[0.0161] Rank[0/16] 06/24/2025 19:21:50 INFO stats.py:314 | Epoch[410] Step[104] GlobalStep[56274] Training Speed: 427.19 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:02:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:21:58 INFO loss_tracker.py:84 | Epoch[410/NA] Step[124] GlobalStep[56294/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0055] loss_depth[0.0127] total_loss[0.0182] Rank[0/16] 06/24/2025 19:22:00 INFO stats.py:314 | Epoch[410] Step[129] GlobalStep[56299] Training Speed: 447.53 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:02:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:22:02 INFO stats.py:394 | Epoch[410] completed. Training Speed: 312.56 samples/sec across all devices. Epoch Time: 56.10 sec. Average Epoch Time: 56.10 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:02:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:22:11 INFO stats.py:314 | Epoch[411] Step[17] GlobalStep[56324] Training Speed: 411.49 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 5:02:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:22:14 INFO loss_tracker.py:84 | Epoch[411/NA] Step[24] GlobalStep[56331/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 19:22:21 INFO stats.py:314 | Epoch[411] Step[42] GlobalStep[56349] Training Speed: 427.82 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:01:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:22:24 INFO loss_tracker.py:84 | Epoch[411/NA] Step[49] GlobalStep[56356/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 19:22:32 INFO stats.py:314 | Epoch[411] Step[67] GlobalStep[56374] Training Speed: 426.52 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:01:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:22:34 INFO loss_tracker.py:84 | Epoch[411/NA] Step[74] GlobalStep[56381/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 19:22:42 INFO stats.py:314 | Epoch[411] Step[92] GlobalStep[56399] Training Speed: 418.68 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 5:01:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:22:45 INFO loss_tracker.py:84 | Epoch[411/NA] Step[99] GlobalStep[56406/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 19:22:52 INFO stats.py:314 | Epoch[411] Step[117] GlobalStep[56424] Training Speed: 427.06 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:01:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:22:55 INFO loss_tracker.py:84 | Epoch[411/NA] Step[124] GlobalStep[56431/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 19:22:59 INFO stats.py:394 | Epoch[411] completed. Training Speed: 308.94 samples/sec across all devices. Epoch Time: 56.76 sec. Average Epoch Time: 56.76 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:01:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:23:02 INFO stats.py:314 | Epoch[412] Step[5] GlobalStep[56449] Training Speed: 427.14 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:01:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:23:10 INFO loss_tracker.py:84 | Epoch[412/NA] Step[24] GlobalStep[56468/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 19:23:12 INFO stats.py:314 | Epoch[412] Step[30] GlobalStep[56474] Training Speed: 432.58 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:00:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:23:21 INFO loss_tracker.py:84 | Epoch[412/NA] Step[49] GlobalStep[56493/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 19:23:23 INFO stats.py:314 | Epoch[412] Step[55] GlobalStep[56499] Training Speed: 423.88 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:00:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:23:31 INFO loss_tracker.py:84 | Epoch[412/NA] Step[74] GlobalStep[56518/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 19:23:33 INFO stats.py:314 | Epoch[412] Step[80] GlobalStep[56524] Training Speed: 431.12 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:00:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:23:41 INFO loss_tracker.py:84 | Epoch[412/NA] Step[99] GlobalStep[56543/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 19:23:44 INFO stats.py:314 | Epoch[412] Step[105] GlobalStep[56549] Training Speed: 429.70 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 5:00:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:23:51 INFO loss_tracker.py:84 | Epoch[412/NA] Step[124] GlobalStep[56568/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 19:23:53 INFO stats.py:314 | Epoch[412] Step[130] GlobalStep[56574] Training Speed: 448.74 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:00:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:23:55 INFO stats.py:394 | Epoch[412] completed. Training Speed: 311.42 samples/sec across all devices. Epoch Time: 56.31 sec. Average Epoch Time: 56.31 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 5:00:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:24:04 INFO stats.py:314 | Epoch[413] Step[18] GlobalStep[56599] Training Speed: 436.94 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 5:00:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:24:07 INFO loss_tracker.py:84 | Epoch[413/NA] Step[24] GlobalStep[56605/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0127] total_loss[0.0171] Rank[0/16] 06/24/2025 19:24:15 INFO stats.py:314 | Epoch[413] Step[43] GlobalStep[56624] Training Speed: 429.60 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:59:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:24:18 INFO loss_tracker.py:84 | Epoch[413/NA] Step[49] GlobalStep[56630/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0126] total_loss[0.0163] Rank[0/16] 06/24/2025 19:24:25 INFO stats.py:314 | Epoch[413] Step[68] GlobalStep[56649] Training Speed: 429.70 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:59:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:24:28 INFO loss_tracker.py:84 | Epoch[413/NA] Step[74] GlobalStep[56655/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 19:24:36 INFO stats.py:314 | Epoch[413] Step[93] GlobalStep[56674] Training Speed: 435.38 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:59:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:24:38 INFO loss_tracker.py:84 | Epoch[413/NA] Step[99] GlobalStep[56680/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 19:24:46 INFO stats.py:314 | Epoch[413] Step[118] GlobalStep[56699] Training Speed: 434.92 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:59:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:24:48 INFO loss_tracker.py:84 | Epoch[413/NA] Step[124] GlobalStep[56705/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 19:24:53 INFO stats.py:394 | Epoch[413] completed. Training Speed: 305.03 samples/sec across all devices. Epoch Time: 57.49 sec. Average Epoch Time: 57.49 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 4:59:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:24:57 INFO stats.py:314 | Epoch[414] Step[6] GlobalStep[56724] Training Speed: 433.71 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:59:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:25:04 INFO loss_tracker.py:84 | Epoch[414/NA] Step[24] GlobalStep[56742/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 19:25:07 INFO stats.py:314 | Epoch[414] Step[31] GlobalStep[56749] Training Speed: 420.02 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:59:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:25:14 INFO loss_tracker.py:84 | Epoch[414/NA] Step[49] GlobalStep[56767/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 19:25:17 INFO stats.py:314 | Epoch[414] Step[56] GlobalStep[56774] Training Speed: 429.84 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:58:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:25:25 INFO loss_tracker.py:84 | Epoch[414/NA] Step[74] GlobalStep[56792/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 19:25:27 INFO stats.py:314 | Epoch[414] Step[81] GlobalStep[56799] Training Speed: 445.25 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:58:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:25:35 INFO loss_tracker.py:84 | Epoch[414/NA] Step[99] GlobalStep[56817/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 19:25:38 INFO stats.py:314 | Epoch[414] Step[106] GlobalStep[56824] Training Speed: 434.12 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:58:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:25:45 INFO loss_tracker.py:84 | Epoch[414/NA] Step[124] GlobalStep[56842/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0127] total_loss[0.0172] Rank[0/16] 06/24/2025 19:25:47 INFO stats.py:314 | Epoch[414] Step[131] GlobalStep[56849] Training Speed: 451.01 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 4:58:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:25:49 INFO stats.py:394 | Epoch[414] completed. Training Speed: 311.94 samples/sec across all devices. Epoch Time: 56.22 sec. Average Epoch Time: 56.22 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:58:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:25:58 INFO stats.py:314 | Epoch[415] Step[19] GlobalStep[56874] Training Speed: 436.10 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:58:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:26:01 INFO loss_tracker.py:84 | Epoch[415/NA] Step[24] GlobalStep[56879/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 19:26:09 INFO stats.py:314 | Epoch[415] Step[44] GlobalStep[56899] Training Speed: 433.69 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:58:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:26:11 INFO loss_tracker.py:84 | Epoch[415/NA] Step[49] GlobalStep[56904/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 19:26:19 INFO stats.py:314 | Epoch[415] Step[69] GlobalStep[56924] Training Speed: 433.38 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:57:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:26:21 INFO loss_tracker.py:84 | Epoch[415/NA] Step[74] GlobalStep[56929/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0126] total_loss[0.0163] Rank[0/16] 06/24/2025 19:26:29 INFO stats.py:314 | Epoch[415] Step[94] GlobalStep[56949] Training Speed: 430.96 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:57:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:26:32 INFO loss_tracker.py:84 | Epoch[415/NA] Step[99] GlobalStep[56954/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 19:26:40 INFO stats.py:314 | Epoch[415] Step[119] GlobalStep[56974] Training Speed: 427.68 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:57:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:26:42 INFO loss_tracker.py:84 | Epoch[415/NA] Step[124] GlobalStep[56979/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 19:26:46 INFO stats.py:394 | Epoch[415] completed. Training Speed: 307.89 samples/sec across all devices. Epoch Time: 56.96 sec. Average Epoch Time: 56.96 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 4:57:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:26:50 INFO stats.py:314 | Epoch[416] Step[7] GlobalStep[56999] Training Speed: 436.44 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:57:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:26:57 INFO loss_tracker.py:84 | Epoch[416/NA] Step[24] GlobalStep[57016/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0126] total_loss[0.0178] Rank[0/16] 06/24/2025 19:27:00 INFO stats.py:314 | Epoch[416] Step[32] GlobalStep[57024] Training Speed: 432.07 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:57:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:27:08 INFO loss_tracker.py:84 | Epoch[416/NA] Step[49] GlobalStep[57041/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0176] Rank[0/16] 06/24/2025 19:27:11 INFO stats.py:314 | Epoch[416] Step[57] GlobalStep[57049] Training Speed: 430.10 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:57:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:27:18 INFO loss_tracker.py:84 | Epoch[416/NA] Step[74] GlobalStep[57066/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 19:27:21 INFO stats.py:314 | Epoch[416] Step[82] GlobalStep[57074] Training Speed: 427.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:56:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:27:28 INFO loss_tracker.py:84 | Epoch[416/NA] Step[99] GlobalStep[57091/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 19:27:31 INFO stats.py:314 | Epoch[416] Step[107] GlobalStep[57099] Training Speed: 263.03 samples/sec across all devices. Average Step Time: 0.49 sec. Estimated Remaining Time: 4:56:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:27:38 INFO loss_tracker.py:84 | Epoch[416/NA] Step[124] GlobalStep[57116/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 19:27:40 INFO stats.py:314 | Epoch[416] Step[132] GlobalStep[57124] Training Speed: 450.13 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 4:56:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:27:42 INFO stats.py:394 | Epoch[416] completed. Training Speed: 313.17 samples/sec across all devices. Epoch Time: 56.00 sec. Average Epoch Time: 56.00 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:56:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:27:51 INFO stats.py:314 | Epoch[417] Step[20] GlobalStep[57149] Training Speed: 434.91 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:56:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:27:53 INFO loss_tracker.py:84 | Epoch[417/NA] Step[24] GlobalStep[57153/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 19:28:02 INFO stats.py:314 | Epoch[417] Step[45] GlobalStep[57174] Training Speed: 429.80 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:56:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:28:04 INFO loss_tracker.py:84 | Epoch[417/NA] Step[49] GlobalStep[57178/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 19:28:12 INFO stats.py:314 | Epoch[417] Step[70] GlobalStep[57199] Training Speed: 411.11 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 4:55:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:28:14 INFO loss_tracker.py:84 | Epoch[417/NA] Step[74] GlobalStep[57203/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 19:28:23 INFO stats.py:314 | Epoch[417] Step[95] GlobalStep[57224] Training Speed: 434.15 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:55:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:28:25 INFO loss_tracker.py:84 | Epoch[417/NA] Step[99] GlobalStep[57228/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0127] total_loss[0.0176] Rank[0/16] 06/24/2025 19:28:33 INFO stats.py:314 | Epoch[417] Step[120] GlobalStep[57249] Training Speed: 447.76 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:55:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:28:34 INFO loss_tracker.py:84 | Epoch[417/NA] Step[124] GlobalStep[57253/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 19:28:39 INFO stats.py:394 | Epoch[417] completed. Training Speed: 308.61 samples/sec across all devices. Epoch Time: 56.82 sec. Average Epoch Time: 56.82 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:55:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:28:43 INFO stats.py:314 | Epoch[418] Step[8] GlobalStep[57274] Training Speed: 433.98 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:55:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:28:50 INFO loss_tracker.py:84 | Epoch[418/NA] Step[24] GlobalStep[57290/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 19:28:54 INFO stats.py:314 | Epoch[418] Step[33] GlobalStep[57299] Training Speed: 430.46 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:55:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:29:00 INFO loss_tracker.py:84 | Epoch[418/NA] Step[49] GlobalStep[57315/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0127] total_loss[0.0179] Rank[0/16] 06/24/2025 19:29:04 INFO stats.py:314 | Epoch[418] Step[58] GlobalStep[57324] Training Speed: 432.33 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:55:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:29:10 INFO loss_tracker.py:84 | Epoch[418/NA] Step[74] GlobalStep[57340/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 19:29:14 INFO stats.py:314 | Epoch[418] Step[83] GlobalStep[57349] Training Speed: 423.58 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:54:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:29:21 INFO loss_tracker.py:84 | Epoch[418/NA] Step[99] GlobalStep[57365/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 19:29:24 INFO stats.py:314 | Epoch[418] Step[108] GlobalStep[57374] Training Speed: 431.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:54:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:29:30 INFO loss_tracker.py:84 | Epoch[418/NA] Step[124] GlobalStep[57390/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0127] total_loss[0.0168] Rank[0/16] 06/24/2025 19:29:34 INFO stats.py:314 | Epoch[418] Step[133] GlobalStep[57399] Training Speed: 447.02 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:54:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:29:35 INFO stats.py:394 | Epoch[418] completed. Training Speed: 314.06 samples/sec across all devices. Epoch Time: 55.84 sec. Average Epoch Time: 55.84 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:54:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:29:45 INFO stats.py:314 | Epoch[419] Step[21] GlobalStep[57424] Training Speed: 429.29 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:54:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:29:46 INFO loss_tracker.py:84 | Epoch[419/NA] Step[24] GlobalStep[57427/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 19:29:55 INFO stats.py:314 | Epoch[419] Step[46] GlobalStep[57449] Training Speed: 429.93 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:54:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:29:56 INFO loss_tracker.py:84 | Epoch[419/NA] Step[49] GlobalStep[57452/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 19:30:05 INFO stats.py:314 | Epoch[419] Step[71] GlobalStep[57474] Training Speed: 432.92 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:54:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:30:06 INFO loss_tracker.py:84 | Epoch[419/NA] Step[74] GlobalStep[57477/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 19:30:15 INFO stats.py:314 | Epoch[419] Step[96] GlobalStep[57499] Training Speed: 426.53 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:53:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:30:17 INFO loss_tracker.py:84 | Epoch[419/NA] Step[99] GlobalStep[57502/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 19:30:26 INFO stats.py:314 | Epoch[419] Step[121] GlobalStep[57524] Training Speed: 454.05 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 4:53:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:30:27 INFO loss_tracker.py:84 | Epoch[419/NA] Step[124] GlobalStep[57527/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 19:30:31 INFO stats.py:394 | Epoch[419] completed. Training Speed: 312.24 samples/sec across all devices. Epoch Time: 56.16 sec. Average Epoch Time: 56.16 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:53:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:30:36 INFO stats.py:314 | Epoch[420] Step[9] GlobalStep[57549] Training Speed: 428.80 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:53:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:30:42 INFO loss_tracker.py:84 | Epoch[420/NA] Step[24] GlobalStep[57564/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 19:30:46 INFO stats.py:314 | Epoch[420] Step[34] GlobalStep[57574] Training Speed: 436.76 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:53:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:30:52 INFO loss_tracker.py:84 | Epoch[420/NA] Step[49] GlobalStep[57589/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0035] loss_depth[0.0126] total_loss[0.0162] Rank[0/16] 06/24/2025 19:30:56 INFO stats.py:314 | Epoch[420] Step[59] GlobalStep[57599] Training Speed: 431.41 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:53:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:31:03 INFO loss_tracker.py:84 | Epoch[420/NA] Step[74] GlobalStep[57614/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 19:31:07 INFO stats.py:314 | Epoch[420] Step[84] GlobalStep[57624] Training Speed: 429.06 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:52:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:31:13 INFO loss_tracker.py:84 | Epoch[420/NA] Step[99] GlobalStep[57639/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 19:31:17 INFO stats.py:314 | Epoch[420] Step[109] GlobalStep[57649] Training Speed: 437.52 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:52:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:31:23 INFO loss_tracker.py:84 | Epoch[420/NA] Step[124] GlobalStep[57664/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 19:31:26 INFO stats.py:314 | Epoch[420] Step[134] GlobalStep[57674] Training Speed: 452.40 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 4:52:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:31:27 INFO stats.py:394 | Epoch[420] completed. Training Speed: 311.75 samples/sec across all devices. Epoch Time: 56.25 sec. Average Epoch Time: 56.25 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:52:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:31:37 INFO stats.py:314 | Epoch[421] Step[22] GlobalStep[57699] Training Speed: 430.00 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:52:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:31:38 INFO loss_tracker.py:84 | Epoch[421/NA] Step[24] GlobalStep[57701/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 19:31:48 INFO stats.py:314 | Epoch[421] Step[47] GlobalStep[57724] Training Speed: 423.09 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:52:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:31:48 INFO loss_tracker.py:84 | Epoch[421/NA] Step[49] GlobalStep[57726/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 19:31:58 INFO stats.py:314 | Epoch[421] Step[72] GlobalStep[57749] Training Speed: 431.42 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:52:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:31:59 INFO loss_tracker.py:84 | Epoch[421/NA] Step[74] GlobalStep[57751/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 19:32:08 INFO stats.py:314 | Epoch[421] Step[97] GlobalStep[57774] Training Speed: 425.58 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:51:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:32:09 INFO loss_tracker.py:84 | Epoch[421/NA] Step[99] GlobalStep[57776/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 19:32:18 INFO stats.py:314 | Epoch[421] Step[122] GlobalStep[57799] Training Speed: 453.70 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 4:51:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:32:19 INFO loss_tracker.py:84 | Epoch[421/NA] Step[124] GlobalStep[57801/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0176] Rank[0/16] 06/24/2025 19:32:23 INFO stats.py:394 | Epoch[421] completed. Training Speed: 312.64 samples/sec across all devices. Epoch Time: 56.09 sec. Average Epoch Time: 56.09 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:51:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:32:29 INFO stats.py:314 | Epoch[422] Step[10] GlobalStep[57824] Training Speed: 437.59 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:51:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:32:34 INFO loss_tracker.py:84 | Epoch[422/NA] Step[24] GlobalStep[57838/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 19:32:39 INFO stats.py:314 | Epoch[422] Step[35] GlobalStep[57849] Training Speed: 431.91 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:51:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:32:45 INFO loss_tracker.py:84 | Epoch[422/NA] Step[49] GlobalStep[57863/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 19:32:49 INFO stats.py:314 | Epoch[422] Step[60] GlobalStep[57874] Training Speed: 425.27 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:51:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:32:55 INFO loss_tracker.py:84 | Epoch[422/NA] Step[74] GlobalStep[57888/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 19:32:59 INFO stats.py:314 | Epoch[422] Step[85] GlobalStep[57899] Training Speed: 434.25 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:51:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:33:05 INFO loss_tracker.py:84 | Epoch[422/NA] Step[99] GlobalStep[57913/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 19:33:09 INFO stats.py:314 | Epoch[422] Step[110] GlobalStep[57924] Training Speed: 267.69 samples/sec across all devices. Average Step Time: 0.48 sec. Estimated Remaining Time: 4:50:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:33:14 INFO loss_tracker.py:84 | Epoch[422/NA] Step[124] GlobalStep[57938/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 19:33:18 INFO stats.py:314 | Epoch[422] Step[135] GlobalStep[57949] Training Speed: 450.70 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 4:50:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:33:19 INFO stats.py:394 | Epoch[422] completed. Training Speed: 315.02 samples/sec across all devices. Epoch Time: 55.67 sec. Average Epoch Time: 55.67 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:50:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:33:30 INFO stats.py:314 | Epoch[423] Step[23] GlobalStep[57974] Training Speed: 436.99 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:50:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:33:30 INFO loss_tracker.py:84 | Epoch[423/NA] Step[24] GlobalStep[57975/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 19:33:40 INFO stats.py:314 | Epoch[423] Step[48] GlobalStep[57999] Training Speed: 432.34 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:50:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:33:40 INFO loss_tracker.py:84 | Epoch[423/NA] Step[49] GlobalStep[58000/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 19:33:50 INFO stats.py:314 | Epoch[423] Step[73] GlobalStep[58024] Training Speed: 260.48 samples/sec across all devices. Average Step Time: 0.49 sec. Estimated Remaining Time: 4:50:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:33:51 INFO loss_tracker.py:84 | Epoch[423/NA] Step[74] GlobalStep[58025/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 19:34:00 INFO stats.py:314 | Epoch[423] Step[98] GlobalStep[58049] Training Speed: 433.57 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:50:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:34:01 INFO loss_tracker.py:84 | Epoch[423/NA] Step[99] GlobalStep[58050/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 19:34:11 INFO stats.py:314 | Epoch[423] Step[123] GlobalStep[58074] Training Speed: 453.61 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 4:49:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:34:11 INFO loss_tracker.py:84 | Epoch[423/NA] Step[124] GlobalStep[58075/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0176] Rank[0/16] 06/24/2025 19:34:15 INFO stats.py:394 | Epoch[423] completed. Training Speed: 311.38 samples/sec across all devices. Epoch Time: 56.32 sec. Average Epoch Time: 56.32 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:49:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:34:21 INFO stats.py:314 | Epoch[424] Step[11] GlobalStep[58099] Training Speed: 431.52 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:49:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:34:26 INFO loss_tracker.py:84 | Epoch[424/NA] Step[24] GlobalStep[58112/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 19:34:31 INFO stats.py:314 | Epoch[424] Step[36] GlobalStep[58124] Training Speed: 426.58 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:49:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:34:37 INFO loss_tracker.py:84 | Epoch[424/NA] Step[49] GlobalStep[58137/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0126] total_loss[0.0178] Rank[0/16] 06/24/2025 19:34:42 INFO stats.py:314 | Epoch[424] Step[61] GlobalStep[58149] Training Speed: 433.11 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:49:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:34:47 INFO loss_tracker.py:84 | Epoch[424/NA] Step[74] GlobalStep[58162/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 19:34:52 INFO stats.py:314 | Epoch[424] Step[86] GlobalStep[58174] Training Speed: 422.43 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:49:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:34:57 INFO loss_tracker.py:84 | Epoch[424/NA] Step[99] GlobalStep[58187/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 19:35:02 INFO stats.py:314 | Epoch[424] Step[111] GlobalStep[58199] Training Speed: 424.40 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:48:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:35:07 INFO loss_tracker.py:84 | Epoch[424/NA] Step[124] GlobalStep[58212/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 19:35:12 INFO stats.py:314 | Epoch[424] Step[136] GlobalStep[58224] Training Speed: 449.16 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 4:48:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:35:12 INFO stats.py:394 | Epoch[424] completed. Training Speed: 310.23 samples/sec across all devices. Epoch Time: 56.53 sec. Average Epoch Time: 56.53 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:48:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:35:23 INFO stats.py:314 | Epoch[425] Step[24] GlobalStep[58249] Training Speed: 433.72 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:48:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:35:23 INFO loss_tracker.py:84 | Epoch[425/NA] Step[24] GlobalStep[58249/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 19:35:33 INFO stats.py:314 | Epoch[425] Step[49] GlobalStep[58274] Training Speed: 431.78 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:48:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:35:34 INFO loss_tracker.py:84 | Epoch[425/NA] Step[49] GlobalStep[58274/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 19:35:43 INFO stats.py:314 | Epoch[425] Step[74] GlobalStep[58299] Training Speed: 429.77 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:48:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:35:43 INFO loss_tracker.py:84 | Epoch[425/NA] Step[74] GlobalStep[58299/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 19:35:54 INFO stats.py:314 | Epoch[425] Step[99] GlobalStep[58324] Training Speed: 430.03 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:48:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:35:54 INFO loss_tracker.py:84 | Epoch[425/NA] Step[99] GlobalStep[58324/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0126] total_loss[0.0179] Rank[0/16] 06/24/2025 19:36:03 INFO stats.py:314 | Epoch[425] Step[124] GlobalStep[58349] Training Speed: 450.36 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 4:47:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:36:03 INFO loss_tracker.py:84 | Epoch[425/NA] Step[124] GlobalStep[58349/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 19:36:07 INFO stats.py:394 | Epoch[425] completed. Training Speed: 315.08 samples/sec across all devices. Epoch Time: 55.66 sec. Average Epoch Time: 55.66 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:47:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:36:14 INFO stats.py:314 | Epoch[426] Step[12] GlobalStep[58374] Training Speed: 423.76 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:47:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:36:19 INFO loss_tracker.py:84 | Epoch[426/NA] Step[24] GlobalStep[58386/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 19:36:24 INFO stats.py:314 | Epoch[426] Step[37] GlobalStep[58399] Training Speed: 426.64 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:47:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:36:30 INFO loss_tracker.py:84 | Epoch[426/NA] Step[49] GlobalStep[58411/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 19:36:35 INFO stats.py:314 | Epoch[426] Step[62] GlobalStep[58424] Training Speed: 395.86 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 4:47:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:36:40 INFO loss_tracker.py:84 | Epoch[426/NA] Step[74] GlobalStep[58436/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 19:36:45 INFO stats.py:314 | Epoch[426] Step[87] GlobalStep[58449] Training Speed: 419.58 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 4:47:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:36:51 INFO loss_tracker.py:84 | Epoch[426/NA] Step[99] GlobalStep[58461/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 19:36:56 INFO stats.py:314 | Epoch[426] Step[112] GlobalStep[58474] Training Speed: 425.54 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:47:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:37:00 INFO loss_tracker.py:84 | Epoch[426/NA] Step[124] GlobalStep[58486/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 19:37:05 INFO stats.py:394 | Epoch[426] completed. Training Speed: 306.74 samples/sec across all devices. Epoch Time: 57.17 sec. Average Epoch Time: 57.17 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 4:46:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:37:05 INFO stats.py:314 | Epoch[427] Step[0] GlobalStep[58499] Training Speed: 355.24 samples/sec across all devices. Average Step Time: 0.36 sec. Estimated Remaining Time: 4:46:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:37:16 INFO loss_tracker.py:84 | Epoch[427/NA] Step[24] GlobalStep[58523/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 19:37:16 INFO stats.py:314 | Epoch[427] Step[25] GlobalStep[58524] Training Speed: 405.70 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 4:46:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:37:26 INFO loss_tracker.py:84 | Epoch[427/NA] Step[49] GlobalStep[58548/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 19:37:26 INFO stats.py:314 | Epoch[427] Step[50] GlobalStep[58549] Training Speed: 423.55 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:46:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:37:36 INFO loss_tracker.py:84 | Epoch[427/NA] Step[74] GlobalStep[58573/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 19:37:36 INFO stats.py:314 | Epoch[427] Step[75] GlobalStep[58574] Training Speed: 428.80 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:46:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:37:46 INFO loss_tracker.py:84 | Epoch[427/NA] Step[99] GlobalStep[58598/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0035] loss_depth[0.0126] total_loss[0.0161] Rank[0/16] 06/24/2025 19:37:46 INFO stats.py:314 | Epoch[427] Step[100] GlobalStep[58599] Training Speed: 415.17 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 4:46:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:37:55 INFO loss_tracker.py:84 | Epoch[427/NA] Step[124] GlobalStep[58623/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 19:37:56 INFO stats.py:314 | Epoch[427] Step[125] GlobalStep[58624] Training Speed: 414.32 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 4:46:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:38:00 INFO stats.py:394 | Epoch[427] completed. Training Speed: 318.41 samples/sec across all devices. Epoch Time: 55.07 sec. Average Epoch Time: 55.07 sec. Average Step Time: 0.40 sec. Estimated Remaining Time: 4:45:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:38:07 INFO stats.py:314 | Epoch[428] Step[13] GlobalStep[58649] Training Speed: 429.02 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:45:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:38:12 INFO loss_tracker.py:84 | Epoch[428/NA] Step[24] GlobalStep[58660/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 19:38:17 INFO stats.py:314 | Epoch[428] Step[38] GlobalStep[58674] Training Speed: 429.31 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:45:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:38:22 INFO loss_tracker.py:84 | Epoch[428/NA] Step[49] GlobalStep[58685/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0126] total_loss[0.0163] Rank[0/16] 06/24/2025 19:38:28 INFO stats.py:314 | Epoch[428] Step[63] GlobalStep[58699] Training Speed: 424.77 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:45:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:38:33 INFO loss_tracker.py:84 | Epoch[428/NA] Step[74] GlobalStep[58710/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 19:38:38 INFO stats.py:314 | Epoch[428] Step[88] GlobalStep[58724] Training Speed: 426.86 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:45:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:38:42 INFO loss_tracker.py:84 | Epoch[428/NA] Step[99] GlobalStep[58735/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 19:38:49 INFO stats.py:314 | Epoch[428] Step[113] GlobalStep[58749] Training Speed: 432.30 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:45:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:38:53 INFO loss_tracker.py:84 | Epoch[428/NA] Step[124] GlobalStep[58760/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 19:38:57 INFO stats.py:394 | Epoch[428] completed. Training Speed: 305.77 samples/sec across all devices. Epoch Time: 57.35 sec. Average Epoch Time: 57.35 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 4:44:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:38:59 INFO stats.py:314 | Epoch[429] Step[1] GlobalStep[58774] Training Speed: 432.88 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:44:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:39:09 INFO loss_tracker.py:84 | Epoch[429/NA] Step[24] GlobalStep[58797/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 19:39:10 INFO stats.py:314 | Epoch[429] Step[26] GlobalStep[58799] Training Speed: 403.36 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 4:44:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:39:19 INFO loss_tracker.py:84 | Epoch[429/NA] Step[49] GlobalStep[58822/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 19:39:19 INFO stats.py:314 | Epoch[429] Step[51] GlobalStep[58824] Training Speed: 422.84 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:44:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:39:30 INFO loss_tracker.py:84 | Epoch[429/NA] Step[74] GlobalStep[58847/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 19:39:31 INFO stats.py:314 | Epoch[429] Step[76] GlobalStep[58849] Training Speed: 427.26 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:44:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:39:39 INFO loss_tracker.py:84 | Epoch[429/NA] Step[99] GlobalStep[58872/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 19:39:40 INFO stats.py:314 | Epoch[429] Step[101] GlobalStep[58874] Training Speed: 368.50 samples/sec across all devices. Average Step Time: 0.35 sec. Estimated Remaining Time: 4:44:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:39:51 INFO loss_tracker.py:84 | Epoch[429/NA] Step[124] GlobalStep[58897/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 19:39:51 INFO stats.py:314 | Epoch[429] Step[126] GlobalStep[58899] Training Speed: 449.83 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 4:44:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:39:55 INFO stats.py:394 | Epoch[429] completed. Training Speed: 303.99 samples/sec across all devices. Epoch Time: 57.69 sec. Average Epoch Time: 57.69 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 4:44:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:40:01 INFO stats.py:314 | Epoch[430] Step[14] GlobalStep[58924] Training Speed: 431.55 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:43:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:40:06 INFO loss_tracker.py:84 | Epoch[430/NA] Step[24] GlobalStep[58934/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 19:40:12 INFO stats.py:314 | Epoch[430] Step[39] GlobalStep[58949] Training Speed: 431.20 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:43:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:40:16 INFO loss_tracker.py:84 | Epoch[430/NA] Step[49] GlobalStep[58959/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 19:40:22 INFO stats.py:314 | Epoch[430] Step[64] GlobalStep[58974] Training Speed: 418.80 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 4:43:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:40:26 INFO loss_tracker.py:84 | Epoch[430/NA] Step[74] GlobalStep[58984/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 19:40:32 INFO stats.py:314 | Epoch[430] Step[89] GlobalStep[58999] Training Speed: 427.90 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:43:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:40:36 INFO loss_tracker.py:84 | Epoch[430/NA] Step[99] GlobalStep[59009/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 19:40:42 INFO stats.py:314 | Epoch[430] Step[114] GlobalStep[59024] Training Speed: 427.71 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:43:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:40:47 INFO loss_tracker.py:84 | Epoch[430/NA] Step[124] GlobalStep[59034/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 19:40:51 INFO stats.py:394 | Epoch[430] completed. Training Speed: 310.89 samples/sec across all devices. Epoch Time: 56.41 sec. Average Epoch Time: 56.41 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:43:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:40:53 INFO stats.py:314 | Epoch[431] Step[2] GlobalStep[59049] Training Speed: 416.17 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 4:43:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:41:02 INFO loss_tracker.py:84 | Epoch[431/NA] Step[24] GlobalStep[59071/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 19:41:03 INFO stats.py:314 | Epoch[431] Step[27] GlobalStep[59074] Training Speed: 428.04 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:42:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:41:12 INFO loss_tracker.py:84 | Epoch[431/NA] Step[49] GlobalStep[59096/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 19:41:13 INFO stats.py:314 | Epoch[431] Step[52] GlobalStep[59099] Training Speed: 430.06 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:42:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:41:22 INFO loss_tracker.py:84 | Epoch[431/NA] Step[74] GlobalStep[59121/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 19:41:23 INFO stats.py:314 | Epoch[431] Step[77] GlobalStep[59124] Training Speed: 433.11 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:42:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:41:32 INFO loss_tracker.py:84 | Epoch[431/NA] Step[99] GlobalStep[59146/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 19:41:34 INFO stats.py:314 | Epoch[431] Step[102] GlobalStep[59149] Training Speed: 410.54 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 4:42:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:41:42 INFO loss_tracker.py:84 | Epoch[431/NA] Step[124] GlobalStep[59171/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 19:41:43 INFO stats.py:314 | Epoch[431] Step[127] GlobalStep[59174] Training Speed: 438.04 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:42:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:41:47 INFO stats.py:394 | Epoch[431] completed. Training Speed: 314.55 samples/sec across all devices. Epoch Time: 55.75 sec. Average Epoch Time: 55.75 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:42:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:41:54 INFO stats.py:314 | Epoch[432] Step[15] GlobalStep[59199] Training Speed: 410.45 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 4:42:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:41:58 INFO loss_tracker.py:84 | Epoch[432/NA] Step[24] GlobalStep[59208/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 19:42:05 INFO stats.py:314 | Epoch[432] Step[40] GlobalStep[59224] Training Speed: 248.04 samples/sec across all devices. Average Step Time: 0.52 sec. Estimated Remaining Time: 4:41:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:42:08 INFO loss_tracker.py:84 | Epoch[432/NA] Step[49] GlobalStep[59233/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 19:42:14 INFO stats.py:314 | Epoch[432] Step[65] GlobalStep[59249] Training Speed: 422.00 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:41:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:42:18 INFO loss_tracker.py:84 | Epoch[432/NA] Step[74] GlobalStep[59258/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 19:42:25 INFO stats.py:314 | Epoch[432] Step[90] GlobalStep[59274] Training Speed: 434.70 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:41:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:42:28 INFO loss_tracker.py:84 | Epoch[432/NA] Step[99] GlobalStep[59283/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0176] Rank[0/16] 06/24/2025 19:42:34 INFO stats.py:314 | Epoch[432] Step[115] GlobalStep[59299] Training Speed: 433.20 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:41:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:42:38 INFO loss_tracker.py:84 | Epoch[432/NA] Step[124] GlobalStep[59308/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0125] total_loss[0.0163] Rank[0/16] 06/24/2025 19:42:42 INFO stats.py:394 | Epoch[432] completed. Training Speed: 317.02 samples/sec across all devices. Epoch Time: 55.32 sec. Average Epoch Time: 55.32 sec. Average Step Time: 0.40 sec. Estimated Remaining Time: 4:41:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:42:45 INFO stats.py:314 | Epoch[433] Step[3] GlobalStep[59324] Training Speed: 425.11 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:41:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:42:54 INFO loss_tracker.py:84 | Epoch[433/NA] Step[24] GlobalStep[59345/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 19:42:55 INFO stats.py:314 | Epoch[433] Step[28] GlobalStep[59349] Training Speed: 431.35 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:40:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:43:05 INFO loss_tracker.py:84 | Epoch[433/NA] Step[49] GlobalStep[59370/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0176] Rank[0/16] 06/24/2025 19:43:06 INFO stats.py:314 | Epoch[433] Step[53] GlobalStep[59374] Training Speed: 421.27 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:40:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:43:15 INFO loss_tracker.py:84 | Epoch[433/NA] Step[74] GlobalStep[59395/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0176] Rank[0/16] 06/24/2025 19:43:16 INFO stats.py:314 | Epoch[433] Step[78] GlobalStep[59399] Training Speed: 431.92 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:40:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:43:25 INFO loss_tracker.py:84 | Epoch[433/NA] Step[99] GlobalStep[59420/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 19:43:27 INFO stats.py:314 | Epoch[433] Step[103] GlobalStep[59424] Training Speed: 432.74 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:40:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:43:35 INFO loss_tracker.py:84 | Epoch[433/NA] Step[124] GlobalStep[59445/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 19:43:36 INFO stats.py:314 | Epoch[433] Step[128] GlobalStep[59449] Training Speed: 453.22 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 4:40:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:43:39 INFO stats.py:394 | Epoch[433] completed. Training Speed: 308.16 samples/sec across all devices. Epoch Time: 56.91 sec. Average Epoch Time: 56.91 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 4:40:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:43:47 INFO stats.py:314 | Epoch[434] Step[16] GlobalStep[59474] Training Speed: 430.17 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:40:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:43:51 INFO loss_tracker.py:84 | Epoch[434/NA] Step[24] GlobalStep[59482/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0126] total_loss[0.0163] Rank[0/16] 06/24/2025 19:43:57 INFO stats.py:314 | Epoch[434] Step[41] GlobalStep[59499] Training Speed: 391.33 samples/sec across all devices. Average Step Time: 0.33 sec. Estimated Remaining Time: 4:39:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:44:01 INFO loss_tracker.py:84 | Epoch[434/NA] Step[49] GlobalStep[59507/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0054] loss_depth[0.0126] total_loss[0.0180] Rank[0/16] 06/24/2025 19:44:08 INFO stats.py:314 | Epoch[434] Step[66] GlobalStep[59524] Training Speed: 428.87 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:39:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:44:11 INFO loss_tracker.py:84 | Epoch[434/NA] Step[74] GlobalStep[59532/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 19:44:18 INFO stats.py:314 | Epoch[434] Step[91] GlobalStep[59549] Training Speed: 428.27 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:39:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:44:21 INFO loss_tracker.py:84 | Epoch[434/NA] Step[99] GlobalStep[59557/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 19:44:28 INFO stats.py:314 | Epoch[434] Step[116] GlobalStep[59574] Training Speed: 427.47 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:39:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:44:31 INFO loss_tracker.py:84 | Epoch[434/NA] Step[124] GlobalStep[59582/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 19:44:36 INFO stats.py:394 | Epoch[434] completed. Training Speed: 310.21 samples/sec across all devices. Epoch Time: 56.53 sec. Average Epoch Time: 56.53 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:39:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:44:38 INFO stats.py:314 | Epoch[435] Step[4] GlobalStep[59599] Training Speed: 427.99 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:39:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:44:47 INFO loss_tracker.py:84 | Epoch[435/NA] Step[24] GlobalStep[59619/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0125] total_loss[0.0163] Rank[0/16] 06/24/2025 19:44:49 INFO stats.py:314 | Epoch[435] Step[29] GlobalStep[59624] Training Speed: 430.21 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:39:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:44:57 INFO loss_tracker.py:84 | Epoch[435/NA] Step[49] GlobalStep[59644/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 19:44:59 INFO stats.py:314 | Epoch[435] Step[54] GlobalStep[59649] Training Speed: 423.92 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:38:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:45:08 INFO loss_tracker.py:84 | Epoch[435/NA] Step[74] GlobalStep[59669/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 19:45:10 INFO stats.py:314 | Epoch[435] Step[79] GlobalStep[59674] Training Speed: 381.76 samples/sec across all devices. Average Step Time: 0.34 sec. Estimated Remaining Time: 4:38:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:45:18 INFO loss_tracker.py:84 | Epoch[435/NA] Step[99] GlobalStep[59694/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 19:45:20 INFO stats.py:314 | Epoch[435] Step[104] GlobalStep[59699] Training Speed: 420.40 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:38:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:45:28 INFO loss_tracker.py:84 | Epoch[435/NA] Step[124] GlobalStep[59719/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 19:45:30 INFO stats.py:314 | Epoch[435] Step[129] GlobalStep[59724] Training Speed: 449.59 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 4:38:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:45:32 INFO stats.py:394 | Epoch[435] completed. Training Speed: 310.87 samples/sec across all devices. Epoch Time: 56.41 sec. Average Epoch Time: 56.41 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:38:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:45:40 INFO stats.py:314 | Epoch[436] Step[17] GlobalStep[59749] Training Speed: 429.06 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:38:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:45:43 INFO loss_tracker.py:84 | Epoch[436/NA] Step[24] GlobalStep[59756/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0127] total_loss[0.0166] Rank[0/16] 06/24/2025 19:45:50 INFO stats.py:314 | Epoch[436] Step[42] GlobalStep[59774] Training Speed: 431.99 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:38:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:45:53 INFO loss_tracker.py:84 | Epoch[436/NA] Step[49] GlobalStep[59781/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 19:46:00 INFO stats.py:314 | Epoch[436] Step[67] GlobalStep[59799] Training Speed: 418.43 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 4:37:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:46:03 INFO loss_tracker.py:84 | Epoch[436/NA] Step[74] GlobalStep[59806/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 19:46:11 INFO stats.py:314 | Epoch[436] Step[92] GlobalStep[59824] Training Speed: 430.90 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:37:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:46:14 INFO loss_tracker.py:84 | Epoch[436/NA] Step[99] GlobalStep[59831/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 19:46:21 INFO stats.py:314 | Epoch[436] Step[117] GlobalStep[59849] Training Speed: 424.32 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:37:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:46:23 INFO loss_tracker.py:84 | Epoch[436/NA] Step[124] GlobalStep[59856/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0126] total_loss[0.0178] Rank[0/16] 06/24/2025 19:46:28 INFO stats.py:394 | Epoch[436] completed. Training Speed: 311.71 samples/sec across all devices. Epoch Time: 56.26 sec. Average Epoch Time: 56.26 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:37:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:46:32 INFO stats.py:314 | Epoch[437] Step[5] GlobalStep[59874] Training Speed: 426.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:37:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:46:39 INFO loss_tracker.py:84 | Epoch[437/NA] Step[24] GlobalStep[59893/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 19:46:42 INFO stats.py:314 | Epoch[437] Step[30] GlobalStep[59899] Training Speed: 434.08 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:37:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:46:50 INFO loss_tracker.py:84 | Epoch[437/NA] Step[49] GlobalStep[59918/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 19:46:52 INFO stats.py:314 | Epoch[437] Step[55] GlobalStep[59924] Training Speed: 400.16 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 4:36:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:46:59 INFO loss_tracker.py:84 | Epoch[437/NA] Step[74] GlobalStep[59943/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0127] total_loss[0.0177] Rank[0/16] 06/24/2025 19:47:02 INFO stats.py:314 | Epoch[437] Step[80] GlobalStep[59949] Training Speed: 394.65 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 4:36:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:47:10 INFO loss_tracker.py:84 | Epoch[437/NA] Step[99] GlobalStep[59968/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0127] total_loss[0.0167] Rank[0/16] 06/24/2025 19:47:12 INFO stats.py:314 | Epoch[437] Step[105] GlobalStep[59974] Training Speed: 425.29 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:36:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:47:20 INFO loss_tracker.py:84 | Epoch[437/NA] Step[124] GlobalStep[59993/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 19:47:22 INFO stats.py:314 | Epoch[437] Step[130] GlobalStep[59999] Training Speed: 436.75 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:36:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:47:23 WARNING accelerator.py:3099 | Deleting 1 checkpoints to make room for new checkpoint. Rank[0/16] 06/24/2025 19:47:23 INFO accelerator.py:3111 | Saving current state to /job_data/checkpoints/checkpoint_14 Rank[2/16] 06/24/2025 19:47:23 INFO checkpoint.py:110 | Save checkpoint at the end of step 59999 to /job_data/checkpoints/checkpoint_14 Rank[12/16] 06/24/2025 19:47:23 INFO checkpoint.py:110 | Save checkpoint at the end of step 59999 to /job_data/checkpoints/checkpoint_14 Rank[6/16] 06/24/2025 19:47:23 INFO checkpoint.py:110 | Save checkpoint at the end of step 59999 to /job_data/checkpoints/checkpoint_14 Rank[7/16] 06/24/2025 19:47:23 INFO checkpoint.py:110 | Save checkpoint at the end of step 59999 to /job_data/checkpoints/checkpoint_14 Rank[13/16] 06/24/2025 19:47:23 INFO checkpoint.py:110 | Save checkpoint at the end of step 59999 to /job_data/checkpoints/checkpoint_14 Rank[1/16] 06/24/2025 19:47:23 INFO checkpoint.py:110 | Save checkpoint at the end of step 59999 to /job_data/checkpoints/checkpoint_14 Rank[15/16] 06/24/2025 19:47:23 INFO checkpoint.py:110 | Save checkpoint at the end of step 59999 to /job_data/checkpoints/checkpoint_14 Rank[5/16] 06/24/2025 19:47:23 INFO checkpoint.py:110 | Save checkpoint at the end of step 59999 to /job_data/checkpoints/checkpoint_14 Rank[14/16] 06/24/2025 19:47:24 INFO checkpoint.py:110 | Save checkpoint at the end of step 59999 to /job_data/checkpoints/checkpoint_14 Rank[3/16] 06/24/2025 19:47:24 INFO checkpoint.py:110 | Save checkpoint at the end of step 59999 to /job_data/checkpoints/checkpoint_14 Rank[9/16] 06/24/2025 19:47:24 INFO checkpoint.py:110 | Save checkpoint at the end of step 59999 to /job_data/checkpoints/checkpoint_14 Rank[4/16] 06/24/2025 19:47:24 INFO checkpoint.py:110 | Save checkpoint at the end of step 59999 to /job_data/checkpoints/checkpoint_14 Rank[8/16] 06/24/2025 19:47:24 INFO checkpoint.py:110 | Save checkpoint at the end of step 59999 to /job_data/checkpoints/checkpoint_14 Rank[11/16] 06/24/2025 19:47:24 INFO checkpoint.py:110 | Save checkpoint at the end of step 59999 to /job_data/checkpoints/checkpoint_14 Rank[10/16] 06/24/2025 19:47:24 INFO checkpoint.py:110 | Save checkpoint at the end of step 59999 to /job_data/checkpoints/checkpoint_14 Rank[0/16] 06/24/2025 19:47:24 INFO checkpointing.py:106 | Model weights saved in /job_data/checkpoints/checkpoint_14/model.safetensors Rank[0/16] 06/24/2025 19:47:25 INFO checkpointing.py:113 | Optimizer state saved in /job_data/checkpoints/checkpoint_14/optimizer.bin Rank[0/16] 06/24/2025 19:47:25 INFO checkpointing.py:120 | Scheduler state saved in /job_data/checkpoints/checkpoint_14/scheduler.bin Rank[0/16] 06/24/2025 19:47:25 INFO checkpointing.py:137 | Sampler state for dataloader 0 saved in /job_data/checkpoints/checkpoint_14/sampler.bin Rank[0/16] 06/24/2025 19:47:25 INFO checkpointing.py:164 | Random states saved in /job_data/checkpoints/checkpoint_14/random_states_0.pkl Rank[0/16] 06/24/2025 19:47:25 INFO checkpointing.py:300 | Saving the state of TrainerProgressState to /job_data/checkpoints/checkpoint_14/custom_checkpoint_0.pkl Rank[0/16] 06/24/2025 19:47:25 INFO checkpoint.py:110 | Save checkpoint at the end of step 59999 to /job_data/checkpoints/checkpoint_14 Rank[0/16] 06/24/2025 19:47:27 INFO stats.py:394 | Epoch[437] completed. Training Speed: 296.17 samples/sec across all devices. Epoch Time: 59.21 sec. Average Epoch Time: 59.21 sec. Average Step Time: 0.43 sec. Estimated Remaining Time: 4:36:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:47:36 INFO stats.py:314 | Epoch[438] Step[18] GlobalStep[60024] Training Speed: 428.19 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:36:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:47:39 INFO loss_tracker.py:84 | Epoch[438/NA] Step[24] GlobalStep[60030/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 19:47:47 INFO stats.py:314 | Epoch[438] Step[43] GlobalStep[60049] Training Speed: 421.82 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:36:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:47:49 INFO loss_tracker.py:84 | Epoch[438/NA] Step[49] GlobalStep[60055/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 19:47:57 INFO stats.py:314 | Epoch[438] Step[68] GlobalStep[60074] Training Speed: 426.89 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:35:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:48:00 INFO loss_tracker.py:84 | Epoch[438/NA] Step[74] GlobalStep[60080/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 19:48:08 INFO stats.py:314 | Epoch[438] Step[93] GlobalStep[60099] Training Speed: 426.80 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:35:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:48:11 INFO loss_tracker.py:84 | Epoch[438/NA] Step[99] GlobalStep[60105/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 19:48:18 INFO stats.py:314 | Epoch[438] Step[118] GlobalStep[60124] Training Speed: 430.13 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:35:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:48:20 INFO loss_tracker.py:84 | Epoch[438/NA] Step[124] GlobalStep[60130/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0126] total_loss[0.0162] Rank[0/16] 06/24/2025 19:48:25 INFO stats.py:394 | Epoch[438] completed. Training Speed: 306.91 samples/sec across all devices. Epoch Time: 57.14 sec. Average Epoch Time: 57.14 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 4:35:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:48:28 INFO stats.py:314 | Epoch[439] Step[6] GlobalStep[60149] Training Speed: 433.91 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:35:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:48:36 INFO loss_tracker.py:84 | Epoch[439/NA] Step[24] GlobalStep[60167/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 19:48:39 INFO stats.py:314 | Epoch[439] Step[31] GlobalStep[60174] Training Speed: 432.77 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:35:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:48:47 INFO loss_tracker.py:84 | Epoch[439/NA] Step[49] GlobalStep[60192/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0035] loss_depth[0.0126] total_loss[0.0161] Rank[0/16] 06/24/2025 19:48:50 INFO stats.py:314 | Epoch[439] Step[56] GlobalStep[60199] Training Speed: 426.00 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:35:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:48:57 INFO loss_tracker.py:84 | Epoch[439/NA] Step[74] GlobalStep[60217/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 19:49:00 INFO stats.py:314 | Epoch[439] Step[81] GlobalStep[60224] Training Speed: 400.30 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 4:34:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:49:08 INFO loss_tracker.py:84 | Epoch[439/NA] Step[99] GlobalStep[60242/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 19:49:10 INFO stats.py:314 | Epoch[439] Step[106] GlobalStep[60249] Training Speed: 419.76 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:34:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:49:18 INFO loss_tracker.py:84 | Epoch[439/NA] Step[124] GlobalStep[60267/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 19:49:20 INFO stats.py:314 | Epoch[439] Step[131] GlobalStep[60274] Training Speed: 444.29 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:34:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:49:22 INFO stats.py:394 | Epoch[439] completed. Training Speed: 305.59 samples/sec across all devices. Epoch Time: 57.38 sec. Average Epoch Time: 57.38 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 4:34:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:49:32 INFO stats.py:314 | Epoch[440] Step[19] GlobalStep[60299] Training Speed: 436.86 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:34:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:49:34 INFO loss_tracker.py:84 | Epoch[440/NA] Step[24] GlobalStep[60304/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 19:49:42 INFO stats.py:314 | Epoch[440] Step[44] GlobalStep[60324] Training Speed: 432.20 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:34:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:49:44 INFO loss_tracker.py:84 | Epoch[440/NA] Step[49] GlobalStep[60329/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 19:49:52 INFO stats.py:314 | Epoch[440] Step[69] GlobalStep[60349] Training Speed: 420.17 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:34:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:49:55 INFO loss_tracker.py:84 | Epoch[440/NA] Step[74] GlobalStep[60354/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 19:50:02 INFO stats.py:314 | Epoch[440] Step[94] GlobalStep[60374] Training Speed: 392.34 samples/sec across all devices. Average Step Time: 0.33 sec. Estimated Remaining Time: 4:33:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:50:05 INFO loss_tracker.py:84 | Epoch[440/NA] Step[99] GlobalStep[60379/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0035] loss_depth[0.0126] total_loss[0.0161] Rank[0/16] 06/24/2025 19:50:13 INFO stats.py:314 | Epoch[440] Step[119] GlobalStep[60399] Training Speed: 426.56 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:33:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:50:15 INFO loss_tracker.py:84 | Epoch[440/NA] Step[124] GlobalStep[60404/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 19:50:20 INFO stats.py:394 | Epoch[440] completed. Training Speed: 303.13 samples/sec across all devices. Epoch Time: 57.85 sec. Average Epoch Time: 57.85 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 4:33:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:50:24 INFO stats.py:314 | Epoch[441] Step[7] GlobalStep[60424] Training Speed: 429.35 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:33:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:50:32 INFO loss_tracker.py:84 | Epoch[441/NA] Step[24] GlobalStep[60441/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 19:50:35 INFO stats.py:314 | Epoch[441] Step[32] GlobalStep[60449] Training Speed: 437.04 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:33:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:50:42 INFO loss_tracker.py:84 | Epoch[441/NA] Step[49] GlobalStep[60466/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 19:50:45 INFO stats.py:314 | Epoch[441] Step[57] GlobalStep[60474] Training Speed: 434.43 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:33:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:50:52 INFO loss_tracker.py:84 | Epoch[441/NA] Step[74] GlobalStep[60491/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 19:50:55 INFO stats.py:314 | Epoch[441] Step[82] GlobalStep[60499] Training Speed: 427.04 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:33:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:51:02 INFO loss_tracker.py:84 | Epoch[441/NA] Step[99] GlobalStep[60516/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 19:51:05 INFO stats.py:314 | Epoch[441] Step[107] GlobalStep[60524] Training Speed: 242.96 samples/sec across all devices. Average Step Time: 0.53 sec. Estimated Remaining Time: 4:32:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:51:12 INFO loss_tracker.py:84 | Epoch[441/NA] Step[124] GlobalStep[60541/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 19:51:15 INFO stats.py:314 | Epoch[441] Step[132] GlobalStep[60549] Training Speed: 440.18 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:32:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:51:16 INFO stats.py:394 | Epoch[441] completed. Training Speed: 310.31 samples/sec across all devices. Epoch Time: 56.51 sec. Average Epoch Time: 56.51 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:32:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:51:26 INFO stats.py:314 | Epoch[442] Step[20] GlobalStep[60574] Training Speed: 433.48 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:32:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:51:28 INFO loss_tracker.py:84 | Epoch[442/NA] Step[24] GlobalStep[60578/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 19:51:37 INFO stats.py:314 | Epoch[442] Step[45] GlobalStep[60599] Training Speed: 427.36 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:32:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:51:39 INFO loss_tracker.py:84 | Epoch[442/NA] Step[49] GlobalStep[60603/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 19:51:47 INFO stats.py:314 | Epoch[442] Step[70] GlobalStep[60624] Training Speed: 411.62 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 4:32:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:51:48 INFO loss_tracker.py:84 | Epoch[442/NA] Step[74] GlobalStep[60628/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 19:51:57 INFO stats.py:314 | Epoch[442] Step[95] GlobalStep[60649] Training Speed: 433.85 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:32:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:51:58 INFO loss_tracker.py:84 | Epoch[442/NA] Step[99] GlobalStep[60653/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0034] loss_depth[0.0126] total_loss[0.0161] Rank[0/16] 06/24/2025 19:52:06 INFO stats.py:314 | Epoch[442] Step[120] GlobalStep[60674] Training Speed: 438.35 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:31:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:52:08 INFO loss_tracker.py:84 | Epoch[442/NA] Step[124] GlobalStep[60678/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 19:52:12 INFO stats.py:394 | Epoch[442] completed. Training Speed: 312.10 samples/sec across all devices. Epoch Time: 56.19 sec. Average Epoch Time: 56.19 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:31:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:52:17 INFO stats.py:314 | Epoch[443] Step[8] GlobalStep[60699] Training Speed: 433.60 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:31:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:52:24 INFO loss_tracker.py:84 | Epoch[443/NA] Step[24] GlobalStep[60715/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 19:52:27 INFO stats.py:314 | Epoch[443] Step[33] GlobalStep[60724] Training Speed: 419.78 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:31:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:52:35 INFO loss_tracker.py:84 | Epoch[443/NA] Step[49] GlobalStep[60740/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 19:52:38 INFO stats.py:314 | Epoch[443] Step[58] GlobalStep[60749] Training Speed: 428.08 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:31:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:52:45 INFO loss_tracker.py:84 | Epoch[443/NA] Step[74] GlobalStep[60765/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 19:52:49 INFO stats.py:314 | Epoch[443] Step[83] GlobalStep[60774] Training Speed: 430.55 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:31:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:52:55 INFO loss_tracker.py:84 | Epoch[443/NA] Step[99] GlobalStep[60790/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0176] Rank[0/16] 06/24/2025 19:52:59 INFO stats.py:314 | Epoch[443] Step[108] GlobalStep[60799] Training Speed: 432.73 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:30:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:53:05 INFO loss_tracker.py:84 | Epoch[443/NA] Step[124] GlobalStep[60815/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 19:53:08 INFO stats.py:314 | Epoch[443] Step[133] GlobalStep[60824] Training Speed: 450.12 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 4:30:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:53:09 INFO stats.py:394 | Epoch[443] completed. Training Speed: 310.31 samples/sec across all devices. Epoch Time: 56.51 sec. Average Epoch Time: 56.51 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:30:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:53:19 INFO stats.py:314 | Epoch[444] Step[21] GlobalStep[60849] Training Speed: 432.19 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:30:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:53:20 INFO loss_tracker.py:84 | Epoch[444/NA] Step[24] GlobalStep[60852/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 19:53:29 INFO stats.py:314 | Epoch[444] Step[46] GlobalStep[60874] Training Speed: 430.99 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:30:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:53:31 INFO loss_tracker.py:84 | Epoch[444/NA] Step[49] GlobalStep[60877/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 19:53:40 INFO stats.py:314 | Epoch[444] Step[71] GlobalStep[60899] Training Speed: 417.89 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 4:30:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:53:41 INFO loss_tracker.py:84 | Epoch[444/NA] Step[74] GlobalStep[60902/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 19:53:50 INFO stats.py:314 | Epoch[444] Step[96] GlobalStep[60924] Training Speed: 424.86 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:30:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:53:52 INFO loss_tracker.py:84 | Epoch[444/NA] Step[99] GlobalStep[60927/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 19:54:00 INFO stats.py:314 | Epoch[444] Step[121] GlobalStep[60949] Training Speed: 452.71 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 4:29:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:54:01 INFO loss_tracker.py:84 | Epoch[444/NA] Step[124] GlobalStep[60952/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 19:54:05 INFO stats.py:394 | Epoch[444] completed. Training Speed: 311.73 samples/sec across all devices. Epoch Time: 56.25 sec. Average Epoch Time: 56.25 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:29:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:54:11 INFO stats.py:314 | Epoch[445] Step[9] GlobalStep[60974] Training Speed: 434.82 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:29:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:54:17 INFO loss_tracker.py:84 | Epoch[445/NA] Step[24] GlobalStep[60989/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0035] loss_depth[0.0126] total_loss[0.0161] Rank[0/16] 06/24/2025 19:54:21 INFO stats.py:314 | Epoch[445] Step[34] GlobalStep[60999] Training Speed: 433.08 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:29:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:54:27 INFO loss_tracker.py:84 | Epoch[445/NA] Step[49] GlobalStep[61014/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 19:54:32 INFO stats.py:314 | Epoch[445] Step[59] GlobalStep[61024] Training Speed: 424.89 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:29:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:54:37 INFO loss_tracker.py:84 | Epoch[445/NA] Step[74] GlobalStep[61039/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 19:54:42 INFO stats.py:314 | Epoch[445] Step[84] GlobalStep[61049] Training Speed: 429.69 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:29:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:54:48 INFO loss_tracker.py:84 | Epoch[445/NA] Step[99] GlobalStep[61064/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 19:54:52 INFO stats.py:314 | Epoch[445] Step[109] GlobalStep[61074] Training Speed: 404.99 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 4:29:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:54:58 INFO loss_tracker.py:84 | Epoch[445/NA] Step[124] GlobalStep[61089/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 19:55:01 INFO stats.py:314 | Epoch[445] Step[134] GlobalStep[61099] Training Speed: 452.12 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 4:28:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:55:02 INFO stats.py:394 | Epoch[445] completed. Training Speed: 308.64 samples/sec across all devices. Epoch Time: 56.82 sec. Average Epoch Time: 56.82 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:28:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:55:13 INFO stats.py:314 | Epoch[446] Step[22] GlobalStep[61124] Training Speed: 433.04 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:28:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:55:14 INFO loss_tracker.py:84 | Epoch[446/NA] Step[24] GlobalStep[61126/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 19:55:23 INFO stats.py:314 | Epoch[446] Step[47] GlobalStep[61149] Training Speed: 435.88 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:28:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:55:23 INFO loss_tracker.py:84 | Epoch[446/NA] Step[49] GlobalStep[61151/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 19:55:33 INFO stats.py:314 | Epoch[446] Step[72] GlobalStep[61174] Training Speed: 419.01 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 4:28:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:55:34 INFO loss_tracker.py:84 | Epoch[446/NA] Step[74] GlobalStep[61176/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0126] total_loss[0.0178] Rank[0/16] 06/24/2025 19:55:43 INFO stats.py:314 | Epoch[446] Step[97] GlobalStep[61199] Training Speed: 430.57 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:28:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:55:44 INFO loss_tracker.py:84 | Epoch[446/NA] Step[99] GlobalStep[61201/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 19:55:54 INFO stats.py:314 | Epoch[446] Step[122] GlobalStep[61224] Training Speed: 442.55 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:28:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:55:54 INFO loss_tracker.py:84 | Epoch[446/NA] Step[124] GlobalStep[61226/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 19:55:59 INFO stats.py:394 | Epoch[446] completed. Training Speed: 310.10 samples/sec across all devices. Epoch Time: 56.55 sec. Average Epoch Time: 56.55 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:27:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:56:04 INFO stats.py:314 | Epoch[447] Step[10] GlobalStep[61249] Training Speed: 429.08 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:27:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:56:10 INFO loss_tracker.py:84 | Epoch[447/NA] Step[24] GlobalStep[61263/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 19:56:15 INFO stats.py:314 | Epoch[447] Step[35] GlobalStep[61274] Training Speed: 399.38 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 4:27:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:56:20 INFO loss_tracker.py:84 | Epoch[447/NA] Step[49] GlobalStep[61288/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 19:56:25 INFO stats.py:314 | Epoch[447] Step[60] GlobalStep[61299] Training Speed: 413.05 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 4:27:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:56:32 INFO loss_tracker.py:84 | Epoch[447/NA] Step[74] GlobalStep[61313/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 19:56:36 INFO stats.py:314 | Epoch[447] Step[85] GlobalStep[61324] Training Speed: 427.79 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:27:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:56:41 INFO loss_tracker.py:84 | Epoch[447/NA] Step[99] GlobalStep[61338/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 19:56:46 INFO stats.py:314 | Epoch[447] Step[110] GlobalStep[61349] Training Speed: 427.79 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:27:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:56:52 INFO loss_tracker.py:84 | Epoch[447/NA] Step[124] GlobalStep[61363/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0125] total_loss[0.0175] Rank[0/16] 06/24/2025 19:56:56 INFO stats.py:314 | Epoch[447] Step[135] GlobalStep[61374] Training Speed: 452.22 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 4:26:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:56:56 INFO stats.py:394 | Epoch[447] completed. Training Speed: 306.31 samples/sec across all devices. Epoch Time: 57.25 sec. Average Epoch Time: 57.25 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 4:26:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:57:07 INFO stats.py:314 | Epoch[448] Step[23] GlobalStep[61399] Training Speed: 434.75 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:26:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:57:07 INFO loss_tracker.py:84 | Epoch[448/NA] Step[24] GlobalStep[61400/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 19:57:17 INFO stats.py:314 | Epoch[448] Step[48] GlobalStep[61424] Training Speed: 425.24 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:26:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:57:17 INFO loss_tracker.py:84 | Epoch[448/NA] Step[49] GlobalStep[61425/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0127] total_loss[0.0166] Rank[0/16] 06/24/2025 19:57:27 INFO stats.py:314 | Epoch[448] Step[73] GlobalStep[61449] Training Speed: 427.78 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:26:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:57:28 INFO loss_tracker.py:84 | Epoch[448/NA] Step[74] GlobalStep[61450/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0033] loss_depth[0.0126] total_loss[0.0159] Rank[0/16] 06/24/2025 19:57:38 INFO stats.py:314 | Epoch[448] Step[98] GlobalStep[61474] Training Speed: 425.46 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:26:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:57:38 INFO loss_tracker.py:84 | Epoch[448/NA] Step[99] GlobalStep[61475/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 19:57:48 INFO stats.py:314 | Epoch[448] Step[123] GlobalStep[61499] Training Speed: 438.69 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:26:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:57:49 INFO loss_tracker.py:84 | Epoch[448/NA] Step[124] GlobalStep[61500/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 19:57:53 INFO stats.py:394 | Epoch[448] completed. Training Speed: 306.37 samples/sec across all devices. Epoch Time: 57.24 sec. Average Epoch Time: 57.24 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 4:26:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:57:59 INFO stats.py:314 | Epoch[449] Step[11] GlobalStep[61524] Training Speed: 394.82 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 4:25:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:58:04 INFO loss_tracker.py:84 | Epoch[449/NA] Step[24] GlobalStep[61537/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 19:58:10 INFO stats.py:314 | Epoch[449] Step[36] GlobalStep[61549] Training Speed: 432.87 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:25:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:58:14 INFO loss_tracker.py:84 | Epoch[449/NA] Step[49] GlobalStep[61562/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 19:58:19 INFO stats.py:314 | Epoch[449] Step[61] GlobalStep[61574] Training Speed: 426.73 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:25:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:58:25 INFO loss_tracker.py:84 | Epoch[449/NA] Step[74] GlobalStep[61587/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 19:58:30 INFO stats.py:314 | Epoch[449] Step[86] GlobalStep[61599] Training Speed: 429.49 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:25:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:58:35 INFO loss_tracker.py:84 | Epoch[449/NA] Step[99] GlobalStep[61612/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 19:58:40 INFO stats.py:314 | Epoch[449] Step[111] GlobalStep[61624] Training Speed: 425.60 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:25:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:58:45 INFO loss_tracker.py:84 | Epoch[449/NA] Step[124] GlobalStep[61637/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 19:58:50 INFO stats.py:314 | Epoch[449] Step[136] GlobalStep[61649] Training Speed: 437.51 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:25:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:58:50 INFO stats.py:394 | Epoch[449] completed. Training Speed: 307.90 samples/sec across all devices. Epoch Time: 56.95 sec. Average Epoch Time: 56.95 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 4:25:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:59:01 INFO stats.py:314 | Epoch[450] Step[24] GlobalStep[61674] Training Speed: 435.75 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:24:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:59:01 INFO loss_tracker.py:84 | Epoch[450/NA] Step[24] GlobalStep[61674/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 19:59:12 INFO stats.py:314 | Epoch[450] Step[49] GlobalStep[61699] Training Speed: 429.57 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:24:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:59:12 INFO loss_tracker.py:84 | Epoch[450/NA] Step[49] GlobalStep[61699/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 19:59:21 INFO stats.py:314 | Epoch[450] Step[74] GlobalStep[61724] Training Speed: 432.60 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:24:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:59:21 INFO loss_tracker.py:84 | Epoch[450/NA] Step[74] GlobalStep[61724/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 19:59:31 INFO stats.py:314 | Epoch[450] Step[99] GlobalStep[61749] Training Speed: 423.68 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:24:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:59:32 INFO loss_tracker.py:84 | Epoch[450/NA] Step[99] GlobalStep[61749/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 19:59:42 INFO stats.py:314 | Epoch[450] Step[124] GlobalStep[61774] Training Speed: 439.48 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:24:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:59:42 INFO loss_tracker.py:84 | Epoch[450/NA] Step[124] GlobalStep[61774/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0176] Rank[0/16] 06/24/2025 19:59:46 INFO stats.py:394 | Epoch[450] completed. Training Speed: 311.61 samples/sec across all devices. Epoch Time: 56.28 sec. Average Epoch Time: 56.28 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:24:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:59:53 INFO stats.py:314 | Epoch[451] Step[12] GlobalStep[61799] Training Speed: 425.96 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:24:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 19:59:57 INFO loss_tracker.py:84 | Epoch[451/NA] Step[24] GlobalStep[61811/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0127] total_loss[0.0169] Rank[0/16] 06/24/2025 20:00:03 INFO stats.py:314 | Epoch[451] Step[37] GlobalStep[61824] Training Speed: 422.14 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:23:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:00:08 INFO loss_tracker.py:84 | Epoch[451/NA] Step[49] GlobalStep[61836/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 20:00:13 INFO stats.py:314 | Epoch[451] Step[62] GlobalStep[61849] Training Speed: 435.00 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:23:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:00:18 INFO loss_tracker.py:84 | Epoch[451/NA] Step[74] GlobalStep[61861/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 20:00:24 INFO stats.py:314 | Epoch[451] Step[87] GlobalStep[61874] Training Speed: 438.01 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:23:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:00:29 INFO loss_tracker.py:84 | Epoch[451/NA] Step[99] GlobalStep[61886/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0127] total_loss[0.0178] Rank[0/16] 06/24/2025 20:00:34 INFO stats.py:314 | Epoch[451] Step[112] GlobalStep[61899] Training Speed: 427.22 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:23:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:00:38 INFO loss_tracker.py:84 | Epoch[451/NA] Step[124] GlobalStep[61911/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:00:43 INFO stats.py:394 | Epoch[451] completed. Training Speed: 310.68 samples/sec across all devices. Epoch Time: 56.44 sec. Average Epoch Time: 56.44 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:23:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:00:44 INFO stats.py:314 | Epoch[452] Step[0] GlobalStep[61924] Training Speed: 356.95 samples/sec across all devices. Average Step Time: 0.36 sec. Estimated Remaining Time: 4:23:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:00:54 INFO loss_tracker.py:84 | Epoch[452/NA] Step[24] GlobalStep[61948/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 20:00:54 INFO stats.py:314 | Epoch[452] Step[25] GlobalStep[61949] Training Speed: 414.82 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 4:23:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:01:05 INFO loss_tracker.py:84 | Epoch[452/NA] Step[49] GlobalStep[61973/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 20:01:05 INFO stats.py:314 | Epoch[452] Step[50] GlobalStep[61974] Training Speed: 375.27 samples/sec across all devices. Average Step Time: 0.34 sec. Estimated Remaining Time: 4:22:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:01:15 INFO loss_tracker.py:84 | Epoch[452/NA] Step[74] GlobalStep[61998/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 20:01:15 INFO stats.py:314 | Epoch[452] Step[75] GlobalStep[61999] Training Speed: 417.87 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 4:22:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:01:26 INFO loss_tracker.py:84 | Epoch[452/NA] Step[99] GlobalStep[62023/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:01:26 INFO stats.py:314 | Epoch[452] Step[100] GlobalStep[62024] Training Speed: 417.92 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 4:22:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:01:35 INFO loss_tracker.py:84 | Epoch[452/NA] Step[124] GlobalStep[62048/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 20:01:36 INFO stats.py:314 | Epoch[452] Step[125] GlobalStep[62049] Training Speed: 419.99 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:22:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:01:40 INFO stats.py:394 | Epoch[452] completed. Training Speed: 308.54 samples/sec across all devices. Epoch Time: 56.83 sec. Average Epoch Time: 56.83 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:22:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:01:47 INFO stats.py:314 | Epoch[453] Step[13] GlobalStep[62074] Training Speed: 432.23 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:22:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:01:51 INFO loss_tracker.py:84 | Epoch[453/NA] Step[24] GlobalStep[62085/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 20:01:57 INFO stats.py:314 | Epoch[453] Step[38] GlobalStep[62099] Training Speed: 425.98 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:21:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:02:01 INFO loss_tracker.py:84 | Epoch[453/NA] Step[49] GlobalStep[62110/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0060] loss_depth[0.0126] total_loss[0.0187] Rank[0/16] 06/24/2025 20:02:07 INFO stats.py:314 | Epoch[453] Step[63] GlobalStep[62124] Training Speed: 420.00 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:21:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:02:11 INFO loss_tracker.py:84 | Epoch[453/NA] Step[74] GlobalStep[62135/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 20:02:16 INFO stats.py:314 | Epoch[453] Step[88] GlobalStep[62149] Training Speed: 435.13 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:21:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:02:22 INFO loss_tracker.py:84 | Epoch[453/NA] Step[99] GlobalStep[62160/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:02:27 INFO stats.py:314 | Epoch[453] Step[113] GlobalStep[62174] Training Speed: 414.56 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 4:21:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:02:32 INFO loss_tracker.py:84 | Epoch[453/NA] Step[124] GlobalStep[62185/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 20:02:36 INFO stats.py:394 | Epoch[453] completed. Training Speed: 311.40 samples/sec across all devices. Epoch Time: 56.31 sec. Average Epoch Time: 56.31 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:21:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:02:38 INFO stats.py:314 | Epoch[454] Step[1] GlobalStep[62199] Training Speed: 219.63 samples/sec across all devices. Average Step Time: 0.58 sec. Estimated Remaining Time: 4:21:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:02:47 INFO loss_tracker.py:84 | Epoch[454/NA] Step[24] GlobalStep[62222/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 20:02:48 INFO stats.py:314 | Epoch[454] Step[26] GlobalStep[62224] Training Speed: 423.31 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:21:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:02:58 INFO loss_tracker.py:84 | Epoch[454/NA] Step[49] GlobalStep[62247/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:02:58 INFO stats.py:314 | Epoch[454] Step[51] GlobalStep[62249] Training Speed: 425.72 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:20:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:03:08 INFO loss_tracker.py:84 | Epoch[454/NA] Step[74] GlobalStep[62272/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 20:03:09 INFO stats.py:314 | Epoch[454] Step[76] GlobalStep[62274] Training Speed: 430.50 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:20:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:03:18 INFO loss_tracker.py:84 | Epoch[454/NA] Step[99] GlobalStep[62297/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:03:18 INFO stats.py:314 | Epoch[454] Step[101] GlobalStep[62299] Training Speed: 431.94 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:20:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:03:28 INFO loss_tracker.py:84 | Epoch[454/NA] Step[124] GlobalStep[62322/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 20:03:29 INFO stats.py:314 | Epoch[454] Step[126] GlobalStep[62324] Training Speed: 435.93 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:20:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:03:32 INFO stats.py:394 | Epoch[454] completed. Training Speed: 310.79 samples/sec across all devices. Epoch Time: 56.42 sec. Average Epoch Time: 56.42 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:20:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:03:40 INFO stats.py:314 | Epoch[455] Step[14] GlobalStep[62349] Training Speed: 395.12 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 4:20:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:03:44 INFO loss_tracker.py:84 | Epoch[455/NA] Step[24] GlobalStep[62359/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 20:03:51 INFO stats.py:314 | Epoch[455] Step[39] GlobalStep[62374] Training Speed: 430.34 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:20:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:03:54 INFO loss_tracker.py:84 | Epoch[455/NA] Step[49] GlobalStep[62384/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 20:04:01 INFO stats.py:314 | Epoch[455] Step[64] GlobalStep[62399] Training Speed: 430.79 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:19:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:04:05 INFO loss_tracker.py:84 | Epoch[455/NA] Step[74] GlobalStep[62409/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 20:04:11 INFO stats.py:314 | Epoch[455] Step[89] GlobalStep[62424] Training Speed: 419.01 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 4:19:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:04:15 INFO loss_tracker.py:84 | Epoch[455/NA] Step[99] GlobalStep[62434/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 20:04:21 INFO stats.py:314 | Epoch[455] Step[114] GlobalStep[62449] Training Speed: 430.26 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:19:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:04:25 INFO loss_tracker.py:84 | Epoch[455/NA] Step[124] GlobalStep[62459/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:04:29 INFO stats.py:394 | Epoch[455] completed. Training Speed: 308.66 samples/sec across all devices. Epoch Time: 56.81 sec. Average Epoch Time: 56.81 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:19:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:04:32 INFO stats.py:314 | Epoch[456] Step[2] GlobalStep[62474] Training Speed: 430.16 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:19:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:04:41 INFO loss_tracker.py:84 | Epoch[456/NA] Step[24] GlobalStep[62496/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 20:04:42 INFO stats.py:314 | Epoch[456] Step[27] GlobalStep[62499] Training Speed: 427.31 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:19:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:04:51 INFO loss_tracker.py:84 | Epoch[456/NA] Step[49] GlobalStep[62521/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0125] total_loss[0.0178] Rank[0/16] 06/24/2025 20:04:52 INFO stats.py:314 | Epoch[456] Step[52] GlobalStep[62524] Training Speed: 427.53 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:19:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:05:02 INFO loss_tracker.py:84 | Epoch[456/NA] Step[74] GlobalStep[62546/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 20:05:03 INFO stats.py:314 | Epoch[456] Step[77] GlobalStep[62549] Training Speed: 424.59 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:18:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:05:12 INFO loss_tracker.py:84 | Epoch[456/NA] Step[99] GlobalStep[62571/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 20:05:13 INFO stats.py:314 | Epoch[456] Step[102] GlobalStep[62574] Training Speed: 424.78 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:18:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:05:22 INFO loss_tracker.py:84 | Epoch[456/NA] Step[124] GlobalStep[62596/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 20:05:24 INFO stats.py:314 | Epoch[456] Step[127] GlobalStep[62599] Training Speed: 446.96 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:18:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:05:27 INFO stats.py:394 | Epoch[456] completed. Training Speed: 304.65 samples/sec across all devices. Epoch Time: 57.56 sec. Average Epoch Time: 57.56 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 4:18:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:05:34 INFO stats.py:314 | Epoch[457] Step[15] GlobalStep[62624] Training Speed: 436.84 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:18:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:05:38 INFO loss_tracker.py:84 | Epoch[457/NA] Step[24] GlobalStep[62633/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:05:45 INFO stats.py:314 | Epoch[457] Step[40] GlobalStep[62649] Training Speed: 424.24 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:18:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:05:49 INFO loss_tracker.py:84 | Epoch[457/NA] Step[49] GlobalStep[62658/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:05:55 INFO stats.py:314 | Epoch[457] Step[65] GlobalStep[62674] Training Speed: 425.63 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:18:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:05:59 INFO loss_tracker.py:84 | Epoch[457/NA] Step[74] GlobalStep[62683/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 20:06:06 INFO stats.py:314 | Epoch[457] Step[90] GlobalStep[62699] Training Speed: 427.35 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:17:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:06:10 INFO loss_tracker.py:84 | Epoch[457/NA] Step[99] GlobalStep[62708/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 20:06:16 INFO stats.py:314 | Epoch[457] Step[115] GlobalStep[62724] Training Speed: 431.04 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:17:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:06:19 INFO loss_tracker.py:84 | Epoch[457/NA] Step[124] GlobalStep[62733/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:06:24 INFO stats.py:394 | Epoch[457] completed. Training Speed: 306.66 samples/sec across all devices. Epoch Time: 57.18 sec. Average Epoch Time: 57.18 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 4:17:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:06:27 INFO stats.py:314 | Epoch[458] Step[3] GlobalStep[62749] Training Speed: 430.68 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:17:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:06:36 INFO loss_tracker.py:84 | Epoch[458/NA] Step[24] GlobalStep[62770/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:06:37 INFO stats.py:314 | Epoch[458] Step[28] GlobalStep[62774] Training Speed: 413.76 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 4:17:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:06:46 INFO loss_tracker.py:84 | Epoch[458/NA] Step[49] GlobalStep[62795/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:06:48 INFO stats.py:314 | Epoch[458] Step[53] GlobalStep[62799] Training Speed: 426.03 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:17:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:06:56 INFO loss_tracker.py:84 | Epoch[458/NA] Step[74] GlobalStep[62820/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 20:06:57 INFO stats.py:314 | Epoch[458] Step[78] GlobalStep[62824] Training Speed: 431.17 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:16:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:07:06 INFO loss_tracker.py:84 | Epoch[458/NA] Step[99] GlobalStep[62845/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 20:07:08 INFO stats.py:314 | Epoch[458] Step[103] GlobalStep[62849] Training Speed: 436.75 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:16:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:07:16 INFO loss_tracker.py:84 | Epoch[458/NA] Step[124] GlobalStep[62870/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 20:07:18 INFO stats.py:314 | Epoch[458] Step[128] GlobalStep[62874] Training Speed: 450.16 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 4:16:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:07:20 INFO stats.py:394 | Epoch[458] completed. Training Speed: 310.92 samples/sec across all devices. Epoch Time: 56.40 sec. Average Epoch Time: 56.40 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:16:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:07:29 INFO stats.py:314 | Epoch[459] Step[16] GlobalStep[62899] Training Speed: 428.52 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:16:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:07:32 INFO loss_tracker.py:84 | Epoch[459/NA] Step[24] GlobalStep[62907/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 20:07:39 INFO stats.py:314 | Epoch[459] Step[41] GlobalStep[62924] Training Speed: 422.17 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:16:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:07:42 INFO loss_tracker.py:84 | Epoch[459/NA] Step[49] GlobalStep[62932/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 20:07:49 INFO stats.py:314 | Epoch[459] Step[66] GlobalStep[62949] Training Speed: 430.35 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:16:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:07:53 INFO loss_tracker.py:84 | Epoch[459/NA] Step[74] GlobalStep[62957/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0126] total_loss[0.0163] Rank[0/16] 06/24/2025 20:08:00 INFO stats.py:314 | Epoch[459] Step[91] GlobalStep[62974] Training Speed: 424.78 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:15:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:08:03 INFO loss_tracker.py:84 | Epoch[459/NA] Step[99] GlobalStep[62982/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0126] total_loss[0.0163] Rank[0/16] 06/24/2025 20:08:10 INFO stats.py:314 | Epoch[459] Step[116] GlobalStep[62999] Training Speed: 434.05 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:15:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:08:14 INFO loss_tracker.py:84 | Epoch[459/NA] Step[124] GlobalStep[63007/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0176] Rank[0/16] 06/24/2025 20:08:18 INFO stats.py:394 | Epoch[459] completed. Training Speed: 305.89 samples/sec across all devices. Epoch Time: 57.33 sec. Average Epoch Time: 57.33 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 4:15:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:08:21 INFO stats.py:314 | Epoch[460] Step[4] GlobalStep[63024] Training Speed: 430.71 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:15:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:08:29 INFO loss_tracker.py:84 | Epoch[460/NA] Step[24] GlobalStep[63044/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0126] total_loss[0.0178] Rank[0/16] 06/24/2025 20:08:31 INFO stats.py:314 | Epoch[460] Step[29] GlobalStep[63049] Training Speed: 430.87 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:15:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:08:40 INFO loss_tracker.py:84 | Epoch[460/NA] Step[49] GlobalStep[63069/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:08:42 INFO stats.py:314 | Epoch[460] Step[54] GlobalStep[63074] Training Speed: 427.00 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:15:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:08:50 INFO loss_tracker.py:84 | Epoch[460/NA] Step[74] GlobalStep[63094/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 20:08:52 INFO stats.py:314 | Epoch[460] Step[79] GlobalStep[63099] Training Speed: 420.35 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:15:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:09:00 INFO loss_tracker.py:84 | Epoch[460/NA] Step[99] GlobalStep[63119/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:09:02 INFO stats.py:314 | Epoch[460] Step[104] GlobalStep[63124] Training Speed: 427.63 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:14:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:09:11 INFO loss_tracker.py:84 | Epoch[460/NA] Step[124] GlobalStep[63144/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:09:12 INFO stats.py:314 | Epoch[460] Step[129] GlobalStep[63149] Training Speed: 426.61 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:14:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:09:15 INFO stats.py:394 | Epoch[460] completed. Training Speed: 307.23 samples/sec across all devices. Epoch Time: 57.08 sec. Average Epoch Time: 57.08 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 4:14:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:09:23 INFO stats.py:314 | Epoch[461] Step[17] GlobalStep[63174] Training Speed: 436.08 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:14:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:09:26 INFO loss_tracker.py:84 | Epoch[461/NA] Step[24] GlobalStep[63181/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 20:09:34 INFO stats.py:314 | Epoch[461] Step[42] GlobalStep[63199] Training Speed: 421.89 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:14:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:09:37 INFO loss_tracker.py:84 | Epoch[461/NA] Step[49] GlobalStep[63206/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 20:09:44 INFO stats.py:314 | Epoch[461] Step[67] GlobalStep[63224] Training Speed: 425.96 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:14:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:09:47 INFO loss_tracker.py:84 | Epoch[461/NA] Step[74] GlobalStep[63231/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 20:09:54 INFO stats.py:314 | Epoch[461] Step[92] GlobalStep[63249] Training Speed: 427.15 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:14:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:09:57 INFO loss_tracker.py:84 | Epoch[461/NA] Step[99] GlobalStep[63256/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:10:05 INFO stats.py:314 | Epoch[461] Step[117] GlobalStep[63274] Training Speed: 424.87 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:13:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:10:07 INFO loss_tracker.py:84 | Epoch[461/NA] Step[124] GlobalStep[63281/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 20:10:12 INFO stats.py:394 | Epoch[461] completed. Training Speed: 307.37 samples/sec across all devices. Epoch Time: 57.05 sec. Average Epoch Time: 57.05 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 4:13:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:10:15 INFO stats.py:314 | Epoch[462] Step[5] GlobalStep[63299] Training Speed: 424.76 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:13:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:10:23 INFO loss_tracker.py:84 | Epoch[462/NA] Step[24] GlobalStep[63318/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:10:26 INFO stats.py:314 | Epoch[462] Step[30] GlobalStep[63324] Training Speed: 412.34 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 4:13:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:10:34 INFO loss_tracker.py:84 | Epoch[462/NA] Step[49] GlobalStep[63343/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 20:10:37 INFO stats.py:314 | Epoch[462] Step[55] GlobalStep[63349] Training Speed: 426.72 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:13:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:10:44 INFO loss_tracker.py:84 | Epoch[462/NA] Step[74] GlobalStep[63368/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 20:10:47 INFO stats.py:314 | Epoch[462] Step[80] GlobalStep[63374] Training Speed: 421.22 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:13:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:10:55 INFO loss_tracker.py:84 | Epoch[462/NA] Step[99] GlobalStep[63393/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0035] loss_depth[0.0126] total_loss[0.0162] Rank[0/16] 06/24/2025 20:10:57 INFO stats.py:314 | Epoch[462] Step[105] GlobalStep[63399] Training Speed: 434.32 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:13:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:11:05 INFO loss_tracker.py:84 | Epoch[462/NA] Step[124] GlobalStep[63418/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 20:11:07 INFO stats.py:314 | Epoch[462] Step[130] GlobalStep[63424] Training Speed: 446.82 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:12:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:11:09 INFO stats.py:394 | Epoch[462] completed. Training Speed: 305.87 samples/sec across all devices. Epoch Time: 57.33 sec. Average Epoch Time: 57.33 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 4:12:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:11:18 INFO stats.py:314 | Epoch[463] Step[18] GlobalStep[63449] Training Speed: 434.96 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:12:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:11:20 INFO loss_tracker.py:84 | Epoch[463/NA] Step[24] GlobalStep[63455/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 20:11:28 INFO stats.py:314 | Epoch[463] Step[43] GlobalStep[63474] Training Speed: 426.56 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:12:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:11:31 INFO loss_tracker.py:84 | Epoch[463/NA] Step[49] GlobalStep[63480/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 20:11:38 INFO stats.py:314 | Epoch[463] Step[68] GlobalStep[63499] Training Speed: 430.29 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:12:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:11:41 INFO loss_tracker.py:84 | Epoch[463/NA] Step[74] GlobalStep[63505/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 20:11:49 INFO stats.py:314 | Epoch[463] Step[93] GlobalStep[63524] Training Speed: 428.51 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:12:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:11:51 INFO loss_tracker.py:84 | Epoch[463/NA] Step[99] GlobalStep[63530/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 20:11:59 INFO stats.py:314 | Epoch[463] Step[118] GlobalStep[63549] Training Speed: 431.65 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:11:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:12:01 INFO loss_tracker.py:84 | Epoch[463/NA] Step[124] GlobalStep[63555/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 20:12:06 INFO stats.py:394 | Epoch[463] completed. Training Speed: 310.26 samples/sec across all devices. Epoch Time: 56.52 sec. Average Epoch Time: 56.52 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:11:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:12:10 INFO stats.py:314 | Epoch[464] Step[6] GlobalStep[63574] Training Speed: 423.45 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:11:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:12:17 INFO loss_tracker.py:84 | Epoch[464/NA] Step[24] GlobalStep[63592/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:12:19 INFO stats.py:314 | Epoch[464] Step[31] GlobalStep[63599] Training Speed: 423.89 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:11:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:12:27 INFO loss_tracker.py:84 | Epoch[464/NA] Step[49] GlobalStep[63617/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:12:30 INFO stats.py:314 | Epoch[464] Step[56] GlobalStep[63624] Training Speed: 434.82 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:11:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:12:37 INFO loss_tracker.py:84 | Epoch[464/NA] Step[74] GlobalStep[63642/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 20:12:40 INFO stats.py:314 | Epoch[464] Step[81] GlobalStep[63649] Training Speed: 423.58 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:11:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:12:47 INFO loss_tracker.py:84 | Epoch[464/NA] Step[99] GlobalStep[63667/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 20:12:50 INFO stats.py:314 | Epoch[464] Step[106] GlobalStep[63674] Training Speed: 426.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:11:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:12:57 INFO loss_tracker.py:84 | Epoch[464/NA] Step[124] GlobalStep[63692/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 20:13:00 INFO stats.py:314 | Epoch[464] Step[131] GlobalStep[63699] Training Speed: 423.96 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:10:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:13:02 INFO stats.py:394 | Epoch[464] completed. Training Speed: 312.09 samples/sec across all devices. Epoch Time: 56.19 sec. Average Epoch Time: 56.19 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:10:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:13:11 INFO stats.py:314 | Epoch[465] Step[19] GlobalStep[63724] Training Speed: 437.58 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:10:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:13:13 INFO loss_tracker.py:84 | Epoch[465/NA] Step[24] GlobalStep[63729/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 20:13:22 INFO stats.py:314 | Epoch[465] Step[44] GlobalStep[63749] Training Speed: 426.90 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:10:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:13:24 INFO loss_tracker.py:84 | Epoch[465/NA] Step[49] GlobalStep[63754/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0126] total_loss[0.0162] Rank[0/16] 06/24/2025 20:13:32 INFO stats.py:314 | Epoch[465] Step[69] GlobalStep[63774] Training Speed: 405.78 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 4:10:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:13:34 INFO loss_tracker.py:84 | Epoch[465/NA] Step[74] GlobalStep[63779/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 20:13:42 INFO stats.py:314 | Epoch[465] Step[94] GlobalStep[63799] Training Speed: 434.29 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:10:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:13:44 INFO loss_tracker.py:84 | Epoch[465/NA] Step[99] GlobalStep[63804/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 20:13:52 INFO stats.py:314 | Epoch[465] Step[119] GlobalStep[63824] Training Speed: 424.51 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:10:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:13:54 INFO loss_tracker.py:84 | Epoch[465/NA] Step[124] GlobalStep[63829/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 20:13:58 INFO stats.py:394 | Epoch[465] completed. Training Speed: 309.44 samples/sec across all devices. Epoch Time: 56.67 sec. Average Epoch Time: 56.67 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:09:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:14:03 INFO stats.py:314 | Epoch[466] Step[7] GlobalStep[63849] Training Speed: 422.17 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:09:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:14:10 INFO loss_tracker.py:84 | Epoch[466/NA] Step[24] GlobalStep[63866/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 20:14:13 INFO stats.py:314 | Epoch[466] Step[32] GlobalStep[63874] Training Speed: 425.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:09:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:14:20 INFO loss_tracker.py:84 | Epoch[466/NA] Step[49] GlobalStep[63891/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 20:14:23 INFO stats.py:314 | Epoch[466] Step[57] GlobalStep[63899] Training Speed: 410.37 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 4:09:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:14:30 INFO loss_tracker.py:84 | Epoch[466/NA] Step[74] GlobalStep[63916/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0057] loss_depth[0.0126] total_loss[0.0183] Rank[0/16] 06/24/2025 20:14:34 INFO stats.py:314 | Epoch[466] Step[82] GlobalStep[63924] Training Speed: 434.00 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:09:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:14:41 INFO loss_tracker.py:84 | Epoch[466/NA] Step[99] GlobalStep[63941/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0055] loss_depth[0.0126] total_loss[0.0182] Rank[0/16] 06/24/2025 20:14:44 INFO stats.py:314 | Epoch[466] Step[107] GlobalStep[63949] Training Speed: 434.04 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:09:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:14:51 INFO loss_tracker.py:84 | Epoch[466/NA] Step[124] GlobalStep[63966/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0176] Rank[0/16] 06/24/2025 20:14:54 INFO stats.py:314 | Epoch[466] Step[132] GlobalStep[63974] Training Speed: 437.16 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:09:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:14:55 INFO stats.py:394 | Epoch[466] completed. Training Speed: 310.71 samples/sec across all devices. Epoch Time: 56.44 sec. Average Epoch Time: 56.44 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:08:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:15:04 INFO stats.py:314 | Epoch[467] Step[20] GlobalStep[63999] Training Speed: 425.04 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:08:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:15:05 WARNING accelerator.py:3099 | Deleting 1 checkpoints to make room for new checkpoint. Rank[0/16] 06/24/2025 20:15:06 INFO accelerator.py:3111 | Saving current state to /job_data/checkpoints/checkpoint_15 Rank[2/16] 06/24/2025 20:15:06 INFO checkpoint.py:110 | Save checkpoint at the end of step 63999 to /job_data/checkpoints/checkpoint_15 Rank[6/16] 06/24/2025 20:15:06 INFO checkpoint.py:110 | Save checkpoint at the end of step 63999 to /job_data/checkpoints/checkpoint_15 Rank[4/16] 06/24/2025 20:15:06 INFO checkpoint.py:110 | Save checkpoint at the end of step 63999 to /job_data/checkpoints/checkpoint_15 Rank[7/16] 06/24/2025 20:15:06 INFO checkpoint.py:110 | Save checkpoint at the end of step 63999 to /job_data/checkpoints/checkpoint_15 Rank[11/16] 06/24/2025 20:15:06 INFO checkpoint.py:110 | Save checkpoint at the end of step 63999 to /job_data/checkpoints/checkpoint_15 Rank[3/16] 06/24/2025 20:15:06 INFO checkpoint.py:110 | Save checkpoint at the end of step 63999 to /job_data/checkpoints/checkpoint_15 Rank[1/16] 06/24/2025 20:15:06 INFO checkpoint.py:110 | Save checkpoint at the end of step 63999 to /job_data/checkpoints/checkpoint_15 Rank[15/16] 06/24/2025 20:15:06 INFO checkpoint.py:110 | Save checkpoint at the end of step 63999 to /job_data/checkpoints/checkpoint_15 Rank[5/16] 06/24/2025 20:15:06 INFO checkpoint.py:110 | Save checkpoint at the end of step 63999 to /job_data/checkpoints/checkpoint_15 Rank[10/16] 06/24/2025 20:15:06 INFO checkpoint.py:110 | Save checkpoint at the end of step 63999 to /job_data/checkpoints/checkpoint_15 Rank[14/16] 06/24/2025 20:15:06 INFO checkpoint.py:110 | Save checkpoint at the end of step 63999 to /job_data/checkpoints/checkpoint_15 Rank[8/16] 06/24/2025 20:15:06 INFO checkpoint.py:110 | Save checkpoint at the end of step 63999 to /job_data/checkpoints/checkpoint_15 Rank[13/16] 06/24/2025 20:15:06 INFO checkpoint.py:110 | Save checkpoint at the end of step 63999 to /job_data/checkpoints/checkpoint_15 Rank[12/16] 06/24/2025 20:15:06 INFO checkpoint.py:110 | Save checkpoint at the end of step 63999 to /job_data/checkpoints/checkpoint_15 Rank[9/16] 06/24/2025 20:15:06 INFO checkpoint.py:110 | Save checkpoint at the end of step 63999 to /job_data/checkpoints/checkpoint_15 Rank[0/16] 06/24/2025 20:15:07 INFO checkpointing.py:106 | Model weights saved in /job_data/checkpoints/checkpoint_15/model.safetensors Rank[0/16] 06/24/2025 20:15:08 INFO checkpointing.py:113 | Optimizer state saved in /job_data/checkpoints/checkpoint_15/optimizer.bin Rank[0/16] 06/24/2025 20:15:08 INFO checkpointing.py:120 | Scheduler state saved in /job_data/checkpoints/checkpoint_15/scheduler.bin Rank[0/16] 06/24/2025 20:15:08 INFO checkpointing.py:137 | Sampler state for dataloader 0 saved in /job_data/checkpoints/checkpoint_15/sampler.bin Rank[0/16] 06/24/2025 20:15:08 INFO checkpointing.py:164 | Random states saved in /job_data/checkpoints/checkpoint_15/random_states_0.pkl Rank[0/16] 06/24/2025 20:15:08 INFO checkpointing.py:300 | Saving the state of TrainerProgressState to /job_data/checkpoints/checkpoint_15/custom_checkpoint_0.pkl Rank[0/16] 06/24/2025 20:15:08 INFO checkpoint.py:110 | Save checkpoint at the end of step 63999 to /job_data/checkpoints/checkpoint_15 Rank[0/16] 06/24/2025 20:15:09 INFO loss_tracker.py:84 | Epoch[467/NA] Step[24] GlobalStep[64003/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 20:15:18 INFO stats.py:314 | Epoch[467] Step[45] GlobalStep[64024] Training Speed: 427.47 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:08:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:15:19 INFO loss_tracker.py:84 | Epoch[467/NA] Step[49] GlobalStep[64028/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:15:28 INFO stats.py:314 | Epoch[467] Step[70] GlobalStep[64049] Training Speed: 429.92 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:08:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:15:30 INFO loss_tracker.py:84 | Epoch[467/NA] Step[74] GlobalStep[64053/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0126] total_loss[0.0162] Rank[0/16] 06/24/2025 20:15:38 INFO stats.py:314 | Epoch[467] Step[95] GlobalStep[64074] Training Speed: 437.54 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:08:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:15:40 INFO loss_tracker.py:84 | Epoch[467/NA] Step[99] GlobalStep[64078/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:15:48 INFO stats.py:314 | Epoch[467] Step[120] GlobalStep[64099] Training Speed: 448.57 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:08:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:15:50 INFO loss_tracker.py:84 | Epoch[467/NA] Step[124] GlobalStep[64103/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 20:15:54 INFO stats.py:394 | Epoch[467] completed. Training Speed: 295.59 samples/sec across all devices. Epoch Time: 59.33 sec. Average Epoch Time: 59.33 sec. Average Step Time: 0.43 sec. Estimated Remaining Time: 4:08:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:15:59 INFO stats.py:314 | Epoch[468] Step[8] GlobalStep[64124] Training Speed: 431.00 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:08:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:16:06 INFO loss_tracker.py:84 | Epoch[468/NA] Step[24] GlobalStep[64140/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 20:16:10 INFO stats.py:314 | Epoch[468] Step[33] GlobalStep[64149] Training Speed: 423.74 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:07:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:16:17 INFO loss_tracker.py:84 | Epoch[468/NA] Step[49] GlobalStep[64165/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:16:20 INFO stats.py:314 | Epoch[468] Step[58] GlobalStep[64174] Training Speed: 427.64 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:07:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:16:27 INFO loss_tracker.py:84 | Epoch[468/NA] Step[74] GlobalStep[64190/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 20:16:30 INFO stats.py:314 | Epoch[468] Step[83] GlobalStep[64199] Training Speed: 435.38 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:07:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:16:37 INFO loss_tracker.py:84 | Epoch[468/NA] Step[99] GlobalStep[64215/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:16:40 INFO stats.py:314 | Epoch[468] Step[108] GlobalStep[64224] Training Speed: 433.55 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:07:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:16:47 INFO loss_tracker.py:84 | Epoch[468/NA] Step[124] GlobalStep[64240/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:16:50 INFO stats.py:314 | Epoch[468] Step[133] GlobalStep[64249] Training Speed: 452.73 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 4:07:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:16:51 INFO stats.py:394 | Epoch[468] completed. Training Speed: 306.72 samples/sec across all devices. Epoch Time: 57.17 sec. Average Epoch Time: 57.17 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 4:07:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:17:01 INFO stats.py:314 | Epoch[469] Step[21] GlobalStep[64274] Training Speed: 427.63 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:06:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:17:02 INFO loss_tracker.py:84 | Epoch[469/NA] Step[24] GlobalStep[64277/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 20:17:12 INFO stats.py:314 | Epoch[469] Step[46] GlobalStep[64299] Training Speed: 428.86 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:06:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:17:13 INFO loss_tracker.py:84 | Epoch[469/NA] Step[49] GlobalStep[64302/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 20:17:22 INFO stats.py:314 | Epoch[469] Step[71] GlobalStep[64324] Training Speed: 425.72 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:06:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:17:24 INFO loss_tracker.py:84 | Epoch[469/NA] Step[74] GlobalStep[64327/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 20:17:33 INFO stats.py:314 | Epoch[469] Step[96] GlobalStep[64349] Training Speed: 426.42 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:06:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:17:34 INFO loss_tracker.py:84 | Epoch[469/NA] Step[99] GlobalStep[64352/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:17:43 INFO stats.py:314 | Epoch[469] Step[121] GlobalStep[64374] Training Speed: 456.19 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 4:06:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:17:44 INFO loss_tracker.py:84 | Epoch[469/NA] Step[124] GlobalStep[64377/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 20:17:48 INFO stats.py:394 | Epoch[469] completed. Training Speed: 310.13 samples/sec across all devices. Epoch Time: 56.54 sec. Average Epoch Time: 56.54 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:06:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:17:53 INFO stats.py:314 | Epoch[470] Step[9] GlobalStep[64399] Training Speed: 428.64 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:06:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:17:59 INFO loss_tracker.py:84 | Epoch[470/NA] Step[24] GlobalStep[64414/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0034] loss_depth[0.0126] total_loss[0.0160] Rank[0/16] 06/24/2025 20:18:03 INFO stats.py:314 | Epoch[470] Step[34] GlobalStep[64424] Training Speed: 435.18 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:05:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:18:10 INFO loss_tracker.py:84 | Epoch[470/NA] Step[49] GlobalStep[64439/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 20:18:14 INFO stats.py:314 | Epoch[470] Step[59] GlobalStep[64449] Training Speed: 429.22 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:05:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:18:20 INFO loss_tracker.py:84 | Epoch[470/NA] Step[74] GlobalStep[64464/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 20:18:24 INFO stats.py:314 | Epoch[470] Step[84] GlobalStep[64474] Training Speed: 428.05 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:05:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:18:31 INFO loss_tracker.py:84 | Epoch[470/NA] Step[99] GlobalStep[64489/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 20:18:35 INFO stats.py:314 | Epoch[470] Step[109] GlobalStep[64499] Training Speed: 423.11 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:05:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:18:41 INFO loss_tracker.py:84 | Epoch[470/NA] Step[124] GlobalStep[64514/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0168] Rank[0/16] 06/24/2025 20:18:45 INFO stats.py:314 | Epoch[470] Step[134] GlobalStep[64524] Training Speed: 432.08 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:05:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:18:45 INFO stats.py:394 | Epoch[470] completed. Training Speed: 305.51 samples/sec across all devices. Epoch Time: 57.40 sec. Average Epoch Time: 57.40 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 4:05:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:18:56 INFO stats.py:314 | Epoch[471] Step[22] GlobalStep[64549] Training Speed: 432.32 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:05:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:18:57 INFO loss_tracker.py:84 | Epoch[471/NA] Step[24] GlobalStep[64551/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 20:19:06 INFO stats.py:314 | Epoch[471] Step[47] GlobalStep[64574] Training Speed: 432.85 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:04:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:19:07 INFO loss_tracker.py:84 | Epoch[471/NA] Step[49] GlobalStep[64576/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0126] total_loss[0.0162] Rank[0/16] 06/24/2025 20:19:17 INFO stats.py:314 | Epoch[471] Step[72] GlobalStep[64599] Training Speed: 431.06 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:04:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:19:18 INFO loss_tracker.py:84 | Epoch[471/NA] Step[74] GlobalStep[64601/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:19:27 INFO stats.py:314 | Epoch[471] Step[97] GlobalStep[64624] Training Speed: 427.99 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:04:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:19:28 INFO loss_tracker.py:84 | Epoch[471/NA] Step[99] GlobalStep[64626/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 20:19:38 INFO stats.py:314 | Epoch[471] Step[122] GlobalStep[64649] Training Speed: 454.19 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 4:04:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:19:38 INFO loss_tracker.py:84 | Epoch[471/NA] Step[124] GlobalStep[64651/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 20:19:43 INFO stats.py:394 | Epoch[471] completed. Training Speed: 306.69 samples/sec across all devices. Epoch Time: 57.18 sec. Average Epoch Time: 57.18 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 4:04:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:19:48 INFO stats.py:314 | Epoch[472] Step[10] GlobalStep[64674] Training Speed: 434.76 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:04:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:19:54 INFO loss_tracker.py:84 | Epoch[472/NA] Step[24] GlobalStep[64688/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0126] total_loss[0.0163] Rank[0/16] 06/24/2025 20:19:58 INFO stats.py:314 | Epoch[472] Step[35] GlobalStep[64699] Training Speed: 417.74 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 4:04:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:20:04 INFO loss_tracker.py:84 | Epoch[472/NA] Step[49] GlobalStep[64713/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 20:20:09 INFO stats.py:314 | Epoch[472] Step[60] GlobalStep[64724] Training Speed: 432.40 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:03:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:20:14 INFO loss_tracker.py:84 | Epoch[472/NA] Step[74] GlobalStep[64738/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 20:20:19 INFO stats.py:314 | Epoch[472] Step[85] GlobalStep[64749] Training Speed: 426.05 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:03:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:20:25 INFO loss_tracker.py:84 | Epoch[472/NA] Step[99] GlobalStep[64763/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 20:20:30 INFO stats.py:314 | Epoch[472] Step[110] GlobalStep[64774] Training Speed: 428.36 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:03:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:20:35 INFO loss_tracker.py:84 | Epoch[472/NA] Step[124] GlobalStep[64788/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:20:39 INFO stats.py:314 | Epoch[472] Step[135] GlobalStep[64799] Training Speed: 455.34 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 4:03:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:20:40 INFO stats.py:394 | Epoch[472] completed. Training Speed: 306.18 samples/sec across all devices. Epoch Time: 57.27 sec. Average Epoch Time: 57.27 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 4:03:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:20:51 INFO stats.py:314 | Epoch[473] Step[23] GlobalStep[64824] Training Speed: 443.33 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:03:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:20:51 INFO loss_tracker.py:84 | Epoch[473/NA] Step[24] GlobalStep[64825/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:21:01 INFO stats.py:314 | Epoch[473] Step[48] GlobalStep[64849] Training Speed: 439.65 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:03:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:21:02 INFO loss_tracker.py:84 | Epoch[473/NA] Step[49] GlobalStep[64850/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:21:12 INFO stats.py:314 | Epoch[473] Step[73] GlobalStep[64874] Training Speed: 416.32 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 4:02:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:21:12 INFO loss_tracker.py:84 | Epoch[473/NA] Step[74] GlobalStep[64875/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 20:21:22 INFO stats.py:314 | Epoch[473] Step[98] GlobalStep[64899] Training Speed: 425.96 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:02:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:21:23 INFO loss_tracker.py:84 | Epoch[473/NA] Step[99] GlobalStep[64900/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 20:21:32 INFO stats.py:314 | Epoch[473] Step[123] GlobalStep[64924] Training Speed: 443.68 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:02:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:21:33 INFO loss_tracker.py:84 | Epoch[473/NA] Step[124] GlobalStep[64925/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 20:21:37 INFO stats.py:394 | Epoch[473] completed. Training Speed: 306.96 samples/sec across all devices. Epoch Time: 57.13 sec. Average Epoch Time: 57.13 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 4:02:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:21:43 INFO stats.py:314 | Epoch[474] Step[11] GlobalStep[64949] Training Speed: 432.43 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:02:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:21:48 INFO loss_tracker.py:84 | Epoch[474/NA] Step[24] GlobalStep[64962/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 20:21:53 INFO stats.py:314 | Epoch[474] Step[36] GlobalStep[64974] Training Speed: 432.64 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:02:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:21:59 INFO loss_tracker.py:84 | Epoch[474/NA] Step[49] GlobalStep[64987/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 20:22:04 INFO stats.py:314 | Epoch[474] Step[61] GlobalStep[64999] Training Speed: 440.00 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:01:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:22:09 INFO loss_tracker.py:84 | Epoch[474/NA] Step[74] GlobalStep[65012/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 20:22:14 INFO stats.py:314 | Epoch[474] Step[86] GlobalStep[65024] Training Speed: 410.60 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 4:01:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:22:19 INFO loss_tracker.py:84 | Epoch[474/NA] Step[99] GlobalStep[65037/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 20:22:24 INFO stats.py:314 | Epoch[474] Step[111] GlobalStep[65049] Training Speed: 428.62 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:01:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:22:29 INFO loss_tracker.py:84 | Epoch[474/NA] Step[124] GlobalStep[65062/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 20:22:34 INFO stats.py:314 | Epoch[474] Step[136] GlobalStep[65074] Training Speed: 434.60 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:01:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:22:34 INFO stats.py:394 | Epoch[474] completed. Training Speed: 308.66 samples/sec across all devices. Epoch Time: 56.81 sec. Average Epoch Time: 56.81 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:01:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:22:45 INFO stats.py:314 | Epoch[475] Step[24] GlobalStep[65099] Training Speed: 436.14 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:01:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:22:45 INFO loss_tracker.py:84 | Epoch[475/NA] Step[24] GlobalStep[65099/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 20:22:55 INFO stats.py:314 | Epoch[475] Step[49] GlobalStep[65124] Training Speed: 433.86 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:01:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:22:55 INFO loss_tracker.py:84 | Epoch[475/NA] Step[49] GlobalStep[65124/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 20:23:05 INFO stats.py:314 | Epoch[475] Step[74] GlobalStep[65149] Training Speed: 434.17 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:00:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:23:05 INFO loss_tracker.py:84 | Epoch[475/NA] Step[74] GlobalStep[65149/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:23:15 INFO stats.py:314 | Epoch[475] Step[99] GlobalStep[65174] Training Speed: 431.21 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:00:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:23:15 INFO loss_tracker.py:84 | Epoch[475/NA] Step[99] GlobalStep[65174/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 20:23:25 INFO stats.py:314 | Epoch[475] Step[124] GlobalStep[65199] Training Speed: 439.71 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 4:00:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:23:25 INFO loss_tracker.py:84 | Epoch[475/NA] Step[124] GlobalStep[65199/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 20:23:30 INFO stats.py:394 | Epoch[475] completed. Training Speed: 313.97 samples/sec across all devices. Epoch Time: 55.85 sec. Average Epoch Time: 55.85 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 4:00:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:23:36 INFO stats.py:314 | Epoch[476] Step[12] GlobalStep[65224] Training Speed: 404.49 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 4:00:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:23:41 INFO loss_tracker.py:84 | Epoch[476/NA] Step[24] GlobalStep[65236/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 20:23:46 INFO stats.py:314 | Epoch[476] Step[37] GlobalStep[65249] Training Speed: 430.99 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:00:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:23:51 INFO loss_tracker.py:84 | Epoch[476/NA] Step[49] GlobalStep[65261/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 20:23:57 INFO stats.py:314 | Epoch[476] Step[62] GlobalStep[65274] Training Speed: 428.34 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 4:00:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:24:02 INFO loss_tracker.py:84 | Epoch[476/NA] Step[74] GlobalStep[65286/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 20:24:07 INFO stats.py:314 | Epoch[476] Step[87] GlobalStep[65299] Training Speed: 434.44 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:59:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:24:12 INFO loss_tracker.py:84 | Epoch[476/NA] Step[99] GlobalStep[65311/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:24:17 INFO stats.py:314 | Epoch[476] Step[112] GlobalStep[65324] Training Speed: 439.94 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:59:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:24:21 INFO loss_tracker.py:84 | Epoch[476/NA] Step[124] GlobalStep[65336/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:24:26 INFO stats.py:394 | Epoch[476] completed. Training Speed: 310.87 samples/sec across all devices. Epoch Time: 56.41 sec. Average Epoch Time: 56.41 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 3:59:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:24:27 INFO stats.py:314 | Epoch[477] Step[0] GlobalStep[65349] Training Speed: 352.69 samples/sec across all devices. Average Step Time: 0.36 sec. Estimated Remaining Time: 3:59:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:24:37 INFO loss_tracker.py:84 | Epoch[477/NA] Step[24] GlobalStep[65373/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:24:38 INFO stats.py:314 | Epoch[477] Step[25] GlobalStep[65374] Training Speed: 420.61 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:59:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:24:48 INFO loss_tracker.py:84 | Epoch[477/NA] Step[49] GlobalStep[65398/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 20:24:48 INFO stats.py:314 | Epoch[477] Step[50] GlobalStep[65399] Training Speed: 421.61 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:59:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:24:58 INFO loss_tracker.py:84 | Epoch[477/NA] Step[74] GlobalStep[65423/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 20:24:59 INFO stats.py:314 | Epoch[477] Step[75] GlobalStep[65424] Training Speed: 415.11 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:59:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:25:08 INFO loss_tracker.py:84 | Epoch[477/NA] Step[99] GlobalStep[65448/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0054] loss_depth[0.0126] total_loss[0.0180] Rank[0/16] 06/24/2025 20:25:09 INFO stats.py:314 | Epoch[477] Step[100] GlobalStep[65449] Training Speed: 432.91 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:58:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:25:18 INFO loss_tracker.py:84 | Epoch[477/NA] Step[124] GlobalStep[65473/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0126] total_loss[0.0163] Rank[0/16] 06/24/2025 20:25:19 INFO stats.py:314 | Epoch[477] Step[125] GlobalStep[65474] Training Speed: 424.49 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:58:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:25:23 INFO stats.py:394 | Epoch[477] completed. Training Speed: 310.30 samples/sec across all devices. Epoch Time: 56.51 sec. Average Epoch Time: 56.51 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 3:58:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:25:29 INFO stats.py:314 | Epoch[478] Step[13] GlobalStep[65499] Training Speed: 431.51 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:58:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:25:34 INFO loss_tracker.py:84 | Epoch[478/NA] Step[24] GlobalStep[65510/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 20:25:39 INFO stats.py:314 | Epoch[478] Step[38] GlobalStep[65524] Training Speed: 425.70 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:58:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:25:44 INFO loss_tracker.py:84 | Epoch[478/NA] Step[49] GlobalStep[65535/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 20:25:50 INFO stats.py:314 | Epoch[478] Step[63] GlobalStep[65549] Training Speed: 431.03 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:58:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:25:55 INFO loss_tracker.py:84 | Epoch[478/NA] Step[74] GlobalStep[65560/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 20:26:00 INFO stats.py:314 | Epoch[478] Step[88] GlobalStep[65574] Training Speed: 435.07 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:57:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:26:05 INFO loss_tracker.py:84 | Epoch[478/NA] Step[99] GlobalStep[65585/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 20:26:10 INFO stats.py:314 | Epoch[478] Step[113] GlobalStep[65599] Training Speed: 423.53 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:57:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:26:15 INFO loss_tracker.py:84 | Epoch[478/NA] Step[124] GlobalStep[65610/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 20:26:19 INFO stats.py:394 | Epoch[478] completed. Training Speed: 311.08 samples/sec across all devices. Epoch Time: 56.37 sec. Average Epoch Time: 56.37 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 3:57:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:26:21 INFO stats.py:314 | Epoch[479] Step[1] GlobalStep[65624] Training Speed: 430.30 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:57:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:26:30 INFO loss_tracker.py:84 | Epoch[479/NA] Step[24] GlobalStep[65647/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 20:26:31 INFO stats.py:314 | Epoch[479] Step[26] GlobalStep[65649] Training Speed: 433.43 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:57:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:26:40 INFO loss_tracker.py:84 | Epoch[479/NA] Step[49] GlobalStep[65672/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 20:26:41 INFO stats.py:314 | Epoch[479] Step[51] GlobalStep[65674] Training Speed: 433.56 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:57:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:26:51 INFO loss_tracker.py:84 | Epoch[479/NA] Step[74] GlobalStep[65697/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0125] total_loss[0.0169] Rank[0/16] 06/24/2025 20:26:51 INFO stats.py:314 | Epoch[479] Step[76] GlobalStep[65699] Training Speed: 424.58 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:57:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:27:02 INFO loss_tracker.py:84 | Epoch[479/NA] Step[99] GlobalStep[65722/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 20:27:02 INFO stats.py:314 | Epoch[479] Step[101] GlobalStep[65724] Training Speed: 426.09 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:56:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:27:11 INFO loss_tracker.py:84 | Epoch[479/NA] Step[124] GlobalStep[65747/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 20:27:12 INFO stats.py:314 | Epoch[479] Step[126] GlobalStep[65749] Training Speed: 435.32 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:56:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:27:16 INFO stats.py:394 | Epoch[479] completed. Training Speed: 307.63 samples/sec across all devices. Epoch Time: 57.00 sec. Average Epoch Time: 57.00 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:56:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:27:23 INFO stats.py:314 | Epoch[480] Step[14] GlobalStep[65774] Training Speed: 431.56 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:56:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:27:27 INFO loss_tracker.py:84 | Epoch[480/NA] Step[24] GlobalStep[65784/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 20:27:34 INFO stats.py:314 | Epoch[480] Step[39] GlobalStep[65799] Training Speed: 438.03 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:56:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:27:38 INFO loss_tracker.py:84 | Epoch[480/NA] Step[49] GlobalStep[65809/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 20:27:43 INFO stats.py:314 | Epoch[480] Step[64] GlobalStep[65824] Training Speed: 438.56 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:56:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:27:47 INFO loss_tracker.py:84 | Epoch[480/NA] Step[74] GlobalStep[65834/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 20:27:54 INFO stats.py:314 | Epoch[480] Step[89] GlobalStep[65849] Training Speed: 408.27 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:56:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:27:58 INFO loss_tracker.py:84 | Epoch[480/NA] Step[99] GlobalStep[65859/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 20:28:04 INFO stats.py:314 | Epoch[480] Step[114] GlobalStep[65874] Training Speed: 420.17 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:55:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:28:08 INFO loss_tracker.py:84 | Epoch[480/NA] Step[124] GlobalStep[65884/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 20:28:12 INFO stats.py:394 | Epoch[480] completed. Training Speed: 310.45 samples/sec across all devices. Epoch Time: 56.49 sec. Average Epoch Time: 56.49 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 3:55:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:28:15 INFO stats.py:314 | Epoch[481] Step[2] GlobalStep[65899] Training Speed: 419.45 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:55:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:28:24 INFO loss_tracker.py:84 | Epoch[481/NA] Step[24] GlobalStep[65921/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 20:28:25 INFO stats.py:314 | Epoch[481] Step[27] GlobalStep[65924] Training Speed: 434.31 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:55:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:28:34 INFO loss_tracker.py:84 | Epoch[481/NA] Step[49] GlobalStep[65946/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 20:28:35 INFO stats.py:314 | Epoch[481] Step[52] GlobalStep[65949] Training Speed: 432.48 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:55:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:28:44 INFO loss_tracker.py:84 | Epoch[481/NA] Step[74] GlobalStep[65971/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 20:28:46 INFO stats.py:314 | Epoch[481] Step[77] GlobalStep[65974] Training Speed: 433.32 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:55:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:28:55 INFO loss_tracker.py:84 | Epoch[481/NA] Step[99] GlobalStep[65996/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0125] total_loss[0.0171] Rank[0/16] 06/24/2025 20:28:56 INFO stats.py:314 | Epoch[481] Step[102] GlobalStep[65999] Training Speed: 428.34 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:55:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:29:05 INFO loss_tracker.py:84 | Epoch[481/NA] Step[124] GlobalStep[66021/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 20:29:06 INFO stats.py:314 | Epoch[481] Step[127] GlobalStep[66024] Training Speed: 440.00 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:54:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:29:09 INFO stats.py:394 | Epoch[481] completed. Training Speed: 308.60 samples/sec across all devices. Epoch Time: 56.82 sec. Average Epoch Time: 56.82 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 3:54:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:29:17 INFO stats.py:314 | Epoch[482] Step[15] GlobalStep[66049] Training Speed: 423.43 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:54:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:29:21 INFO loss_tracker.py:84 | Epoch[482/NA] Step[24] GlobalStep[66058/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:29:27 INFO stats.py:314 | Epoch[482] Step[40] GlobalStep[66074] Training Speed: 429.07 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:54:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:29:31 INFO loss_tracker.py:84 | Epoch[482/NA] Step[49] GlobalStep[66083/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0032] loss_depth[0.0126] total_loss[0.0158] Rank[0/16] 06/24/2025 20:29:37 INFO stats.py:314 | Epoch[482] Step[65] GlobalStep[66099] Training Speed: 420.59 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:54:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:29:42 INFO loss_tracker.py:84 | Epoch[482/NA] Step[74] GlobalStep[66108/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 20:29:48 INFO stats.py:314 | Epoch[482] Step[90] GlobalStep[66124] Training Speed: 426.19 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:54:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:29:52 INFO loss_tracker.py:84 | Epoch[482/NA] Step[99] GlobalStep[66133/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 20:29:58 INFO stats.py:314 | Epoch[482] Step[115] GlobalStep[66149] Training Speed: 407.52 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:54:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:30:02 INFO loss_tracker.py:84 | Epoch[482/NA] Step[124] GlobalStep[66158/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 20:30:06 INFO stats.py:394 | Epoch[482] completed. Training Speed: 306.35 samples/sec across all devices. Epoch Time: 57.24 sec. Average Epoch Time: 57.24 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:53:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:30:09 INFO stats.py:314 | Epoch[483] Step[3] GlobalStep[66174] Training Speed: 435.15 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:53:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:30:18 INFO loss_tracker.py:84 | Epoch[483/NA] Step[24] GlobalStep[66195/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 20:30:19 INFO stats.py:314 | Epoch[483] Step[28] GlobalStep[66199] Training Speed: 431.13 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:53:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:30:28 INFO loss_tracker.py:84 | Epoch[483/NA] Step[49] GlobalStep[66220/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 20:30:30 INFO stats.py:314 | Epoch[483] Step[53] GlobalStep[66224] Training Speed: 435.03 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:53:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:30:38 INFO loss_tracker.py:84 | Epoch[483/NA] Step[74] GlobalStep[66245/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 20:30:40 INFO stats.py:314 | Epoch[483] Step[78] GlobalStep[66249] Training Speed: 431.86 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:53:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:30:49 INFO loss_tracker.py:84 | Epoch[483/NA] Step[99] GlobalStep[66270/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 20:30:51 INFO stats.py:314 | Epoch[483] Step[103] GlobalStep[66274] Training Speed: 429.30 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:53:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:30:59 INFO loss_tracker.py:84 | Epoch[483/NA] Step[124] GlobalStep[66295/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0126] total_loss[0.0163] Rank[0/16] 06/24/2025 20:31:00 INFO stats.py:314 | Epoch[483] Step[128] GlobalStep[66299] Training Speed: 437.03 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:52:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:31:03 INFO stats.py:394 | Epoch[483] completed. Training Speed: 307.59 samples/sec across all devices. Epoch Time: 57.01 sec. Average Epoch Time: 57.01 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:52:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:31:11 INFO stats.py:314 | Epoch[484] Step[16] GlobalStep[66324] Training Speed: 419.20 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:52:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:31:15 INFO loss_tracker.py:84 | Epoch[484/NA] Step[24] GlobalStep[66332/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 20:31:22 INFO stats.py:314 | Epoch[484] Step[41] GlobalStep[66349] Training Speed: 430.06 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:52:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:31:25 INFO loss_tracker.py:84 | Epoch[484/NA] Step[49] GlobalStep[66357/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 20:31:32 INFO stats.py:314 | Epoch[484] Step[66] GlobalStep[66374] Training Speed: 433.36 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:52:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:31:35 INFO loss_tracker.py:84 | Epoch[484/NA] Step[74] GlobalStep[66382/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 20:31:42 INFO stats.py:314 | Epoch[484] Step[91] GlobalStep[66399] Training Speed: 428.43 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:52:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:31:45 INFO loss_tracker.py:84 | Epoch[484/NA] Step[99] GlobalStep[66407/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 20:31:52 INFO stats.py:314 | Epoch[484] Step[116] GlobalStep[66424] Training Speed: 424.27 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:52:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:31:55 INFO loss_tracker.py:84 | Epoch[484/NA] Step[124] GlobalStep[66432/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:31:59 INFO stats.py:394 | Epoch[484] completed. Training Speed: 313.97 samples/sec across all devices. Epoch Time: 55.85 sec. Average Epoch Time: 55.85 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 3:51:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:32:03 INFO stats.py:314 | Epoch[485] Step[4] GlobalStep[66449] Training Speed: 434.34 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:51:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:32:11 INFO loss_tracker.py:84 | Epoch[485/NA] Step[24] GlobalStep[66469/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 20:32:13 INFO stats.py:314 | Epoch[485] Step[29] GlobalStep[66474] Training Speed: 424.75 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:51:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:32:21 INFO loss_tracker.py:84 | Epoch[485/NA] Step[49] GlobalStep[66494/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0126] total_loss[0.0163] Rank[0/16] 06/24/2025 20:32:23 INFO stats.py:314 | Epoch[485] Step[54] GlobalStep[66499] Training Speed: 434.10 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:51:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:32:31 INFO loss_tracker.py:84 | Epoch[485/NA] Step[74] GlobalStep[66519/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 20:32:33 INFO stats.py:314 | Epoch[485] Step[79] GlobalStep[66524] Training Speed: 433.82 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:51:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:32:41 INFO loss_tracker.py:84 | Epoch[485/NA] Step[99] GlobalStep[66544/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:32:43 INFO stats.py:314 | Epoch[485] Step[104] GlobalStep[66549] Training Speed: 435.25 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:51:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:32:51 INFO loss_tracker.py:84 | Epoch[485/NA] Step[124] GlobalStep[66569/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0126] total_loss[0.0162] Rank[0/16] 06/24/2025 20:32:52 INFO stats.py:314 | Epoch[485] Step[129] GlobalStep[66574] Training Speed: 448.12 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:51:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:32:55 INFO stats.py:394 | Epoch[485] completed. Training Speed: 314.41 samples/sec across all devices. Epoch Time: 55.77 sec. Average Epoch Time: 55.77 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 3:51:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:33:04 INFO stats.py:314 | Epoch[486] Step[17] GlobalStep[66599] Training Speed: 426.33 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:50:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:33:07 INFO loss_tracker.py:84 | Epoch[486/NA] Step[24] GlobalStep[66606/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 20:33:14 INFO stats.py:314 | Epoch[486] Step[42] GlobalStep[66624] Training Speed: 421.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:50:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:33:17 INFO loss_tracker.py:84 | Epoch[486/NA] Step[49] GlobalStep[66631/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 20:33:25 INFO stats.py:314 | Epoch[486] Step[67] GlobalStep[66649] Training Speed: 428.06 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:50:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:33:28 INFO loss_tracker.py:84 | Epoch[486/NA] Step[74] GlobalStep[66656/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 20:33:35 INFO stats.py:314 | Epoch[486] Step[92] GlobalStep[66674] Training Speed: 436.12 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:50:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:33:38 INFO loss_tracker.py:84 | Epoch[486/NA] Step[99] GlobalStep[66681/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 20:33:45 INFO stats.py:314 | Epoch[486] Step[117] GlobalStep[66699] Training Speed: 421.79 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:50:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:33:48 INFO loss_tracker.py:84 | Epoch[486/NA] Step[124] GlobalStep[66706/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 20:33:52 INFO stats.py:394 | Epoch[486] completed. Training Speed: 308.02 samples/sec across all devices. Epoch Time: 56.93 sec. Average Epoch Time: 56.93 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:50:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:33:55 INFO stats.py:314 | Epoch[487] Step[5] GlobalStep[66724] Training Speed: 436.60 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:50:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:34:03 INFO loss_tracker.py:84 | Epoch[487/NA] Step[24] GlobalStep[66743/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 20:34:06 INFO stats.py:314 | Epoch[487] Step[30] GlobalStep[66749] Training Speed: 424.02 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:49:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:34:14 INFO loss_tracker.py:84 | Epoch[487/NA] Step[49] GlobalStep[66768/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 20:34:16 INFO stats.py:314 | Epoch[487] Step[55] GlobalStep[66774] Training Speed: 434.43 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:49:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:34:24 INFO loss_tracker.py:84 | Epoch[487/NA] Step[74] GlobalStep[66793/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 20:34:27 INFO stats.py:314 | Epoch[487] Step[80] GlobalStep[66799] Training Speed: 425.31 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:49:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:34:35 INFO loss_tracker.py:84 | Epoch[487/NA] Step[99] GlobalStep[66818/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 20:34:37 INFO stats.py:314 | Epoch[487] Step[105] GlobalStep[66824] Training Speed: 419.68 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:49:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:34:44 INFO loss_tracker.py:84 | Epoch[487/NA] Step[124] GlobalStep[66843/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 20:34:47 INFO stats.py:314 | Epoch[487] Step[130] GlobalStep[66849] Training Speed: 452.15 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 3:49:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:34:49 INFO stats.py:394 | Epoch[487] completed. Training Speed: 310.25 samples/sec across all devices. Epoch Time: 56.52 sec. Average Epoch Time: 56.52 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 3:49:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:34:57 INFO stats.py:314 | Epoch[488] Step[18] GlobalStep[66874] Training Speed: 435.65 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:48:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:35:00 INFO loss_tracker.py:84 | Epoch[488/NA] Step[24] GlobalStep[66880/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 20:35:07 INFO stats.py:314 | Epoch[488] Step[43] GlobalStep[66899] Training Speed: 422.05 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:48:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:35:10 INFO loss_tracker.py:84 | Epoch[488/NA] Step[49] GlobalStep[66905/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 20:35:18 INFO stats.py:314 | Epoch[488] Step[68] GlobalStep[66924] Training Speed: 429.17 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:48:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:35:21 INFO loss_tracker.py:84 | Epoch[488/NA] Step[74] GlobalStep[66930/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 20:35:28 INFO stats.py:314 | Epoch[488] Step[93] GlobalStep[66949] Training Speed: 411.11 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:48:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:35:31 INFO loss_tracker.py:84 | Epoch[488/NA] Step[99] GlobalStep[66955/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:35:39 INFO stats.py:314 | Epoch[488] Step[118] GlobalStep[66974] Training Speed: 413.38 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:48:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:35:41 INFO loss_tracker.py:84 | Epoch[488/NA] Step[124] GlobalStep[66980/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0125] total_loss[0.0173] Rank[0/16] 06/24/2025 20:35:45 INFO stats.py:394 | Epoch[488] completed. Training Speed: 309.24 samples/sec across all devices. Epoch Time: 56.71 sec. Average Epoch Time: 56.71 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 3:48:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:35:49 INFO stats.py:314 | Epoch[489] Step[6] GlobalStep[66999] Training Speed: 425.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:48:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:35:57 INFO loss_tracker.py:84 | Epoch[489/NA] Step[24] GlobalStep[67017/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 20:36:00 INFO stats.py:314 | Epoch[489] Step[31] GlobalStep[67024] Training Speed: 434.00 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:47:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:36:07 INFO loss_tracker.py:84 | Epoch[489/NA] Step[49] GlobalStep[67042/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:36:10 INFO stats.py:314 | Epoch[489] Step[56] GlobalStep[67049] Training Speed: 429.55 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:47:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:36:17 INFO loss_tracker.py:84 | Epoch[489/NA] Step[74] GlobalStep[67067/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 20:36:20 INFO stats.py:314 | Epoch[489] Step[81] GlobalStep[67074] Training Speed: 383.06 samples/sec across all devices. Average Step Time: 0.33 sec. Estimated Remaining Time: 3:47:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:36:27 INFO loss_tracker.py:84 | Epoch[489/NA] Step[99] GlobalStep[67092/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 20:36:30 INFO stats.py:314 | Epoch[489] Step[106] GlobalStep[67099] Training Speed: 424.61 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:47:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:36:38 INFO loss_tracker.py:84 | Epoch[489/NA] Step[124] GlobalStep[67117/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 20:36:41 INFO stats.py:314 | Epoch[489] Step[131] GlobalStep[67124] Training Speed: 438.62 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:47:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:36:42 INFO stats.py:394 | Epoch[489] completed. Training Speed: 307.15 samples/sec across all devices. Epoch Time: 57.09 sec. Average Epoch Time: 57.09 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:47:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:36:51 INFO stats.py:314 | Epoch[490] Step[19] GlobalStep[67149] Training Speed: 430.67 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:47:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:36:53 INFO loss_tracker.py:84 | Epoch[490/NA] Step[24] GlobalStep[67154/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0125] total_loss[0.0169] Rank[0/16] 06/24/2025 20:37:02 INFO stats.py:314 | Epoch[490] Step[44] GlobalStep[67174] Training Speed: 436.43 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:46:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:37:04 INFO loss_tracker.py:84 | Epoch[490/NA] Step[49] GlobalStep[67179/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:37:12 INFO stats.py:314 | Epoch[490] Step[69] GlobalStep[67199] Training Speed: 424.54 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:46:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:37:14 INFO loss_tracker.py:84 | Epoch[490/NA] Step[74] GlobalStep[67204/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 20:37:22 INFO stats.py:314 | Epoch[490] Step[94] GlobalStep[67224] Training Speed: 401.06 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 3:46:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:37:24 INFO loss_tracker.py:84 | Epoch[490/NA] Step[99] GlobalStep[67229/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0032] loss_depth[0.0126] total_loss[0.0158] Rank[0/16] 06/24/2025 20:37:32 INFO stats.py:314 | Epoch[490] Step[119] GlobalStep[67249] Training Speed: 432.48 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:46:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:37:34 INFO loss_tracker.py:84 | Epoch[490/NA] Step[124] GlobalStep[67254/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 20:37:39 INFO stats.py:394 | Epoch[490] completed. Training Speed: 308.16 samples/sec across all devices. Epoch Time: 56.91 sec. Average Epoch Time: 56.91 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:46:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:37:44 INFO stats.py:314 | Epoch[491] Step[7] GlobalStep[67274] Training Speed: 430.24 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:46:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:37:51 INFO loss_tracker.py:84 | Epoch[491/NA] Step[24] GlobalStep[67291/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 20:37:54 INFO stats.py:314 | Epoch[491] Step[32] GlobalStep[67299] Training Speed: 418.02 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:46:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:38:01 INFO loss_tracker.py:84 | Epoch[491/NA] Step[49] GlobalStep[67316/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 20:38:05 INFO stats.py:314 | Epoch[491] Step[57] GlobalStep[67324] Training Speed: 401.89 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 3:45:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:38:11 INFO loss_tracker.py:84 | Epoch[491/NA] Step[74] GlobalStep[67341/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 20:38:15 INFO stats.py:314 | Epoch[491] Step[82] GlobalStep[67349] Training Speed: 434.64 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:45:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:38:22 INFO loss_tracker.py:84 | Epoch[491/NA] Step[99] GlobalStep[67366/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:38:25 INFO stats.py:314 | Epoch[491] Step[107] GlobalStep[67374] Training Speed: 428.20 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:45:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:38:31 INFO loss_tracker.py:84 | Epoch[491/NA] Step[124] GlobalStep[67391/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 20:38:34 INFO stats.py:314 | Epoch[491] Step[132] GlobalStep[67399] Training Speed: 430.62 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:45:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:38:36 INFO stats.py:394 | Epoch[491] completed. Training Speed: 309.63 samples/sec across all devices. Epoch Time: 56.64 sec. Average Epoch Time: 56.64 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 3:45:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:38:46 INFO stats.py:314 | Epoch[492] Step[20] GlobalStep[67424] Training Speed: 428.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:45:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:38:47 INFO loss_tracker.py:84 | Epoch[492/NA] Step[24] GlobalStep[67428/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 20:38:56 INFO stats.py:314 | Epoch[492] Step[45] GlobalStep[67449] Training Speed: 426.15 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:45:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:38:58 INFO loss_tracker.py:84 | Epoch[492/NA] Step[49] GlobalStep[67453/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 20:39:06 INFO stats.py:314 | Epoch[492] Step[70] GlobalStep[67474] Training Speed: 422.81 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:44:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:39:08 INFO loss_tracker.py:84 | Epoch[492/NA] Step[74] GlobalStep[67478/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 20:39:16 INFO stats.py:314 | Epoch[492] Step[95] GlobalStep[67499] Training Speed: 433.73 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:44:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:39:18 INFO loss_tracker.py:84 | Epoch[492/NA] Step[99] GlobalStep[67503/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 20:39:27 INFO stats.py:314 | Epoch[492] Step[120] GlobalStep[67524] Training Speed: 451.46 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 3:44:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:39:28 INFO loss_tracker.py:84 | Epoch[492/NA] Step[124] GlobalStep[67528/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 20:39:32 INFO stats.py:394 | Epoch[492] completed. Training Speed: 310.61 samples/sec across all devices. Epoch Time: 56.46 sec. Average Epoch Time: 56.46 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 3:44:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:39:37 INFO stats.py:314 | Epoch[493] Step[8] GlobalStep[67549] Training Speed: 425.93 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:44:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:39:44 INFO loss_tracker.py:84 | Epoch[493/NA] Step[24] GlobalStep[67565/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:39:47 INFO stats.py:314 | Epoch[493] Step[33] GlobalStep[67574] Training Speed: 420.40 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:44:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:39:54 INFO loss_tracker.py:84 | Epoch[493/NA] Step[49] GlobalStep[67590/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0126] total_loss[0.0162] Rank[0/16] 06/24/2025 20:39:57 INFO stats.py:314 | Epoch[493] Step[58] GlobalStep[67599] Training Speed: 417.75 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:43:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:40:04 INFO loss_tracker.py:84 | Epoch[493/NA] Step[74] GlobalStep[67615/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 20:40:08 INFO stats.py:314 | Epoch[493] Step[83] GlobalStep[67624] Training Speed: 424.51 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:43:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:40:15 INFO loss_tracker.py:84 | Epoch[493/NA] Step[99] GlobalStep[67640/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 20:40:18 INFO stats.py:314 | Epoch[493] Step[108] GlobalStep[67649] Training Speed: 422.99 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:43:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:40:25 INFO loss_tracker.py:84 | Epoch[493/NA] Step[124] GlobalStep[67665/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0125] total_loss[0.0171] Rank[0/16] 06/24/2025 20:40:28 INFO stats.py:314 | Epoch[493] Step[133] GlobalStep[67674] Training Speed: 451.14 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 3:43:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:40:29 INFO stats.py:394 | Epoch[493] completed. Training Speed: 307.72 samples/sec across all devices. Epoch Time: 56.99 sec. Average Epoch Time: 56.99 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:43:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:40:39 INFO stats.py:314 | Epoch[494] Step[21] GlobalStep[67699] Training Speed: 432.25 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:43:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:40:41 INFO loss_tracker.py:84 | Epoch[494/NA] Step[24] GlobalStep[67702/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 20:40:50 INFO stats.py:314 | Epoch[494] Step[46] GlobalStep[67724] Training Speed: 397.56 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 3:43:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:40:51 INFO loss_tracker.py:84 | Epoch[494/NA] Step[49] GlobalStep[67727/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 20:41:00 INFO stats.py:314 | Epoch[494] Step[71] GlobalStep[67749] Training Speed: 433.49 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:42:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:41:01 INFO loss_tracker.py:84 | Epoch[494/NA] Step[74] GlobalStep[67752/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 20:41:10 INFO stats.py:314 | Epoch[494] Step[96] GlobalStep[67774] Training Speed: 431.16 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:42:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:41:12 INFO loss_tracker.py:84 | Epoch[494/NA] Step[99] GlobalStep[67777/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 20:41:21 INFO stats.py:314 | Epoch[494] Step[121] GlobalStep[67799] Training Speed: 438.16 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:42:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:41:22 INFO loss_tracker.py:84 | Epoch[494/NA] Step[124] GlobalStep[67802/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0176] Rank[0/16] 06/24/2025 20:41:26 INFO stats.py:394 | Epoch[494] completed. Training Speed: 306.92 samples/sec across all devices. Epoch Time: 57.14 sec. Average Epoch Time: 57.14 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:42:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:41:32 INFO stats.py:314 | Epoch[495] Step[9] GlobalStep[67824] Training Speed: 423.97 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:42:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:41:38 INFO loss_tracker.py:84 | Epoch[495/NA] Step[24] GlobalStep[67839/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0168] Rank[0/16] 06/24/2025 20:41:43 INFO stats.py:314 | Epoch[495] Step[34] GlobalStep[67849] Training Speed: 423.62 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:42:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:41:49 INFO loss_tracker.py:84 | Epoch[495/NA] Step[49] GlobalStep[67864/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 20:41:53 INFO stats.py:314 | Epoch[495] Step[59] GlobalStep[67874] Training Speed: 415.94 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:42:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:41:59 INFO loss_tracker.py:84 | Epoch[495/NA] Step[74] GlobalStep[67889/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 20:42:04 INFO stats.py:314 | Epoch[495] Step[84] GlobalStep[67899] Training Speed: 432.41 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:41:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:42:09 INFO loss_tracker.py:84 | Epoch[495/NA] Step[99] GlobalStep[67914/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:42:13 INFO stats.py:314 | Epoch[495] Step[109] GlobalStep[67924] Training Speed: 434.89 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:41:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:42:19 INFO loss_tracker.py:84 | Epoch[495/NA] Step[124] GlobalStep[67939/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 20:42:23 INFO stats.py:314 | Epoch[495] Step[134] GlobalStep[67949] Training Speed: 452.81 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 3:41:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:42:24 INFO stats.py:394 | Epoch[495] completed. Training Speed: 306.89 samples/sec across all devices. Epoch Time: 57.14 sec. Average Epoch Time: 57.14 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:41:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:42:34 INFO stats.py:314 | Epoch[496] Step[22] GlobalStep[67974] Training Speed: 411.25 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:41:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:42:34 INFO loss_tracker.py:84 | Epoch[496/NA] Step[24] GlobalStep[67976/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 20:42:44 INFO stats.py:314 | Epoch[496] Step[47] GlobalStep[67999] Training Speed: 421.29 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:41:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:42:45 WARNING accelerator.py:3099 | Deleting 1 checkpoints to make room for new checkpoint. Rank[0/16] 06/24/2025 20:42:45 INFO accelerator.py:3111 | Saving current state to /job_data/checkpoints/checkpoint_16 Rank[5/16] 06/24/2025 20:42:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 67999 to /job_data/checkpoints/checkpoint_16 Rank[6/16] 06/24/2025 20:42:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 67999 to /job_data/checkpoints/checkpoint_16 Rank[7/16] 06/24/2025 20:42:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 67999 to /job_data/checkpoints/checkpoint_16 Rank[3/16] 06/24/2025 20:42:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 67999 to /job_data/checkpoints/checkpoint_16 Rank[2/16] 06/24/2025 20:42:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 67999 to /job_data/checkpoints/checkpoint_16 Rank[1/16] 06/24/2025 20:42:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 67999 to /job_data/checkpoints/checkpoint_16 Rank[4/16] 06/24/2025 20:42:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 67999 to /job_data/checkpoints/checkpoint_16 Rank[8/16] 06/24/2025 20:42:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 67999 to /job_data/checkpoints/checkpoint_16 Rank[14/16] 06/24/2025 20:42:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 67999 to /job_data/checkpoints/checkpoint_16 Rank[13/16] 06/24/2025 20:42:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 67999 to /job_data/checkpoints/checkpoint_16 Rank[11/16] 06/24/2025 20:42:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 67999 to /job_data/checkpoints/checkpoint_16 Rank[12/16] 06/24/2025 20:42:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 67999 to /job_data/checkpoints/checkpoint_16 Rank[15/16] 06/24/2025 20:42:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 67999 to /job_data/checkpoints/checkpoint_16 Rank[9/16] 06/24/2025 20:42:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 67999 to /job_data/checkpoints/checkpoint_16 Rank[10/16] 06/24/2025 20:42:46 INFO checkpoint.py:110 | Save checkpoint at the end of step 67999 to /job_data/checkpoints/checkpoint_16 Rank[0/16] 06/24/2025 20:42:47 INFO checkpointing.py:106 | Model weights saved in /job_data/checkpoints/checkpoint_16/model.safetensors Rank[0/16] 06/24/2025 20:42:48 INFO checkpointing.py:113 | Optimizer state saved in /job_data/checkpoints/checkpoint_16/optimizer.bin Rank[0/16] 06/24/2025 20:42:48 INFO checkpointing.py:120 | Scheduler state saved in /job_data/checkpoints/checkpoint_16/scheduler.bin Rank[0/16] 06/24/2025 20:42:48 INFO checkpointing.py:137 | Sampler state for dataloader 0 saved in /job_data/checkpoints/checkpoint_16/sampler.bin Rank[0/16] 06/24/2025 20:42:48 INFO checkpointing.py:164 | Random states saved in /job_data/checkpoints/checkpoint_16/random_states_0.pkl Rank[0/16] 06/24/2025 20:42:48 INFO checkpointing.py:300 | Saving the state of TrainerProgressState to /job_data/checkpoints/checkpoint_16/custom_checkpoint_0.pkl Rank[0/16] 06/24/2025 20:42:48 INFO checkpoint.py:110 | Save checkpoint at the end of step 67999 to /job_data/checkpoints/checkpoint_16 Rank[0/16] 06/24/2025 20:42:49 INFO loss_tracker.py:84 | Epoch[496/NA] Step[49] GlobalStep[68001/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 20:42:58 INFO stats.py:314 | Epoch[496] Step[72] GlobalStep[68024] Training Speed: 428.03 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:41:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:42:59 INFO loss_tracker.py:84 | Epoch[496/NA] Step[74] GlobalStep[68026/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:43:10 INFO stats.py:314 | Epoch[496] Step[97] GlobalStep[68049] Training Speed: 421.19 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:40:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:43:10 INFO loss_tracker.py:84 | Epoch[496/NA] Step[99] GlobalStep[68051/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 20:43:19 INFO stats.py:314 | Epoch[496] Step[122] GlobalStep[68074] Training Speed: 441.14 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:40:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:43:20 INFO loss_tracker.py:84 | Epoch[496/NA] Step[124] GlobalStep[68076/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 20:43:25 INFO stats.py:394 | Epoch[496] completed. Training Speed: 285.52 samples/sec across all devices. Epoch Time: 61.42 sec. Average Epoch Time: 61.42 sec. Average Step Time: 0.45 sec. Estimated Remaining Time: 3:40:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:43:31 INFO stats.py:314 | Epoch[497] Step[10] GlobalStep[68099] Training Speed: 416.65 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:40:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:43:36 INFO loss_tracker.py:84 | Epoch[497/NA] Step[24] GlobalStep[68113/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 20:43:41 INFO stats.py:314 | Epoch[497] Step[35] GlobalStep[68124] Training Speed: 409.47 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:40:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:43:47 INFO loss_tracker.py:84 | Epoch[497/NA] Step[49] GlobalStep[68138/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0125] total_loss[0.0177] Rank[0/16] 06/24/2025 20:43:51 INFO stats.py:314 | Epoch[497] Step[60] GlobalStep[68149] Training Speed: 435.99 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:40:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:43:57 INFO loss_tracker.py:84 | Epoch[497/NA] Step[74] GlobalStep[68163/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 20:44:01 INFO stats.py:314 | Epoch[497] Step[85] GlobalStep[68174] Training Speed: 431.67 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:40:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:44:07 INFO loss_tracker.py:84 | Epoch[497/NA] Step[99] GlobalStep[68188/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 20:44:11 INFO stats.py:314 | Epoch[497] Step[110] GlobalStep[68199] Training Speed: 431.68 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:39:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:44:16 INFO loss_tracker.py:84 | Epoch[497/NA] Step[124] GlobalStep[68213/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0035] loss_depth[0.0125] total_loss[0.0161] Rank[0/16] 06/24/2025 20:44:20 INFO stats.py:314 | Epoch[497] Step[135] GlobalStep[68224] Training Speed: 436.91 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:39:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:44:21 INFO stats.py:394 | Epoch[497] completed. Training Speed: 314.21 samples/sec across all devices. Epoch Time: 55.81 sec. Average Epoch Time: 55.81 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 3:39:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:44:32 INFO stats.py:314 | Epoch[498] Step[23] GlobalStep[68249] Training Speed: 434.06 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:39:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:44:32 INFO loss_tracker.py:84 | Epoch[498/NA] Step[24] GlobalStep[68250/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0176] Rank[0/16] 06/24/2025 20:44:42 INFO stats.py:314 | Epoch[498] Step[48] GlobalStep[68274] Training Speed: 427.88 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:39:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:44:43 INFO loss_tracker.py:84 | Epoch[498/NA] Step[49] GlobalStep[68275/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:44:52 INFO stats.py:314 | Epoch[498] Step[73] GlobalStep[68299] Training Speed: 430.16 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:39:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:44:53 INFO loss_tracker.py:84 | Epoch[498/NA] Step[74] GlobalStep[68300/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 20:45:02 INFO stats.py:314 | Epoch[498] Step[98] GlobalStep[68324] Training Speed: 433.03 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:38:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:45:03 INFO loss_tracker.py:84 | Epoch[498/NA] Step[99] GlobalStep[68325/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 20:45:12 INFO stats.py:314 | Epoch[498] Step[123] GlobalStep[68349] Training Speed: 456.80 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 3:38:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:45:13 INFO loss_tracker.py:84 | Epoch[498/NA] Step[124] GlobalStep[68350/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 20:45:17 INFO stats.py:394 | Epoch[498] completed. Training Speed: 312.39 samples/sec across all devices. Epoch Time: 56.14 sec. Average Epoch Time: 56.14 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 3:38:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:45:23 INFO stats.py:314 | Epoch[499] Step[11] GlobalStep[68374] Training Speed: 416.04 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:38:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:45:29 INFO loss_tracker.py:84 | Epoch[499/NA] Step[24] GlobalStep[68387/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 20:45:34 INFO stats.py:314 | Epoch[499] Step[36] GlobalStep[68399] Training Speed: 429.34 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:38:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:45:39 INFO loss_tracker.py:84 | Epoch[499/NA] Step[49] GlobalStep[68412/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:45:44 INFO stats.py:314 | Epoch[499] Step[61] GlobalStep[68424] Training Speed: 417.19 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:38:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:45:50 INFO loss_tracker.py:84 | Epoch[499/NA] Step[74] GlobalStep[68437/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 20:45:55 INFO stats.py:314 | Epoch[499] Step[86] GlobalStep[68449] Training Speed: 402.42 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 3:38:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:46:00 INFO loss_tracker.py:84 | Epoch[499/NA] Step[99] GlobalStep[68462/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0125] total_loss[0.0173] Rank[0/16] 06/24/2025 20:46:05 INFO stats.py:314 | Epoch[499] Step[111] GlobalStep[68474] Training Speed: 413.83 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:37:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:46:10 INFO loss_tracker.py:84 | Epoch[499/NA] Step[124] GlobalStep[68487/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0164] Rank[0/16] 06/24/2025 20:46:15 INFO stats.py:314 | Epoch[499] Step[136] GlobalStep[68499] Training Speed: 453.82 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 3:37:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:46:15 INFO stats.py:394 | Epoch[499] completed. Training Speed: 304.51 samples/sec across all devices. Epoch Time: 57.59 sec. Average Epoch Time: 57.59 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:37:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:46:26 INFO stats.py:314 | Epoch[500] Step[24] GlobalStep[68524] Training Speed: 440.21 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:37:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:46:26 INFO loss_tracker.py:84 | Epoch[500/NA] Step[24] GlobalStep[68524/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 20:46:36 INFO stats.py:314 | Epoch[500] Step[49] GlobalStep[68549] Training Speed: 416.56 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:37:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:46:36 INFO loss_tracker.py:84 | Epoch[500/NA] Step[49] GlobalStep[68549/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 20:46:46 INFO stats.py:314 | Epoch[500] Step[74] GlobalStep[68574] Training Speed: 412.27 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:37:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:46:46 INFO loss_tracker.py:84 | Epoch[500/NA] Step[74] GlobalStep[68574/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 20:46:57 INFO stats.py:314 | Epoch[500] Step[99] GlobalStep[68599] Training Speed: 411.79 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:37:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:46:57 INFO loss_tracker.py:84 | Epoch[500/NA] Step[99] GlobalStep[68599/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0126] total_loss[0.0163] Rank[0/16] 06/24/2025 20:47:08 INFO stats.py:314 | Epoch[500] Step[124] GlobalStep[68624] Training Speed: 448.89 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:36:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:47:08 INFO loss_tracker.py:84 | Epoch[500/NA] Step[124] GlobalStep[68624/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 20:47:12 INFO stats.py:394 | Epoch[500] completed. Training Speed: 306.40 samples/sec across all devices. Epoch Time: 57.23 sec. Average Epoch Time: 57.23 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:36:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:47:18 INFO stats.py:314 | Epoch[501] Step[12] GlobalStep[68649] Training Speed: 436.68 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:36:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:47:23 INFO loss_tracker.py:84 | Epoch[501/NA] Step[24] GlobalStep[68661/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:47:28 INFO stats.py:314 | Epoch[501] Step[37] GlobalStep[68674] Training Speed: 427.30 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:36:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:47:34 INFO loss_tracker.py:84 | Epoch[501/NA] Step[49] GlobalStep[68686/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 20:47:38 INFO stats.py:314 | Epoch[501] Step[62] GlobalStep[68699] Training Speed: 413.34 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:36:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:47:44 INFO loss_tracker.py:84 | Epoch[501/NA] Step[74] GlobalStep[68711/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0034] loss_depth[0.0126] total_loss[0.0160] Rank[0/16] 06/24/2025 20:47:49 INFO stats.py:314 | Epoch[501] Step[87] GlobalStep[68724] Training Speed: 427.04 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:36:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:47:55 INFO loss_tracker.py:84 | Epoch[501/NA] Step[99] GlobalStep[68736/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 20:48:00 INFO stats.py:314 | Epoch[501] Step[112] GlobalStep[68749] Training Speed: 250.65 samples/sec across all devices. Average Step Time: 0.51 sec. Estimated Remaining Time: 3:36:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:48:04 INFO loss_tracker.py:84 | Epoch[501/NA] Step[124] GlobalStep[68761/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:48:09 INFO stats.py:394 | Epoch[501] completed. Training Speed: 308.18 samples/sec across all devices. Epoch Time: 56.90 sec. Average Epoch Time: 56.90 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:35:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:48:10 INFO stats.py:314 | Epoch[502] Step[0] GlobalStep[68774] Training Speed: 354.04 samples/sec across all devices. Average Step Time: 0.36 sec. Estimated Remaining Time: 3:35:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:48:20 INFO loss_tracker.py:84 | Epoch[502/NA] Step[24] GlobalStep[68798/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0035] loss_depth[0.0126] total_loss[0.0161] Rank[0/16] 06/24/2025 20:48:20 INFO stats.py:314 | Epoch[502] Step[25] GlobalStep[68799] Training Speed: 407.32 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:35:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:48:31 INFO loss_tracker.py:84 | Epoch[502/NA] Step[49] GlobalStep[68823/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 20:48:31 INFO stats.py:314 | Epoch[502] Step[50] GlobalStep[68824] Training Speed: 429.79 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:35:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:48:40 INFO loss_tracker.py:84 | Epoch[502/NA] Step[74] GlobalStep[68848/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:48:41 INFO stats.py:314 | Epoch[502] Step[75] GlobalStep[68849] Training Speed: 419.34 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:35:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:48:51 INFO loss_tracker.py:84 | Epoch[502/NA] Step[99] GlobalStep[68873/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 20:48:52 INFO stats.py:314 | Epoch[502] Step[100] GlobalStep[68874] Training Speed: 408.93 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:35:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:49:01 INFO loss_tracker.py:84 | Epoch[502/NA] Step[124] GlobalStep[68898/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 20:49:02 INFO stats.py:314 | Epoch[502] Step[125] GlobalStep[68899] Training Speed: 405.41 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 3:35:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:49:06 INFO stats.py:394 | Epoch[502] completed. Training Speed: 305.31 samples/sec across all devices. Epoch Time: 57.44 sec. Average Epoch Time: 57.44 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:34:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:49:13 INFO stats.py:314 | Epoch[503] Step[13] GlobalStep[68924] Training Speed: 432.08 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:34:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:49:18 INFO loss_tracker.py:84 | Epoch[503/NA] Step[24] GlobalStep[68935/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 20:49:23 INFO stats.py:314 | Epoch[503] Step[38] GlobalStep[68949] Training Speed: 425.55 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:34:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:49:28 INFO loss_tracker.py:84 | Epoch[503/NA] Step[49] GlobalStep[68960/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 20:49:34 INFO stats.py:314 | Epoch[503] Step[63] GlobalStep[68974] Training Speed: 442.43 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:34:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:49:38 INFO loss_tracker.py:84 | Epoch[503/NA] Step[74] GlobalStep[68985/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 20:49:44 INFO stats.py:314 | Epoch[503] Step[88] GlobalStep[68999] Training Speed: 416.63 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:34:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:49:49 INFO loss_tracker.py:84 | Epoch[503/NA] Step[99] GlobalStep[69010/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 20:49:54 INFO stats.py:314 | Epoch[503] Step[113] GlobalStep[69024] Training Speed: 424.53 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:34:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:49:59 INFO loss_tracker.py:84 | Epoch[503/NA] Step[124] GlobalStep[69035/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:50:03 INFO stats.py:394 | Epoch[503] completed. Training Speed: 306.98 samples/sec across all devices. Epoch Time: 57.12 sec. Average Epoch Time: 57.12 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:33:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:50:05 INFO stats.py:314 | Epoch[504] Step[1] GlobalStep[69049] Training Speed: 407.60 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:33:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:50:15 INFO loss_tracker.py:84 | Epoch[504/NA] Step[24] GlobalStep[69072/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 20:50:16 INFO stats.py:314 | Epoch[504] Step[26] GlobalStep[69074] Training Speed: 428.69 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:33:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:50:25 INFO loss_tracker.py:84 | Epoch[504/NA] Step[49] GlobalStep[69097/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0126] total_loss[0.0162] Rank[0/16] 06/24/2025 20:50:26 INFO stats.py:314 | Epoch[504] Step[51] GlobalStep[69099] Training Speed: 396.49 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 3:33:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:50:36 INFO loss_tracker.py:84 | Epoch[504/NA] Step[74] GlobalStep[69122/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 20:50:36 INFO stats.py:314 | Epoch[504] Step[76] GlobalStep[69124] Training Speed: 433.90 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:33:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:50:46 INFO loss_tracker.py:84 | Epoch[504/NA] Step[99] GlobalStep[69147/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 20:50:46 INFO stats.py:314 | Epoch[504] Step[101] GlobalStep[69149] Training Speed: 430.52 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:33:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:50:56 INFO loss_tracker.py:84 | Epoch[504/NA] Step[124] GlobalStep[69172/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 20:50:56 INFO stats.py:314 | Epoch[504] Step[126] GlobalStep[69174] Training Speed: 451.48 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 3:33:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:51:00 INFO stats.py:394 | Epoch[504] completed. Training Speed: 307.03 samples/sec across all devices. Epoch Time: 57.12 sec. Average Epoch Time: 57.12 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:33:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:51:07 INFO stats.py:314 | Epoch[505] Step[14] GlobalStep[69199] Training Speed: 423.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:32:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:51:12 INFO loss_tracker.py:84 | Epoch[505/NA] Step[24] GlobalStep[69209/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:51:18 INFO stats.py:314 | Epoch[505] Step[39] GlobalStep[69224] Training Speed: 259.35 samples/sec across all devices. Average Step Time: 0.49 sec. Estimated Remaining Time: 3:32:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:51:22 INFO loss_tracker.py:84 | Epoch[505/NA] Step[49] GlobalStep[69234/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:51:29 INFO stats.py:314 | Epoch[505] Step[64] GlobalStep[69249] Training Speed: 434.23 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:32:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:51:33 INFO loss_tracker.py:84 | Epoch[505/NA] Step[74] GlobalStep[69259/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 20:51:39 INFO stats.py:314 | Epoch[505] Step[89] GlobalStep[69274] Training Speed: 421.82 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:32:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:51:43 INFO loss_tracker.py:84 | Epoch[505/NA] Step[99] GlobalStep[69284/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0125] total_loss[0.0163] Rank[0/16] 06/24/2025 20:51:49 INFO stats.py:314 | Epoch[505] Step[114] GlobalStep[69299] Training Speed: 435.33 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:32:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:51:53 INFO loss_tracker.py:84 | Epoch[505/NA] Step[124] GlobalStep[69309/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:51:57 INFO stats.py:394 | Epoch[505] completed. Training Speed: 307.13 samples/sec across all devices. Epoch Time: 57.10 sec. Average Epoch Time: 57.10 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:32:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:52:00 INFO stats.py:314 | Epoch[506] Step[2] GlobalStep[69324] Training Speed: 425.68 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:32:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:52:08 INFO loss_tracker.py:84 | Epoch[506/NA] Step[24] GlobalStep[69346/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 20:52:10 INFO stats.py:314 | Epoch[506] Step[27] GlobalStep[69349] Training Speed: 433.34 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:31:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:52:19 INFO loss_tracker.py:84 | Epoch[506/NA] Step[49] GlobalStep[69371/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:52:20 INFO stats.py:314 | Epoch[506] Step[52] GlobalStep[69374] Training Speed: 422.72 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:31:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:52:29 INFO loss_tracker.py:84 | Epoch[506/NA] Step[74] GlobalStep[69396/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 20:52:30 INFO stats.py:314 | Epoch[506] Step[77] GlobalStep[69399] Training Speed: 417.09 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:31:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:52:40 INFO loss_tracker.py:84 | Epoch[506/NA] Step[99] GlobalStep[69421/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:52:41 INFO stats.py:314 | Epoch[506] Step[102] GlobalStep[69424] Training Speed: 422.38 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:31:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:52:50 INFO loss_tracker.py:84 | Epoch[506/NA] Step[124] GlobalStep[69446/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 20:52:51 INFO stats.py:314 | Epoch[506] Step[127] GlobalStep[69449] Training Speed: 450.29 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 3:31:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:52:54 INFO stats.py:394 | Epoch[506] completed. Training Speed: 308.27 samples/sec across all devices. Epoch Time: 56.88 sec. Average Epoch Time: 56.88 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:31:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:53:02 INFO stats.py:314 | Epoch[507] Step[15] GlobalStep[69474] Training Speed: 437.89 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:31:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:53:05 INFO loss_tracker.py:84 | Epoch[507/NA] Step[24] GlobalStep[69483/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:53:12 INFO stats.py:314 | Epoch[507] Step[40] GlobalStep[69499] Training Speed: 440.39 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:30:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:53:16 INFO loss_tracker.py:84 | Epoch[507/NA] Step[49] GlobalStep[69508/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:53:22 INFO stats.py:314 | Epoch[507] Step[65] GlobalStep[69524] Training Speed: 407.49 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:30:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:53:26 INFO loss_tracker.py:84 | Epoch[507/NA] Step[74] GlobalStep[69533/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0125] total_loss[0.0163] Rank[0/16] 06/24/2025 20:53:32 INFO stats.py:314 | Epoch[507] Step[90] GlobalStep[69549] Training Speed: 424.34 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:30:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:53:36 INFO loss_tracker.py:84 | Epoch[507/NA] Step[99] GlobalStep[69558/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 20:53:43 INFO stats.py:314 | Epoch[507] Step[115] GlobalStep[69574] Training Speed: 422.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:30:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:53:46 INFO loss_tracker.py:84 | Epoch[507/NA] Step[124] GlobalStep[69583/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 20:53:51 INFO stats.py:394 | Epoch[507] completed. Training Speed: 310.60 samples/sec across all devices. Epoch Time: 56.46 sec. Average Epoch Time: 56.46 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 3:30:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:53:53 INFO stats.py:314 | Epoch[508] Step[3] GlobalStep[69599] Training Speed: 425.28 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:30:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:54:02 INFO loss_tracker.py:84 | Epoch[508/NA] Step[24] GlobalStep[69620/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 20:54:04 INFO stats.py:314 | Epoch[508] Step[28] GlobalStep[69624] Training Speed: 420.66 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:30:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:54:13 INFO loss_tracker.py:84 | Epoch[508/NA] Step[49] GlobalStep[69645/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 20:54:15 INFO stats.py:314 | Epoch[508] Step[53] GlobalStep[69649] Training Speed: 430.55 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:29:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:54:23 INFO loss_tracker.py:84 | Epoch[508/NA] Step[74] GlobalStep[69670/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 20:54:24 INFO stats.py:314 | Epoch[508] Step[78] GlobalStep[69674] Training Speed: 435.88 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:29:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:54:33 INFO loss_tracker.py:84 | Epoch[508/NA] Step[99] GlobalStep[69695/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 20:54:35 INFO stats.py:314 | Epoch[508] Step[103] GlobalStep[69699] Training Speed: 420.89 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:29:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:54:43 INFO loss_tracker.py:84 | Epoch[508/NA] Step[124] GlobalStep[69720/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0125] total_loss[0.0172] Rank[0/16] 06/24/2025 20:54:45 INFO stats.py:314 | Epoch[508] Step[128] GlobalStep[69724] Training Speed: 436.45 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:29:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:54:48 INFO stats.py:394 | Epoch[508] completed. Training Speed: 309.31 samples/sec across all devices. Epoch Time: 56.69 sec. Average Epoch Time: 56.69 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 3:29:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:54:55 INFO stats.py:314 | Epoch[509] Step[16] GlobalStep[69749] Training Speed: 427.89 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:29:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:54:59 INFO loss_tracker.py:84 | Epoch[509/NA] Step[24] GlobalStep[69757/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 20:55:06 INFO stats.py:314 | Epoch[509] Step[41] GlobalStep[69774] Training Speed: 433.14 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:28:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:55:09 INFO loss_tracker.py:84 | Epoch[509/NA] Step[49] GlobalStep[69782/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 20:55:16 INFO stats.py:314 | Epoch[509] Step[66] GlobalStep[69799] Training Speed: 432.41 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:28:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:55:19 INFO loss_tracker.py:84 | Epoch[509/NA] Step[74] GlobalStep[69807/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 20:55:26 INFO stats.py:314 | Epoch[509] Step[91] GlobalStep[69824] Training Speed: 432.23 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:28:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:55:29 INFO loss_tracker.py:84 | Epoch[509/NA] Step[99] GlobalStep[69832/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 20:55:36 INFO stats.py:314 | Epoch[509] Step[116] GlobalStep[69849] Training Speed: 251.10 samples/sec across all devices. Average Step Time: 0.51 sec. Estimated Remaining Time: 3:28:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:55:39 INFO loss_tracker.py:84 | Epoch[509/NA] Step[124] GlobalStep[69857/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 20:55:43 INFO stats.py:394 | Epoch[509] completed. Training Speed: 313.93 samples/sec across all devices. Epoch Time: 55.86 sec. Average Epoch Time: 55.86 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 3:28:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:55:46 INFO stats.py:314 | Epoch[510] Step[4] GlobalStep[69874] Training Speed: 428.51 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:28:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:55:55 INFO loss_tracker.py:84 | Epoch[510/NA] Step[24] GlobalStep[69894/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:55:57 INFO stats.py:314 | Epoch[510] Step[29] GlobalStep[69899] Training Speed: 232.98 samples/sec across all devices. Average Step Time: 0.55 sec. Estimated Remaining Time: 3:28:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:56:05 INFO loss_tracker.py:84 | Epoch[510/NA] Step[49] GlobalStep[69919/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 20:56:08 INFO stats.py:314 | Epoch[510] Step[54] GlobalStep[69924] Training Speed: 420.83 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:27:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:56:16 INFO loss_tracker.py:84 | Epoch[510/NA] Step[74] GlobalStep[69944/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 20:56:18 INFO stats.py:314 | Epoch[510] Step[79] GlobalStep[69949] Training Speed: 432.58 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:27:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:56:26 INFO loss_tracker.py:84 | Epoch[510/NA] Step[99] GlobalStep[69969/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:56:28 INFO stats.py:314 | Epoch[510] Step[104] GlobalStep[69974] Training Speed: 435.56 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:27:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:56:36 INFO loss_tracker.py:84 | Epoch[510/NA] Step[124] GlobalStep[69994/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0056] loss_depth[0.0126] total_loss[0.0181] Rank[0/16] 06/24/2025 20:56:38 INFO stats.py:314 | Epoch[510] Step[129] GlobalStep[69999] Training Speed: 446.20 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:27:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:56:40 INFO stats.py:394 | Epoch[510] completed. Training Speed: 309.14 samples/sec across all devices. Epoch Time: 56.72 sec. Average Epoch Time: 56.72 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 3:27:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:56:48 INFO stats.py:314 | Epoch[511] Step[17] GlobalStep[70024] Training Speed: 429.43 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:27:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:56:52 INFO loss_tracker.py:84 | Epoch[511/NA] Step[24] GlobalStep[70031/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 20:56:59 INFO stats.py:314 | Epoch[511] Step[42] GlobalStep[70049] Training Speed: 426.98 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:27:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:57:02 INFO loss_tracker.py:84 | Epoch[511/NA] Step[49] GlobalStep[70056/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 20:57:09 INFO stats.py:314 | Epoch[511] Step[67] GlobalStep[70074] Training Speed: 429.96 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:26:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:57:13 INFO loss_tracker.py:84 | Epoch[511/NA] Step[74] GlobalStep[70081/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:57:20 INFO stats.py:314 | Epoch[511] Step[92] GlobalStep[70099] Training Speed: 434.24 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:26:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:57:23 INFO loss_tracker.py:84 | Epoch[511/NA] Step[99] GlobalStep[70106/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:57:30 INFO stats.py:314 | Epoch[511] Step[117] GlobalStep[70124] Training Speed: 435.87 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:26:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:57:33 INFO loss_tracker.py:84 | Epoch[511/NA] Step[124] GlobalStep[70131/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 20:57:37 INFO stats.py:394 | Epoch[511] completed. Training Speed: 307.13 samples/sec across all devices. Epoch Time: 57.10 sec. Average Epoch Time: 57.10 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:26:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:57:41 INFO stats.py:314 | Epoch[512] Step[5] GlobalStep[70149] Training Speed: 408.55 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:26:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:57:48 INFO loss_tracker.py:84 | Epoch[512/NA] Step[24] GlobalStep[70168/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 20:57:51 INFO stats.py:314 | Epoch[512] Step[30] GlobalStep[70174] Training Speed: 440.21 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:26:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:57:59 INFO loss_tracker.py:84 | Epoch[512/NA] Step[49] GlobalStep[70193/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 20:58:01 INFO stats.py:314 | Epoch[512] Step[55] GlobalStep[70199] Training Speed: 427.85 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:26:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:58:09 INFO loss_tracker.py:84 | Epoch[512/NA] Step[74] GlobalStep[70218/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 20:58:12 INFO stats.py:314 | Epoch[512] Step[80] GlobalStep[70224] Training Speed: 424.17 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:25:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:58:20 INFO loss_tracker.py:84 | Epoch[512/NA] Step[99] GlobalStep[70243/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:58:22 INFO stats.py:314 | Epoch[512] Step[105] GlobalStep[70249] Training Speed: 324.16 samples/sec across all devices. Average Step Time: 0.39 sec. Estimated Remaining Time: 3:25:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:58:30 INFO loss_tracker.py:84 | Epoch[512/NA] Step[124] GlobalStep[70268/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0034] loss_depth[0.0125] total_loss[0.0159] Rank[0/16] 06/24/2025 20:58:32 INFO stats.py:314 | Epoch[512] Step[130] GlobalStep[70274] Training Speed: 439.71 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:25:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:58:35 INFO stats.py:394 | Epoch[512] completed. Training Speed: 305.54 samples/sec across all devices. Epoch Time: 57.39 sec. Average Epoch Time: 57.39 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:25:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:58:43 INFO stats.py:314 | Epoch[513] Step[18] GlobalStep[70299] Training Speed: 425.26 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:25:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:58:46 INFO loss_tracker.py:84 | Epoch[513/NA] Step[24] GlobalStep[70305/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 20:58:54 INFO stats.py:314 | Epoch[513] Step[43] GlobalStep[70324] Training Speed: 422.71 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:25:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:58:57 INFO loss_tracker.py:84 | Epoch[513/NA] Step[49] GlobalStep[70330/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0054] loss_depth[0.0126] total_loss[0.0181] Rank[0/16] 06/24/2025 20:59:04 INFO stats.py:314 | Epoch[513] Step[68] GlobalStep[70349] Training Speed: 421.58 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:24:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:59:07 INFO loss_tracker.py:84 | Epoch[513/NA] Step[74] GlobalStep[70355/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0034] loss_depth[0.0126] total_loss[0.0160] Rank[0/16] 06/24/2025 20:59:15 INFO stats.py:314 | Epoch[513] Step[93] GlobalStep[70374] Training Speed: 432.38 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:24:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:59:18 INFO loss_tracker.py:84 | Epoch[513/NA] Step[99] GlobalStep[70380/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 20:59:25 INFO stats.py:314 | Epoch[513] Step[118] GlobalStep[70399] Training Speed: 421.19 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:24:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:59:28 INFO loss_tracker.py:84 | Epoch[513/NA] Step[124] GlobalStep[70405/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 20:59:32 INFO stats.py:394 | Epoch[513] completed. Training Speed: 303.14 samples/sec across all devices. Epoch Time: 57.85 sec. Average Epoch Time: 57.85 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:24:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:59:37 INFO stats.py:314 | Epoch[514] Step[6] GlobalStep[70424] Training Speed: 431.12 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:24:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:59:44 INFO loss_tracker.py:84 | Epoch[514/NA] Step[24] GlobalStep[70442/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 20:59:47 INFO stats.py:314 | Epoch[514] Step[31] GlobalStep[70449] Training Speed: 435.06 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:24:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 20:59:54 INFO loss_tracker.py:84 | Epoch[514/NA] Step[49] GlobalStep[70467/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0126] total_loss[0.0163] Rank[0/16] 06/24/2025 20:59:57 INFO stats.py:314 | Epoch[514] Step[56] GlobalStep[70474] Training Speed: 437.74 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:24:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:00:04 INFO loss_tracker.py:84 | Epoch[514/NA] Step[74] GlobalStep[70492/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0125] total_loss[0.0169] Rank[0/16] 06/24/2025 21:00:07 INFO stats.py:314 | Epoch[514] Step[81] GlobalStep[70499] Training Speed: 435.83 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:23:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:00:15 INFO loss_tracker.py:84 | Epoch[514/NA] Step[99] GlobalStep[70517/99999]: loss_noise_mse[0.0001] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 21:00:18 INFO stats.py:314 | Epoch[514] Step[106] GlobalStep[70524] Training Speed: 415.63 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:23:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:00:25 INFO loss_tracker.py:84 | Epoch[514/NA] Step[124] GlobalStep[70542/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 21:00:28 INFO stats.py:314 | Epoch[514] Step[131] GlobalStep[70549] Training Speed: 414.66 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:23:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:00:30 INFO stats.py:394 | Epoch[514] completed. Training Speed: 304.52 samples/sec across all devices. Epoch Time: 57.59 sec. Average Epoch Time: 57.59 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:23:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:00:39 INFO stats.py:314 | Epoch[515] Step[19] GlobalStep[70574] Training Speed: 432.70 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:23:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:00:42 INFO loss_tracker.py:84 | Epoch[515/NA] Step[24] GlobalStep[70579/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 21:00:49 INFO stats.py:314 | Epoch[515] Step[44] GlobalStep[70599] Training Speed: 428.53 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:23:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:00:52 INFO loss_tracker.py:84 | Epoch[515/NA] Step[49] GlobalStep[70604/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0034] loss_depth[0.0126] total_loss[0.0160] Rank[0/16] 06/24/2025 21:01:00 INFO stats.py:314 | Epoch[515] Step[69] GlobalStep[70624] Training Speed: 430.80 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:23:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:01:02 INFO loss_tracker.py:84 | Epoch[515/NA] Step[74] GlobalStep[70629/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0168] Rank[0/16] 06/24/2025 21:01:10 INFO stats.py:314 | Epoch[515] Step[94] GlobalStep[70649] Training Speed: 424.19 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:22:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:01:12 INFO loss_tracker.py:84 | Epoch[515/NA] Step[99] GlobalStep[70654/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 21:01:20 INFO stats.py:314 | Epoch[515] Step[119] GlobalStep[70674] Training Speed: 429.97 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:22:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:01:22 INFO loss_tracker.py:84 | Epoch[515/NA] Step[124] GlobalStep[70679/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 21:01:26 INFO stats.py:394 | Epoch[515] completed. Training Speed: 312.17 samples/sec across all devices. Epoch Time: 56.17 sec. Average Epoch Time: 56.17 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 3:22:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:01:31 INFO stats.py:314 | Epoch[516] Step[7] GlobalStep[70699] Training Speed: 420.85 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:22:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:01:38 INFO loss_tracker.py:84 | Epoch[516/NA] Step[24] GlobalStep[70716/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 21:01:41 INFO stats.py:314 | Epoch[516] Step[32] GlobalStep[70724] Training Speed: 420.32 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:22:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:01:48 INFO loss_tracker.py:84 | Epoch[516/NA] Step[49] GlobalStep[70741/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 21:01:51 INFO stats.py:314 | Epoch[516] Step[57] GlobalStep[70749] Training Speed: 429.11 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:22:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:01:58 INFO loss_tracker.py:84 | Epoch[516/NA] Step[74] GlobalStep[70766/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0126] total_loss[0.0162] Rank[0/16] 06/24/2025 21:02:01 INFO stats.py:314 | Epoch[516] Step[82] GlobalStep[70774] Training Speed: 433.56 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:22:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:02:08 INFO loss_tracker.py:84 | Epoch[516/NA] Step[99] GlobalStep[70791/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0126] total_loss[0.0162] Rank[0/16] 06/24/2025 21:02:12 INFO stats.py:314 | Epoch[516] Step[107] GlobalStep[70799] Training Speed: 408.08 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:21:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:02:18 INFO loss_tracker.py:84 | Epoch[516/NA] Step[124] GlobalStep[70816/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 21:02:21 INFO stats.py:314 | Epoch[516] Step[132] GlobalStep[70824] Training Speed: 449.93 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 3:21:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:02:22 INFO stats.py:394 | Epoch[516] completed. Training Speed: 312.46 samples/sec across all devices. Epoch Time: 56.12 sec. Average Epoch Time: 56.12 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 3:21:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:02:32 INFO stats.py:314 | Epoch[517] Step[20] GlobalStep[70849] Training Speed: 419.25 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:21:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:02:34 INFO loss_tracker.py:84 | Epoch[517/NA] Step[24] GlobalStep[70853/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0054] loss_depth[0.0126] total_loss[0.0180] Rank[0/16] 06/24/2025 21:02:42 INFO stats.py:314 | Epoch[517] Step[45] GlobalStep[70874] Training Speed: 386.15 samples/sec across all devices. Average Step Time: 0.33 sec. Estimated Remaining Time: 3:21:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:02:44 INFO loss_tracker.py:84 | Epoch[517/NA] Step[49] GlobalStep[70878/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 21:02:53 INFO stats.py:314 | Epoch[517] Step[70] GlobalStep[70899] Training Speed: 418.86 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:21:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:02:55 INFO loss_tracker.py:84 | Epoch[517/NA] Step[74] GlobalStep[70903/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 21:03:03 INFO stats.py:314 | Epoch[517] Step[95] GlobalStep[70924] Training Speed: 439.01 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:21:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:03:05 INFO loss_tracker.py:84 | Epoch[517/NA] Step[99] GlobalStep[70928/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0125] total_loss[0.0169] Rank[0/16] 06/24/2025 21:03:14 INFO stats.py:314 | Epoch[517] Step[120] GlobalStep[70949] Training Speed: 450.70 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 3:20:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:03:16 INFO loss_tracker.py:84 | Epoch[517/NA] Step[124] GlobalStep[70953/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 21:03:20 INFO stats.py:394 | Epoch[517] completed. Training Speed: 304.81 samples/sec across all devices. Epoch Time: 57.53 sec. Average Epoch Time: 57.53 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:20:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:03:24 INFO stats.py:314 | Epoch[518] Step[8] GlobalStep[70974] Training Speed: 434.99 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:20:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:03:31 INFO loss_tracker.py:84 | Epoch[518/NA] Step[24] GlobalStep[70990/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 21:03:35 INFO stats.py:314 | Epoch[518] Step[33] GlobalStep[70999] Training Speed: 427.55 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:20:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:03:41 INFO loss_tracker.py:84 | Epoch[518/NA] Step[49] GlobalStep[71015/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 21:03:45 INFO stats.py:314 | Epoch[518] Step[58] GlobalStep[71024] Training Speed: 428.33 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:20:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:03:51 INFO loss_tracker.py:84 | Epoch[518/NA] Step[74] GlobalStep[71040/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 21:03:55 INFO stats.py:314 | Epoch[518] Step[83] GlobalStep[71049] Training Speed: 415.10 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:20:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:04:02 INFO loss_tracker.py:84 | Epoch[518/NA] Step[99] GlobalStep[71065/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 21:04:06 INFO stats.py:314 | Epoch[518] Step[108] GlobalStep[71074] Training Speed: 408.94 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:19:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:04:12 INFO loss_tracker.py:84 | Epoch[518/NA] Step[124] GlobalStep[71090/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 21:04:15 INFO stats.py:314 | Epoch[518] Step[133] GlobalStep[71099] Training Speed: 451.08 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 3:19:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:04:16 INFO stats.py:394 | Epoch[518] completed. Training Speed: 309.94 samples/sec across all devices. Epoch Time: 56.58 sec. Average Epoch Time: 56.58 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 3:19:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:04:26 INFO stats.py:314 | Epoch[519] Step[21] GlobalStep[71124] Training Speed: 435.85 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:19:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:04:28 INFO loss_tracker.py:84 | Epoch[519/NA] Step[24] GlobalStep[71127/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 21:04:36 INFO stats.py:314 | Epoch[519] Step[46] GlobalStep[71149] Training Speed: 432.20 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:19:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:04:38 INFO loss_tracker.py:84 | Epoch[519/NA] Step[49] GlobalStep[71152/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 21:04:47 INFO stats.py:314 | Epoch[519] Step[71] GlobalStep[71174] Training Speed: 427.83 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:19:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:04:48 INFO loss_tracker.py:84 | Epoch[519/NA] Step[74] GlobalStep[71177/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0168] Rank[0/16] 06/24/2025 21:04:57 INFO stats.py:314 | Epoch[519] Step[96] GlobalStep[71199] Training Speed: 431.29 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:19:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:04:58 INFO loss_tracker.py:84 | Epoch[519/NA] Step[99] GlobalStep[71202/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 21:05:07 INFO stats.py:314 | Epoch[519] Step[121] GlobalStep[71224] Training Speed: 438.59 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:18:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:05:08 INFO loss_tracker.py:84 | Epoch[519/NA] Step[124] GlobalStep[71227/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0055] loss_depth[0.0126] total_loss[0.0181] Rank[0/16] 06/24/2025 21:05:12 INFO stats.py:394 | Epoch[519] completed. Training Speed: 313.21 samples/sec across all devices. Epoch Time: 55.99 sec. Average Epoch Time: 55.99 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 3:18:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:05:18 INFO stats.py:314 | Epoch[520] Step[9] GlobalStep[71249] Training Speed: 396.87 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 3:18:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:05:24 INFO loss_tracker.py:84 | Epoch[520/NA] Step[24] GlobalStep[71264/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 21:05:28 INFO stats.py:314 | Epoch[520] Step[34] GlobalStep[71274] Training Speed: 425.60 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:18:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:05:34 INFO loss_tracker.py:84 | Epoch[520/NA] Step[49] GlobalStep[71289/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 21:05:38 INFO stats.py:314 | Epoch[520] Step[59] GlobalStep[71299] Training Speed: 433.06 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:18:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:05:44 INFO loss_tracker.py:84 | Epoch[520/NA] Step[74] GlobalStep[71314/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0125] total_loss[0.0174] Rank[0/16] 06/24/2025 21:05:48 INFO stats.py:314 | Epoch[520] Step[84] GlobalStep[71324] Training Speed: 433.52 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:18:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:05:54 INFO loss_tracker.py:84 | Epoch[520/NA] Step[99] GlobalStep[71339/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0125] total_loss[0.0173] Rank[0/16] 06/24/2025 21:05:58 INFO stats.py:314 | Epoch[520] Step[109] GlobalStep[71349] Training Speed: 423.20 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:18:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:06:05 INFO loss_tracker.py:84 | Epoch[520/NA] Step[124] GlobalStep[71364/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 21:06:08 INFO stats.py:314 | Epoch[520] Step[134] GlobalStep[71374] Training Speed: 442.77 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:17:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:06:09 INFO stats.py:394 | Epoch[520] completed. Training Speed: 309.76 samples/sec across all devices. Epoch Time: 56.61 sec. Average Epoch Time: 56.61 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 3:17:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:06:20 INFO stats.py:314 | Epoch[521] Step[22] GlobalStep[71399] Training Speed: 421.43 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:17:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:06:21 INFO loss_tracker.py:84 | Epoch[521/NA] Step[24] GlobalStep[71401/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 21:06:31 INFO stats.py:314 | Epoch[521] Step[47] GlobalStep[71424] Training Speed: 436.41 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:17:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:06:31 INFO loss_tracker.py:84 | Epoch[521/NA] Step[49] GlobalStep[71426/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 21:06:41 INFO stats.py:314 | Epoch[521] Step[72] GlobalStep[71449] Training Speed: 412.44 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:17:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:06:41 INFO loss_tracker.py:84 | Epoch[521/NA] Step[74] GlobalStep[71451/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0033] loss_depth[0.0126] total_loss[0.0158] Rank[0/16] 06/24/2025 21:06:51 INFO stats.py:314 | Epoch[521] Step[97] GlobalStep[71474] Training Speed: 436.70 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:17:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:06:52 INFO loss_tracker.py:84 | Epoch[521/NA] Step[99] GlobalStep[71476/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 21:07:01 INFO stats.py:314 | Epoch[521] Step[122] GlobalStep[71499] Training Speed: 434.64 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:17:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:07:02 INFO loss_tracker.py:84 | Epoch[521/NA] Step[124] GlobalStep[71501/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 21:07:06 INFO stats.py:394 | Epoch[521] completed. Training Speed: 305.77 samples/sec across all devices. Epoch Time: 57.35 sec. Average Epoch Time: 57.35 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:16:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:07:12 INFO stats.py:314 | Epoch[522] Step[10] GlobalStep[71524] Training Speed: 427.12 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:16:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:07:18 INFO loss_tracker.py:84 | Epoch[522/NA] Step[24] GlobalStep[71538/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 21:07:22 INFO stats.py:314 | Epoch[522] Step[35] GlobalStep[71549] Training Speed: 430.11 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:16:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:07:28 INFO loss_tracker.py:84 | Epoch[522/NA] Step[49] GlobalStep[71563/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 21:07:33 INFO stats.py:314 | Epoch[522] Step[60] GlobalStep[71574] Training Speed: 432.60 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:16:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:07:39 INFO loss_tracker.py:84 | Epoch[522/NA] Step[74] GlobalStep[71588/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 21:07:43 INFO stats.py:314 | Epoch[522] Step[85] GlobalStep[71599] Training Speed: 430.53 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:16:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:07:49 INFO loss_tracker.py:84 | Epoch[522/NA] Step[99] GlobalStep[71613/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0163] Rank[0/16] 06/24/2025 21:07:54 INFO stats.py:314 | Epoch[522] Step[110] GlobalStep[71624] Training Speed: 432.92 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:16:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:07:59 INFO loss_tracker.py:84 | Epoch[522/NA] Step[124] GlobalStep[71638/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0035] loss_depth[0.0126] total_loss[0.0161] Rank[0/16] 06/24/2025 21:08:03 INFO stats.py:314 | Epoch[522] Step[135] GlobalStep[71649] Training Speed: 454.60 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 3:16:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:08:03 INFO stats.py:394 | Epoch[522] completed. Training Speed: 307.80 samples/sec across all devices. Epoch Time: 56.97 sec. Average Epoch Time: 56.97 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:16:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:08:14 INFO stats.py:314 | Epoch[523] Step[23] GlobalStep[71674] Training Speed: 425.10 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:15:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:08:15 INFO loss_tracker.py:84 | Epoch[523/NA] Step[24] GlobalStep[71675/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 21:08:25 INFO stats.py:314 | Epoch[523] Step[48] GlobalStep[71699] Training Speed: 430.04 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:15:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:08:25 INFO loss_tracker.py:84 | Epoch[523/NA] Step[49] GlobalStep[71700/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 21:08:35 INFO stats.py:314 | Epoch[523] Step[73] GlobalStep[71724] Training Speed: 428.40 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:15:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:08:36 INFO loss_tracker.py:84 | Epoch[523/NA] Step[74] GlobalStep[71725/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0125] total_loss[0.0173] Rank[0/16] 06/24/2025 21:08:46 INFO stats.py:314 | Epoch[523] Step[98] GlobalStep[71749] Training Speed: 437.22 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:15:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:08:46 INFO loss_tracker.py:84 | Epoch[523/NA] Step[99] GlobalStep[71750/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 21:08:56 INFO stats.py:314 | Epoch[523] Step[123] GlobalStep[71774] Training Speed: 453.82 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 3:15:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:08:56 INFO loss_tracker.py:84 | Epoch[523/NA] Step[124] GlobalStep[71775/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 21:09:00 INFO stats.py:394 | Epoch[523] completed. Training Speed: 308.18 samples/sec across all devices. Epoch Time: 56.90 sec. Average Epoch Time: 56.90 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:15:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:09:06 INFO stats.py:314 | Epoch[524] Step[11] GlobalStep[71799] Training Speed: 426.49 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:14:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:09:12 INFO loss_tracker.py:84 | Epoch[524/NA] Step[24] GlobalStep[71812/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 21:09:16 INFO stats.py:314 | Epoch[524] Step[36] GlobalStep[71824] Training Speed: 425.15 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:14:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:09:22 INFO loss_tracker.py:84 | Epoch[524/NA] Step[49] GlobalStep[71837/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 21:09:27 INFO stats.py:314 | Epoch[524] Step[61] GlobalStep[71849] Training Speed: 431.60 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:14:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:09:32 INFO loss_tracker.py:84 | Epoch[524/NA] Step[74] GlobalStep[71862/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 21:09:37 INFO stats.py:314 | Epoch[524] Step[86] GlobalStep[71874] Training Speed: 435.39 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:14:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:09:42 INFO loss_tracker.py:84 | Epoch[524/NA] Step[99] GlobalStep[71887/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 21:09:47 INFO stats.py:314 | Epoch[524] Step[111] GlobalStep[71899] Training Speed: 414.93 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:14:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:09:52 INFO loss_tracker.py:84 | Epoch[524/NA] Step[124] GlobalStep[71912/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 21:09:57 INFO stats.py:314 | Epoch[524] Step[136] GlobalStep[71924] Training Speed: 451.16 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 3:14:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:09:57 INFO stats.py:394 | Epoch[524] completed. Training Speed: 310.49 samples/sec across all devices. Epoch Time: 56.48 sec. Average Epoch Time: 56.48 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 3:14:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:10:08 INFO stats.py:314 | Epoch[525] Step[24] GlobalStep[71949] Training Speed: 438.09 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:13:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:10:08 INFO loss_tracker.py:84 | Epoch[525/NA] Step[24] GlobalStep[71949/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 21:10:18 INFO stats.py:314 | Epoch[525] Step[49] GlobalStep[71974] Training Speed: 433.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:13:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:10:18 INFO loss_tracker.py:84 | Epoch[525/NA] Step[49] GlobalStep[71974/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 21:10:28 INFO stats.py:314 | Epoch[525] Step[74] GlobalStep[71999] Training Speed: 427.18 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:13:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:10:29 INFO loss_tracker.py:84 | Epoch[525/NA] Step[74] GlobalStep[71999/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 21:10:29 WARNING accelerator.py:3099 | Deleting 1 checkpoints to make room for new checkpoint. Rank[0/16] 06/24/2025 21:10:30 INFO accelerator.py:3111 | Saving current state to /job_data/checkpoints/checkpoint_17 Rank[1/16] 06/24/2025 21:10:30 INFO checkpoint.py:110 | Save checkpoint at the end of step 71999 to /job_data/checkpoints/checkpoint_17 Rank[9/16] 06/24/2025 21:10:30 INFO checkpoint.py:110 | Save checkpoint at the end of step 71999 to /job_data/checkpoints/checkpoint_17 Rank[10/16] 06/24/2025 21:10:30 INFO checkpoint.py:110 | Save checkpoint at the end of step 71999 to /job_data/checkpoints/checkpoint_17 Rank[7/16] 06/24/2025 21:10:30 INFO checkpoint.py:110 | Save checkpoint at the end of step 71999 to /job_data/checkpoints/checkpoint_17 Rank[6/16] 06/24/2025 21:10:30 INFO checkpoint.py:110 | Save checkpoint at the end of step 71999 to /job_data/checkpoints/checkpoint_17 Rank[11/16] 06/24/2025 21:10:30 INFO checkpoint.py:110 | Save checkpoint at the end of step 71999 to /job_data/checkpoints/checkpoint_17 Rank[2/16] 06/24/2025 21:10:30 INFO checkpoint.py:110 | Save checkpoint at the end of step 71999 to /job_data/checkpoints/checkpoint_17 Rank[4/16] 06/24/2025 21:10:30 INFO checkpoint.py:110 | Save checkpoint at the end of step 71999 to /job_data/checkpoints/checkpoint_17 Rank[8/16] 06/24/2025 21:10:30 INFO checkpoint.py:110 | Save checkpoint at the end of step 71999 to /job_data/checkpoints/checkpoint_17 Rank[3/16] 06/24/2025 21:10:30 INFO checkpoint.py:110 | Save checkpoint at the end of step 71999 to /job_data/checkpoints/checkpoint_17 Rank[14/16] 06/24/2025 21:10:31 INFO checkpoint.py:110 | Save checkpoint at the end of step 71999 to /job_data/checkpoints/checkpoint_17 Rank[5/16] 06/24/2025 21:10:31 INFO checkpoint.py:110 | Save checkpoint at the end of step 71999 to /job_data/checkpoints/checkpoint_17 Rank[12/16] 06/24/2025 21:10:31 INFO checkpoint.py:110 | Save checkpoint at the end of step 71999 to /job_data/checkpoints/checkpoint_17 Rank[13/16] 06/24/2025 21:10:31 INFO checkpoint.py:110 | Save checkpoint at the end of step 71999 to /job_data/checkpoints/checkpoint_17 Rank[15/16] 06/24/2025 21:10:31 INFO checkpoint.py:110 | Save checkpoint at the end of step 71999 to /job_data/checkpoints/checkpoint_17 Rank[0/16] 06/24/2025 21:10:31 INFO checkpointing.py:106 | Model weights saved in /job_data/checkpoints/checkpoint_17/model.safetensors Rank[0/16] 06/24/2025 21:10:32 INFO checkpointing.py:113 | Optimizer state saved in /job_data/checkpoints/checkpoint_17/optimizer.bin Rank[0/16] 06/24/2025 21:10:32 INFO checkpointing.py:120 | Scheduler state saved in /job_data/checkpoints/checkpoint_17/scheduler.bin Rank[0/16] 06/24/2025 21:10:32 INFO checkpointing.py:137 | Sampler state for dataloader 0 saved in /job_data/checkpoints/checkpoint_17/sampler.bin Rank[0/16] 06/24/2025 21:10:32 INFO checkpointing.py:164 | Random states saved in /job_data/checkpoints/checkpoint_17/random_states_0.pkl Rank[0/16] 06/24/2025 21:10:32 INFO checkpointing.py:300 | Saving the state of TrainerProgressState to /job_data/checkpoints/checkpoint_17/custom_checkpoint_0.pkl Rank[0/16] 06/24/2025 21:10:32 INFO checkpoint.py:110 | Save checkpoint at the end of step 71999 to /job_data/checkpoints/checkpoint_17 Rank[0/16] 06/24/2025 21:10:42 INFO stats.py:314 | Epoch[525] Step[99] GlobalStep[72024] Training Speed: 411.81 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:13:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:10:42 INFO loss_tracker.py:84 | Epoch[525/NA] Step[99] GlobalStep[72024/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 21:10:53 INFO stats.py:314 | Epoch[525] Step[124] GlobalStep[72049] Training Speed: 437.76 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:13:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:10:53 INFO loss_tracker.py:84 | Epoch[525/NA] Step[124] GlobalStep[72049/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 21:10:57 INFO stats.py:394 | Epoch[525] completed. Training Speed: 289.65 samples/sec across all devices. Epoch Time: 60.54 sec. Average Epoch Time: 60.54 sec. Average Step Time: 0.44 sec. Estimated Remaining Time: 3:13:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:11:03 INFO stats.py:314 | Epoch[526] Step[12] GlobalStep[72074] Training Speed: 434.10 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:13:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:11:08 INFO loss_tracker.py:84 | Epoch[526/NA] Step[24] GlobalStep[72086/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 21:11:14 INFO stats.py:314 | Epoch[526] Step[37] GlobalStep[72099] Training Speed: 425.45 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:12:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:11:19 INFO loss_tracker.py:84 | Epoch[526/NA] Step[49] GlobalStep[72111/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 21:11:24 INFO stats.py:314 | Epoch[526] Step[62] GlobalStep[72124] Training Speed: 425.05 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:12:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:11:29 INFO loss_tracker.py:84 | Epoch[526/NA] Step[74] GlobalStep[72136/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0125] total_loss[0.0179] Rank[0/16] 06/24/2025 21:11:34 INFO stats.py:314 | Epoch[526] Step[87] GlobalStep[72149] Training Speed: 399.93 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 3:12:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:11:39 INFO loss_tracker.py:84 | Epoch[526/NA] Step[99] GlobalStep[72161/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 21:11:44 INFO stats.py:314 | Epoch[526] Step[112] GlobalStep[72174] Training Speed: 435.10 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:12:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:11:49 INFO loss_tracker.py:84 | Epoch[526/NA] Step[124] GlobalStep[72186/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 21:11:54 INFO stats.py:394 | Epoch[526] completed. Training Speed: 311.79 samples/sec across all devices. Epoch Time: 56.24 sec. Average Epoch Time: 56.24 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 3:12:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:11:54 INFO stats.py:314 | Epoch[527] Step[0] GlobalStep[72199] Training Speed: 358.00 samples/sec across all devices. Average Step Time: 0.36 sec. Estimated Remaining Time: 3:12:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:12:05 INFO loss_tracker.py:84 | Epoch[527/NA] Step[24] GlobalStep[72223/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 21:12:05 INFO stats.py:314 | Epoch[527] Step[25] GlobalStep[72224] Training Speed: 394.58 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 3:12:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:12:16 INFO loss_tracker.py:84 | Epoch[527/NA] Step[49] GlobalStep[72248/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 21:12:16 INFO stats.py:314 | Epoch[527] Step[50] GlobalStep[72249] Training Speed: 427.57 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:11:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:12:26 INFO loss_tracker.py:84 | Epoch[527/NA] Step[74] GlobalStep[72273/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0125] total_loss[0.0173] Rank[0/16] 06/24/2025 21:12:26 INFO stats.py:314 | Epoch[527] Step[75] GlobalStep[72274] Training Speed: 414.37 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:11:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:12:36 INFO loss_tracker.py:84 | Epoch[527/NA] Step[99] GlobalStep[72298/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 21:12:36 INFO stats.py:314 | Epoch[527] Step[100] GlobalStep[72299] Training Speed: 428.00 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:11:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:12:46 INFO loss_tracker.py:84 | Epoch[527/NA] Step[124] GlobalStep[72323/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 21:12:46 INFO stats.py:314 | Epoch[527] Step[125] GlobalStep[72324] Training Speed: 427.84 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:11:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:12:50 INFO stats.py:394 | Epoch[527] completed. Training Speed: 309.10 samples/sec across all devices. Epoch Time: 56.73 sec. Average Epoch Time: 56.73 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 3:11:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:12:58 INFO stats.py:314 | Epoch[528] Step[13] GlobalStep[72349] Training Speed: 400.02 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 3:11:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:13:02 INFO loss_tracker.py:84 | Epoch[528/NA] Step[24] GlobalStep[72360/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 21:13:08 INFO stats.py:314 | Epoch[528] Step[38] GlobalStep[72374] Training Speed: 429.51 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:11:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:13:13 INFO loss_tracker.py:84 | Epoch[528/NA] Step[49] GlobalStep[72385/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 21:13:19 INFO stats.py:314 | Epoch[528] Step[63] GlobalStep[72399] Training Speed: 411.65 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:10:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:13:23 INFO loss_tracker.py:84 | Epoch[528/NA] Step[74] GlobalStep[72410/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 21:13:29 INFO stats.py:314 | Epoch[528] Step[88] GlobalStep[72424] Training Speed: 440.56 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:10:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:13:34 INFO loss_tracker.py:84 | Epoch[528/NA] Step[99] GlobalStep[72435/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0127] total_loss[0.0170] Rank[0/16] 06/24/2025 21:13:40 INFO stats.py:314 | Epoch[528] Step[113] GlobalStep[72449] Training Speed: 437.31 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:10:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:13:44 INFO loss_tracker.py:84 | Epoch[528/NA] Step[124] GlobalStep[72460/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0125] total_loss[0.0177] Rank[0/16] 06/24/2025 21:13:48 INFO stats.py:394 | Epoch[528] completed. Training Speed: 303.11 samples/sec across all devices. Epoch Time: 57.85 sec. Average Epoch Time: 57.85 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:10:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:13:50 INFO stats.py:314 | Epoch[529] Step[1] GlobalStep[72474] Training Speed: 425.81 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:10:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:14:00 INFO loss_tracker.py:84 | Epoch[529/NA] Step[24] GlobalStep[72497/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 21:14:01 INFO stats.py:314 | Epoch[529] Step[26] GlobalStep[72499] Training Speed: 420.67 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:10:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:14:10 INFO loss_tracker.py:84 | Epoch[529/NA] Step[49] GlobalStep[72522/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 21:14:11 INFO stats.py:314 | Epoch[529] Step[51] GlobalStep[72524] Training Speed: 418.15 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:09:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:14:21 INFO loss_tracker.py:84 | Epoch[529/NA] Step[74] GlobalStep[72547/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0126] total_loss[0.0163] Rank[0/16] 06/24/2025 21:14:22 INFO stats.py:314 | Epoch[529] Step[76] GlobalStep[72549] Training Speed: 428.72 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:09:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:14:31 INFO loss_tracker.py:84 | Epoch[529/NA] Step[99] GlobalStep[72572/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 21:14:32 INFO stats.py:314 | Epoch[529] Step[101] GlobalStep[72574] Training Speed: 429.46 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:09:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:14:42 INFO loss_tracker.py:84 | Epoch[529/NA] Step[124] GlobalStep[72597/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0125] total_loss[0.0173] Rank[0/16] 06/24/2025 21:14:42 INFO stats.py:314 | Epoch[529] Step[126] GlobalStep[72599] Training Speed: 453.04 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 3:09:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:14:46 INFO stats.py:394 | Epoch[529] completed. Training Speed: 302.85 samples/sec across all devices. Epoch Time: 57.90 sec. Average Epoch Time: 57.90 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:09:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:14:53 INFO stats.py:314 | Epoch[530] Step[14] GlobalStep[72624] Training Speed: 429.91 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:09:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:14:58 INFO loss_tracker.py:84 | Epoch[530/NA] Step[24] GlobalStep[72634/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 21:15:04 INFO stats.py:314 | Epoch[530] Step[39] GlobalStep[72649] Training Speed: 430.32 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:09:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:15:08 INFO loss_tracker.py:84 | Epoch[530/NA] Step[49] GlobalStep[72659/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 21:15:14 INFO stats.py:314 | Epoch[530] Step[64] GlobalStep[72674] Training Speed: 430.79 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:08:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:15:19 INFO loss_tracker.py:84 | Epoch[530/NA] Step[74] GlobalStep[72684/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 21:15:25 INFO stats.py:314 | Epoch[530] Step[89] GlobalStep[72699] Training Speed: 433.79 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:08:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:15:29 INFO loss_tracker.py:84 | Epoch[530/NA] Step[99] GlobalStep[72709/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0126] total_loss[0.0163] Rank[0/16] 06/24/2025 21:15:35 INFO stats.py:314 | Epoch[530] Step[114] GlobalStep[72724] Training Speed: 422.08 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:08:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:15:39 INFO loss_tracker.py:84 | Epoch[530/NA] Step[124] GlobalStep[72734/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 21:15:43 INFO stats.py:394 | Epoch[530] completed. Training Speed: 306.21 samples/sec across all devices. Epoch Time: 57.27 sec. Average Epoch Time: 57.27 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:08:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:15:46 INFO stats.py:314 | Epoch[531] Step[2] GlobalStep[72749] Training Speed: 436.23 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:08:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:15:54 INFO loss_tracker.py:84 | Epoch[531/NA] Step[24] GlobalStep[72771/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 21:15:55 INFO stats.py:314 | Epoch[531] Step[27] GlobalStep[72774] Training Speed: 430.10 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:08:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:16:04 INFO loss_tracker.py:84 | Epoch[531/NA] Step[49] GlobalStep[72796/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 21:16:05 INFO stats.py:314 | Epoch[531] Step[52] GlobalStep[72799] Training Speed: 394.23 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 3:08:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:16:15 INFO loss_tracker.py:84 | Epoch[531/NA] Step[74] GlobalStep[72821/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 21:16:16 INFO stats.py:314 | Epoch[531] Step[77] GlobalStep[72824] Training Speed: 426.20 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:07:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:16:25 INFO loss_tracker.py:84 | Epoch[531/NA] Step[99] GlobalStep[72846/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 21:16:26 INFO stats.py:314 | Epoch[531] Step[102] GlobalStep[72849] Training Speed: 428.96 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:07:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:16:35 INFO loss_tracker.py:84 | Epoch[531/NA] Step[124] GlobalStep[72871/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 21:16:36 INFO stats.py:314 | Epoch[531] Step[127] GlobalStep[72874] Training Speed: 436.07 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:07:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:16:40 INFO stats.py:394 | Epoch[531] completed. Training Speed: 311.89 samples/sec across all devices. Epoch Time: 56.23 sec. Average Epoch Time: 56.23 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 3:07:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:16:47 INFO stats.py:314 | Epoch[532] Step[15] GlobalStep[72899] Training Speed: 440.85 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:07:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:16:51 INFO loss_tracker.py:84 | Epoch[532/NA] Step[24] GlobalStep[72908/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 21:16:57 INFO stats.py:314 | Epoch[532] Step[40] GlobalStep[72924] Training Speed: 435.19 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:07:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:17:01 INFO loss_tracker.py:84 | Epoch[532/NA] Step[49] GlobalStep[72933/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 21:17:08 INFO stats.py:314 | Epoch[532] Step[65] GlobalStep[72949] Training Speed: 431.90 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:07:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:17:11 INFO loss_tracker.py:84 | Epoch[532/NA] Step[74] GlobalStep[72958/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 21:17:18 INFO stats.py:314 | Epoch[532] Step[90] GlobalStep[72974] Training Speed: 424.00 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:06:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:17:22 INFO loss_tracker.py:84 | Epoch[532/NA] Step[99] GlobalStep[72983/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 21:17:28 INFO stats.py:314 | Epoch[532] Step[115] GlobalStep[72999] Training Speed: 429.80 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:06:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:17:32 INFO loss_tracker.py:84 | Epoch[532/NA] Step[124] GlobalStep[73008/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0125] total_loss[0.0171] Rank[0/16] 06/24/2025 21:17:36 INFO stats.py:394 | Epoch[532] completed. Training Speed: 308.20 samples/sec across all devices. Epoch Time: 56.90 sec. Average Epoch Time: 56.90 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:06:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:17:39 INFO stats.py:314 | Epoch[533] Step[3] GlobalStep[73024] Training Speed: 416.95 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:06:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:17:48 INFO loss_tracker.py:84 | Epoch[533/NA] Step[24] GlobalStep[73045/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 21:17:50 INFO stats.py:314 | Epoch[533] Step[28] GlobalStep[73049] Training Speed: 424.52 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:06:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:17:58 INFO loss_tracker.py:84 | Epoch[533/NA] Step[49] GlobalStep[73070/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 21:18:00 INFO stats.py:314 | Epoch[533] Step[53] GlobalStep[73074] Training Speed: 432.00 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:06:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:18:09 INFO loss_tracker.py:84 | Epoch[533/NA] Step[74] GlobalStep[73095/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 21:18:11 INFO stats.py:314 | Epoch[533] Step[78] GlobalStep[73099] Training Speed: 429.93 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:06:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:18:19 INFO loss_tracker.py:84 | Epoch[533/NA] Step[99] GlobalStep[73120/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0125] total_loss[0.0168] Rank[0/16] 06/24/2025 21:18:21 INFO stats.py:314 | Epoch[533] Step[103] GlobalStep[73124] Training Speed: 427.06 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:05:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:18:29 INFO loss_tracker.py:84 | Epoch[533/NA] Step[124] GlobalStep[73145/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 21:18:31 INFO stats.py:314 | Epoch[533] Step[128] GlobalStep[73149] Training Speed: 416.29 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:05:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:18:34 INFO stats.py:394 | Epoch[533] completed. Training Speed: 305.33 samples/sec across all devices. Epoch Time: 57.43 sec. Average Epoch Time: 57.43 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:05:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:18:42 INFO stats.py:314 | Epoch[534] Step[16] GlobalStep[73174] Training Speed: 418.91 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:05:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:18:45 INFO loss_tracker.py:84 | Epoch[534/NA] Step[24] GlobalStep[73182/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 21:18:52 INFO stats.py:314 | Epoch[534] Step[41] GlobalStep[73199] Training Speed: 427.99 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:05:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:18:56 INFO loss_tracker.py:84 | Epoch[534/NA] Step[49] GlobalStep[73207/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 21:19:02 INFO stats.py:314 | Epoch[534] Step[66] GlobalStep[73224] Training Speed: 434.03 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:05:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:19:06 INFO loss_tracker.py:84 | Epoch[534/NA] Step[74] GlobalStep[73232/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 21:19:13 INFO stats.py:314 | Epoch[534] Step[91] GlobalStep[73249] Training Speed: 435.86 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:04:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:19:16 INFO loss_tracker.py:84 | Epoch[534/NA] Step[99] GlobalStep[73257/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0125] total_loss[0.0163] Rank[0/16] 06/24/2025 21:19:23 INFO stats.py:314 | Epoch[534] Step[116] GlobalStep[73274] Training Speed: 433.65 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:04:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:19:26 INFO loss_tracker.py:84 | Epoch[534/NA] Step[124] GlobalStep[73282/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0125] total_loss[0.0164] Rank[0/16] 06/24/2025 21:19:30 INFO stats.py:394 | Epoch[534] completed. Training Speed: 309.65 samples/sec across all devices. Epoch Time: 56.63 sec. Average Epoch Time: 56.63 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 3:04:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:19:33 INFO stats.py:314 | Epoch[535] Step[4] GlobalStep[73299] Training Speed: 428.97 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:04:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:19:42 INFO loss_tracker.py:84 | Epoch[535/NA] Step[24] GlobalStep[73319/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 21:19:44 INFO stats.py:314 | Epoch[535] Step[29] GlobalStep[73324] Training Speed: 428.58 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:04:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:19:53 INFO loss_tracker.py:84 | Epoch[535/NA] Step[49] GlobalStep[73344/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 21:19:54 INFO stats.py:314 | Epoch[535] Step[54] GlobalStep[73349] Training Speed: 400.61 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 3:04:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:20:03 INFO loss_tracker.py:84 | Epoch[535/NA] Step[74] GlobalStep[73369/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0164] Rank[0/16] 06/24/2025 21:20:05 INFO stats.py:314 | Epoch[535] Step[79] GlobalStep[73374] Training Speed: 437.80 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:04:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:20:13 INFO loss_tracker.py:84 | Epoch[535/NA] Step[99] GlobalStep[73394/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 21:20:15 INFO stats.py:314 | Epoch[535] Step[104] GlobalStep[73399] Training Speed: 440.31 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:03:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:20:24 INFO loss_tracker.py:84 | Epoch[535/NA] Step[124] GlobalStep[73419/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0163] Rank[0/16] 06/24/2025 21:20:25 INFO stats.py:314 | Epoch[535] Step[129] GlobalStep[73424] Training Speed: 450.79 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 3:03:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:20:28 INFO stats.py:394 | Epoch[535] completed. Training Speed: 306.21 samples/sec across all devices. Epoch Time: 57.27 sec. Average Epoch Time: 57.27 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:03:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:20:36 INFO stats.py:314 | Epoch[536] Step[17] GlobalStep[73449] Training Speed: 398.97 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 3:03:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:20:39 INFO loss_tracker.py:84 | Epoch[536/NA] Step[24] GlobalStep[73456/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 21:20:47 INFO stats.py:314 | Epoch[536] Step[42] GlobalStep[73474] Training Speed: 429.15 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:03:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:20:50 INFO loss_tracker.py:84 | Epoch[536/NA] Step[49] GlobalStep[73481/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 21:20:57 INFO stats.py:314 | Epoch[536] Step[67] GlobalStep[73499] Training Speed: 427.08 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:03:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:21:00 INFO loss_tracker.py:84 | Epoch[536/NA] Step[74] GlobalStep[73506/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0168] Rank[0/16] 06/24/2025 21:21:07 INFO stats.py:314 | Epoch[536] Step[92] GlobalStep[73524] Training Speed: 409.16 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:03:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:21:11 INFO loss_tracker.py:84 | Epoch[536/NA] Step[99] GlobalStep[73531/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 21:21:18 INFO stats.py:314 | Epoch[536] Step[117] GlobalStep[73549] Training Speed: 420.59 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:02:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:21:20 INFO loss_tracker.py:84 | Epoch[536/NA] Step[124] GlobalStep[73556/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 21:21:25 INFO stats.py:394 | Epoch[536] completed. Training Speed: 306.16 samples/sec across all devices. Epoch Time: 57.28 sec. Average Epoch Time: 57.28 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 3:02:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:21:29 INFO stats.py:314 | Epoch[537] Step[5] GlobalStep[73574] Training Speed: 431.11 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:02:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:21:36 INFO loss_tracker.py:84 | Epoch[537/NA] Step[24] GlobalStep[73593/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0125] total_loss[0.0174] Rank[0/16] 06/24/2025 21:21:39 INFO stats.py:314 | Epoch[537] Step[30] GlobalStep[73599] Training Speed: 409.45 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 3:02:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:21:47 INFO loss_tracker.py:84 | Epoch[537/NA] Step[49] GlobalStep[73618/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 21:21:49 INFO stats.py:314 | Epoch[537] Step[55] GlobalStep[73624] Training Speed: 430.81 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:02:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:21:57 INFO loss_tracker.py:84 | Epoch[537/NA] Step[74] GlobalStep[73643/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 21:21:59 INFO stats.py:314 | Epoch[537] Step[80] GlobalStep[73649] Training Speed: 432.19 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:02:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:22:07 INFO loss_tracker.py:84 | Epoch[537/NA] Step[99] GlobalStep[73668/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 21:22:10 INFO stats.py:314 | Epoch[537] Step[105] GlobalStep[73674] Training Speed: 426.03 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:02:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:22:17 INFO loss_tracker.py:84 | Epoch[537/NA] Step[124] GlobalStep[73693/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 21:22:19 INFO stats.py:314 | Epoch[537] Step[130] GlobalStep[73699] Training Speed: 437.71 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:01:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:22:22 INFO stats.py:394 | Epoch[537] completed. Training Speed: 308.87 samples/sec across all devices. Epoch Time: 56.77 sec. Average Epoch Time: 56.77 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 3:01:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:22:30 INFO stats.py:314 | Epoch[538] Step[18] GlobalStep[73724] Training Speed: 438.02 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:01:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:22:33 INFO loss_tracker.py:84 | Epoch[538/NA] Step[24] GlobalStep[73730/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0125] total_loss[0.0169] Rank[0/16] 06/24/2025 21:22:41 INFO stats.py:314 | Epoch[538] Step[43] GlobalStep[73749] Training Speed: 428.65 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:01:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:22:43 INFO loss_tracker.py:84 | Epoch[538/NA] Step[49] GlobalStep[73755/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0125] total_loss[0.0168] Rank[0/16] 06/24/2025 21:22:51 INFO stats.py:314 | Epoch[538] Step[68] GlobalStep[73774] Training Speed: 435.55 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:01:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:22:53 INFO loss_tracker.py:84 | Epoch[538/NA] Step[74] GlobalStep[73780/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0034] loss_depth[0.0125] total_loss[0.0160] Rank[0/16] 06/24/2025 21:23:01 INFO stats.py:314 | Epoch[538] Step[93] GlobalStep[73799] Training Speed: 428.09 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:01:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:23:04 INFO loss_tracker.py:84 | Epoch[538/NA] Step[99] GlobalStep[73805/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 21:23:11 INFO stats.py:314 | Epoch[538] Step[118] GlobalStep[73824] Training Speed: 430.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:01:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:23:14 INFO loss_tracker.py:84 | Epoch[538/NA] Step[124] GlobalStep[73830/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 21:23:18 INFO stats.py:394 | Epoch[538] completed. Training Speed: 309.75 samples/sec across all devices. Epoch Time: 56.61 sec. Average Epoch Time: 56.61 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 3:00:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:23:22 INFO stats.py:314 | Epoch[539] Step[6] GlobalStep[73849] Training Speed: 419.83 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:00:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:23:30 INFO loss_tracker.py:84 | Epoch[539/NA] Step[24] GlobalStep[73867/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 21:23:33 INFO stats.py:314 | Epoch[539] Step[31] GlobalStep[73874] Training Speed: 435.40 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:00:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:23:40 INFO loss_tracker.py:84 | Epoch[539/NA] Step[49] GlobalStep[73892/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0125] total_loss[0.0161] Rank[0/16] 06/24/2025 21:23:43 INFO stats.py:314 | Epoch[539] Step[56] GlobalStep[73899] Training Speed: 434.97 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:00:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:23:50 INFO loss_tracker.py:84 | Epoch[539/NA] Step[74] GlobalStep[73917/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0033] loss_depth[0.0126] total_loss[0.0159] Rank[0/16] 06/24/2025 21:23:53 INFO stats.py:314 | Epoch[539] Step[81] GlobalStep[73924] Training Speed: 434.35 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 3:00:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:24:01 INFO loss_tracker.py:84 | Epoch[539/NA] Step[99] GlobalStep[73942/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 21:24:03 INFO stats.py:314 | Epoch[539] Step[106] GlobalStep[73949] Training Speed: 423.48 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 3:00:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:24:11 INFO loss_tracker.py:84 | Epoch[539/NA] Step[124] GlobalStep[73967/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 21:24:13 INFO stats.py:314 | Epoch[539] Step[131] GlobalStep[73974] Training Speed: 440.62 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:59:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:24:15 INFO stats.py:394 | Epoch[539] completed. Training Speed: 308.68 samples/sec across all devices. Epoch Time: 56.81 sec. Average Epoch Time: 56.81 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:59:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:24:24 INFO stats.py:314 | Epoch[540] Step[19] GlobalStep[73999] Training Speed: 399.42 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 2:59:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:24:27 INFO loss_tracker.py:84 | Epoch[540/NA] Step[24] GlobalStep[74004/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 21:24:35 INFO stats.py:314 | Epoch[540] Step[44] GlobalStep[74024] Training Speed: 423.28 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:59:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:24:37 INFO loss_tracker.py:84 | Epoch[540/NA] Step[49] GlobalStep[74029/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 21:24:45 INFO stats.py:314 | Epoch[540] Step[69] GlobalStep[74049] Training Speed: 432.71 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:59:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:24:47 INFO loss_tracker.py:84 | Epoch[540/NA] Step[74] GlobalStep[74054/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 21:24:55 INFO stats.py:314 | Epoch[540] Step[94] GlobalStep[74074] Training Speed: 434.69 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:59:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:24:57 INFO loss_tracker.py:84 | Epoch[540/NA] Step[99] GlobalStep[74079/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0125] total_loss[0.0172] Rank[0/16] 06/24/2025 21:25:05 INFO stats.py:314 | Epoch[540] Step[119] GlobalStep[74099] Training Speed: 437.16 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:59:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:25:07 INFO loss_tracker.py:84 | Epoch[540/NA] Step[124] GlobalStep[74104/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 21:25:11 INFO stats.py:394 | Epoch[540] completed. Training Speed: 312.49 samples/sec across all devices. Epoch Time: 56.12 sec. Average Epoch Time: 56.12 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:58:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:25:16 INFO stats.py:314 | Epoch[541] Step[7] GlobalStep[74124] Training Speed: 428.90 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:58:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:25:23 INFO loss_tracker.py:84 | Epoch[541/NA] Step[24] GlobalStep[74141/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0125] total_loss[0.0169] Rank[0/16] 06/24/2025 21:25:27 INFO stats.py:314 | Epoch[541] Step[32] GlobalStep[74149] Training Speed: 428.72 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:58:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:25:33 INFO loss_tracker.py:84 | Epoch[541/NA] Step[49] GlobalStep[74166/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0125] total_loss[0.0173] Rank[0/16] 06/24/2025 21:25:37 INFO stats.py:314 | Epoch[541] Step[57] GlobalStep[74174] Training Speed: 427.02 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:58:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:25:45 INFO loss_tracker.py:84 | Epoch[541/NA] Step[74] GlobalStep[74191/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 21:25:48 INFO stats.py:314 | Epoch[541] Step[82] GlobalStep[74199] Training Speed: 432.72 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:58:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:25:54 INFO loss_tracker.py:84 | Epoch[541/NA] Step[99] GlobalStep[74216/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 21:25:58 INFO stats.py:314 | Epoch[541] Step[107] GlobalStep[74224] Training Speed: 437.04 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:58:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:26:05 INFO loss_tracker.py:84 | Epoch[541/NA] Step[124] GlobalStep[74241/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0125] total_loss[0.0172] Rank[0/16] 06/24/2025 21:26:08 INFO stats.py:314 | Epoch[541] Step[132] GlobalStep[74249] Training Speed: 450.47 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 2:58:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:26:09 INFO stats.py:394 | Epoch[541] completed. Training Speed: 304.09 samples/sec across all devices. Epoch Time: 57.67 sec. Average Epoch Time: 57.67 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:58:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:26:18 INFO stats.py:314 | Epoch[542] Step[20] GlobalStep[74274] Training Speed: 432.80 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:57:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:26:20 INFO loss_tracker.py:84 | Epoch[542/NA] Step[24] GlobalStep[74278/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0125] total_loss[0.0177] Rank[0/16] 06/24/2025 21:26:29 INFO stats.py:314 | Epoch[542] Step[45] GlobalStep[74299] Training Speed: 254.85 samples/sec across all devices. Average Step Time: 0.50 sec. Estimated Remaining Time: 2:57:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:26:31 INFO loss_tracker.py:84 | Epoch[542/NA] Step[49] GlobalStep[74303/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0125] total_loss[0.0169] Rank[0/16] 06/24/2025 21:26:39 INFO stats.py:314 | Epoch[542] Step[70] GlobalStep[74324] Training Speed: 436.78 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:57:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:26:41 INFO loss_tracker.py:84 | Epoch[542/NA] Step[74] GlobalStep[74328/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0056] loss_depth[0.0126] total_loss[0.0182] Rank[0/16] 06/24/2025 21:26:50 INFO stats.py:314 | Epoch[542] Step[95] GlobalStep[74349] Training Speed: 425.68 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:57:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:26:51 INFO loss_tracker.py:84 | Epoch[542/NA] Step[99] GlobalStep[74353/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0032] loss_depth[0.0126] total_loss[0.0158] Rank[0/16] 06/24/2025 21:27:00 INFO stats.py:314 | Epoch[542] Step[120] GlobalStep[74374] Training Speed: 454.51 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 2:57:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:27:01 INFO loss_tracker.py:84 | Epoch[542/NA] Step[124] GlobalStep[74378/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0125] total_loss[0.0168] Rank[0/16] 06/24/2025 21:27:06 INFO stats.py:394 | Epoch[542] completed. Training Speed: 309.28 samples/sec across all devices. Epoch Time: 56.70 sec. Average Epoch Time: 56.70 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:57:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:27:11 INFO stats.py:314 | Epoch[543] Step[8] GlobalStep[74399] Training Speed: 432.67 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:57:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:27:17 INFO loss_tracker.py:84 | Epoch[543/NA] Step[24] GlobalStep[74415/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 21:27:20 INFO stats.py:314 | Epoch[543] Step[33] GlobalStep[74424] Training Speed: 434.38 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:56:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:27:27 INFO loss_tracker.py:84 | Epoch[543/NA] Step[49] GlobalStep[74440/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0054] loss_depth[0.0126] total_loss[0.0180] Rank[0/16] 06/24/2025 21:27:31 INFO stats.py:314 | Epoch[543] Step[58] GlobalStep[74449] Training Speed: 413.67 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 2:56:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:27:38 INFO loss_tracker.py:84 | Epoch[543/NA] Step[74] GlobalStep[74465/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 21:27:41 INFO stats.py:314 | Epoch[543] Step[83] GlobalStep[74474] Training Speed: 424.66 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:56:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:27:48 INFO loss_tracker.py:84 | Epoch[543/NA] Step[99] GlobalStep[74490/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 21:27:52 INFO stats.py:314 | Epoch[543] Step[108] GlobalStep[74499] Training Speed: 420.49 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:56:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:27:58 INFO loss_tracker.py:84 | Epoch[543/NA] Step[124] GlobalStep[74515/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 21:28:02 INFO stats.py:314 | Epoch[543] Step[133] GlobalStep[74524] Training Speed: 433.51 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:56:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:28:03 INFO stats.py:394 | Epoch[543] completed. Training Speed: 306.26 samples/sec across all devices. Epoch Time: 57.26 sec. Average Epoch Time: 57.26 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:56:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:28:13 INFO stats.py:314 | Epoch[544] Step[21] GlobalStep[74549] Training Speed: 437.04 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:55:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:28:14 INFO loss_tracker.py:84 | Epoch[544/NA] Step[24] GlobalStep[74552/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 21:28:23 INFO stats.py:314 | Epoch[544] Step[46] GlobalStep[74574] Training Speed: 434.97 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:55:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:28:25 INFO loss_tracker.py:84 | Epoch[544/NA] Step[49] GlobalStep[74577/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0034] loss_depth[0.0126] total_loss[0.0160] Rank[0/16] 06/24/2025 21:28:33 INFO stats.py:314 | Epoch[544] Step[71] GlobalStep[74599] Training Speed: 437.72 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:55:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:28:35 INFO loss_tracker.py:84 | Epoch[544/NA] Step[74] GlobalStep[74602/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 21:28:44 INFO stats.py:314 | Epoch[544] Step[96] GlobalStep[74624] Training Speed: 423.46 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:55:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:28:45 INFO loss_tracker.py:84 | Epoch[544/NA] Step[99] GlobalStep[74627/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0125] total_loss[0.0168] Rank[0/16] 06/24/2025 21:28:54 INFO stats.py:314 | Epoch[544] Step[121] GlobalStep[74649] Training Speed: 434.62 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:55:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:28:55 INFO loss_tracker.py:84 | Epoch[544/NA] Step[124] GlobalStep[74652/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 21:28:59 INFO stats.py:394 | Epoch[544] completed. Training Speed: 312.33 samples/sec across all devices. Epoch Time: 56.15 sec. Average Epoch Time: 56.15 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:55:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:29:04 INFO stats.py:314 | Epoch[545] Step[9] GlobalStep[74674] Training Speed: 429.64 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:55:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:29:11 INFO loss_tracker.py:84 | Epoch[545/NA] Step[24] GlobalStep[74689/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 21:29:15 INFO stats.py:314 | Epoch[545] Step[34] GlobalStep[74699] Training Speed: 433.67 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:54:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:29:21 INFO loss_tracker.py:84 | Epoch[545/NA] Step[49] GlobalStep[74714/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 21:29:25 INFO stats.py:314 | Epoch[545] Step[59] GlobalStep[74724] Training Speed: 430.40 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:54:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:29:31 INFO loss_tracker.py:84 | Epoch[545/NA] Step[74] GlobalStep[74739/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 21:29:35 INFO stats.py:314 | Epoch[545] Step[84] GlobalStep[74749] Training Speed: 431.28 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:54:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:29:41 INFO loss_tracker.py:84 | Epoch[545/NA] Step[99] GlobalStep[74764/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 21:29:45 INFO stats.py:314 | Epoch[545] Step[109] GlobalStep[74774] Training Speed: 419.27 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 2:54:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:29:51 INFO loss_tracker.py:84 | Epoch[545/NA] Step[124] GlobalStep[74789/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 21:29:55 INFO stats.py:314 | Epoch[545] Step[134] GlobalStep[74799] Training Speed: 445.23 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:54:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:29:55 INFO stats.py:394 | Epoch[545] completed. Training Speed: 311.02 samples/sec across all devices. Epoch Time: 56.38 sec. Average Epoch Time: 56.38 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:54:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:30:06 INFO stats.py:314 | Epoch[546] Step[22] GlobalStep[74824] Training Speed: 430.56 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:54:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:30:07 INFO loss_tracker.py:84 | Epoch[546/NA] Step[24] GlobalStep[74826/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 21:30:17 INFO stats.py:314 | Epoch[546] Step[47] GlobalStep[74849] Training Speed: 425.90 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:53:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:30:18 INFO loss_tracker.py:84 | Epoch[546/NA] Step[49] GlobalStep[74851/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 21:30:27 INFO stats.py:314 | Epoch[546] Step[72] GlobalStep[74874] Training Speed: 430.59 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:53:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:30:28 INFO loss_tracker.py:84 | Epoch[546/NA] Step[74] GlobalStep[74876/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 21:30:37 INFO stats.py:314 | Epoch[546] Step[97] GlobalStep[74899] Training Speed: 433.18 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:53:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:30:38 INFO loss_tracker.py:84 | Epoch[546/NA] Step[99] GlobalStep[74901/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 21:30:47 INFO stats.py:314 | Epoch[546] Step[122] GlobalStep[74924] Training Speed: 454.31 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 2:53:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:30:48 INFO loss_tracker.py:84 | Epoch[546/NA] Step[124] GlobalStep[74926/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 21:30:53 INFO stats.py:394 | Epoch[546] completed. Training Speed: 306.23 samples/sec across all devices. Epoch Time: 57.26 sec. Average Epoch Time: 57.26 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:53:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:30:58 INFO stats.py:314 | Epoch[547] Step[10] GlobalStep[74949] Training Speed: 418.66 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 2:53:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:31:04 INFO loss_tracker.py:84 | Epoch[547/NA] Step[24] GlobalStep[74963/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 21:31:09 INFO stats.py:314 | Epoch[547] Step[35] GlobalStep[74974] Training Speed: 423.07 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:53:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:31:15 INFO loss_tracker.py:84 | Epoch[547/NA] Step[49] GlobalStep[74988/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 21:31:19 INFO stats.py:314 | Epoch[547] Step[60] GlobalStep[74999] Training Speed: 411.40 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 2:52:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:31:25 INFO loss_tracker.py:84 | Epoch[547/NA] Step[74] GlobalStep[75013/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0125] total_loss[0.0164] Rank[0/16] 06/24/2025 21:31:29 INFO stats.py:314 | Epoch[547] Step[85] GlobalStep[75024] Training Speed: 435.81 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:52:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:31:35 INFO loss_tracker.py:84 | Epoch[547/NA] Step[99] GlobalStep[75038/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 21:31:39 INFO stats.py:314 | Epoch[547] Step[110] GlobalStep[75049] Training Speed: 423.21 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:52:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:31:45 INFO loss_tracker.py:84 | Epoch[547/NA] Step[124] GlobalStep[75063/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 21:31:49 INFO stats.py:314 | Epoch[547] Step[135] GlobalStep[75074] Training Speed: 453.11 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 2:52:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:31:50 INFO stats.py:394 | Epoch[547] completed. Training Speed: 308.43 samples/sec across all devices. Epoch Time: 56.86 sec. Average Epoch Time: 56.86 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:52:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:32:00 INFO stats.py:314 | Epoch[548] Step[23] GlobalStep[75099] Training Speed: 441.05 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:52:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:32:01 INFO loss_tracker.py:84 | Epoch[548/NA] Step[24] GlobalStep[75100/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 21:32:11 INFO stats.py:314 | Epoch[548] Step[48] GlobalStep[75124] Training Speed: 428.76 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:52:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:32:11 INFO loss_tracker.py:84 | Epoch[548/NA] Step[49] GlobalStep[75125/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 21:32:21 INFO stats.py:314 | Epoch[548] Step[73] GlobalStep[75149] Training Speed: 433.95 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:51:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:32:21 INFO loss_tracker.py:84 | Epoch[548/NA] Step[74] GlobalStep[75150/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 21:32:31 INFO stats.py:314 | Epoch[548] Step[98] GlobalStep[75174] Training Speed: 424.60 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:51:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:32:32 INFO loss_tracker.py:84 | Epoch[548/NA] Step[99] GlobalStep[75175/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 21:32:41 INFO stats.py:314 | Epoch[548] Step[123] GlobalStep[75199] Training Speed: 452.07 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 2:51:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:32:42 INFO loss_tracker.py:84 | Epoch[548/NA] Step[124] GlobalStep[75200/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 21:32:46 INFO stats.py:394 | Epoch[548] completed. Training Speed: 310.41 samples/sec across all devices. Epoch Time: 56.49 sec. Average Epoch Time: 56.49 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:51:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:32:52 INFO stats.py:314 | Epoch[549] Step[11] GlobalStep[75224] Training Speed: 431.06 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:51:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:32:58 INFO loss_tracker.py:84 | Epoch[549/NA] Step[24] GlobalStep[75237/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 21:33:02 INFO stats.py:314 | Epoch[549] Step[36] GlobalStep[75249] Training Speed: 433.97 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:51:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:33:08 INFO loss_tracker.py:84 | Epoch[549/NA] Step[49] GlobalStep[75262/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0034] loss_depth[0.0125] total_loss[0.0160] Rank[0/16] 06/24/2025 21:33:13 INFO stats.py:314 | Epoch[549] Step[61] GlobalStep[75274] Training Speed: 428.40 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:50:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:33:18 INFO loss_tracker.py:84 | Epoch[549/NA] Step[74] GlobalStep[75287/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0125] total_loss[0.0172] Rank[0/16] 06/24/2025 21:33:23 INFO stats.py:314 | Epoch[549] Step[86] GlobalStep[75299] Training Speed: 429.80 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:50:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:33:29 INFO loss_tracker.py:84 | Epoch[549/NA] Step[99] GlobalStep[75312/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0125] total_loss[0.0169] Rank[0/16] 06/24/2025 21:33:34 INFO stats.py:314 | Epoch[549] Step[111] GlobalStep[75324] Training Speed: 413.00 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 2:50:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:33:39 INFO loss_tracker.py:84 | Epoch[549/NA] Step[124] GlobalStep[75337/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 21:33:44 INFO stats.py:314 | Epoch[549] Step[136] GlobalStep[75349] Training Speed: 432.42 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:50:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:33:44 INFO stats.py:394 | Epoch[549] completed. Training Speed: 304.37 samples/sec across all devices. Epoch Time: 57.61 sec. Average Epoch Time: 57.61 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:50:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:33:55 INFO stats.py:314 | Epoch[550] Step[24] GlobalStep[75374] Training Speed: 434.03 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:50:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:33:55 INFO loss_tracker.py:84 | Epoch[550/NA] Step[24] GlobalStep[75374/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0035] loss_depth[0.0126] total_loss[0.0161] Rank[0/16] 06/24/2025 21:34:05 INFO stats.py:314 | Epoch[550] Step[49] GlobalStep[75399] Training Speed: 436.94 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:50:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:34:06 INFO loss_tracker.py:84 | Epoch[550/NA] Step[49] GlobalStep[75399/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0126] total_loss[0.0162] Rank[0/16] 06/24/2025 21:34:16 INFO stats.py:314 | Epoch[550] Step[74] GlobalStep[75424] Training Speed: 426.41 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:49:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:34:16 INFO loss_tracker.py:84 | Epoch[550/NA] Step[74] GlobalStep[75424/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0126] total_loss[0.0179] Rank[0/16] 06/24/2025 21:34:26 INFO stats.py:314 | Epoch[550] Step[99] GlobalStep[75449] Training Speed: 428.13 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:49:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:34:26 INFO loss_tracker.py:84 | Epoch[550/NA] Step[99] GlobalStep[75449/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 21:34:37 INFO stats.py:314 | Epoch[550] Step[124] GlobalStep[75474] Training Speed: 439.01 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:49:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:34:37 INFO loss_tracker.py:84 | Epoch[550/NA] Step[124] GlobalStep[75474/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0125] total_loss[0.0162] Rank[0/16] 06/24/2025 21:34:41 INFO stats.py:394 | Epoch[550] completed. Training Speed: 305.34 samples/sec across all devices. Epoch Time: 57.43 sec. Average Epoch Time: 57.43 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:49:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:34:48 INFO stats.py:314 | Epoch[551] Step[12] GlobalStep[75499] Training Speed: 429.54 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:49:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:34:53 INFO loss_tracker.py:84 | Epoch[551/NA] Step[24] GlobalStep[75511/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 21:34:58 INFO stats.py:314 | Epoch[551] Step[37] GlobalStep[75524] Training Speed: 428.75 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:49:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:35:03 INFO loss_tracker.py:84 | Epoch[551/NA] Step[49] GlobalStep[75536/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0125] total_loss[0.0171] Rank[0/16] 06/24/2025 21:35:09 INFO stats.py:314 | Epoch[551] Step[62] GlobalStep[75549] Training Speed: 438.48 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:49:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:35:14 INFO loss_tracker.py:84 | Epoch[551/NA] Step[74] GlobalStep[75561/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 21:35:19 INFO stats.py:314 | Epoch[551] Step[87] GlobalStep[75574] Training Speed: 435.08 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:48:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:35:24 INFO loss_tracker.py:84 | Epoch[551/NA] Step[99] GlobalStep[75586/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 21:35:29 INFO stats.py:314 | Epoch[551] Step[112] GlobalStep[75599] Training Speed: 436.78 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:48:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:35:34 INFO loss_tracker.py:84 | Epoch[551/NA] Step[124] GlobalStep[75611/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 21:35:38 INFO stats.py:394 | Epoch[551] completed. Training Speed: 306.07 samples/sec across all devices. Epoch Time: 57.29 sec. Average Epoch Time: 57.29 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:48:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:35:40 INFO stats.py:314 | Epoch[552] Step[0] GlobalStep[75624] Training Speed: 332.10 samples/sec across all devices. Average Step Time: 0.39 sec. Estimated Remaining Time: 2:48:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:35:50 INFO loss_tracker.py:84 | Epoch[552/NA] Step[24] GlobalStep[75648/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 21:35:50 INFO stats.py:314 | Epoch[552] Step[25] GlobalStep[75649] Training Speed: 420.82 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:48:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:36:00 INFO loss_tracker.py:84 | Epoch[552/NA] Step[49] GlobalStep[75673/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 21:36:01 INFO stats.py:314 | Epoch[552] Step[50] GlobalStep[75674] Training Speed: 403.05 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 2:48:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:36:11 INFO loss_tracker.py:84 | Epoch[552/NA] Step[74] GlobalStep[75698/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0125] total_loss[0.0177] Rank[0/16] 06/24/2025 21:36:11 INFO stats.py:314 | Epoch[552] Step[75] GlobalStep[75699] Training Speed: 422.77 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:48:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:36:20 INFO loss_tracker.py:84 | Epoch[552/NA] Step[99] GlobalStep[75723/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0125] total_loss[0.0171] Rank[0/16] 06/24/2025 21:36:21 INFO stats.py:314 | Epoch[552] Step[100] GlobalStep[75724] Training Speed: 429.30 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:47:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:36:31 INFO loss_tracker.py:84 | Epoch[552/NA] Step[124] GlobalStep[75748/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 21:36:31 INFO stats.py:314 | Epoch[552] Step[125] GlobalStep[75749] Training Speed: 437.22 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:47:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:36:35 INFO stats.py:394 | Epoch[552] completed. Training Speed: 309.96 samples/sec across all devices. Epoch Time: 56.57 sec. Average Epoch Time: 56.57 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:47:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:36:41 INFO stats.py:314 | Epoch[553] Step[13] GlobalStep[75774] Training Speed: 415.78 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 2:47:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:36:47 INFO loss_tracker.py:84 | Epoch[553/NA] Step[24] GlobalStep[75785/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0126] total_loss[0.0162] Rank[0/16] 06/24/2025 21:36:53 INFO stats.py:314 | Epoch[553] Step[38] GlobalStep[75799] Training Speed: 422.47 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:47:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:36:57 INFO loss_tracker.py:84 | Epoch[553/NA] Step[49] GlobalStep[75810/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0125] total_loss[0.0168] Rank[0/16] 06/24/2025 21:37:02 INFO stats.py:314 | Epoch[553] Step[63] GlobalStep[75824] Training Speed: 433.96 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:47:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:37:08 INFO loss_tracker.py:84 | Epoch[553/NA] Step[74] GlobalStep[75835/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 21:37:14 INFO stats.py:314 | Epoch[553] Step[88] GlobalStep[75849] Training Speed: 440.34 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:47:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:37:18 INFO loss_tracker.py:84 | Epoch[553/NA] Step[99] GlobalStep[75860/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0126] total_loss[0.0162] Rank[0/16] 06/24/2025 21:37:23 INFO stats.py:314 | Epoch[553] Step[113] GlobalStep[75874] Training Speed: 395.21 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 2:46:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:37:28 INFO loss_tracker.py:84 | Epoch[553/NA] Step[124] GlobalStep[75885/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 21:37:32 INFO stats.py:394 | Epoch[553] completed. Training Speed: 305.54 samples/sec across all devices. Epoch Time: 57.39 sec. Average Epoch Time: 57.39 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:46:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:37:34 INFO stats.py:314 | Epoch[554] Step[1] GlobalStep[75899] Training Speed: 433.57 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:46:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:37:43 INFO loss_tracker.py:84 | Epoch[554/NA] Step[24] GlobalStep[75922/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0125] total_loss[0.0173] Rank[0/16] 06/24/2025 21:37:44 INFO stats.py:314 | Epoch[554] Step[26] GlobalStep[75924] Training Speed: 420.17 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:46:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:37:54 INFO loss_tracker.py:84 | Epoch[554/NA] Step[49] GlobalStep[75947/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 21:37:55 INFO stats.py:314 | Epoch[554] Step[51] GlobalStep[75949] Training Speed: 427.76 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:46:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:38:04 INFO loss_tracker.py:84 | Epoch[554/NA] Step[74] GlobalStep[75972/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0034] loss_depth[0.0126] total_loss[0.0160] Rank[0/16] 06/24/2025 21:38:05 INFO stats.py:314 | Epoch[554] Step[76] GlobalStep[75974] Training Speed: 234.76 samples/sec across all devices. Average Step Time: 0.55 sec. Estimated Remaining Time: 2:46:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:38:15 INFO loss_tracker.py:84 | Epoch[554/NA] Step[99] GlobalStep[75997/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0125] total_loss[0.0174] Rank[0/16] 06/24/2025 21:38:16 INFO stats.py:314 | Epoch[554] Step[101] GlobalStep[75999] Training Speed: 429.94 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:45:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:38:16 WARNING accelerator.py:3099 | Deleting 1 checkpoints to make room for new checkpoint. Rank[0/16] 06/24/2025 21:38:17 INFO accelerator.py:3111 | Saving current state to /job_data/checkpoints/checkpoint_18 Rank[5/16] 06/24/2025 21:38:17 INFO checkpoint.py:110 | Save checkpoint at the end of step 75999 to /job_data/checkpoints/checkpoint_18 Rank[10/16] 06/24/2025 21:38:17 INFO checkpoint.py:110 | Save checkpoint at the end of step 75999 to /job_data/checkpoints/checkpoint_18 Rank[12/16] 06/24/2025 21:38:17 INFO checkpoint.py:110 | Save checkpoint at the end of step 75999 to /job_data/checkpoints/checkpoint_18 Rank[8/16] 06/24/2025 21:38:17 INFO checkpoint.py:110 | Save checkpoint at the end of step 75999 to /job_data/checkpoints/checkpoint_18 Rank[7/16] 06/24/2025 21:38:17 INFO checkpoint.py:110 | Save checkpoint at the end of step 75999 to /job_data/checkpoints/checkpoint_18 Rank[11/16] 06/24/2025 21:38:17 INFO checkpoint.py:110 | Save checkpoint at the end of step 75999 to /job_data/checkpoints/checkpoint_18 Rank[9/16] 06/24/2025 21:38:17 INFO checkpoint.py:110 | Save checkpoint at the end of step 75999 to /job_data/checkpoints/checkpoint_18 Rank[15/16] 06/24/2025 21:38:17 INFO checkpoint.py:110 | Save checkpoint at the end of step 75999 to /job_data/checkpoints/checkpoint_18 Rank[3/16] 06/24/2025 21:38:17 INFO checkpoint.py:110 | Save checkpoint at the end of step 75999 to /job_data/checkpoints/checkpoint_18 Rank[14/16] 06/24/2025 21:38:17 INFO checkpoint.py:110 | Save checkpoint at the end of step 75999 to /job_data/checkpoints/checkpoint_18 Rank[2/16] 06/24/2025 21:38:17 INFO checkpoint.py:110 | Save checkpoint at the end of step 75999 to /job_data/checkpoints/checkpoint_18 Rank[13/16] 06/24/2025 21:38:17 INFO checkpoint.py:110 | Save checkpoint at the end of step 75999 to /job_data/checkpoints/checkpoint_18 Rank[6/16] 06/24/2025 21:38:17 INFO checkpoint.py:110 | Save checkpoint at the end of step 75999 to /job_data/checkpoints/checkpoint_18 Rank[1/16] 06/24/2025 21:38:17 INFO checkpoint.py:110 | Save checkpoint at the end of step 75999 to /job_data/checkpoints/checkpoint_18 Rank[4/16] 06/24/2025 21:38:17 INFO checkpoint.py:110 | Save checkpoint at the end of step 75999 to /job_data/checkpoints/checkpoint_18 Rank[0/16] 06/24/2025 21:38:18 INFO checkpointing.py:106 | Model weights saved in /job_data/checkpoints/checkpoint_18/model.safetensors Rank[0/16] 06/24/2025 21:38:19 INFO checkpointing.py:113 | Optimizer state saved in /job_data/checkpoints/checkpoint_18/optimizer.bin Rank[0/16] 06/24/2025 21:38:19 INFO checkpointing.py:120 | Scheduler state saved in /job_data/checkpoints/checkpoint_18/scheduler.bin Rank[0/16] 06/24/2025 21:38:19 INFO checkpointing.py:137 | Sampler state for dataloader 0 saved in /job_data/checkpoints/checkpoint_18/sampler.bin Rank[0/16] 06/24/2025 21:38:19 INFO checkpointing.py:164 | Random states saved in /job_data/checkpoints/checkpoint_18/random_states_0.pkl Rank[0/16] 06/24/2025 21:38:19 INFO checkpointing.py:300 | Saving the state of TrainerProgressState to /job_data/checkpoints/checkpoint_18/custom_checkpoint_0.pkl Rank[0/16] 06/24/2025 21:38:19 INFO checkpoint.py:110 | Save checkpoint at the end of step 75999 to /job_data/checkpoints/checkpoint_18 Rank[0/16] 06/24/2025 21:38:28 INFO loss_tracker.py:84 | Epoch[554/NA] Step[124] GlobalStep[76022/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0032] loss_depth[0.0126] total_loss[0.0158] Rank[0/16] 06/24/2025 21:38:29 INFO stats.py:314 | Epoch[554] Step[126] GlobalStep[76024] Training Speed: 455.07 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 2:45:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:38:32 INFO stats.py:394 | Epoch[554] completed. Training Speed: 292.23 samples/sec across all devices. Epoch Time: 60.01 sec. Average Epoch Time: 60.01 sec. Average Step Time: 0.44 sec. Estimated Remaining Time: 2:45:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:38:39 INFO stats.py:314 | Epoch[555] Step[14] GlobalStep[76049] Training Speed: 426.66 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:45:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:38:43 INFO loss_tracker.py:84 | Epoch[555/NA] Step[24] GlobalStep[76059/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 21:38:50 INFO stats.py:314 | Epoch[555] Step[39] GlobalStep[76074] Training Speed: 438.53 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:45:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:38:54 INFO loss_tracker.py:84 | Epoch[555/NA] Step[49] GlobalStep[76084/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 21:39:00 INFO stats.py:314 | Epoch[555] Step[64] GlobalStep[76099] Training Speed: 432.99 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:45:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:39:04 INFO loss_tracker.py:84 | Epoch[555/NA] Step[74] GlobalStep[76109/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0126] total_loss[0.0178] Rank[0/16] 06/24/2025 21:39:10 INFO stats.py:314 | Epoch[555] Step[89] GlobalStep[76124] Training Speed: 435.82 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:45:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:39:15 INFO loss_tracker.py:84 | Epoch[555/NA] Step[99] GlobalStep[76134/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 21:39:21 INFO stats.py:314 | Epoch[555] Step[114] GlobalStep[76149] Training Speed: 417.03 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 2:44:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:39:24 INFO loss_tracker.py:84 | Epoch[555/NA] Step[124] GlobalStep[76159/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0127] total_loss[0.0166] Rank[0/16] 06/24/2025 21:39:29 INFO stats.py:394 | Epoch[555] completed. Training Speed: 312.40 samples/sec across all devices. Epoch Time: 56.13 sec. Average Epoch Time: 56.13 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:44:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:39:31 INFO stats.py:314 | Epoch[556] Step[2] GlobalStep[76174] Training Speed: 424.90 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:44:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:39:40 INFO loss_tracker.py:84 | Epoch[556/NA] Step[24] GlobalStep[76196/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 21:39:41 INFO stats.py:314 | Epoch[556] Step[27] GlobalStep[76199] Training Speed: 420.25 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:44:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:39:50 INFO loss_tracker.py:84 | Epoch[556/NA] Step[49] GlobalStep[76221/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 21:39:51 INFO stats.py:314 | Epoch[556] Step[52] GlobalStep[76224] Training Speed: 436.88 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:44:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:40:00 INFO loss_tracker.py:84 | Epoch[556/NA] Step[74] GlobalStep[76246/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0125] total_loss[0.0162] Rank[0/16] 06/24/2025 21:40:01 INFO stats.py:314 | Epoch[556] Step[77] GlobalStep[76249] Training Speed: 428.72 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:44:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:40:11 INFO loss_tracker.py:84 | Epoch[556/NA] Step[99] GlobalStep[76271/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 21:40:12 INFO stats.py:314 | Epoch[556] Step[102] GlobalStep[76274] Training Speed: 427.98 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:44:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:40:21 INFO loss_tracker.py:84 | Epoch[556/NA] Step[124] GlobalStep[76296/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 21:40:22 INFO stats.py:314 | Epoch[556] Step[127] GlobalStep[76299] Training Speed: 430.48 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:43:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:40:25 INFO stats.py:394 | Epoch[556] completed. Training Speed: 310.21 samples/sec across all devices. Epoch Time: 56.53 sec. Average Epoch Time: 56.53 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:43:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:40:33 INFO stats.py:314 | Epoch[557] Step[15] GlobalStep[76324] Training Speed: 416.49 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 2:43:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:40:37 INFO loss_tracker.py:84 | Epoch[557/NA] Step[24] GlobalStep[76333/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 21:40:43 INFO stats.py:314 | Epoch[557] Step[40] GlobalStep[76349] Training Speed: 436.62 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:43:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:40:47 INFO loss_tracker.py:84 | Epoch[557/NA] Step[49] GlobalStep[76358/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0126] total_loss[0.0163] Rank[0/16] 06/24/2025 21:40:54 INFO stats.py:314 | Epoch[557] Step[65] GlobalStep[76374] Training Speed: 436.99 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:43:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:40:57 INFO loss_tracker.py:84 | Epoch[557/NA] Step[74] GlobalStep[76383/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 21:41:03 INFO stats.py:314 | Epoch[557] Step[90] GlobalStep[76399] Training Speed: 431.38 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:43:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:41:08 INFO loss_tracker.py:84 | Epoch[557/NA] Step[99] GlobalStep[76408/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 21:41:14 INFO stats.py:314 | Epoch[557] Step[115] GlobalStep[76424] Training Speed: 417.21 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 2:43:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:41:18 INFO loss_tracker.py:84 | Epoch[557/NA] Step[124] GlobalStep[76433/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0125] total_loss[0.0161] Rank[0/16] 06/24/2025 21:41:23 INFO stats.py:394 | Epoch[557] completed. Training Speed: 305.00 samples/sec across all devices. Epoch Time: 57.50 sec. Average Epoch Time: 57.50 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:42:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:41:25 INFO stats.py:314 | Epoch[558] Step[3] GlobalStep[76449] Training Speed: 426.76 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:42:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:41:35 INFO loss_tracker.py:84 | Epoch[558/NA] Step[24] GlobalStep[76470/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 21:41:36 INFO stats.py:314 | Epoch[558] Step[28] GlobalStep[76474] Training Speed: 392.49 samples/sec across all devices. Average Step Time: 0.33 sec. Estimated Remaining Time: 2:42:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:41:45 INFO loss_tracker.py:84 | Epoch[558/NA] Step[49] GlobalStep[76495/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0125] total_loss[0.0162] Rank[0/16] 06/24/2025 21:41:46 INFO stats.py:314 | Epoch[558] Step[53] GlobalStep[76499] Training Speed: 428.83 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:42:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:41:55 INFO loss_tracker.py:84 | Epoch[558/NA] Step[74] GlobalStep[76520/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 21:41:57 INFO stats.py:314 | Epoch[558] Step[78] GlobalStep[76524] Training Speed: 434.88 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:42:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:42:05 INFO loss_tracker.py:84 | Epoch[558/NA] Step[99] GlobalStep[76545/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 21:42:07 INFO stats.py:314 | Epoch[558] Step[103] GlobalStep[76549] Training Speed: 431.09 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:42:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:42:16 INFO loss_tracker.py:84 | Epoch[558/NA] Step[124] GlobalStep[76570/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0126] total_loss[0.0163] Rank[0/16] 06/24/2025 21:42:17 INFO stats.py:314 | Epoch[558] Step[128] GlobalStep[76574] Training Speed: 436.32 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:42:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:42:20 INFO stats.py:394 | Epoch[558] completed. Training Speed: 304.11 samples/sec across all devices. Epoch Time: 57.66 sec. Average Epoch Time: 57.66 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:41:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:42:28 INFO stats.py:314 | Epoch[559] Step[16] GlobalStep[76599] Training Speed: 423.32 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:41:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:42:32 INFO loss_tracker.py:84 | Epoch[559/NA] Step[24] GlobalStep[76607/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0163] Rank[0/16] 06/24/2025 21:42:39 INFO stats.py:314 | Epoch[559] Step[41] GlobalStep[76624] Training Speed: 433.00 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:41:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:42:42 INFO loss_tracker.py:84 | Epoch[559/NA] Step[49] GlobalStep[76632/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 21:42:49 INFO stats.py:314 | Epoch[559] Step[66] GlobalStep[76649] Training Speed: 419.24 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 2:41:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:42:53 INFO loss_tracker.py:84 | Epoch[559/NA] Step[74] GlobalStep[76657/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0176] Rank[0/16] 06/24/2025 21:43:00 INFO stats.py:314 | Epoch[559] Step[91] GlobalStep[76674] Training Speed: 435.21 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:41:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:43:03 INFO loss_tracker.py:84 | Epoch[559/NA] Step[99] GlobalStep[76682/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 21:43:10 INFO stats.py:314 | Epoch[559] Step[116] GlobalStep[76699] Training Speed: 433.59 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:41:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:43:13 INFO loss_tracker.py:84 | Epoch[559/NA] Step[124] GlobalStep[76707/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 21:43:17 INFO stats.py:394 | Epoch[559] completed. Training Speed: 306.55 samples/sec across all devices. Epoch Time: 57.20 sec. Average Epoch Time: 57.20 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:41:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:43:21 INFO stats.py:314 | Epoch[560] Step[4] GlobalStep[76724] Training Speed: 435.47 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:40:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:43:29 INFO loss_tracker.py:84 | Epoch[560/NA] Step[24] GlobalStep[76744/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 21:43:31 INFO stats.py:314 | Epoch[560] Step[29] GlobalStep[76749] Training Speed: 419.99 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:40:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:43:40 INFO loss_tracker.py:84 | Epoch[560/NA] Step[49] GlobalStep[76769/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 21:43:42 INFO stats.py:314 | Epoch[560] Step[54] GlobalStep[76774] Training Speed: 422.29 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:40:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:43:50 INFO loss_tracker.py:84 | Epoch[560/NA] Step[74] GlobalStep[76794/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0035] loss_depth[0.0126] total_loss[0.0161] Rank[0/16] 06/24/2025 21:43:52 INFO stats.py:314 | Epoch[560] Step[79] GlobalStep[76799] Training Speed: 426.64 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:40:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:44:01 INFO loss_tracker.py:84 | Epoch[560/NA] Step[99] GlobalStep[76819/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 21:44:03 INFO stats.py:314 | Epoch[560] Step[104] GlobalStep[76824] Training Speed: 420.04 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:40:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:44:11 INFO loss_tracker.py:84 | Epoch[560/NA] Step[124] GlobalStep[76844/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0176] Rank[0/16] 06/24/2025 21:44:13 INFO stats.py:314 | Epoch[560] Step[129] GlobalStep[76849] Training Speed: 454.65 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 2:40:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:44:15 INFO stats.py:394 | Epoch[560] completed. Training Speed: 303.25 samples/sec across all devices. Epoch Time: 57.83 sec. Average Epoch Time: 57.83 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:40:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:44:23 INFO stats.py:314 | Epoch[561] Step[17] GlobalStep[76874] Training Speed: 434.79 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:39:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:44:26 INFO loss_tracker.py:84 | Epoch[561/NA] Step[24] GlobalStep[76881/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0125] total_loss[0.0173] Rank[0/16] 06/24/2025 21:44:33 INFO stats.py:314 | Epoch[561] Step[42] GlobalStep[76899] Training Speed: 420.08 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:39:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:44:37 INFO loss_tracker.py:84 | Epoch[561/NA] Step[49] GlobalStep[76906/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 21:44:44 INFO stats.py:314 | Epoch[561] Step[67] GlobalStep[76924] Training Speed: 411.41 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 2:39:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:44:47 INFO loss_tracker.py:84 | Epoch[561/NA] Step[74] GlobalStep[76931/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 21:44:54 INFO stats.py:314 | Epoch[561] Step[92] GlobalStep[76949] Training Speed: 425.14 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:39:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:44:57 INFO loss_tracker.py:84 | Epoch[561/NA] Step[99] GlobalStep[76956/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0035] loss_depth[0.0126] total_loss[0.0161] Rank[0/16] 06/24/2025 21:45:04 INFO stats.py:314 | Epoch[561] Step[117] GlobalStep[76974] Training Speed: 420.06 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:39:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:45:07 INFO loss_tracker.py:84 | Epoch[561/NA] Step[124] GlobalStep[76981/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0125] total_loss[0.0163] Rank[0/16] 06/24/2025 21:45:12 INFO stats.py:394 | Epoch[561] completed. Training Speed: 311.06 samples/sec across all devices. Epoch Time: 56.38 sec. Average Epoch Time: 56.38 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:39:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:45:15 INFO stats.py:314 | Epoch[562] Step[5] GlobalStep[76999] Training Speed: 424.45 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:39:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:45:23 INFO loss_tracker.py:84 | Epoch[562/NA] Step[24] GlobalStep[77018/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0035] loss_depth[0.0125] total_loss[0.0161] Rank[0/16] 06/24/2025 21:45:26 INFO stats.py:314 | Epoch[562] Step[30] GlobalStep[77024] Training Speed: 430.54 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:38:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:45:33 INFO loss_tracker.py:84 | Epoch[562/NA] Step[49] GlobalStep[77043/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0126] total_loss[0.0162] Rank[0/16] 06/24/2025 21:45:36 INFO stats.py:314 | Epoch[562] Step[55] GlobalStep[77049] Training Speed: 431.29 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:38:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:45:44 INFO loss_tracker.py:84 | Epoch[562/NA] Step[74] GlobalStep[77068/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 21:45:46 INFO stats.py:314 | Epoch[562] Step[80] GlobalStep[77074] Training Speed: 400.96 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 2:38:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:45:54 INFO loss_tracker.py:84 | Epoch[562/NA] Step[99] GlobalStep[77093/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0176] Rank[0/16] 06/24/2025 21:45:57 INFO stats.py:314 | Epoch[562] Step[105] GlobalStep[77099] Training Speed: 422.31 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:38:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:46:04 INFO loss_tracker.py:84 | Epoch[562/NA] Step[124] GlobalStep[77118/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 21:46:06 INFO stats.py:314 | Epoch[562] Step[130] GlobalStep[77124] Training Speed: 431.99 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:38:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:46:09 INFO stats.py:394 | Epoch[562] completed. Training Speed: 307.64 samples/sec across all devices. Epoch Time: 57.00 sec. Average Epoch Time: 57.00 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:38:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:46:18 INFO stats.py:314 | Epoch[563] Step[18] GlobalStep[77149] Training Speed: 431.13 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:38:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:46:20 INFO loss_tracker.py:84 | Epoch[563/NA] Step[24] GlobalStep[77155/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 21:46:28 INFO stats.py:314 | Epoch[563] Step[43] GlobalStep[77174] Training Speed: 429.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:37:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:46:30 INFO loss_tracker.py:84 | Epoch[563/NA] Step[49] GlobalStep[77180/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 21:46:38 INFO stats.py:314 | Epoch[563] Step[68] GlobalStep[77199] Training Speed: 432.99 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:37:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:46:40 INFO loss_tracker.py:84 | Epoch[563/NA] Step[74] GlobalStep[77205/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 21:46:48 INFO stats.py:314 | Epoch[563] Step[93] GlobalStep[77224] Training Speed: 435.88 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:37:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:46:50 INFO loss_tracker.py:84 | Epoch[563/NA] Step[99] GlobalStep[77230/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0125] total_loss[0.0162] Rank[0/16] 06/24/2025 21:46:59 INFO stats.py:314 | Epoch[563] Step[118] GlobalStep[77249] Training Speed: 433.67 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:37:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:47:01 INFO loss_tracker.py:84 | Epoch[563/NA] Step[124] GlobalStep[77255/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0125] total_loss[0.0174] Rank[0/16] 06/24/2025 21:47:06 INFO stats.py:394 | Epoch[563] completed. Training Speed: 308.25 samples/sec across all devices. Epoch Time: 56.89 sec. Average Epoch Time: 56.89 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:37:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:47:09 INFO stats.py:314 | Epoch[564] Step[6] GlobalStep[77274] Training Speed: 426.72 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:37:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:47:18 INFO loss_tracker.py:84 | Epoch[564/NA] Step[24] GlobalStep[77292/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0125] total_loss[0.0162] Rank[0/16] 06/24/2025 21:47:21 INFO stats.py:314 | Epoch[564] Step[31] GlobalStep[77299] Training Speed: 433.10 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:37:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:47:28 INFO loss_tracker.py:84 | Epoch[564/NA] Step[49] GlobalStep[77317/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0126] total_loss[0.0162] Rank[0/16] 06/24/2025 21:47:30 INFO stats.py:314 | Epoch[564] Step[56] GlobalStep[77324] Training Speed: 432.73 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:36:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:47:39 INFO loss_tracker.py:84 | Epoch[564/NA] Step[74] GlobalStep[77342/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0125] total_loss[0.0169] Rank[0/16] 06/24/2025 21:47:41 INFO stats.py:314 | Epoch[564] Step[81] GlobalStep[77349] Training Speed: 442.42 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:36:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:47:48 INFO loss_tracker.py:84 | Epoch[564/NA] Step[99] GlobalStep[77367/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0034] loss_depth[0.0126] total_loss[0.0160] Rank[0/16] 06/24/2025 21:47:51 INFO stats.py:314 | Epoch[564] Step[106] GlobalStep[77374] Training Speed: 431.39 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:36:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:47:59 INFO loss_tracker.py:84 | Epoch[564/NA] Step[124] GlobalStep[77392/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0125] total_loss[0.0169] Rank[0/16] 06/24/2025 21:48:01 INFO stats.py:314 | Epoch[564] Step[131] GlobalStep[77399] Training Speed: 439.65 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:36:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:48:03 INFO stats.py:394 | Epoch[564] completed. Training Speed: 304.78 samples/sec across all devices. Epoch Time: 57.54 sec. Average Epoch Time: 57.54 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:36:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:48:12 INFO stats.py:314 | Epoch[565] Step[19] GlobalStep[77424] Training Speed: 426.99 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:36:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:48:15 INFO loss_tracker.py:84 | Epoch[565/NA] Step[24] GlobalStep[77429/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0125] total_loss[0.0164] Rank[0/16] 06/24/2025 21:48:23 INFO stats.py:314 | Epoch[565] Step[44] GlobalStep[77449] Training Speed: 425.13 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:35:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:48:25 INFO loss_tracker.py:84 | Epoch[565/NA] Step[49] GlobalStep[77454/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 21:48:33 INFO stats.py:314 | Epoch[565] Step[69] GlobalStep[77474] Training Speed: 436.38 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:35:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:48:36 INFO loss_tracker.py:84 | Epoch[565/NA] Step[74] GlobalStep[77479/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 21:48:44 INFO stats.py:314 | Epoch[565] Step[94] GlobalStep[77499] Training Speed: 437.62 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:35:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:48:46 INFO loss_tracker.py:84 | Epoch[565/NA] Step[99] GlobalStep[77504/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 21:48:53 INFO stats.py:314 | Epoch[565] Step[119] GlobalStep[77524] Training Speed: 442.39 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:35:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:48:55 INFO loss_tracker.py:84 | Epoch[565/NA] Step[124] GlobalStep[77529/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0031] loss_depth[0.0126] total_loss[0.0157] Rank[0/16] 06/24/2025 21:49:00 INFO stats.py:394 | Epoch[565] completed. Training Speed: 308.35 samples/sec across all devices. Epoch Time: 56.87 sec. Average Epoch Time: 56.87 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:35:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:49:04 INFO stats.py:314 | Epoch[566] Step[7] GlobalStep[77549] Training Speed: 435.82 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:35:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:49:11 INFO loss_tracker.py:84 | Epoch[566/NA] Step[24] GlobalStep[77566/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 21:49:14 INFO stats.py:314 | Epoch[566] Step[32] GlobalStep[77574] Training Speed: 418.35 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 2:35:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:49:22 INFO loss_tracker.py:84 | Epoch[566/NA] Step[49] GlobalStep[77591/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 21:49:25 INFO stats.py:314 | Epoch[566] Step[57] GlobalStep[77599] Training Speed: 429.24 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:34:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:49:32 INFO loss_tracker.py:84 | Epoch[566/NA] Step[74] GlobalStep[77616/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 21:49:35 INFO stats.py:314 | Epoch[566] Step[82] GlobalStep[77624] Training Speed: 429.36 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:34:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:49:43 INFO loss_tracker.py:84 | Epoch[566/NA] Step[99] GlobalStep[77641/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 21:49:46 INFO stats.py:314 | Epoch[566] Step[107] GlobalStep[77649] Training Speed: 410.27 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 2:34:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:49:52 INFO loss_tracker.py:84 | Epoch[566/NA] Step[124] GlobalStep[77666/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 21:49:56 INFO stats.py:314 | Epoch[566] Step[132] GlobalStep[77674] Training Speed: 451.76 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 2:34:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:49:57 INFO stats.py:394 | Epoch[566] completed. Training Speed: 307.22 samples/sec across all devices. Epoch Time: 57.08 sec. Average Epoch Time: 57.08 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:34:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:50:07 INFO stats.py:314 | Epoch[567] Step[20] GlobalStep[77699] Training Speed: 268.05 samples/sec across all devices. Average Step Time: 0.48 sec. Estimated Remaining Time: 2:34:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:50:08 INFO loss_tracker.py:84 | Epoch[567/NA] Step[24] GlobalStep[77703/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 21:50:17 INFO stats.py:314 | Epoch[567] Step[45] GlobalStep[77724] Training Speed: 423.94 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:34:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:50:19 INFO loss_tracker.py:84 | Epoch[567/NA] Step[49] GlobalStep[77728/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0125] total_loss[0.0171] Rank[0/16] 06/24/2025 21:50:27 INFO stats.py:314 | Epoch[567] Step[70] GlobalStep[77749] Training Speed: 433.05 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:33:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:50:29 INFO loss_tracker.py:84 | Epoch[567/NA] Step[74] GlobalStep[77753/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 21:50:38 INFO stats.py:314 | Epoch[567] Step[95] GlobalStep[77774] Training Speed: 431.99 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:33:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:50:39 INFO loss_tracker.py:84 | Epoch[567/NA] Step[99] GlobalStep[77778/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 21:50:48 INFO stats.py:314 | Epoch[567] Step[120] GlobalStep[77799] Training Speed: 439.62 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:33:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:50:49 INFO loss_tracker.py:84 | Epoch[567/NA] Step[124] GlobalStep[77803/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0125] total_loss[0.0173] Rank[0/16] 06/24/2025 21:50:53 INFO stats.py:394 | Epoch[567] completed. Training Speed: 311.15 samples/sec across all devices. Epoch Time: 56.36 sec. Average Epoch Time: 56.36 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:33:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:50:58 INFO stats.py:314 | Epoch[568] Step[8] GlobalStep[77824] Training Speed: 429.46 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:33:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:51:04 INFO loss_tracker.py:84 | Epoch[568/NA] Step[24] GlobalStep[77840/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 21:51:08 INFO stats.py:314 | Epoch[568] Step[33] GlobalStep[77849] Training Speed: 439.34 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:33:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:51:14 INFO loss_tracker.py:84 | Epoch[568/NA] Step[49] GlobalStep[77865/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0163] Rank[0/16] 06/24/2025 21:51:18 INFO stats.py:314 | Epoch[568] Step[58] GlobalStep[77874] Training Speed: 435.89 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:33:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:51:25 INFO loss_tracker.py:84 | Epoch[568/NA] Step[74] GlobalStep[77890/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0125] total_loss[0.0168] Rank[0/16] 06/24/2025 21:51:29 INFO stats.py:314 | Epoch[568] Step[83] GlobalStep[77899] Training Speed: 432.99 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:32:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:51:35 INFO loss_tracker.py:84 | Epoch[568/NA] Step[99] GlobalStep[77915/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 21:51:39 INFO stats.py:314 | Epoch[568] Step[108] GlobalStep[77924] Training Speed: 429.02 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:32:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:51:46 INFO loss_tracker.py:84 | Epoch[568/NA] Step[124] GlobalStep[77940/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0125] total_loss[0.0175] Rank[0/16] 06/24/2025 21:51:49 INFO stats.py:314 | Epoch[568] Step[133] GlobalStep[77949] Training Speed: 443.26 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:32:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:51:50 INFO stats.py:394 | Epoch[568] completed. Training Speed: 309.45 samples/sec across all devices. Epoch Time: 56.67 sec. Average Epoch Time: 56.67 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:32:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:52:01 INFO stats.py:314 | Epoch[569] Step[21] GlobalStep[77974] Training Speed: 409.00 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 2:32:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:52:02 INFO loss_tracker.py:84 | Epoch[569/NA] Step[24] GlobalStep[77977/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 21:52:11 INFO stats.py:314 | Epoch[569] Step[46] GlobalStep[77999] Training Speed: 440.07 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:32:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:52:12 INFO loss_tracker.py:84 | Epoch[569/NA] Step[49] GlobalStep[78002/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 21:52:21 INFO stats.py:314 | Epoch[569] Step[71] GlobalStep[78024] Training Speed: 436.39 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:31:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:52:22 INFO loss_tracker.py:84 | Epoch[569/NA] Step[74] GlobalStep[78027/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 21:52:31 INFO stats.py:314 | Epoch[569] Step[96] GlobalStep[78049] Training Speed: 434.76 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:31:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:52:32 INFO loss_tracker.py:84 | Epoch[569/NA] Step[99] GlobalStep[78052/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 21:52:41 INFO stats.py:314 | Epoch[569] Step[121] GlobalStep[78074] Training Speed: 439.51 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:31:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:52:42 INFO loss_tracker.py:84 | Epoch[569/NA] Step[124] GlobalStep[78077/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 21:52:46 INFO stats.py:394 | Epoch[569] completed. Training Speed: 310.85 samples/sec across all devices. Epoch Time: 56.41 sec. Average Epoch Time: 56.41 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:31:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:52:52 INFO stats.py:314 | Epoch[570] Step[9] GlobalStep[78099] Training Speed: 427.84 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:31:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:52:58 INFO loss_tracker.py:84 | Epoch[570/NA] Step[24] GlobalStep[78114/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 21:53:02 INFO stats.py:314 | Epoch[570] Step[34] GlobalStep[78124] Training Speed: 433.30 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:31:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:53:09 INFO loss_tracker.py:84 | Epoch[570/NA] Step[49] GlobalStep[78139/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 21:53:13 INFO stats.py:314 | Epoch[570] Step[59] GlobalStep[78149] Training Speed: 431.55 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:31:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:53:19 INFO loss_tracker.py:84 | Epoch[570/NA] Step[74] GlobalStep[78164/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0125] total_loss[0.0168] Rank[0/16] 06/24/2025 21:53:23 INFO stats.py:314 | Epoch[570] Step[84] GlobalStep[78174] Training Speed: 253.07 samples/sec across all devices. Average Step Time: 0.51 sec. Estimated Remaining Time: 2:30:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:53:29 INFO loss_tracker.py:84 | Epoch[570/NA] Step[99] GlobalStep[78189/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0052] loss_depth[0.0125] total_loss[0.0177] Rank[0/16] 06/24/2025 21:53:33 INFO stats.py:314 | Epoch[570] Step[109] GlobalStep[78199] Training Speed: 438.23 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:30:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:53:40 INFO loss_tracker.py:84 | Epoch[570/NA] Step[124] GlobalStep[78214/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0031] loss_depth[0.0125] total_loss[0.0156] Rank[0/16] 06/24/2025 21:53:43 INFO stats.py:314 | Epoch[570] Step[134] GlobalStep[78224] Training Speed: 442.07 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:30:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:53:44 INFO stats.py:394 | Epoch[570] completed. Training Speed: 306.31 samples/sec across all devices. Epoch Time: 57.25 sec. Average Epoch Time: 57.25 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:30:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:53:54 INFO stats.py:314 | Epoch[571] Step[22] GlobalStep[78249] Training Speed: 433.85 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:30:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:53:55 INFO loss_tracker.py:84 | Epoch[571/NA] Step[24] GlobalStep[78251/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 21:54:05 INFO stats.py:314 | Epoch[571] Step[47] GlobalStep[78274] Training Speed: 391.98 samples/sec across all devices. Average Step Time: 0.33 sec. Estimated Remaining Time: 2:30:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:54:06 INFO loss_tracker.py:84 | Epoch[571/NA] Step[49] GlobalStep[78276/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 21:54:15 INFO stats.py:314 | Epoch[571] Step[72] GlobalStep[78299] Training Speed: 427.85 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:30:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:54:16 INFO loss_tracker.py:84 | Epoch[571/NA] Step[74] GlobalStep[78301/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 21:54:25 INFO stats.py:314 | Epoch[571] Step[97] GlobalStep[78324] Training Speed: 426.10 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:29:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:54:26 INFO loss_tracker.py:84 | Epoch[571/NA] Step[99] GlobalStep[78326/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 21:54:35 INFO stats.py:314 | Epoch[571] Step[122] GlobalStep[78349] Training Speed: 455.48 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 2:29:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:54:36 INFO loss_tracker.py:84 | Epoch[571/NA] Step[124] GlobalStep[78351/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 21:54:40 INFO stats.py:394 | Epoch[571] completed. Training Speed: 312.30 samples/sec across all devices. Epoch Time: 56.15 sec. Average Epoch Time: 56.15 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:29:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:54:45 INFO stats.py:314 | Epoch[572] Step[10] GlobalStep[78374] Training Speed: 430.86 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:29:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:54:51 INFO loss_tracker.py:84 | Epoch[572/NA] Step[24] GlobalStep[78388/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 21:54:56 INFO stats.py:314 | Epoch[572] Step[35] GlobalStep[78399] Training Speed: 428.18 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:29:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:55:02 INFO loss_tracker.py:84 | Epoch[572/NA] Step[49] GlobalStep[78413/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 21:55:06 INFO stats.py:314 | Epoch[572] Step[60] GlobalStep[78424] Training Speed: 428.82 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:29:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:55:12 INFO loss_tracker.py:84 | Epoch[572/NA] Step[74] GlobalStep[78438/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 21:55:17 INFO stats.py:314 | Epoch[572] Step[85] GlobalStep[78449] Training Speed: 423.45 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:29:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:55:22 INFO loss_tracker.py:84 | Epoch[572/NA] Step[99] GlobalStep[78463/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 21:55:27 INFO stats.py:314 | Epoch[572] Step[110] GlobalStep[78474] Training Speed: 437.51 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:28:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:55:32 INFO loss_tracker.py:84 | Epoch[572/NA] Step[124] GlobalStep[78488/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0126] total_loss[0.0163] Rank[0/16] 06/24/2025 21:55:36 INFO stats.py:314 | Epoch[572] Step[135] GlobalStep[78499] Training Speed: 452.04 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 2:28:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:55:36 INFO stats.py:394 | Epoch[572] completed. Training Speed: 311.41 samples/sec across all devices. Epoch Time: 56.31 sec. Average Epoch Time: 56.31 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:28:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:55:47 INFO stats.py:314 | Epoch[573] Step[23] GlobalStep[78524] Training Speed: 444.61 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:28:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:55:48 INFO loss_tracker.py:84 | Epoch[573/NA] Step[24] GlobalStep[78525/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 21:55:57 INFO stats.py:314 | Epoch[573] Step[48] GlobalStep[78549] Training Speed: 422.24 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:28:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:55:58 INFO loss_tracker.py:84 | Epoch[573/NA] Step[49] GlobalStep[78550/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 21:56:07 INFO stats.py:314 | Epoch[573] Step[73] GlobalStep[78574] Training Speed: 406.22 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 2:28:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:56:08 INFO loss_tracker.py:84 | Epoch[573/NA] Step[74] GlobalStep[78575/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 21:56:18 INFO stats.py:314 | Epoch[573] Step[98] GlobalStep[78599] Training Speed: 421.42 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:28:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:56:18 INFO loss_tracker.py:84 | Epoch[573/NA] Step[99] GlobalStep[78600/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 21:56:28 INFO stats.py:314 | Epoch[573] Step[123] GlobalStep[78624] Training Speed: 447.10 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:27:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:56:29 INFO loss_tracker.py:84 | Epoch[573/NA] Step[124] GlobalStep[78625/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 21:56:33 INFO stats.py:394 | Epoch[573] completed. Training Speed: 309.27 samples/sec across all devices. Epoch Time: 56.70 sec. Average Epoch Time: 56.70 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:27:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:56:39 INFO stats.py:314 | Epoch[574] Step[11] GlobalStep[78649] Training Speed: 428.62 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:27:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:56:44 INFO loss_tracker.py:84 | Epoch[574/NA] Step[24] GlobalStep[78662/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 21:56:49 INFO stats.py:314 | Epoch[574] Step[36] GlobalStep[78674] Training Speed: 424.02 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:27:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:56:54 INFO loss_tracker.py:84 | Epoch[574/NA] Step[49] GlobalStep[78687/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0125] total_loss[0.0171] Rank[0/16] 06/24/2025 21:56:59 INFO stats.py:314 | Epoch[574] Step[61] GlobalStep[78699] Training Speed: 435.54 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:27:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:57:04 INFO loss_tracker.py:84 | Epoch[574/NA] Step[74] GlobalStep[78712/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 21:57:09 INFO stats.py:314 | Epoch[574] Step[86] GlobalStep[78724] Training Speed: 431.15 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:27:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:57:14 INFO loss_tracker.py:84 | Epoch[574/NA] Step[99] GlobalStep[78737/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 21:57:19 INFO stats.py:314 | Epoch[574] Step[111] GlobalStep[78749] Training Speed: 415.91 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 2:26:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:57:24 INFO loss_tracker.py:84 | Epoch[574/NA] Step[124] GlobalStep[78762/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 21:57:29 INFO stats.py:314 | Epoch[574] Step[136] GlobalStep[78774] Training Speed: 442.30 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:26:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:57:29 INFO stats.py:394 | Epoch[574] completed. Training Speed: 313.55 samples/sec across all devices. Epoch Time: 55.93 sec. Average Epoch Time: 55.93 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:26:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:57:40 INFO stats.py:314 | Epoch[575] Step[24] GlobalStep[78799] Training Speed: 431.58 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:26:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:57:40 INFO loss_tracker.py:84 | Epoch[575/NA] Step[24] GlobalStep[78799/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 21:57:50 INFO stats.py:314 | Epoch[575] Step[49] GlobalStep[78824] Training Speed: 441.23 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:26:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:57:50 INFO loss_tracker.py:84 | Epoch[575/NA] Step[49] GlobalStep[78824/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0175] Rank[0/16] 06/24/2025 21:58:00 INFO stats.py:314 | Epoch[575] Step[74] GlobalStep[78849] Training Speed: 430.24 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:26:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:58:01 INFO loss_tracker.py:84 | Epoch[575/NA] Step[74] GlobalStep[78849/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0125] total_loss[0.0174] Rank[0/16] 06/24/2025 21:58:10 INFO stats.py:314 | Epoch[575] Step[99] GlobalStep[78874] Training Speed: 428.31 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:26:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:58:11 INFO loss_tracker.py:84 | Epoch[575/NA] Step[99] GlobalStep[78874/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 21:58:20 INFO stats.py:314 | Epoch[575] Step[124] GlobalStep[78899] Training Speed: 442.89 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:25:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:58:20 INFO loss_tracker.py:84 | Epoch[575/NA] Step[124] GlobalStep[78899/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 21:58:25 INFO stats.py:394 | Epoch[575] completed. Training Speed: 310.98 samples/sec across all devices. Epoch Time: 56.39 sec. Average Epoch Time: 56.39 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:25:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:58:32 INFO stats.py:314 | Epoch[576] Step[12] GlobalStep[78924] Training Speed: 425.33 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:25:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:58:37 INFO loss_tracker.py:84 | Epoch[576/NA] Step[24] GlobalStep[78936/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 21:58:42 INFO stats.py:314 | Epoch[576] Step[37] GlobalStep[78949] Training Speed: 431.42 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:25:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:58:46 INFO loss_tracker.py:84 | Epoch[576/NA] Step[49] GlobalStep[78961/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0126] total_loss[0.0162] Rank[0/16] 06/24/2025 21:58:52 INFO stats.py:314 | Epoch[576] Step[62] GlobalStep[78974] Training Speed: 430.54 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:25:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:58:56 INFO loss_tracker.py:84 | Epoch[576/NA] Step[74] GlobalStep[78986/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0125] total_loss[0.0169] Rank[0/16] 06/24/2025 21:59:02 INFO stats.py:314 | Epoch[576] Step[87] GlobalStep[78999] Training Speed: 436.28 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:25:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:59:07 INFO loss_tracker.py:84 | Epoch[576/NA] Step[99] GlobalStep[79011/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0125] total_loss[0.0169] Rank[0/16] 06/24/2025 21:59:12 INFO stats.py:314 | Epoch[576] Step[112] GlobalStep[79024] Training Speed: 434.51 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:25:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:59:16 INFO loss_tracker.py:84 | Epoch[576/NA] Step[124] GlobalStep[79036/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 21:59:20 INFO stats.py:394 | Epoch[576] completed. Training Speed: 317.37 samples/sec across all devices. Epoch Time: 55.25 sec. Average Epoch Time: 55.25 sec. Average Step Time: 0.40 sec. Estimated Remaining Time: 2:24:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:59:22 INFO stats.py:314 | Epoch[577] Step[0] GlobalStep[79049] Training Speed: 314.04 samples/sec across all devices. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:24:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:59:32 INFO loss_tracker.py:84 | Epoch[577/NA] Step[24] GlobalStep[79073/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 21:59:32 INFO stats.py:314 | Epoch[577] Step[25] GlobalStep[79074] Training Speed: 425.43 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:24:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:59:42 INFO loss_tracker.py:84 | Epoch[577/NA] Step[49] GlobalStep[79098/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 21:59:42 INFO stats.py:314 | Epoch[577] Step[50] GlobalStep[79099] Training Speed: 415.86 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 2:24:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 21:59:52 INFO loss_tracker.py:84 | Epoch[577/NA] Step[74] GlobalStep[79123/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0126] total_loss[0.0162] Rank[0/16] 06/24/2025 21:59:53 INFO stats.py:314 | Epoch[577] Step[75] GlobalStep[79124] Training Speed: 417.52 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 2:24:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:00:03 INFO loss_tracker.py:84 | Epoch[577/NA] Step[99] GlobalStep[79148/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0125] total_loss[0.0162] Rank[0/16] 06/24/2025 22:00:03 INFO stats.py:314 | Epoch[577] Step[100] GlobalStep[79149] Training Speed: 428.65 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:24:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:00:12 INFO loss_tracker.py:84 | Epoch[577/NA] Step[124] GlobalStep[79173/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0125] total_loss[0.0168] Rank[0/16] 06/24/2025 22:00:13 INFO stats.py:314 | Epoch[577] Step[125] GlobalStep[79174] Training Speed: 437.09 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:24:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:00:17 INFO stats.py:394 | Epoch[577] completed. Training Speed: 312.08 samples/sec across all devices. Epoch Time: 56.19 sec. Average Epoch Time: 56.19 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:23:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:00:23 INFO stats.py:314 | Epoch[578] Step[13] GlobalStep[79199] Training Speed: 436.66 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:23:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:00:27 INFO loss_tracker.py:84 | Epoch[578/NA] Step[24] GlobalStep[79210/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0125] total_loss[0.0168] Rank[0/16] 06/24/2025 22:00:34 INFO stats.py:314 | Epoch[578] Step[38] GlobalStep[79224] Training Speed: 429.49 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:23:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:00:38 INFO loss_tracker.py:84 | Epoch[578/NA] Step[49] GlobalStep[79235/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 22:00:44 INFO stats.py:314 | Epoch[578] Step[63] GlobalStep[79249] Training Speed: 423.86 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:23:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:00:49 INFO loss_tracker.py:84 | Epoch[578/NA] Step[74] GlobalStep[79260/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 22:00:54 INFO stats.py:314 | Epoch[578] Step[88] GlobalStep[79274] Training Speed: 427.21 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:23:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:00:59 INFO loss_tracker.py:84 | Epoch[578/NA] Step[99] GlobalStep[79285/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0176] Rank[0/16] 06/24/2025 22:01:05 INFO stats.py:314 | Epoch[578] Step[113] GlobalStep[79299] Training Speed: 424.66 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:23:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:01:09 INFO loss_tracker.py:84 | Epoch[578/NA] Step[124] GlobalStep[79310/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 22:01:13 INFO stats.py:394 | Epoch[578] completed. Training Speed: 309.75 samples/sec across all devices. Epoch Time: 56.61 sec. Average Epoch Time: 56.61 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:22:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:01:15 INFO stats.py:314 | Epoch[579] Step[1] GlobalStep[79324] Training Speed: 423.48 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:22:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:01:24 INFO loss_tracker.py:84 | Epoch[579/NA] Step[24] GlobalStep[79347/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 22:01:25 INFO stats.py:314 | Epoch[579] Step[26] GlobalStep[79349] Training Speed: 433.55 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:22:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:01:34 INFO loss_tracker.py:84 | Epoch[579/NA] Step[49] GlobalStep[79372/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 22:01:35 INFO stats.py:314 | Epoch[579] Step[51] GlobalStep[79374] Training Speed: 437.71 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:22:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:01:45 INFO loss_tracker.py:84 | Epoch[579/NA] Step[74] GlobalStep[79397/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 22:01:45 INFO stats.py:314 | Epoch[579] Step[76] GlobalStep[79399] Training Speed: 432.64 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:22:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:01:55 INFO loss_tracker.py:84 | Epoch[579/NA] Step[99] GlobalStep[79422/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 22:01:56 INFO stats.py:314 | Epoch[579] Step[101] GlobalStep[79424] Training Speed: 251.05 samples/sec across all devices. Average Step Time: 0.51 sec. Estimated Remaining Time: 2:22:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:02:05 INFO loss_tracker.py:84 | Epoch[579/NA] Step[124] GlobalStep[79447/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0035] loss_depth[0.0125] total_loss[0.0160] Rank[0/16] 06/24/2025 22:02:06 INFO stats.py:314 | Epoch[579] Step[126] GlobalStep[79449] Training Speed: 440.30 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:22:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:02:09 INFO stats.py:394 | Epoch[579] completed. Training Speed: 313.37 samples/sec across all devices. Epoch Time: 55.96 sec. Average Epoch Time: 55.96 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:22:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:02:16 INFO stats.py:314 | Epoch[580] Step[14] GlobalStep[79474] Training Speed: 434.24 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:21:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:02:21 INFO loss_tracker.py:84 | Epoch[580/NA] Step[24] GlobalStep[79484/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0164] Rank[0/16] 06/24/2025 22:02:26 INFO stats.py:314 | Epoch[580] Step[39] GlobalStep[79499] Training Speed: 423.80 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:21:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:02:31 INFO loss_tracker.py:84 | Epoch[580/NA] Step[49] GlobalStep[79509/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0035] loss_depth[0.0126] total_loss[0.0161] Rank[0/16] 06/24/2025 22:02:36 INFO stats.py:314 | Epoch[580] Step[64] GlobalStep[79524] Training Speed: 439.32 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:21:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:02:40 INFO loss_tracker.py:84 | Epoch[580/NA] Step[74] GlobalStep[79534/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 22:02:46 INFO stats.py:314 | Epoch[580] Step[89] GlobalStep[79549] Training Speed: 420.28 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:21:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:02:50 INFO loss_tracker.py:84 | Epoch[580/NA] Step[99] GlobalStep[79559/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 22:02:57 INFO stats.py:314 | Epoch[580] Step[114] GlobalStep[79574] Training Speed: 431.40 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:21:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:03:01 INFO loss_tracker.py:84 | Epoch[580/NA] Step[124] GlobalStep[79584/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 22:03:05 INFO stats.py:394 | Epoch[580] completed. Training Speed: 315.96 samples/sec across all devices. Epoch Time: 55.50 sec. Average Epoch Time: 55.50 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:21:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:03:07 INFO stats.py:314 | Epoch[581] Step[2] GlobalStep[79599] Training Speed: 427.06 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:21:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:03:16 INFO loss_tracker.py:84 | Epoch[581/NA] Step[24] GlobalStep[79621/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 22:03:17 INFO stats.py:314 | Epoch[581] Step[27] GlobalStep[79624] Training Speed: 424.43 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:20:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:03:26 INFO loss_tracker.py:84 | Epoch[581/NA] Step[49] GlobalStep[79646/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 22:03:28 INFO stats.py:314 | Epoch[581] Step[52] GlobalStep[79649] Training Speed: 434.91 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:20:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:03:36 INFO loss_tracker.py:84 | Epoch[581/NA] Step[74] GlobalStep[79671/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0125] total_loss[0.0171] Rank[0/16] 06/24/2025 22:03:37 INFO stats.py:314 | Epoch[581] Step[77] GlobalStep[79674] Training Speed: 434.76 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:20:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:03:46 INFO loss_tracker.py:84 | Epoch[581/NA] Step[99] GlobalStep[79696/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 22:03:48 INFO stats.py:314 | Epoch[581] Step[102] GlobalStep[79699] Training Speed: 429.49 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:20:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:03:57 INFO loss_tracker.py:84 | Epoch[581/NA] Step[124] GlobalStep[79721/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 22:03:58 INFO stats.py:314 | Epoch[581] Step[127] GlobalStep[79724] Training Speed: 439.07 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:20:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:04:01 INFO stats.py:394 | Epoch[581] completed. Training Speed: 311.77 samples/sec across all devices. Epoch Time: 56.25 sec. Average Epoch Time: 56.25 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:20:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:04:08 INFO stats.py:314 | Epoch[582] Step[15] GlobalStep[79749] Training Speed: 429.18 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:20:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:04:12 INFO loss_tracker.py:84 | Epoch[582/NA] Step[24] GlobalStep[79758/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 22:04:20 INFO stats.py:314 | Epoch[582] Step[40] GlobalStep[79774] Training Speed: 433.39 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:19:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:04:23 INFO loss_tracker.py:84 | Epoch[582/NA] Step[49] GlobalStep[79783/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 22:04:29 INFO stats.py:314 | Epoch[582] Step[65] GlobalStep[79799] Training Speed: 404.88 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 2:19:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:04:33 INFO loss_tracker.py:84 | Epoch[582/NA] Step[74] GlobalStep[79808/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0126] total_loss[0.0163] Rank[0/16] 06/24/2025 22:04:40 INFO stats.py:314 | Epoch[582] Step[90] GlobalStep[79824] Training Speed: 432.43 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:19:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:04:44 INFO loss_tracker.py:84 | Epoch[582/NA] Step[99] GlobalStep[79833/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 22:04:50 INFO stats.py:314 | Epoch[582] Step[115] GlobalStep[79849] Training Speed: 433.85 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:19:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:04:53 INFO loss_tracker.py:84 | Epoch[582/NA] Step[124] GlobalStep[79858/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0033] loss_depth[0.0125] total_loss[0.0159] Rank[0/16] 06/24/2025 22:04:58 INFO stats.py:394 | Epoch[582] completed. Training Speed: 309.61 samples/sec across all devices. Epoch Time: 56.64 sec. Average Epoch Time: 56.64 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:19:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:05:00 INFO stats.py:314 | Epoch[583] Step[3] GlobalStep[79874] Training Speed: 422.36 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:19:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:05:09 INFO loss_tracker.py:84 | Epoch[583/NA] Step[24] GlobalStep[79895/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 22:05:10 INFO stats.py:314 | Epoch[583] Step[28] GlobalStep[79899] Training Speed: 424.38 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:18:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:05:20 INFO loss_tracker.py:84 | Epoch[583/NA] Step[49] GlobalStep[79920/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0030] loss_depth[0.0126] total_loss[0.0156] Rank[0/16] 06/24/2025 22:05:21 INFO stats.py:314 | Epoch[583] Step[53] GlobalStep[79924] Training Speed: 426.20 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:18:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:05:30 INFO loss_tracker.py:84 | Epoch[583/NA] Step[74] GlobalStep[79945/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0164] Rank[0/16] 06/24/2025 22:05:31 INFO stats.py:314 | Epoch[583] Step[78] GlobalStep[79949] Training Speed: 429.10 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:18:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:05:40 INFO loss_tracker.py:84 | Epoch[583/NA] Step[99] GlobalStep[79970/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 22:05:42 INFO stats.py:314 | Epoch[583] Step[103] GlobalStep[79974] Training Speed: 316.74 samples/sec across all devices. Average Step Time: 0.40 sec. Estimated Remaining Time: 2:18:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:05:50 INFO loss_tracker.py:84 | Epoch[583/NA] Step[124] GlobalStep[79995/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 22:05:51 INFO stats.py:314 | Epoch[583] Step[128] GlobalStep[79999] Training Speed: 448.27 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:18:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:05:52 WARNING accelerator.py:3099 | Deleting 1 checkpoints to make room for new checkpoint. Rank[0/16] 06/24/2025 22:05:52 INFO accelerator.py:3111 | Saving current state to /job_data/checkpoints/checkpoint_19 Rank[4/16] 06/24/2025 22:05:53 INFO checkpoint.py:110 | Save checkpoint at the end of step 79999 to /job_data/checkpoints/checkpoint_19 Rank[14/16] 06/24/2025 22:05:53 INFO checkpoint.py:110 | Save checkpoint at the end of step 79999 to /job_data/checkpoints/checkpoint_19 Rank[5/16] 06/24/2025 22:05:53 INFO checkpoint.py:110 | Save checkpoint at the end of step 79999 to /job_data/checkpoints/checkpoint_19 Rank[11/16] 06/24/2025 22:05:53 INFO checkpoint.py:110 | Save checkpoint at the end of step 79999 to /job_data/checkpoints/checkpoint_19 Rank[7/16] 06/24/2025 22:05:53 INFO checkpoint.py:110 | Save checkpoint at the end of step 79999 to /job_data/checkpoints/checkpoint_19 Rank[8/16] 06/24/2025 22:05:53 INFO checkpoint.py:110 | Save checkpoint at the end of step 79999 to /job_data/checkpoints/checkpoint_19 Rank[6/16] 06/24/2025 22:05:53 INFO checkpoint.py:110 | Save checkpoint at the end of step 79999 to /job_data/checkpoints/checkpoint_19 Rank[12/16] 06/24/2025 22:05:53 INFO checkpoint.py:110 | Save checkpoint at the end of step 79999 to /job_data/checkpoints/checkpoint_19 Rank[2/16] 06/24/2025 22:05:53 INFO checkpoint.py:110 | Save checkpoint at the end of step 79999 to /job_data/checkpoints/checkpoint_19 Rank[10/16] 06/24/2025 22:05:53 INFO checkpoint.py:110 | Save checkpoint at the end of step 79999 to /job_data/checkpoints/checkpoint_19 Rank[1/16] 06/24/2025 22:05:53 INFO checkpoint.py:110 | Save checkpoint at the end of step 79999 to /job_data/checkpoints/checkpoint_19 Rank[3/16] 06/24/2025 22:05:53 INFO checkpoint.py:110 | Save checkpoint at the end of step 79999 to /job_data/checkpoints/checkpoint_19 Rank[13/16] 06/24/2025 22:05:53 INFO checkpoint.py:110 | Save checkpoint at the end of step 79999 to /job_data/checkpoints/checkpoint_19 Rank[9/16] 06/24/2025 22:05:53 INFO checkpoint.py:110 | Save checkpoint at the end of step 79999 to /job_data/checkpoints/checkpoint_19 Rank[15/16] 06/24/2025 22:05:53 INFO checkpoint.py:110 | Save checkpoint at the end of step 79999 to /job_data/checkpoints/checkpoint_19 Rank[0/16] 06/24/2025 22:05:53 INFO checkpointing.py:106 | Model weights saved in /job_data/checkpoints/checkpoint_19/model.safetensors Rank[0/16] 06/24/2025 22:05:54 INFO checkpointing.py:113 | Optimizer state saved in /job_data/checkpoints/checkpoint_19/optimizer.bin Rank[0/16] 06/24/2025 22:05:54 INFO checkpointing.py:120 | Scheduler state saved in /job_data/checkpoints/checkpoint_19/scheduler.bin Rank[0/16] 06/24/2025 22:05:54 INFO checkpointing.py:137 | Sampler state for dataloader 0 saved in /job_data/checkpoints/checkpoint_19/sampler.bin Rank[0/16] 06/24/2025 22:05:54 INFO checkpointing.py:164 | Random states saved in /job_data/checkpoints/checkpoint_19/random_states_0.pkl Rank[0/16] 06/24/2025 22:05:54 INFO checkpointing.py:300 | Saving the state of TrainerProgressState to /job_data/checkpoints/checkpoint_19/custom_checkpoint_0.pkl Rank[0/16] 06/24/2025 22:05:54 INFO checkpoint.py:110 | Save checkpoint at the end of step 79999 to /job_data/checkpoints/checkpoint_19 Rank[0/16] 06/24/2025 22:05:57 INFO stats.py:394 | Epoch[583] completed. Training Speed: 294.69 samples/sec across all devices. Epoch Time: 59.51 sec. Average Epoch Time: 59.51 sec. Average Step Time: 0.43 sec. Estimated Remaining Time: 2:18:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:06:05 INFO stats.py:314 | Epoch[584] Step[16] GlobalStep[80024] Training Speed: 438.54 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:18:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:06:09 INFO loss_tracker.py:84 | Epoch[584/NA] Step[24] GlobalStep[80032/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 22:06:16 INFO stats.py:314 | Epoch[584] Step[41] GlobalStep[80049] Training Speed: 398.47 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 2:17:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:06:19 INFO loss_tracker.py:84 | Epoch[584/NA] Step[49] GlobalStep[80057/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 22:06:26 INFO stats.py:314 | Epoch[584] Step[66] GlobalStep[80074] Training Speed: 425.20 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:17:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:06:30 INFO loss_tracker.py:84 | Epoch[584/NA] Step[74] GlobalStep[80082/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0125] total_loss[0.0169] Rank[0/16] 06/24/2025 22:06:37 INFO stats.py:314 | Epoch[584] Step[91] GlobalStep[80099] Training Speed: 422.40 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:17:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:06:40 INFO loss_tracker.py:84 | Epoch[584/NA] Step[99] GlobalStep[80107/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0168] Rank[0/16] 06/24/2025 22:06:47 INFO stats.py:314 | Epoch[584] Step[116] GlobalStep[80124] Training Speed: 428.48 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:17:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:06:50 INFO loss_tracker.py:84 | Epoch[584/NA] Step[124] GlobalStep[80132/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0125] total_loss[0.0163] Rank[0/16] 06/24/2025 22:06:55 INFO stats.py:394 | Epoch[584] completed. Training Speed: 303.52 samples/sec across all devices. Epoch Time: 57.78 sec. Average Epoch Time: 57.78 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:17:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:06:58 INFO stats.py:314 | Epoch[585] Step[4] GlobalStep[80149] Training Speed: 408.68 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 2:17:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:07:06 INFO loss_tracker.py:84 | Epoch[585/NA] Step[24] GlobalStep[80169/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 22:07:08 INFO stats.py:314 | Epoch[585] Step[29] GlobalStep[80174] Training Speed: 435.88 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:17:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:07:16 INFO loss_tracker.py:84 | Epoch[585/NA] Step[49] GlobalStep[80194/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0125] total_loss[0.0171] Rank[0/16] 06/24/2025 22:07:18 INFO stats.py:314 | Epoch[585] Step[54] GlobalStep[80199] Training Speed: 434.89 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:16:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:07:27 INFO loss_tracker.py:84 | Epoch[585/NA] Step[74] GlobalStep[80219/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0125] total_loss[0.0172] Rank[0/16] 06/24/2025 22:07:29 INFO stats.py:314 | Epoch[585] Step[79] GlobalStep[80224] Training Speed: 425.78 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:16:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:07:37 INFO loss_tracker.py:84 | Epoch[585/NA] Step[99] GlobalStep[80244/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 22:07:39 INFO stats.py:314 | Epoch[585] Step[104] GlobalStep[80249] Training Speed: 419.92 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:16:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:07:47 INFO loss_tracker.py:84 | Epoch[585/NA] Step[124] GlobalStep[80269/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 22:07:49 INFO stats.py:314 | Epoch[585] Step[129] GlobalStep[80274] Training Speed: 441.61 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:16:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:07:52 INFO stats.py:394 | Epoch[585] completed. Training Speed: 307.06 samples/sec across all devices. Epoch Time: 57.11 sec. Average Epoch Time: 57.11 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:16:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:08:00 INFO stats.py:314 | Epoch[586] Step[17] GlobalStep[80299] Training Speed: 429.92 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:16:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:08:04 INFO loss_tracker.py:84 | Epoch[586/NA] Step[24] GlobalStep[80306/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 22:08:11 INFO stats.py:314 | Epoch[586] Step[42] GlobalStep[80324] Training Speed: 430.17 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:16:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:08:13 INFO loss_tracker.py:84 | Epoch[586/NA] Step[49] GlobalStep[80331/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 22:08:21 INFO stats.py:314 | Epoch[586] Step[67] GlobalStep[80349] Training Speed: 424.12 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:15:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:08:24 INFO loss_tracker.py:84 | Epoch[586/NA] Step[74] GlobalStep[80356/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 22:08:31 INFO stats.py:314 | Epoch[586] Step[92] GlobalStep[80374] Training Speed: 398.05 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 2:15:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:08:34 INFO loss_tracker.py:84 | Epoch[586/NA] Step[99] GlobalStep[80381/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0125] total_loss[0.0163] Rank[0/16] 06/24/2025 22:08:42 INFO stats.py:314 | Epoch[586] Step[117] GlobalStep[80399] Training Speed: 428.88 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:15:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:08:45 INFO loss_tracker.py:84 | Epoch[586/NA] Step[124] GlobalStep[80406/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0168] Rank[0/16] 06/24/2025 22:08:49 INFO stats.py:394 | Epoch[586] completed. Training Speed: 306.89 samples/sec across all devices. Epoch Time: 57.14 sec. Average Epoch Time: 57.14 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:15:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:08:53 INFO stats.py:314 | Epoch[587] Step[5] GlobalStep[80424] Training Speed: 429.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:15:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:09:01 INFO loss_tracker.py:84 | Epoch[587/NA] Step[24] GlobalStep[80443/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 22:09:03 INFO stats.py:314 | Epoch[587] Step[30] GlobalStep[80449] Training Speed: 431.33 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:15:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:09:12 INFO loss_tracker.py:84 | Epoch[587/NA] Step[49] GlobalStep[80468/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0125] total_loss[0.0171] Rank[0/16] 06/24/2025 22:09:14 INFO stats.py:314 | Epoch[587] Step[55] GlobalStep[80474] Training Speed: 438.34 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:15:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:09:22 INFO loss_tracker.py:84 | Epoch[587/NA] Step[74] GlobalStep[80493/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 22:09:24 INFO stats.py:314 | Epoch[587] Step[80] GlobalStep[80499] Training Speed: 437.18 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:14:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:09:32 INFO loss_tracker.py:84 | Epoch[587/NA] Step[99] GlobalStep[80518/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0168] Rank[0/16] 06/24/2025 22:09:34 INFO stats.py:314 | Epoch[587] Step[105] GlobalStep[80524] Training Speed: 430.29 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:14:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:09:42 INFO loss_tracker.py:84 | Epoch[587/NA] Step[124] GlobalStep[80543/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0125] total_loss[0.0163] Rank[0/16] 06/24/2025 22:09:44 INFO stats.py:314 | Epoch[587] Step[130] GlobalStep[80549] Training Speed: 443.21 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:14:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:09:46 INFO stats.py:394 | Epoch[587] completed. Training Speed: 306.85 samples/sec across all devices. Epoch Time: 57.15 sec. Average Epoch Time: 57.15 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:14:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:09:55 INFO stats.py:314 | Epoch[588] Step[18] GlobalStep[80574] Training Speed: 415.56 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 2:14:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:09:58 INFO loss_tracker.py:84 | Epoch[588/NA] Step[24] GlobalStep[80580/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 22:10:06 INFO stats.py:314 | Epoch[588] Step[43] GlobalStep[80599] Training Speed: 395.50 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 2:14:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:10:09 INFO loss_tracker.py:84 | Epoch[588/NA] Step[49] GlobalStep[80605/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0125] total_loss[0.0168] Rank[0/16] 06/24/2025 22:10:17 INFO stats.py:314 | Epoch[588] Step[68] GlobalStep[80624] Training Speed: 429.76 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:13:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:10:19 INFO loss_tracker.py:84 | Epoch[588/NA] Step[74] GlobalStep[80630/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0125] total_loss[0.0172] Rank[0/16] 06/24/2025 22:10:27 INFO stats.py:314 | Epoch[588] Step[93] GlobalStep[80649] Training Speed: 431.82 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:13:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:10:30 INFO loss_tracker.py:84 | Epoch[588/NA] Step[99] GlobalStep[80655/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 22:10:37 INFO stats.py:314 | Epoch[588] Step[118] GlobalStep[80674] Training Speed: 427.39 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:13:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:10:39 INFO loss_tracker.py:84 | Epoch[588/NA] Step[124] GlobalStep[80680/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 22:10:44 INFO stats.py:394 | Epoch[588] completed. Training Speed: 304.99 samples/sec across all devices. Epoch Time: 57.50 sec. Average Epoch Time: 57.50 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:13:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:10:48 INFO stats.py:314 | Epoch[589] Step[6] GlobalStep[80699] Training Speed: 428.25 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:13:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:10:56 INFO loss_tracker.py:84 | Epoch[589/NA] Step[24] GlobalStep[80717/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 22:10:58 INFO stats.py:314 | Epoch[589] Step[31] GlobalStep[80724] Training Speed: 424.43 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:13:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:11:06 INFO loss_tracker.py:84 | Epoch[589/NA] Step[49] GlobalStep[80742/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0125] total_loss[0.0169] Rank[0/16] 06/24/2025 22:11:09 INFO stats.py:314 | Epoch[589] Step[56] GlobalStep[80749] Training Speed: 423.84 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:13:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:11:16 INFO loss_tracker.py:84 | Epoch[589/NA] Step[74] GlobalStep[80767/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0164] Rank[0/16] 06/24/2025 22:11:19 INFO stats.py:314 | Epoch[589] Step[81] GlobalStep[80774] Training Speed: 427.09 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:12:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:11:26 INFO loss_tracker.py:84 | Epoch[589/NA] Step[99] GlobalStep[80792/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 22:11:29 INFO stats.py:314 | Epoch[589] Step[106] GlobalStep[80799] Training Speed: 430.89 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:12:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:11:36 INFO loss_tracker.py:84 | Epoch[589/NA] Step[124] GlobalStep[80817/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 22:11:39 INFO stats.py:314 | Epoch[589] Step[131] GlobalStep[80824] Training Speed: 449.83 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 2:12:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:11:40 INFO stats.py:394 | Epoch[589] completed. Training Speed: 310.08 samples/sec across all devices. Epoch Time: 56.55 sec. Average Epoch Time: 56.55 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:12:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:11:50 INFO stats.py:314 | Epoch[590] Step[19] GlobalStep[80849] Training Speed: 408.98 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 2:12:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:11:52 INFO loss_tracker.py:84 | Epoch[590/NA] Step[24] GlobalStep[80854/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0164] Rank[0/16] 06/24/2025 22:12:00 INFO stats.py:314 | Epoch[590] Step[44] GlobalStep[80874] Training Speed: 413.41 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 2:12:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:12:02 INFO loss_tracker.py:84 | Epoch[590/NA] Step[49] GlobalStep[80879/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 22:12:11 INFO stats.py:314 | Epoch[590] Step[69] GlobalStep[80899] Training Speed: 430.43 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:12:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:12:13 INFO loss_tracker.py:84 | Epoch[590/NA] Step[74] GlobalStep[80904/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 22:12:21 INFO stats.py:314 | Epoch[590] Step[94] GlobalStep[80924] Training Speed: 428.48 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:11:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:12:23 INFO loss_tracker.py:84 | Epoch[590/NA] Step[99] GlobalStep[80929/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0125] total_loss[0.0162] Rank[0/16] 06/24/2025 22:12:32 INFO stats.py:314 | Epoch[590] Step[119] GlobalStep[80949] Training Speed: 439.19 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:11:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:12:34 INFO loss_tracker.py:84 | Epoch[590/NA] Step[124] GlobalStep[80954/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0126] total_loss[0.0162] Rank[0/16] 06/24/2025 22:12:38 INFO stats.py:394 | Epoch[590] completed. Training Speed: 305.47 samples/sec across all devices. Epoch Time: 57.41 sec. Average Epoch Time: 57.41 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:11:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:12:42 INFO stats.py:314 | Epoch[591] Step[7] GlobalStep[80974] Training Speed: 439.27 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:11:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:12:49 INFO loss_tracker.py:84 | Epoch[591/NA] Step[24] GlobalStep[80991/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 22:12:53 INFO stats.py:314 | Epoch[591] Step[32] GlobalStep[80999] Training Speed: 438.43 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:11:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:12:59 INFO loss_tracker.py:84 | Epoch[591/NA] Step[49] GlobalStep[81016/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 22:13:03 INFO stats.py:314 | Epoch[591] Step[57] GlobalStep[81024] Training Speed: 431.47 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:11:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:13:10 INFO loss_tracker.py:84 | Epoch[591/NA] Step[74] GlobalStep[81041/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0125] total_loss[0.0172] Rank[0/16] 06/24/2025 22:13:14 INFO stats.py:314 | Epoch[591] Step[82] GlobalStep[81049] Training Speed: 425.84 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:11:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:13:21 INFO loss_tracker.py:84 | Epoch[591/NA] Step[99] GlobalStep[81066/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0125] total_loss[0.0172] Rank[0/16] 06/24/2025 22:13:24 INFO stats.py:314 | Epoch[591] Step[107] GlobalStep[81074] Training Speed: 406.89 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 2:10:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:13:30 INFO loss_tracker.py:84 | Epoch[591/NA] Step[124] GlobalStep[81091/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0125] total_loss[0.0164] Rank[0/16] 06/24/2025 22:13:33 INFO stats.py:314 | Epoch[591] Step[132] GlobalStep[81099] Training Speed: 436.72 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:10:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:13:35 INFO stats.py:394 | Epoch[591] completed. Training Speed: 307.57 samples/sec across all devices. Epoch Time: 57.01 sec. Average Epoch Time: 57.01 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:10:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:13:45 INFO stats.py:314 | Epoch[592] Step[20] GlobalStep[81124] Training Speed: 436.87 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:10:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:13:46 INFO loss_tracker.py:84 | Epoch[592/NA] Step[24] GlobalStep[81128/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0125] total_loss[0.0163] Rank[0/16] 06/24/2025 22:13:55 INFO stats.py:314 | Epoch[592] Step[45] GlobalStep[81149] Training Speed: 424.47 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:10:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:13:57 INFO loss_tracker.py:84 | Epoch[592/NA] Step[49] GlobalStep[81153/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 22:14:05 INFO stats.py:314 | Epoch[592] Step[70] GlobalStep[81174] Training Speed: 431.50 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:10:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:14:07 INFO loss_tracker.py:84 | Epoch[592/NA] Step[74] GlobalStep[81178/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0168] Rank[0/16] 06/24/2025 22:14:16 INFO stats.py:314 | Epoch[592] Step[95] GlobalStep[81199] Training Speed: 432.08 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:10:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:14:17 INFO loss_tracker.py:84 | Epoch[592/NA] Step[99] GlobalStep[81203/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0125] total_loss[0.0164] Rank[0/16] 06/24/2025 22:14:26 INFO stats.py:314 | Epoch[592] Step[120] GlobalStep[81224] Training Speed: 438.43 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:09:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:14:28 INFO loss_tracker.py:84 | Epoch[592/NA] Step[124] GlobalStep[81228/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 22:14:32 INFO stats.py:394 | Epoch[592] completed. Training Speed: 305.48 samples/sec across all devices. Epoch Time: 57.40 sec. Average Epoch Time: 57.40 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:09:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:14:36 INFO stats.py:314 | Epoch[593] Step[8] GlobalStep[81249] Training Speed: 432.07 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:09:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:14:43 INFO loss_tracker.py:84 | Epoch[593/NA] Step[24] GlobalStep[81265/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 22:14:47 INFO stats.py:314 | Epoch[593] Step[33] GlobalStep[81274] Training Speed: 436.46 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:09:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:14:53 INFO loss_tracker.py:84 | Epoch[593/NA] Step[49] GlobalStep[81290/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 22:14:56 INFO stats.py:314 | Epoch[593] Step[58] GlobalStep[81299] Training Speed: 434.57 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:09:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:15:03 INFO loss_tracker.py:84 | Epoch[593/NA] Step[74] GlobalStep[81315/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 22:15:07 INFO stats.py:314 | Epoch[593] Step[83] GlobalStep[81324] Training Speed: 424.28 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:09:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:15:13 INFO loss_tracker.py:84 | Epoch[593/NA] Step[99] GlobalStep[81340/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0033] loss_depth[0.0126] total_loss[0.0158] Rank[0/16] 06/24/2025 22:15:17 INFO stats.py:314 | Epoch[593] Step[108] GlobalStep[81349] Training Speed: 420.61 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:08:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:15:23 INFO loss_tracker.py:84 | Epoch[593/NA] Step[124] GlobalStep[81365/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 22:15:27 INFO stats.py:314 | Epoch[593] Step[133] GlobalStep[81374] Training Speed: 434.79 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:08:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:15:28 INFO stats.py:394 | Epoch[593] completed. Training Speed: 314.82 samples/sec across all devices. Epoch Time: 55.70 sec. Average Epoch Time: 55.70 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:08:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:15:38 INFO stats.py:314 | Epoch[594] Step[21] GlobalStep[81399] Training Speed: 414.05 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 2:08:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:15:39 INFO loss_tracker.py:84 | Epoch[594/NA] Step[24] GlobalStep[81402/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 22:15:49 INFO stats.py:314 | Epoch[594] Step[46] GlobalStep[81424] Training Speed: 405.31 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 2:08:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:15:50 INFO loss_tracker.py:84 | Epoch[594/NA] Step[49] GlobalStep[81427/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 22:15:59 INFO stats.py:314 | Epoch[594] Step[71] GlobalStep[81449] Training Speed: 432.35 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:08:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:16:00 INFO loss_tracker.py:84 | Epoch[594/NA] Step[74] GlobalStep[81452/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 22:16:09 INFO stats.py:314 | Epoch[594] Step[96] GlobalStep[81474] Training Speed: 429.32 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:08:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:16:11 INFO loss_tracker.py:84 | Epoch[594/NA] Step[99] GlobalStep[81477/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0125] total_loss[0.0163] Rank[0/16] 06/24/2025 22:16:19 INFO stats.py:314 | Epoch[594] Step[121] GlobalStep[81499] Training Speed: 454.38 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 2:07:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:16:21 INFO loss_tracker.py:84 | Epoch[594/NA] Step[124] GlobalStep[81502/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 22:16:25 INFO stats.py:394 | Epoch[594] completed. Training Speed: 305.61 samples/sec across all devices. Epoch Time: 57.38 sec. Average Epoch Time: 57.38 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:07:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:16:31 INFO stats.py:314 | Epoch[595] Step[9] GlobalStep[81524] Training Speed: 422.68 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:07:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:16:37 INFO loss_tracker.py:84 | Epoch[595/NA] Step[24] GlobalStep[81539/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 22:16:41 INFO stats.py:314 | Epoch[595] Step[34] GlobalStep[81549] Training Speed: 421.60 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:07:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:16:47 INFO loss_tracker.py:84 | Epoch[595/NA] Step[49] GlobalStep[81564/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 22:16:52 INFO stats.py:314 | Epoch[595] Step[59] GlobalStep[81574] Training Speed: 415.69 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 2:07:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:16:58 INFO loss_tracker.py:84 | Epoch[595/NA] Step[74] GlobalStep[81589/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0125] total_loss[0.0163] Rank[0/16] 06/24/2025 22:17:02 INFO stats.py:314 | Epoch[595] Step[84] GlobalStep[81599] Training Speed: 435.81 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:07:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:17:08 INFO loss_tracker.py:84 | Epoch[595/NA] Step[99] GlobalStep[81614/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 22:17:12 INFO stats.py:314 | Epoch[595] Step[109] GlobalStep[81624] Training Speed: 432.80 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:07:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:17:18 INFO loss_tracker.py:84 | Epoch[595/NA] Step[124] GlobalStep[81639/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 22:17:22 INFO stats.py:314 | Epoch[595] Step[134] GlobalStep[81649] Training Speed: 450.36 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 2:06:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:17:22 INFO stats.py:394 | Epoch[595] completed. Training Speed: 307.31 samples/sec across all devices. Epoch Time: 57.06 sec. Average Epoch Time: 57.06 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:06:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:17:33 INFO stats.py:314 | Epoch[596] Step[22] GlobalStep[81674] Training Speed: 426.22 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:06:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:17:34 INFO loss_tracker.py:84 | Epoch[596/NA] Step[24] GlobalStep[81676/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0125] total_loss[0.0161] Rank[0/16] 06/24/2025 22:17:43 INFO stats.py:314 | Epoch[596] Step[47] GlobalStep[81699] Training Speed: 429.66 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:06:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:17:45 INFO loss_tracker.py:84 | Epoch[596/NA] Step[49] GlobalStep[81701/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 22:17:54 INFO stats.py:314 | Epoch[596] Step[72] GlobalStep[81724] Training Speed: 429.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:06:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:17:55 INFO loss_tracker.py:84 | Epoch[596/NA] Step[74] GlobalStep[81726/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0035] loss_depth[0.0125] total_loss[0.0160] Rank[0/16] 06/24/2025 22:18:04 INFO stats.py:314 | Epoch[596] Step[97] GlobalStep[81749] Training Speed: 430.32 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:06:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:18:05 INFO loss_tracker.py:84 | Epoch[596/NA] Step[99] GlobalStep[81751/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 22:18:14 INFO stats.py:314 | Epoch[596] Step[122] GlobalStep[81774] Training Speed: 452.80 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 2:06:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:18:15 INFO loss_tracker.py:84 | Epoch[596/NA] Step[124] GlobalStep[81776/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0125] total_loss[0.0169] Rank[0/16] 06/24/2025 22:18:19 INFO stats.py:394 | Epoch[596] completed. Training Speed: 308.65 samples/sec across all devices. Epoch Time: 56.82 sec. Average Epoch Time: 56.82 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:05:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:18:25 INFO stats.py:314 | Epoch[597] Step[10] GlobalStep[81799] Training Speed: 434.75 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:05:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:18:30 INFO loss_tracker.py:84 | Epoch[597/NA] Step[24] GlobalStep[81813/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 22:18:35 INFO stats.py:314 | Epoch[597] Step[35] GlobalStep[81824] Training Speed: 431.36 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:05:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:18:40 INFO loss_tracker.py:84 | Epoch[597/NA] Step[49] GlobalStep[81838/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 22:18:45 INFO stats.py:314 | Epoch[597] Step[60] GlobalStep[81849] Training Speed: 424.20 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:05:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:18:51 INFO loss_tracker.py:84 | Epoch[597/NA] Step[74] GlobalStep[81863/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0125] total_loss[0.0169] Rank[0/16] 06/24/2025 22:18:55 INFO stats.py:314 | Epoch[597] Step[85] GlobalStep[81874] Training Speed: 430.48 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:05:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:19:01 INFO loss_tracker.py:84 | Epoch[597/NA] Step[99] GlobalStep[81888/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 22:19:06 INFO stats.py:314 | Epoch[597] Step[110] GlobalStep[81899] Training Speed: 422.07 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:05:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:19:11 INFO loss_tracker.py:84 | Epoch[597/NA] Step[124] GlobalStep[81913/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0125] total_loss[0.0161] Rank[0/16] 06/24/2025 22:19:15 INFO stats.py:314 | Epoch[597] Step[135] GlobalStep[81924] Training Speed: 440.96 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:05:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:19:15 INFO stats.py:394 | Epoch[597] completed. Training Speed: 312.11 samples/sec across all devices. Epoch Time: 56.19 sec. Average Epoch Time: 56.19 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:04:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:19:26 INFO stats.py:314 | Epoch[598] Step[23] GlobalStep[81949] Training Speed: 443.48 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:04:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:19:27 INFO loss_tracker.py:84 | Epoch[598/NA] Step[24] GlobalStep[81950/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0125] total_loss[0.0163] Rank[0/16] 06/24/2025 22:19:37 INFO stats.py:314 | Epoch[598] Step[48] GlobalStep[81974] Training Speed: 425.60 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:04:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:19:37 INFO loss_tracker.py:84 | Epoch[598/NA] Step[49] GlobalStep[81975/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 22:19:47 INFO stats.py:314 | Epoch[598] Step[73] GlobalStep[81999] Training Speed: 439.61 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:04:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:19:47 INFO loss_tracker.py:84 | Epoch[598/NA] Step[74] GlobalStep[82000/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0125] total_loss[0.0162] Rank[0/16] 06/24/2025 22:19:57 INFO stats.py:314 | Epoch[598] Step[98] GlobalStep[82024] Training Speed: 429.16 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:04:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:19:57 INFO loss_tracker.py:84 | Epoch[598/NA] Step[99] GlobalStep[82025/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 22:20:07 INFO stats.py:314 | Epoch[598] Step[123] GlobalStep[82049] Training Speed: 438.01 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:04:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:20:07 INFO loss_tracker.py:84 | Epoch[598/NA] Step[124] GlobalStep[82050/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 22:20:12 INFO stats.py:394 | Epoch[598] completed. Training Speed: 311.01 samples/sec across all devices. Epoch Time: 56.38 sec. Average Epoch Time: 56.38 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:04:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:20:18 INFO stats.py:314 | Epoch[599] Step[11] GlobalStep[82074] Training Speed: 432.03 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:03:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:20:24 INFO loss_tracker.py:84 | Epoch[599/NA] Step[24] GlobalStep[82087/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 22:20:28 INFO stats.py:314 | Epoch[599] Step[36] GlobalStep[82099] Training Speed: 435.47 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:03:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:20:34 INFO loss_tracker.py:84 | Epoch[599/NA] Step[49] GlobalStep[82112/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0125] total_loss[0.0162] Rank[0/16] 06/24/2025 22:20:39 INFO stats.py:314 | Epoch[599] Step[61] GlobalStep[82124] Training Speed: 437.90 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:03:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:20:44 INFO loss_tracker.py:84 | Epoch[599/NA] Step[74] GlobalStep[82137/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 22:20:49 INFO stats.py:314 | Epoch[599] Step[86] GlobalStep[82149] Training Speed: 436.92 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:03:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:20:54 INFO loss_tracker.py:84 | Epoch[599/NA] Step[99] GlobalStep[82162/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0168] Rank[0/16] 06/24/2025 22:20:59 INFO stats.py:314 | Epoch[599] Step[111] GlobalStep[82174] Training Speed: 432.03 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:03:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:21:05 INFO loss_tracker.py:84 | Epoch[599/NA] Step[124] GlobalStep[82187/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0033] loss_depth[0.0125] total_loss[0.0159] Rank[0/16] 06/24/2025 22:21:09 INFO stats.py:314 | Epoch[599] Step[136] GlobalStep[82199] Training Speed: 441.32 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:03:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:21:09 INFO stats.py:394 | Epoch[599] completed. Training Speed: 304.39 samples/sec across all devices. Epoch Time: 57.61 sec. Average Epoch Time: 57.61 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:03:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:21:21 INFO stats.py:314 | Epoch[600] Step[24] GlobalStep[82224] Training Speed: 437.65 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:02:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:21:21 INFO loss_tracker.py:84 | Epoch[600/NA] Step[24] GlobalStep[82224/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0125] total_loss[0.0164] Rank[0/16] 06/24/2025 22:21:31 INFO stats.py:314 | Epoch[600] Step[49] GlobalStep[82249] Training Speed: 438.36 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:02:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:21:31 INFO loss_tracker.py:84 | Epoch[600/NA] Step[49] GlobalStep[82249/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 22:21:41 INFO stats.py:314 | Epoch[600] Step[74] GlobalStep[82274] Training Speed: 436.98 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:02:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:21:41 INFO loss_tracker.py:84 | Epoch[600/NA] Step[74] GlobalStep[82274/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 22:21:51 INFO stats.py:314 | Epoch[600] Step[99] GlobalStep[82299] Training Speed: 434.31 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:02:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:21:52 INFO loss_tracker.py:84 | Epoch[600/NA] Step[99] GlobalStep[82299/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0125] total_loss[0.0175] Rank[0/16] 06/24/2025 22:22:02 INFO stats.py:314 | Epoch[600] Step[124] GlobalStep[82324] Training Speed: 456.84 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 2:02:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:22:02 INFO loss_tracker.py:84 | Epoch[600/NA] Step[124] GlobalStep[82324/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0053] loss_depth[0.0125] total_loss[0.0179] Rank[0/16] 06/24/2025 22:22:06 INFO stats.py:394 | Epoch[600] completed. Training Speed: 310.36 samples/sec across all devices. Epoch Time: 56.50 sec. Average Epoch Time: 56.50 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 2:02:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:22:12 INFO stats.py:314 | Epoch[601] Step[12] GlobalStep[82349] Training Speed: 402.36 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 2:02:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:22:17 INFO loss_tracker.py:84 | Epoch[601/NA] Step[24] GlobalStep[82361/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 22:22:23 INFO stats.py:314 | Epoch[601] Step[37] GlobalStep[82374] Training Speed: 422.64 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:01:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:22:28 INFO loss_tracker.py:84 | Epoch[601/NA] Step[49] GlobalStep[82386/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 22:22:33 INFO stats.py:314 | Epoch[601] Step[62] GlobalStep[82399] Training Speed: 423.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:01:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:22:39 INFO loss_tracker.py:84 | Epoch[601/NA] Step[74] GlobalStep[82411/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 22:22:44 INFO stats.py:314 | Epoch[601] Step[87] GlobalStep[82424] Training Speed: 426.45 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:01:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:22:49 INFO loss_tracker.py:84 | Epoch[601/NA] Step[99] GlobalStep[82436/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0029] loss_depth[0.0126] total_loss[0.0156] Rank[0/16] 06/24/2025 22:22:54 INFO stats.py:314 | Epoch[601] Step[112] GlobalStep[82449] Training Speed: 441.75 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:01:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:22:59 INFO loss_tracker.py:84 | Epoch[601/NA] Step[124] GlobalStep[82461/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 22:23:03 INFO stats.py:394 | Epoch[601] completed. Training Speed: 305.26 samples/sec across all devices. Epoch Time: 57.45 sec. Average Epoch Time: 57.45 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:01:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:23:04 INFO stats.py:314 | Epoch[602] Step[0] GlobalStep[82474] Training Speed: 365.35 samples/sec across all devices. Average Step Time: 0.35 sec. Estimated Remaining Time: 2:01:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:23:14 INFO loss_tracker.py:84 | Epoch[602/NA] Step[24] GlobalStep[82498/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 22:23:15 INFO stats.py:314 | Epoch[602] Step[25] GlobalStep[82499] Training Speed: 396.57 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 2:01:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:23:25 INFO loss_tracker.py:84 | Epoch[602/NA] Step[49] GlobalStep[82523/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 22:23:26 INFO stats.py:314 | Epoch[602] Step[50] GlobalStep[82524] Training Speed: 422.22 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:00:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:23:36 INFO loss_tracker.py:84 | Epoch[602/NA] Step[74] GlobalStep[82548/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0126] total_loss[0.0162] Rank[0/16] 06/24/2025 22:23:36 INFO stats.py:314 | Epoch[602] Step[75] GlobalStep[82549] Training Speed: 403.73 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 2:00:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:23:46 INFO loss_tracker.py:84 | Epoch[602/NA] Step[99] GlobalStep[82573/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 22:23:47 INFO stats.py:314 | Epoch[602] Step[100] GlobalStep[82574] Training Speed: 430.59 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:00:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:23:56 INFO loss_tracker.py:84 | Epoch[602/NA] Step[124] GlobalStep[82598/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0172] Rank[0/16] 06/24/2025 22:23:56 INFO stats.py:314 | Epoch[602] Step[125] GlobalStep[82599] Training Speed: 440.91 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 2:00:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:24:01 INFO stats.py:394 | Epoch[602] completed. Training Speed: 306.00 samples/sec across all devices. Epoch Time: 57.31 sec. Average Epoch Time: 57.31 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 2:00:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:24:07 INFO stats.py:314 | Epoch[603] Step[13] GlobalStep[82624] Training Speed: 430.42 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 2:00:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:24:12 INFO loss_tracker.py:84 | Epoch[603/NA] Step[24] GlobalStep[82635/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 22:24:17 INFO stats.py:314 | Epoch[603] Step[38] GlobalStep[82649] Training Speed: 430.75 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:59:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:24:22 INFO loss_tracker.py:84 | Epoch[603/NA] Step[49] GlobalStep[82660/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0125] total_loss[0.0163] Rank[0/16] 06/24/2025 22:24:28 INFO stats.py:314 | Epoch[603] Step[63] GlobalStep[82674] Training Speed: 416.57 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 1:59:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:24:33 INFO loss_tracker.py:84 | Epoch[603/NA] Step[74] GlobalStep[82685/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0125] total_loss[0.0168] Rank[0/16] 06/24/2025 22:24:39 INFO stats.py:314 | Epoch[603] Step[88] GlobalStep[82699] Training Speed: 436.60 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:59:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:24:43 INFO loss_tracker.py:84 | Epoch[603/NA] Step[99] GlobalStep[82710/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 22:24:50 INFO stats.py:314 | Epoch[603] Step[113] GlobalStep[82724] Training Speed: 409.31 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 1:59:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:24:55 INFO loss_tracker.py:84 | Epoch[603/NA] Step[124] GlobalStep[82735/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 22:24:59 INFO stats.py:394 | Epoch[603] completed. Training Speed: 300.72 samples/sec across all devices. Epoch Time: 58.31 sec. Average Epoch Time: 58.31 sec. Average Step Time: 0.43 sec. Estimated Remaining Time: 1:59:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:25:01 INFO stats.py:314 | Epoch[604] Step[1] GlobalStep[82749] Training Speed: 373.93 samples/sec across all devices. Average Step Time: 0.34 sec. Estimated Remaining Time: 1:59:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:25:10 INFO loss_tracker.py:84 | Epoch[604/NA] Step[24] GlobalStep[82772/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0125] total_loss[0.0163] Rank[0/16] 06/24/2025 22:25:11 INFO stats.py:314 | Epoch[604] Step[26] GlobalStep[82774] Training Speed: 432.80 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:59:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:25:21 INFO loss_tracker.py:84 | Epoch[604/NA] Step[49] GlobalStep[82797/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0126] total_loss[0.0163] Rank[0/16] 06/24/2025 22:25:21 INFO stats.py:314 | Epoch[604] Step[51] GlobalStep[82799] Training Speed: 437.36 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:58:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:25:30 INFO loss_tracker.py:84 | Epoch[604/NA] Step[74] GlobalStep[82822/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 22:25:31 INFO stats.py:314 | Epoch[604] Step[76] GlobalStep[82824] Training Speed: 425.34 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:58:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:25:41 INFO loss_tracker.py:84 | Epoch[604/NA] Step[99] GlobalStep[82847/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0033] loss_depth[0.0125] total_loss[0.0158] Rank[0/16] 06/24/2025 22:25:41 INFO stats.py:314 | Epoch[604] Step[101] GlobalStep[82849] Training Speed: 429.14 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:58:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:25:51 INFO loss_tracker.py:84 | Epoch[604/NA] Step[124] GlobalStep[82872/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0035] loss_depth[0.0126] total_loss[0.0161] Rank[0/16] 06/24/2025 22:25:52 INFO stats.py:314 | Epoch[604] Step[126] GlobalStep[82874] Training Speed: 436.69 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:58:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:25:55 INFO stats.py:394 | Epoch[604] completed. Training Speed: 310.15 samples/sec across all devices. Epoch Time: 56.54 sec. Average Epoch Time: 56.54 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:58:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:26:03 INFO stats.py:314 | Epoch[605] Step[14] GlobalStep[82899] Training Speed: 434.26 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:58:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:26:07 INFO loss_tracker.py:84 | Epoch[605/NA] Step[24] GlobalStep[82909/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0034] loss_depth[0.0125] total_loss[0.0159] Rank[0/16] 06/24/2025 22:26:13 INFO stats.py:314 | Epoch[605] Step[39] GlobalStep[82924] Training Speed: 439.36 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:58:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:26:17 INFO loss_tracker.py:84 | Epoch[605/NA] Step[49] GlobalStep[82934/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 22:26:23 INFO stats.py:314 | Epoch[605] Step[64] GlobalStep[82949] Training Speed: 438.64 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:57:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:26:27 INFO loss_tracker.py:84 | Epoch[605/NA] Step[74] GlobalStep[82959/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0125] total_loss[0.0163] Rank[0/16] 06/24/2025 22:26:33 INFO stats.py:314 | Epoch[605] Step[89] GlobalStep[82974] Training Speed: 435.81 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:57:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:26:37 INFO loss_tracker.py:84 | Epoch[605/NA] Step[99] GlobalStep[82984/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0033] loss_depth[0.0126] total_loss[0.0159] Rank[0/16] 06/24/2025 22:26:43 INFO stats.py:314 | Epoch[605] Step[114] GlobalStep[82999] Training Speed: 429.91 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:57:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:26:47 INFO loss_tracker.py:84 | Epoch[605/NA] Step[124] GlobalStep[83009/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0168] Rank[0/16] 06/24/2025 22:26:52 INFO stats.py:394 | Epoch[605] completed. Training Speed: 310.39 samples/sec across all devices. Epoch Time: 56.50 sec. Average Epoch Time: 56.50 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:57:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:26:54 INFO stats.py:314 | Epoch[606] Step[2] GlobalStep[83024] Training Speed: 429.22 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:57:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:27:03 INFO loss_tracker.py:84 | Epoch[606/NA] Step[24] GlobalStep[83046/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 22:27:04 INFO stats.py:314 | Epoch[606] Step[27] GlobalStep[83049] Training Speed: 411.43 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 1:57:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:27:14 INFO loss_tracker.py:84 | Epoch[606/NA] Step[49] GlobalStep[83071/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0125] total_loss[0.0163] Rank[0/16] 06/24/2025 22:27:15 INFO stats.py:314 | Epoch[606] Step[52] GlobalStep[83074] Training Speed: 434.66 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:57:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:27:23 INFO loss_tracker.py:84 | Epoch[606/NA] Step[74] GlobalStep[83096/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0125] total_loss[0.0173] Rank[0/16] 06/24/2025 22:27:25 INFO stats.py:314 | Epoch[606] Step[77] GlobalStep[83099] Training Speed: 437.41 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:56:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:27:34 INFO loss_tracker.py:84 | Epoch[606/NA] Step[99] GlobalStep[83121/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 22:27:35 INFO stats.py:314 | Epoch[606] Step[102] GlobalStep[83124] Training Speed: 437.53 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:56:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:27:43 INFO loss_tracker.py:84 | Epoch[606/NA] Step[124] GlobalStep[83146/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 22:27:44 INFO stats.py:314 | Epoch[606] Step[127] GlobalStep[83149] Training Speed: 452.74 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 1:56:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:27:48 INFO stats.py:394 | Epoch[606] completed. Training Speed: 312.99 samples/sec across all devices. Epoch Time: 56.03 sec. Average Epoch Time: 56.03 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:56:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:27:56 INFO stats.py:314 | Epoch[607] Step[15] GlobalStep[83174] Training Speed: 431.00 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:56:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:28:00 INFO loss_tracker.py:84 | Epoch[607/NA] Step[24] GlobalStep[83183/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 22:28:06 INFO stats.py:314 | Epoch[607] Step[40] GlobalStep[83199] Training Speed: 422.11 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:56:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:28:10 INFO loss_tracker.py:84 | Epoch[607/NA] Step[49] GlobalStep[83208/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0176] Rank[0/16] 06/24/2025 22:28:17 INFO stats.py:314 | Epoch[607] Step[65] GlobalStep[83224] Training Speed: 430.72 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:56:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:28:20 INFO loss_tracker.py:84 | Epoch[607/NA] Step[74] GlobalStep[83233/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 22:28:27 INFO stats.py:314 | Epoch[607] Step[90] GlobalStep[83249] Training Speed: 432.51 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:55:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:28:31 INFO loss_tracker.py:84 | Epoch[607/NA] Step[99] GlobalStep[83258/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 22:28:37 INFO stats.py:314 | Epoch[607] Step[115] GlobalStep[83274] Training Speed: 429.63 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:55:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:28:40 INFO loss_tracker.py:84 | Epoch[607/NA] Step[124] GlobalStep[83283/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0125] total_loss[0.0173] Rank[0/16] 06/24/2025 22:28:44 INFO stats.py:394 | Epoch[607] completed. Training Speed: 310.89 samples/sec across all devices. Epoch Time: 56.41 sec. Average Epoch Time: 56.41 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:55:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:28:47 INFO stats.py:314 | Epoch[608] Step[3] GlobalStep[83299] Training Speed: 432.01 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:55:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:28:56 INFO loss_tracker.py:84 | Epoch[608/NA] Step[24] GlobalStep[83320/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 22:28:57 INFO stats.py:314 | Epoch[608] Step[28] GlobalStep[83324] Training Speed: 425.39 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:55:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:29:06 INFO loss_tracker.py:84 | Epoch[608/NA] Step[49] GlobalStep[83345/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 22:29:08 INFO stats.py:314 | Epoch[608] Step[53] GlobalStep[83349] Training Speed: 423.74 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:55:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:29:17 INFO loss_tracker.py:84 | Epoch[608/NA] Step[74] GlobalStep[83370/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 22:29:18 INFO stats.py:314 | Epoch[608] Step[78] GlobalStep[83374] Training Speed: 431.29 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:54:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:29:27 INFO loss_tracker.py:84 | Epoch[608/NA] Step[99] GlobalStep[83395/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 22:29:29 INFO stats.py:314 | Epoch[608] Step[103] GlobalStep[83399] Training Speed: 426.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:54:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:29:37 INFO loss_tracker.py:84 | Epoch[608/NA] Step[124] GlobalStep[83420/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0125] total_loss[0.0162] Rank[0/16] 06/24/2025 22:29:39 INFO stats.py:314 | Epoch[608] Step[128] GlobalStep[83424] Training Speed: 448.87 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:54:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:29:41 INFO stats.py:394 | Epoch[608] completed. Training Speed: 307.49 samples/sec across all devices. Epoch Time: 57.03 sec. Average Epoch Time: 57.03 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 1:54:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:29:50 INFO stats.py:314 | Epoch[609] Step[16] GlobalStep[83449] Training Speed: 438.03 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:54:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:29:53 INFO loss_tracker.py:84 | Epoch[609/NA] Step[24] GlobalStep[83457/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 22:30:00 INFO stats.py:314 | Epoch[609] Step[41] GlobalStep[83474] Training Speed: 434.48 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:54:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:30:03 INFO loss_tracker.py:84 | Epoch[609/NA] Step[49] GlobalStep[83482/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0034] loss_depth[0.0126] total_loss[0.0160] Rank[0/16] 06/24/2025 22:30:10 INFO stats.py:314 | Epoch[609] Step[66] GlobalStep[83499] Training Speed: 424.85 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:54:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:30:13 INFO loss_tracker.py:84 | Epoch[609/NA] Step[74] GlobalStep[83507/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0126] total_loss[0.0161] Rank[0/16] 06/24/2025 22:30:20 INFO stats.py:314 | Epoch[609] Step[91] GlobalStep[83524] Training Speed: 355.11 samples/sec across all devices. Average Step Time: 0.36 sec. Estimated Remaining Time: 1:53:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:30:23 INFO loss_tracker.py:84 | Epoch[609/NA] Step[99] GlobalStep[83532/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 22:30:31 INFO stats.py:314 | Epoch[609] Step[116] GlobalStep[83549] Training Speed: 257.27 samples/sec across all devices. Average Step Time: 0.50 sec. Estimated Remaining Time: 1:53:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:30:34 INFO loss_tracker.py:84 | Epoch[609/NA] Step[124] GlobalStep[83557/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0035] loss_depth[0.0125] total_loss[0.0161] Rank[0/16] 06/24/2025 22:30:39 INFO stats.py:394 | Epoch[609] completed. Training Speed: 306.50 samples/sec across all devices. Epoch Time: 57.21 sec. Average Epoch Time: 57.21 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 1:53:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:30:41 INFO stats.py:314 | Epoch[610] Step[4] GlobalStep[83574] Training Speed: 436.74 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:53:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:30:50 INFO loss_tracker.py:84 | Epoch[610/NA] Step[24] GlobalStep[83594/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 22:30:51 INFO stats.py:314 | Epoch[610] Step[29] GlobalStep[83599] Training Speed: 435.22 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:53:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:31:00 INFO loss_tracker.py:84 | Epoch[610/NA] Step[49] GlobalStep[83619/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0125] total_loss[0.0175] Rank[0/16] 06/24/2025 22:31:02 INFO stats.py:314 | Epoch[610] Step[54] GlobalStep[83624] Training Speed: 426.58 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:53:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:31:10 INFO loss_tracker.py:84 | Epoch[610/NA] Step[74] GlobalStep[83644/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0125] total_loss[0.0172] Rank[0/16] 06/24/2025 22:31:12 INFO stats.py:314 | Epoch[610] Step[79] GlobalStep[83649] Training Speed: 431.98 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:53:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:31:20 INFO loss_tracker.py:84 | Epoch[610/NA] Step[99] GlobalStep[83669/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 22:31:22 INFO stats.py:314 | Epoch[610] Step[104] GlobalStep[83674] Training Speed: 422.38 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:52:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:31:31 INFO loss_tracker.py:84 | Epoch[610/NA] Step[124] GlobalStep[83694/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 22:31:33 INFO stats.py:314 | Epoch[610] Step[129] GlobalStep[83699] Training Speed: 436.55 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:52:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:31:35 INFO stats.py:394 | Epoch[610] completed. Training Speed: 309.48 samples/sec across all devices. Epoch Time: 56.66 sec. Average Epoch Time: 56.66 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:52:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:31:43 INFO stats.py:314 | Epoch[611] Step[17] GlobalStep[83724] Training Speed: 433.91 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:52:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:31:47 INFO loss_tracker.py:84 | Epoch[611/NA] Step[24] GlobalStep[83731/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0168] Rank[0/16] 06/24/2025 22:31:54 INFO stats.py:314 | Epoch[611] Step[42] GlobalStep[83749] Training Speed: 436.94 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:52:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:31:57 INFO loss_tracker.py:84 | Epoch[611/NA] Step[49] GlobalStep[83756/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 22:32:04 INFO stats.py:314 | Epoch[611] Step[67] GlobalStep[83774] Training Speed: 435.01 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:52:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:32:07 INFO loss_tracker.py:84 | Epoch[611/NA] Step[74] GlobalStep[83781/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 22:32:15 INFO stats.py:314 | Epoch[611] Step[92] GlobalStep[83799] Training Speed: 423.43 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:52:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:32:17 INFO loss_tracker.py:84 | Epoch[611/NA] Step[99] GlobalStep[83806/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 22:32:24 INFO stats.py:314 | Epoch[611] Step[117] GlobalStep[83824] Training Speed: 417.72 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 1:51:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:32:27 INFO loss_tracker.py:84 | Epoch[611/NA] Step[124] GlobalStep[83831/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0168] Rank[0/16] 06/24/2025 22:32:31 INFO stats.py:394 | Epoch[611] completed. Training Speed: 312.37 samples/sec across all devices. Epoch Time: 56.14 sec. Average Epoch Time: 56.14 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:51:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:32:35 INFO stats.py:314 | Epoch[612] Step[5] GlobalStep[83849] Training Speed: 424.22 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:51:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:32:43 INFO loss_tracker.py:84 | Epoch[612/NA] Step[24] GlobalStep[83868/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 22:32:45 INFO stats.py:314 | Epoch[612] Step[30] GlobalStep[83874] Training Speed: 431.22 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:51:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:32:53 INFO loss_tracker.py:84 | Epoch[612/NA] Step[49] GlobalStep[83893/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 22:32:56 INFO stats.py:314 | Epoch[612] Step[55] GlobalStep[83899] Training Speed: 433.13 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:51:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:33:03 INFO loss_tracker.py:84 | Epoch[612/NA] Step[74] GlobalStep[83918/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0125] total_loss[0.0172] Rank[0/16] 06/24/2025 22:33:06 INFO stats.py:314 | Epoch[612] Step[80] GlobalStep[83924] Training Speed: 435.76 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:51:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:33:14 INFO loss_tracker.py:84 | Epoch[612/NA] Step[99] GlobalStep[83943/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0125] total_loss[0.0168] Rank[0/16] 06/24/2025 22:33:16 INFO stats.py:314 | Epoch[612] Step[105] GlobalStep[83949] Training Speed: 437.37 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:51:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:33:23 INFO loss_tracker.py:84 | Epoch[612/NA] Step[124] GlobalStep[83968/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 22:33:25 INFO stats.py:314 | Epoch[612] Step[130] GlobalStep[83974] Training Speed: 453.39 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 1:50:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:33:27 INFO stats.py:394 | Epoch[612] completed. Training Speed: 313.09 samples/sec across all devices. Epoch Time: 56.01 sec. Average Epoch Time: 56.01 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:50:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:33:36 INFO stats.py:314 | Epoch[613] Step[18] GlobalStep[83999] Training Speed: 431.02 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:50:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:33:36 WARNING accelerator.py:3099 | Deleting 1 checkpoints to make room for new checkpoint. Rank[0/16] 06/24/2025 22:33:37 INFO accelerator.py:3111 | Saving current state to /job_data/checkpoints/checkpoint_20 Rank[10/16] 06/24/2025 22:33:37 INFO checkpoint.py:110 | Save checkpoint at the end of step 83999 to /job_data/checkpoints/checkpoint_20 Rank[11/16] 06/24/2025 22:33:37 INFO checkpoint.py:110 | Save checkpoint at the end of step 83999 to /job_data/checkpoints/checkpoint_20 Rank[1/16] 06/24/2025 22:33:37 INFO checkpoint.py:110 | Save checkpoint at the end of step 83999 to /job_data/checkpoints/checkpoint_20 Rank[3/16] 06/24/2025 22:33:37 INFO checkpoint.py:110 | Save checkpoint at the end of step 83999 to /job_data/checkpoints/checkpoint_20 Rank[14/16] 06/24/2025 22:33:37 INFO checkpoint.py:110 | Save checkpoint at the end of step 83999 to /job_data/checkpoints/checkpoint_20 Rank[4/16] 06/24/2025 22:33:37 INFO checkpoint.py:110 | Save checkpoint at the end of step 83999 to /job_data/checkpoints/checkpoint_20 Rank[6/16] 06/24/2025 22:33:37 INFO checkpoint.py:110 | Save checkpoint at the end of step 83999 to /job_data/checkpoints/checkpoint_20 Rank[5/16] 06/24/2025 22:33:37 INFO checkpoint.py:110 | Save checkpoint at the end of step 83999 to /job_data/checkpoints/checkpoint_20 Rank[8/16] 06/24/2025 22:33:37 INFO checkpoint.py:110 | Save checkpoint at the end of step 83999 to /job_data/checkpoints/checkpoint_20 Rank[7/16] 06/24/2025 22:33:37 INFO checkpoint.py:110 | Save checkpoint at the end of step 83999 to /job_data/checkpoints/checkpoint_20 Rank[2/16] 06/24/2025 22:33:37 INFO checkpoint.py:110 | Save checkpoint at the end of step 83999 to /job_data/checkpoints/checkpoint_20 Rank[12/16] 06/24/2025 22:33:37 INFO checkpoint.py:110 | Save checkpoint at the end of step 83999 to /job_data/checkpoints/checkpoint_20 Rank[13/16] 06/24/2025 22:33:37 INFO checkpoint.py:110 | Save checkpoint at the end of step 83999 to /job_data/checkpoints/checkpoint_20 Rank[9/16] 06/24/2025 22:33:38 INFO checkpoint.py:110 | Save checkpoint at the end of step 83999 to /job_data/checkpoints/checkpoint_20 Rank[15/16] 06/24/2025 22:33:38 INFO checkpoint.py:110 | Save checkpoint at the end of step 83999 to /job_data/checkpoints/checkpoint_20 Rank[0/16] 06/24/2025 22:33:38 INFO checkpointing.py:106 | Model weights saved in /job_data/checkpoints/checkpoint_20/model.safetensors Rank[0/16] 06/24/2025 22:33:39 INFO checkpointing.py:113 | Optimizer state saved in /job_data/checkpoints/checkpoint_20/optimizer.bin Rank[0/16] 06/24/2025 22:33:40 INFO checkpointing.py:120 | Scheduler state saved in /job_data/checkpoints/checkpoint_20/scheduler.bin Rank[0/16] 06/24/2025 22:33:40 INFO checkpointing.py:137 | Sampler state for dataloader 0 saved in /job_data/checkpoints/checkpoint_20/sampler.bin Rank[0/16] 06/24/2025 22:33:40 INFO checkpointing.py:164 | Random states saved in /job_data/checkpoints/checkpoint_20/random_states_0.pkl Rank[0/16] 06/24/2025 22:33:40 INFO checkpointing.py:300 | Saving the state of TrainerProgressState to /job_data/checkpoints/checkpoint_20/custom_checkpoint_0.pkl Rank[0/16] 06/24/2025 22:33:40 INFO checkpoint.py:110 | Save checkpoint at the end of step 83999 to /job_data/checkpoints/checkpoint_20 Rank[0/16] 06/24/2025 22:33:43 INFO loss_tracker.py:84 | Epoch[613/NA] Step[24] GlobalStep[84005/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 22:33:51 INFO stats.py:314 | Epoch[613] Step[43] GlobalStep[84024] Training Speed: 418.32 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 1:50:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:33:53 INFO loss_tracker.py:84 | Epoch[613/NA] Step[49] GlobalStep[84030/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 22:34:01 INFO stats.py:314 | Epoch[613] Step[68] GlobalStep[84049] Training Speed: 433.80 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:50:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:34:03 INFO loss_tracker.py:84 | Epoch[613/NA] Step[74] GlobalStep[84055/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 22:34:11 INFO stats.py:314 | Epoch[613] Step[93] GlobalStep[84074] Training Speed: 431.09 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:50:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:34:13 INFO loss_tracker.py:84 | Epoch[613/NA] Step[99] GlobalStep[84080/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0125] total_loss[0.0163] Rank[0/16] 06/24/2025 22:34:21 INFO stats.py:314 | Epoch[613] Step[118] GlobalStep[84099] Training Speed: 430.29 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:49:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:34:23 INFO loss_tracker.py:84 | Epoch[613/NA] Step[124] GlobalStep[84105/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 22:34:27 INFO stats.py:394 | Epoch[613] completed. Training Speed: 293.37 samples/sec across all devices. Epoch Time: 59.77 sec. Average Epoch Time: 59.77 sec. Average Step Time: 0.44 sec. Estimated Remaining Time: 1:49:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:34:31 INFO stats.py:314 | Epoch[614] Step[6] GlobalStep[84124] Training Speed: 411.17 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 1:49:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:34:38 INFO loss_tracker.py:84 | Epoch[614/NA] Step[24] GlobalStep[84142/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 22:34:41 INFO stats.py:314 | Epoch[614] Step[31] GlobalStep[84149] Training Speed: 427.11 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:49:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:34:49 INFO loss_tracker.py:84 | Epoch[614/NA] Step[49] GlobalStep[84167/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0125] total_loss[0.0169] Rank[0/16] 06/24/2025 22:34:52 INFO stats.py:314 | Epoch[614] Step[56] GlobalStep[84174] Training Speed: 430.14 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:49:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:34:59 INFO loss_tracker.py:84 | Epoch[614/NA] Step[74] GlobalStep[84192/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 22:35:02 INFO stats.py:314 | Epoch[614] Step[81] GlobalStep[84199] Training Speed: 430.16 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:49:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:35:10 INFO loss_tracker.py:84 | Epoch[614/NA] Step[99] GlobalStep[84217/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 22:35:12 INFO stats.py:314 | Epoch[614] Step[106] GlobalStep[84224] Training Speed: 432.64 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:49:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:35:19 INFO loss_tracker.py:84 | Epoch[614/NA] Step[124] GlobalStep[84242/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0126] total_loss[0.0162] Rank[0/16] 06/24/2025 22:35:21 INFO stats.py:314 | Epoch[614] Step[131] GlobalStep[84249] Training Speed: 453.43 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 1:48:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:35:23 INFO stats.py:394 | Epoch[614] completed. Training Speed: 314.81 samples/sec across all devices. Epoch Time: 55.70 sec. Average Epoch Time: 55.70 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:48:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:35:32 INFO stats.py:314 | Epoch[615] Step[19] GlobalStep[84274] Training Speed: 431.78 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:48:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:35:34 INFO loss_tracker.py:84 | Epoch[615/NA] Step[24] GlobalStep[84279/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 22:35:42 INFO stats.py:314 | Epoch[615] Step[44] GlobalStep[84299] Training Speed: 430.70 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:48:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:35:44 INFO loss_tracker.py:84 | Epoch[615/NA] Step[49] GlobalStep[84304/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 22:35:53 INFO stats.py:314 | Epoch[615] Step[69] GlobalStep[84324] Training Speed: 431.36 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:48:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:35:55 INFO loss_tracker.py:84 | Epoch[615/NA] Step[74] GlobalStep[84329/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0032] loss_depth[0.0125] total_loss[0.0157] Rank[0/16] 06/24/2025 22:36:03 INFO stats.py:314 | Epoch[615] Step[94] GlobalStep[84349] Training Speed: 425.49 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:48:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:36:05 INFO loss_tracker.py:84 | Epoch[615/NA] Step[99] GlobalStep[84354/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0125] total_loss[0.0164] Rank[0/16] 06/24/2025 22:36:14 INFO stats.py:314 | Epoch[615] Step[119] GlobalStep[84374] Training Speed: 434.64 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:48:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:36:16 INFO loss_tracker.py:84 | Epoch[615/NA] Step[124] GlobalStep[84379/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0034] loss_depth[0.0125] total_loss[0.0160] Rank[0/16] 06/24/2025 22:36:20 INFO stats.py:394 | Epoch[615] completed. Training Speed: 308.39 samples/sec across all devices. Epoch Time: 56.86 sec. Average Epoch Time: 56.86 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 1:47:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:36:24 INFO stats.py:314 | Epoch[616] Step[7] GlobalStep[84399] Training Speed: 434.27 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:47:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:36:32 INFO loss_tracker.py:84 | Epoch[616/NA] Step[24] GlobalStep[84416/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0125] total_loss[0.0168] Rank[0/16] 06/24/2025 22:36:35 INFO stats.py:314 | Epoch[616] Step[32] GlobalStep[84424] Training Speed: 439.61 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:47:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:36:41 INFO loss_tracker.py:84 | Epoch[616/NA] Step[49] GlobalStep[84441/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0164] Rank[0/16] 06/24/2025 22:36:44 INFO stats.py:314 | Epoch[616] Step[57] GlobalStep[84449] Training Speed: 422.91 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:47:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:36:52 INFO loss_tracker.py:84 | Epoch[616/NA] Step[74] GlobalStep[84466/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0125] total_loss[0.0176] Rank[0/16] 06/24/2025 22:36:55 INFO stats.py:314 | Epoch[616] Step[82] GlobalStep[84474] Training Speed: 420.53 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:47:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:37:02 INFO loss_tracker.py:84 | Epoch[616/NA] Step[99] GlobalStep[84491/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0125] total_loss[0.0171] Rank[0/16] 06/24/2025 22:37:05 INFO stats.py:314 | Epoch[616] Step[107] GlobalStep[84499] Training Speed: 426.26 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:47:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:37:12 INFO loss_tracker.py:84 | Epoch[616/NA] Step[124] GlobalStep[84516/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 22:37:15 INFO stats.py:314 | Epoch[616] Step[132] GlobalStep[84524] Training Speed: 439.45 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:47:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:37:17 INFO stats.py:394 | Epoch[616] completed. Training Speed: 308.01 samples/sec across all devices. Epoch Time: 56.93 sec. Average Epoch Time: 56.93 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 1:46:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:37:26 INFO stats.py:314 | Epoch[617] Step[20] GlobalStep[84549] Training Speed: 435.06 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:46:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:37:28 INFO loss_tracker.py:84 | Epoch[617/NA] Step[24] GlobalStep[84553/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0125] total_loss[0.0172] Rank[0/16] 06/24/2025 22:37:36 INFO stats.py:314 | Epoch[617] Step[45] GlobalStep[84574] Training Speed: 428.49 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:46:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:37:38 INFO loss_tracker.py:84 | Epoch[617/NA] Step[49] GlobalStep[84578/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0034] loss_depth[0.0126] total_loss[0.0160] Rank[0/16] 06/24/2025 22:37:47 INFO stats.py:314 | Epoch[617] Step[70] GlobalStep[84599] Training Speed: 434.47 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:46:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:37:48 INFO loss_tracker.py:84 | Epoch[617/NA] Step[74] GlobalStep[84603/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 22:37:56 INFO stats.py:314 | Epoch[617] Step[95] GlobalStep[84624] Training Speed: 438.02 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:46:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:37:58 INFO loss_tracker.py:84 | Epoch[617/NA] Step[99] GlobalStep[84628/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 22:38:07 INFO stats.py:314 | Epoch[617] Step[120] GlobalStep[84649] Training Speed: 429.62 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:46:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:38:09 INFO loss_tracker.py:84 | Epoch[617/NA] Step[124] GlobalStep[84653/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0126] total_loss[0.0163] Rank[0/16] 06/24/2025 22:38:13 INFO stats.py:394 | Epoch[617] completed. Training Speed: 309.93 samples/sec across all devices. Epoch Time: 56.58 sec. Average Epoch Time: 56.58 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:46:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:38:18 INFO stats.py:314 | Epoch[618] Step[8] GlobalStep[84674] Training Speed: 434.28 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:45:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:38:24 INFO loss_tracker.py:84 | Epoch[618/NA] Step[24] GlobalStep[84690/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 22:38:29 INFO stats.py:314 | Epoch[618] Step[33] GlobalStep[84699] Training Speed: 383.46 samples/sec across all devices. Average Step Time: 0.33 sec. Estimated Remaining Time: 1:45:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:38:35 INFO loss_tracker.py:84 | Epoch[618/NA] Step[49] GlobalStep[84715/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 22:38:38 INFO stats.py:314 | Epoch[618] Step[58] GlobalStep[84724] Training Speed: 431.24 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:45:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:38:45 INFO loss_tracker.py:84 | Epoch[618/NA] Step[74] GlobalStep[84740/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 22:38:48 INFO stats.py:314 | Epoch[618] Step[83] GlobalStep[84749] Training Speed: 424.01 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:45:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:38:55 INFO loss_tracker.py:84 | Epoch[618/NA] Step[99] GlobalStep[84765/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0125] total_loss[0.0171] Rank[0/16] 06/24/2025 22:38:58 INFO stats.py:314 | Epoch[618] Step[108] GlobalStep[84774] Training Speed: 434.13 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:45:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:39:05 INFO loss_tracker.py:84 | Epoch[618/NA] Step[124] GlobalStep[84790/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0032] loss_depth[0.0125] total_loss[0.0157] Rank[0/16] 06/24/2025 22:39:08 INFO stats.py:314 | Epoch[618] Step[133] GlobalStep[84799] Training Speed: 433.57 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:45:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:39:09 INFO stats.py:394 | Epoch[618] completed. Training Speed: 315.02 samples/sec across all devices. Epoch Time: 55.67 sec. Average Epoch Time: 55.67 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:45:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:39:18 INFO stats.py:314 | Epoch[619] Step[21] GlobalStep[84824] Training Speed: 391.22 samples/sec across all devices. Average Step Time: 0.33 sec. Estimated Remaining Time: 1:44:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:39:20 INFO loss_tracker.py:84 | Epoch[619/NA] Step[24] GlobalStep[84827/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0126] total_loss[0.0162] Rank[0/16] 06/24/2025 22:39:29 INFO stats.py:314 | Epoch[619] Step[46] GlobalStep[84849] Training Speed: 428.20 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:44:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:39:31 INFO loss_tracker.py:84 | Epoch[619/NA] Step[49] GlobalStep[84852/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 22:39:39 INFO stats.py:314 | Epoch[619] Step[71] GlobalStep[84874] Training Speed: 426.12 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:44:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:39:41 INFO loss_tracker.py:84 | Epoch[619/NA] Step[74] GlobalStep[84877/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 22:39:50 INFO stats.py:314 | Epoch[619] Step[96] GlobalStep[84899] Training Speed: 437.23 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:44:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:39:51 INFO loss_tracker.py:84 | Epoch[619/NA] Step[99] GlobalStep[84902/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0029] loss_depth[0.0125] total_loss[0.0154] Rank[0/16] 06/24/2025 22:40:00 INFO stats.py:314 | Epoch[619] Step[121] GlobalStep[84924] Training Speed: 441.70 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:44:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:40:01 INFO loss_tracker.py:84 | Epoch[619/NA] Step[124] GlobalStep[84927/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0125] total_loss[0.0162] Rank[0/16] 06/24/2025 22:40:05 INFO stats.py:394 | Epoch[619] completed. Training Speed: 313.72 samples/sec across all devices. Epoch Time: 55.90 sec. Average Epoch Time: 55.90 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:44:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:40:10 INFO stats.py:314 | Epoch[620] Step[9] GlobalStep[84949] Training Speed: 440.08 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:44:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:40:16 INFO loss_tracker.py:84 | Epoch[620/NA] Step[24] GlobalStep[84964/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 22:40:20 INFO stats.py:314 | Epoch[620] Step[34] GlobalStep[84974] Training Speed: 422.72 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:43:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:40:26 INFO loss_tracker.py:84 | Epoch[620/NA] Step[49] GlobalStep[84989/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0035] loss_depth[0.0125] total_loss[0.0161] Rank[0/16] 06/24/2025 22:40:31 INFO stats.py:314 | Epoch[620] Step[59] GlobalStep[84999] Training Speed: 426.13 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:43:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:40:37 INFO loss_tracker.py:84 | Epoch[620/NA] Step[74] GlobalStep[85014/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0125] total_loss[0.0163] Rank[0/16] 06/24/2025 22:40:41 INFO stats.py:314 | Epoch[620] Step[84] GlobalStep[85024] Training Speed: 411.53 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 1:43:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:40:47 INFO loss_tracker.py:84 | Epoch[620/NA] Step[99] GlobalStep[85039/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 22:40:52 INFO stats.py:314 | Epoch[620] Step[109] GlobalStep[85049] Training Speed: 425.90 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:43:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:40:57 INFO loss_tracker.py:84 | Epoch[620/NA] Step[124] GlobalStep[85064/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 22:41:00 INFO stats.py:314 | Epoch[620] Step[134] GlobalStep[85074] Training Speed: 451.65 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 1:43:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:41:01 INFO stats.py:394 | Epoch[620] completed. Training Speed: 310.60 samples/sec across all devices. Epoch Time: 56.46 sec. Average Epoch Time: 56.46 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:43:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:41:12 INFO stats.py:314 | Epoch[621] Step[22] GlobalStep[85099] Training Speed: 439.06 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:43:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:41:12 INFO loss_tracker.py:84 | Epoch[621/NA] Step[24] GlobalStep[85101/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 22:41:21 INFO stats.py:314 | Epoch[621] Step[47] GlobalStep[85124] Training Speed: 436.85 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:42:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:41:22 INFO loss_tracker.py:84 | Epoch[621/NA] Step[49] GlobalStep[85126/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0125] total_loss[0.0168] Rank[0/16] 06/24/2025 22:41:32 INFO stats.py:314 | Epoch[621] Step[72] GlobalStep[85149] Training Speed: 429.07 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:42:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:41:33 INFO loss_tracker.py:84 | Epoch[621/NA] Step[74] GlobalStep[85151/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 22:41:42 INFO stats.py:314 | Epoch[621] Step[97] GlobalStep[85174] Training Speed: 434.66 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:42:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:41:43 INFO loss_tracker.py:84 | Epoch[621/NA] Step[99] GlobalStep[85176/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 22:41:53 INFO stats.py:314 | Epoch[621] Step[122] GlobalStep[85199] Training Speed: 424.48 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:42:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:41:54 INFO loss_tracker.py:84 | Epoch[621/NA] Step[124] GlobalStep[85201/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 22:41:58 INFO stats.py:394 | Epoch[621] completed. Training Speed: 310.17 samples/sec across all devices. Epoch Time: 56.54 sec. Average Epoch Time: 56.54 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:42:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:42:04 INFO stats.py:314 | Epoch[622] Step[10] GlobalStep[85224] Training Speed: 425.85 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:42:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:42:10 INFO loss_tracker.py:84 | Epoch[622/NA] Step[24] GlobalStep[85238/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0125] total_loss[0.0175] Rank[0/16] 06/24/2025 22:42:14 INFO stats.py:314 | Epoch[622] Step[35] GlobalStep[85249] Training Speed: 427.71 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:42:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:42:19 INFO loss_tracker.py:84 | Epoch[622/NA] Step[49] GlobalStep[85263/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 22:42:24 INFO stats.py:314 | Epoch[622] Step[60] GlobalStep[85274] Training Speed: 436.06 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:41:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:42:30 INFO loss_tracker.py:84 | Epoch[622/NA] Step[74] GlobalStep[85288/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 22:42:34 INFO stats.py:314 | Epoch[622] Step[85] GlobalStep[85299] Training Speed: 423.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:41:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:42:40 INFO loss_tracker.py:84 | Epoch[622/NA] Step[99] GlobalStep[85313/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0125] total_loss[0.0171] Rank[0/16] 06/24/2025 22:42:44 INFO stats.py:314 | Epoch[622] Step[110] GlobalStep[85324] Training Speed: 418.97 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 1:41:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:42:50 INFO loss_tracker.py:84 | Epoch[622/NA] Step[124] GlobalStep[85338/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 22:42:54 INFO stats.py:314 | Epoch[622] Step[135] GlobalStep[85349] Training Speed: 437.59 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:41:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:42:55 INFO stats.py:394 | Epoch[622] completed. Training Speed: 308.95 samples/sec across all devices. Epoch Time: 56.76 sec. Average Epoch Time: 56.76 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:41:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:43:05 INFO stats.py:314 | Epoch[623] Step[23] GlobalStep[85374] Training Speed: 427.31 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:41:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:43:06 INFO loss_tracker.py:84 | Epoch[623/NA] Step[24] GlobalStep[85375/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 22:43:16 INFO stats.py:314 | Epoch[623] Step[48] GlobalStep[85399] Training Speed: 436.51 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:40:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:43:16 INFO loss_tracker.py:84 | Epoch[623/NA] Step[49] GlobalStep[85400/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 22:43:26 INFO stats.py:314 | Epoch[623] Step[73] GlobalStep[85424] Training Speed: 436.91 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:40:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:43:27 INFO loss_tracker.py:84 | Epoch[623/NA] Step[74] GlobalStep[85425/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 22:43:36 INFO stats.py:314 | Epoch[623] Step[98] GlobalStep[85449] Training Speed: 429.48 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:40:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:43:36 INFO loss_tracker.py:84 | Epoch[623/NA] Step[99] GlobalStep[85450/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 22:43:46 INFO stats.py:314 | Epoch[623] Step[123] GlobalStep[85474] Training Speed: 441.49 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:40:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:43:47 INFO loss_tracker.py:84 | Epoch[623/NA] Step[124] GlobalStep[85475/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 22:43:51 INFO stats.py:394 | Epoch[623] completed. Training Speed: 310.49 samples/sec across all devices. Epoch Time: 56.48 sec. Average Epoch Time: 56.48 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:40:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:43:57 INFO stats.py:314 | Epoch[624] Step[11] GlobalStep[85499] Training Speed: 430.27 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:40:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:44:02 INFO loss_tracker.py:84 | Epoch[624/NA] Step[24] GlobalStep[85512/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0125] total_loss[0.0169] Rank[0/16] 06/24/2025 22:44:08 INFO stats.py:314 | Epoch[624] Step[36] GlobalStep[85524] Training Speed: 429.98 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:40:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:44:13 INFO loss_tracker.py:84 | Epoch[624/NA] Step[49] GlobalStep[85537/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0125] total_loss[0.0177] Rank[0/16] 06/24/2025 22:44:17 INFO stats.py:314 | Epoch[624] Step[61] GlobalStep[85549] Training Speed: 407.61 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 1:39:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:44:23 INFO loss_tracker.py:84 | Epoch[624/NA] Step[74] GlobalStep[85562/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0031] loss_depth[0.0126] total_loss[0.0156] Rank[0/16] 06/24/2025 22:44:28 INFO stats.py:314 | Epoch[624] Step[86] GlobalStep[85574] Training Speed: 435.23 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:39:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:44:33 INFO loss_tracker.py:84 | Epoch[624/NA] Step[99] GlobalStep[85587/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0034] loss_depth[0.0126] total_loss[0.0160] Rank[0/16] 06/24/2025 22:44:38 INFO stats.py:314 | Epoch[624] Step[111] GlobalStep[85599] Training Speed: 437.22 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:39:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:44:44 INFO loss_tracker.py:84 | Epoch[624/NA] Step[124] GlobalStep[85612/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0125] total_loss[0.0163] Rank[0/16] 06/24/2025 22:44:48 INFO stats.py:314 | Epoch[624] Step[136] GlobalStep[85624] Training Speed: 430.02 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:39:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:44:48 INFO stats.py:394 | Epoch[624] completed. Training Speed: 308.62 samples/sec across all devices. Epoch Time: 56.82 sec. Average Epoch Time: 56.82 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:39:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:44:59 INFO stats.py:314 | Epoch[625] Step[24] GlobalStep[85649] Training Speed: 435.11 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:39:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:44:59 INFO loss_tracker.py:84 | Epoch[625/NA] Step[24] GlobalStep[85649/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 22:45:10 INFO stats.py:314 | Epoch[625] Step[49] GlobalStep[85674] Training Speed: 429.99 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:39:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:45:10 INFO loss_tracker.py:84 | Epoch[625/NA] Step[49] GlobalStep[85674/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0125] total_loss[0.0172] Rank[0/16] 06/24/2025 22:45:20 INFO stats.py:314 | Epoch[625] Step[74] GlobalStep[85699] Training Speed: 429.69 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:38:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:45:20 INFO loss_tracker.py:84 | Epoch[625/NA] Step[74] GlobalStep[85699/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 22:45:31 INFO stats.py:314 | Epoch[625] Step[99] GlobalStep[85724] Training Speed: 413.63 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 1:38:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:45:31 INFO loss_tracker.py:84 | Epoch[625/NA] Step[99] GlobalStep[85724/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0125] total_loss[0.0169] Rank[0/16] 06/24/2025 22:45:41 INFO stats.py:314 | Epoch[625] Step[124] GlobalStep[85749] Training Speed: 452.45 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 1:38:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:45:41 INFO loss_tracker.py:84 | Epoch[625/NA] Step[124] GlobalStep[85749/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 22:45:45 INFO stats.py:394 | Epoch[625] completed. Training Speed: 306.67 samples/sec across all devices. Epoch Time: 57.18 sec. Average Epoch Time: 57.18 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 1:38:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:45:52 INFO stats.py:314 | Epoch[626] Step[12] GlobalStep[85774] Training Speed: 432.52 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:38:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:45:56 INFO loss_tracker.py:84 | Epoch[626/NA] Step[24] GlobalStep[85786/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0125] total_loss[0.0163] Rank[0/16] 06/24/2025 22:46:02 INFO stats.py:314 | Epoch[626] Step[37] GlobalStep[85799] Training Speed: 426.50 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:38:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:46:07 INFO loss_tracker.py:84 | Epoch[626/NA] Step[49] GlobalStep[85811/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0034] loss_depth[0.0125] total_loss[0.0159] Rank[0/16] 06/24/2025 22:46:13 INFO stats.py:314 | Epoch[626] Step[62] GlobalStep[85824] Training Speed: 428.77 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:38:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:46:18 INFO loss_tracker.py:84 | Epoch[626/NA] Step[74] GlobalStep[85836/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0126] total_loss[0.0162] Rank[0/16] 06/24/2025 22:46:23 INFO stats.py:314 | Epoch[626] Step[87] GlobalStep[85849] Training Speed: 412.90 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 1:37:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:46:28 INFO loss_tracker.py:84 | Epoch[626/NA] Step[99] GlobalStep[85861/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 22:46:34 INFO stats.py:314 | Epoch[626] Step[112] GlobalStep[85874] Training Speed: 425.82 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:37:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:46:38 INFO loss_tracker.py:84 | Epoch[626/NA] Step[124] GlobalStep[85886/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0034] loss_depth[0.0125] total_loss[0.0159] Rank[0/16] 06/24/2025 22:46:43 INFO stats.py:394 | Epoch[626] completed. Training Speed: 305.29 samples/sec across all devices. Epoch Time: 57.44 sec. Average Epoch Time: 57.44 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 1:37:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:46:44 INFO stats.py:314 | Epoch[627] Step[0] GlobalStep[85899] Training Speed: 227.82 samples/sec across all devices. Average Step Time: 0.56 sec. Estimated Remaining Time: 1:37:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:46:54 INFO loss_tracker.py:84 | Epoch[627/NA] Step[24] GlobalStep[85923/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 22:46:54 INFO stats.py:314 | Epoch[627] Step[25] GlobalStep[85924] Training Speed: 384.39 samples/sec across all devices. Average Step Time: 0.33 sec. Estimated Remaining Time: 1:37:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:47:04 INFO loss_tracker.py:84 | Epoch[627/NA] Step[49] GlobalStep[85948/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0125] total_loss[0.0161] Rank[0/16] 06/24/2025 22:47:05 INFO stats.py:314 | Epoch[627] Step[50] GlobalStep[85949] Training Speed: 417.32 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 1:37:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:47:15 INFO loss_tracker.py:84 | Epoch[627/NA] Step[74] GlobalStep[85973/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 22:47:15 INFO stats.py:314 | Epoch[627] Step[75] GlobalStep[85974] Training Speed: 411.88 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 1:36:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:47:25 INFO loss_tracker.py:84 | Epoch[627/NA] Step[99] GlobalStep[85998/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0035] loss_depth[0.0125] total_loss[0.0160] Rank[0/16] 06/24/2025 22:47:26 INFO stats.py:314 | Epoch[627] Step[100] GlobalStep[85999] Training Speed: 408.73 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 1:36:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:47:35 INFO loss_tracker.py:84 | Epoch[627/NA] Step[124] GlobalStep[86023/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 22:47:36 INFO stats.py:314 | Epoch[627] Step[125] GlobalStep[86024] Training Speed: 406.11 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 1:36:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:47:39 INFO stats.py:394 | Epoch[627] completed. Training Speed: 308.19 samples/sec across all devices. Epoch Time: 56.90 sec. Average Epoch Time: 56.90 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 1:36:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:47:46 INFO stats.py:314 | Epoch[628] Step[13] GlobalStep[86049] Training Speed: 440.43 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:36:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:47:50 INFO loss_tracker.py:84 | Epoch[628/NA] Step[24] GlobalStep[86060/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 22:47:56 INFO stats.py:314 | Epoch[628] Step[38] GlobalStep[86074] Training Speed: 434.58 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:36:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:48:01 INFO loss_tracker.py:84 | Epoch[628/NA] Step[49] GlobalStep[86085/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0125] total_loss[0.0162] Rank[0/16] 06/24/2025 22:48:06 INFO stats.py:314 | Epoch[628] Step[63] GlobalStep[86099] Training Speed: 423.75 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:36:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:48:11 INFO loss_tracker.py:84 | Epoch[628/NA] Step[74] GlobalStep[86110/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 22:48:17 INFO stats.py:314 | Epoch[628] Step[88] GlobalStep[86124] Training Speed: 427.10 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:35:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:48:21 INFO loss_tracker.py:84 | Epoch[628/NA] Step[99] GlobalStep[86135/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0164] Rank[0/16] 06/24/2025 22:48:27 INFO stats.py:314 | Epoch[628] Step[113] GlobalStep[86149] Training Speed: 430.96 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:35:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:48:31 INFO loss_tracker.py:84 | Epoch[628/NA] Step[124] GlobalStep[86160/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0125] total_loss[0.0168] Rank[0/16] 06/24/2025 22:48:36 INFO stats.py:394 | Epoch[628] completed. Training Speed: 309.47 samples/sec across all devices. Epoch Time: 56.67 sec. Average Epoch Time: 56.67 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:35:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:48:38 INFO stats.py:314 | Epoch[629] Step[1] GlobalStep[86174] Training Speed: 421.24 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:35:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:48:48 INFO loss_tracker.py:84 | Epoch[629/NA] Step[24] GlobalStep[86197/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 22:48:49 INFO stats.py:314 | Epoch[629] Step[26] GlobalStep[86199] Training Speed: 432.13 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:35:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:48:58 INFO loss_tracker.py:84 | Epoch[629/NA] Step[49] GlobalStep[86222/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 22:48:59 INFO stats.py:314 | Epoch[629] Step[51] GlobalStep[86224] Training Speed: 429.41 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:35:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:49:08 INFO loss_tracker.py:84 | Epoch[629/NA] Step[74] GlobalStep[86247/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 22:49:09 INFO stats.py:314 | Epoch[629] Step[76] GlobalStep[86249] Training Speed: 435.21 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:35:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:49:19 INFO loss_tracker.py:84 | Epoch[629/NA] Step[99] GlobalStep[86272/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 22:49:19 INFO stats.py:314 | Epoch[629] Step[101] GlobalStep[86274] Training Speed: 432.74 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:34:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:49:29 INFO loss_tracker.py:84 | Epoch[629/NA] Step[124] GlobalStep[86297/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 22:49:29 INFO stats.py:314 | Epoch[629] Step[126] GlobalStep[86299] Training Speed: 438.77 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:34:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:49:33 INFO stats.py:394 | Epoch[629] completed. Training Speed: 307.98 samples/sec across all devices. Epoch Time: 56.94 sec. Average Epoch Time: 56.94 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 1:34:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:49:40 INFO stats.py:314 | Epoch[630] Step[14] GlobalStep[86324] Training Speed: 259.82 samples/sec across all devices. Average Step Time: 0.49 sec. Estimated Remaining Time: 1:34:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:49:44 INFO loss_tracker.py:84 | Epoch[630/NA] Step[24] GlobalStep[86334/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 22:49:51 INFO stats.py:314 | Epoch[630] Step[39] GlobalStep[86349] Training Speed: 403.31 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 1:34:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:49:55 INFO loss_tracker.py:84 | Epoch[630/NA] Step[49] GlobalStep[86359/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 22:50:01 INFO stats.py:314 | Epoch[630] Step[64] GlobalStep[86374] Training Speed: 409.30 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 1:34:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:50:05 INFO loss_tracker.py:84 | Epoch[630/NA] Step[74] GlobalStep[86384/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 22:50:11 INFO stats.py:314 | Epoch[630] Step[89] GlobalStep[86399] Training Speed: 437.52 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:34:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:50:15 INFO loss_tracker.py:84 | Epoch[630/NA] Step[99] GlobalStep[86409/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 22:50:22 INFO stats.py:314 | Epoch[630] Step[114] GlobalStep[86424] Training Speed: 425.29 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:33:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:50:26 INFO loss_tracker.py:84 | Epoch[630/NA] Step[124] GlobalStep[86434/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 22:50:30 INFO stats.py:394 | Epoch[630] completed. Training Speed: 305.71 samples/sec across all devices. Epoch Time: 57.36 sec. Average Epoch Time: 57.36 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 1:33:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:50:32 INFO stats.py:314 | Epoch[631] Step[2] GlobalStep[86449] Training Speed: 431.84 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:33:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:50:42 INFO loss_tracker.py:84 | Epoch[631/NA] Step[24] GlobalStep[86471/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 22:50:43 INFO stats.py:314 | Epoch[631] Step[27] GlobalStep[86474] Training Speed: 431.27 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:33:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:50:52 INFO loss_tracker.py:84 | Epoch[631/NA] Step[49] GlobalStep[86496/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 22:50:53 INFO stats.py:314 | Epoch[631] Step[52] GlobalStep[86499] Training Speed: 394.09 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 1:33:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:51:03 INFO loss_tracker.py:84 | Epoch[631/NA] Step[74] GlobalStep[86521/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 22:51:04 INFO stats.py:314 | Epoch[631] Step[77] GlobalStep[86524] Training Speed: 429.83 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:33:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:51:13 INFO loss_tracker.py:84 | Epoch[631/NA] Step[99] GlobalStep[86546/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 22:51:14 INFO stats.py:314 | Epoch[631] Step[102] GlobalStep[86549] Training Speed: 431.35 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:33:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:51:23 INFO loss_tracker.py:84 | Epoch[631/NA] Step[124] GlobalStep[86571/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0126] total_loss[0.0177] Rank[0/16] 06/24/2025 22:51:24 INFO stats.py:314 | Epoch[631] Step[127] GlobalStep[86574] Training Speed: 451.85 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 1:32:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:51:27 INFO stats.py:394 | Epoch[631] completed. Training Speed: 307.28 samples/sec across all devices. Epoch Time: 57.07 sec. Average Epoch Time: 57.07 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 1:32:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:51:35 INFO stats.py:314 | Epoch[632] Step[15] GlobalStep[86599] Training Speed: 441.03 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:32:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:51:39 INFO loss_tracker.py:84 | Epoch[632/NA] Step[24] GlobalStep[86608/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 22:51:45 INFO stats.py:314 | Epoch[632] Step[40] GlobalStep[86624] Training Speed: 429.78 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:32:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:51:49 INFO loss_tracker.py:84 | Epoch[632/NA] Step[49] GlobalStep[86633/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0176] Rank[0/16] 06/24/2025 22:51:56 INFO stats.py:314 | Epoch[632] Step[65] GlobalStep[86649] Training Speed: 433.58 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:32:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:52:00 INFO loss_tracker.py:84 | Epoch[632/NA] Step[74] GlobalStep[86658/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0125] total_loss[0.0163] Rank[0/16] 06/24/2025 22:52:07 INFO stats.py:314 | Epoch[632] Step[90] GlobalStep[86674] Training Speed: 425.00 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:32:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:52:11 INFO loss_tracker.py:84 | Epoch[632/NA] Step[99] GlobalStep[86683/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 22:52:17 INFO stats.py:314 | Epoch[632] Step[115] GlobalStep[86699] Training Speed: 434.56 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:31:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:52:21 INFO loss_tracker.py:84 | Epoch[632/NA] Step[124] GlobalStep[86708/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 22:52:25 INFO stats.py:394 | Epoch[632] completed. Training Speed: 303.13 samples/sec across all devices. Epoch Time: 57.85 sec. Average Epoch Time: 57.85 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 1:31:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:52:28 INFO stats.py:314 | Epoch[633] Step[3] GlobalStep[86724] Training Speed: 439.16 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:31:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:52:37 INFO loss_tracker.py:84 | Epoch[633/NA] Step[24] GlobalStep[86745/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0125] total_loss[0.0172] Rank[0/16] 06/24/2025 22:52:38 INFO stats.py:314 | Epoch[633] Step[28] GlobalStep[86749] Training Speed: 432.71 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:31:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:52:47 INFO loss_tracker.py:84 | Epoch[633/NA] Step[49] GlobalStep[86770/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0164] Rank[0/16] 06/24/2025 22:52:49 INFO stats.py:314 | Epoch[633] Step[53] GlobalStep[86774] Training Speed: 408.21 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 1:31:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:52:57 INFO loss_tracker.py:84 | Epoch[633/NA] Step[74] GlobalStep[86795/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 22:52:58 INFO stats.py:314 | Epoch[633] Step[78] GlobalStep[86799] Training Speed: 420.29 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:31:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:53:08 INFO loss_tracker.py:84 | Epoch[633/NA] Step[99] GlobalStep[86820/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 22:53:09 INFO stats.py:314 | Epoch[633] Step[103] GlobalStep[86824] Training Speed: 424.38 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:31:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:53:17 INFO loss_tracker.py:84 | Epoch[633/NA] Step[124] GlobalStep[86845/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0168] Rank[0/16] 06/24/2025 22:53:19 INFO stats.py:314 | Epoch[633] Step[128] GlobalStep[86849] Training Speed: 433.17 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:30:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:53:22 INFO stats.py:394 | Epoch[633] completed. Training Speed: 310.67 samples/sec across all devices. Epoch Time: 56.45 sec. Average Epoch Time: 56.45 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:30:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:53:30 INFO stats.py:314 | Epoch[634] Step[16] GlobalStep[86874] Training Speed: 430.93 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:30:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:53:33 INFO loss_tracker.py:84 | Epoch[634/NA] Step[24] GlobalStep[86882/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0125] total_loss[0.0174] Rank[0/16] 06/24/2025 22:53:40 INFO stats.py:314 | Epoch[634] Step[41] GlobalStep[86899] Training Speed: 440.14 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:30:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:53:43 INFO loss_tracker.py:84 | Epoch[634/NA] Step[49] GlobalStep[86907/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 22:53:50 INFO stats.py:314 | Epoch[634] Step[66] GlobalStep[86924] Training Speed: 437.71 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:30:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:53:53 INFO loss_tracker.py:84 | Epoch[634/NA] Step[74] GlobalStep[86932/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0176] Rank[0/16] 06/24/2025 22:54:00 INFO stats.py:314 | Epoch[634] Step[91] GlobalStep[86949] Training Speed: 259.16 samples/sec across all devices. Average Step Time: 0.49 sec. Estimated Remaining Time: 1:30:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:54:03 INFO loss_tracker.py:84 | Epoch[634/NA] Step[99] GlobalStep[86957/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0125] total_loss[0.0171] Rank[0/16] 06/24/2025 22:54:10 INFO stats.py:314 | Epoch[634] Step[116] GlobalStep[86974] Training Speed: 434.10 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:30:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:54:13 INFO loss_tracker.py:84 | Epoch[634/NA] Step[124] GlobalStep[86982/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0050] loss_depth[0.0126] total_loss[0.0176] Rank[0/16] 06/24/2025 22:54:18 INFO stats.py:394 | Epoch[634] completed. Training Speed: 313.51 samples/sec across all devices. Epoch Time: 55.93 sec. Average Epoch Time: 55.93 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:29:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:54:21 INFO stats.py:314 | Epoch[635] Step[4] GlobalStep[86999] Training Speed: 430.27 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:29:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:54:30 INFO loss_tracker.py:84 | Epoch[635/NA] Step[24] GlobalStep[87019/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 22:54:32 INFO stats.py:314 | Epoch[635] Step[29] GlobalStep[87024] Training Speed: 431.03 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:29:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:54:40 INFO loss_tracker.py:84 | Epoch[635/NA] Step[49] GlobalStep[87044/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0125] total_loss[0.0169] Rank[0/16] 06/24/2025 22:54:42 INFO stats.py:314 | Epoch[635] Step[54] GlobalStep[87049] Training Speed: 401.35 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 1:29:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:54:50 INFO loss_tracker.py:84 | Epoch[635/NA] Step[74] GlobalStep[87069/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 22:54:52 INFO stats.py:314 | Epoch[635] Step[79] GlobalStep[87074] Training Speed: 435.01 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:29:23. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:55:00 INFO loss_tracker.py:84 | Epoch[635/NA] Step[99] GlobalStep[87094/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0126] total_loss[0.0171] Rank[0/16] 06/24/2025 22:55:02 INFO stats.py:314 | Epoch[635] Step[104] GlobalStep[87099] Training Speed: 423.68 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:29:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:55:11 INFO loss_tracker.py:84 | Epoch[635/NA] Step[124] GlobalStep[87119/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0125] total_loss[0.0173] Rank[0/16] 06/24/2025 22:55:12 INFO stats.py:314 | Epoch[635] Step[129] GlobalStep[87124] Training Speed: 435.40 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:29:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:55:15 INFO stats.py:394 | Epoch[635] completed. Training Speed: 305.98 samples/sec across all devices. Epoch Time: 57.31 sec. Average Epoch Time: 57.31 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 1:28:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:55:24 INFO stats.py:314 | Epoch[636] Step[17] GlobalStep[87149] Training Speed: 427.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:28:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:55:27 INFO loss_tracker.py:84 | Epoch[636/NA] Step[24] GlobalStep[87156/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 22:55:34 INFO stats.py:314 | Epoch[636] Step[42] GlobalStep[87174] Training Speed: 421.76 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:28:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:55:37 INFO loss_tracker.py:84 | Epoch[636/NA] Step[49] GlobalStep[87181/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0125] total_loss[0.0173] Rank[0/16] 06/24/2025 22:55:45 INFO stats.py:314 | Epoch[636] Step[67] GlobalStep[87199] Training Speed: 435.47 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:28:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:55:48 INFO loss_tracker.py:84 | Epoch[636/NA] Step[74] GlobalStep[87206/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0125] total_loss[0.0161] Rank[0/16] 06/24/2025 22:55:55 INFO stats.py:314 | Epoch[636] Step[92] GlobalStep[87224] Training Speed: 435.14 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:28:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:55:58 INFO loss_tracker.py:84 | Epoch[636/NA] Step[99] GlobalStep[87231/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 22:56:05 INFO stats.py:314 | Epoch[636] Step[117] GlobalStep[87249] Training Speed: 436.02 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:28:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:56:08 INFO loss_tracker.py:84 | Epoch[636/NA] Step[124] GlobalStep[87256/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0125] total_loss[0.0162] Rank[0/16] 06/24/2025 22:56:12 INFO stats.py:394 | Epoch[636] completed. Training Speed: 307.84 samples/sec across all devices. Epoch Time: 56.97 sec. Average Epoch Time: 56.97 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 1:28:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:56:15 INFO stats.py:314 | Epoch[637] Step[5] GlobalStep[87274] Training Speed: 427.00 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:28:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:56:23 INFO loss_tracker.py:84 | Epoch[637/NA] Step[24] GlobalStep[87293/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 22:56:26 INFO stats.py:314 | Epoch[637] Step[30] GlobalStep[87299] Training Speed: 422.43 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:27:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:56:34 INFO loss_tracker.py:84 | Epoch[637/NA] Step[49] GlobalStep[87318/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0025] loss_depth[0.0125] total_loss[0.0150] Rank[0/16] 06/24/2025 22:56:36 INFO stats.py:314 | Epoch[637] Step[55] GlobalStep[87324] Training Speed: 428.31 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:27:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:56:44 INFO loss_tracker.py:84 | Epoch[637/NA] Step[74] GlobalStep[87343/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0033] loss_depth[0.0125] total_loss[0.0159] Rank[0/16] 06/24/2025 22:56:47 INFO stats.py:314 | Epoch[637] Step[80] GlobalStep[87349] Training Speed: 436.87 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:27:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:56:54 INFO loss_tracker.py:84 | Epoch[637/NA] Step[99] GlobalStep[87368/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 22:56:57 INFO stats.py:314 | Epoch[637] Step[105] GlobalStep[87374] Training Speed: 444.30 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:27:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:57:04 INFO loss_tracker.py:84 | Epoch[637/NA] Step[124] GlobalStep[87393/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0164] Rank[0/16] 06/24/2025 22:57:06 INFO stats.py:314 | Epoch[637] Step[130] GlobalStep[87399] Training Speed: 453.37 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 1:27:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:57:09 INFO stats.py:394 | Epoch[637] completed. Training Speed: 307.66 samples/sec across all devices. Epoch Time: 57.00 sec. Average Epoch Time: 57.00 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 1:27:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:57:17 INFO stats.py:314 | Epoch[638] Step[18] GlobalStep[87424] Training Speed: 434.79 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:26:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:57:19 INFO loss_tracker.py:84 | Epoch[638/NA] Step[24] GlobalStep[87430/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 22:57:27 INFO stats.py:314 | Epoch[638] Step[43] GlobalStep[87449] Training Speed: 426.90 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:26:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:57:30 INFO loss_tracker.py:84 | Epoch[638/NA] Step[49] GlobalStep[87455/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0125] total_loss[0.0169] Rank[0/16] 06/24/2025 22:57:37 INFO stats.py:314 | Epoch[638] Step[68] GlobalStep[87474] Training Speed: 427.67 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:26:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:57:40 INFO loss_tracker.py:84 | Epoch[638/NA] Step[74] GlobalStep[87480/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0164] Rank[0/16] 06/24/2025 22:57:48 INFO stats.py:314 | Epoch[638] Step[93] GlobalStep[87499] Training Speed: 393.68 samples/sec across all devices. Average Step Time: 0.33 sec. Estimated Remaining Time: 1:26:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:57:51 INFO loss_tracker.py:84 | Epoch[638/NA] Step[99] GlobalStep[87505/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0125] total_loss[0.0164] Rank[0/16] 06/24/2025 22:57:58 INFO stats.py:314 | Epoch[638] Step[118] GlobalStep[87524] Training Speed: 435.46 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:26:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:58:00 INFO loss_tracker.py:84 | Epoch[638/NA] Step[124] GlobalStep[87530/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 22:58:05 INFO stats.py:394 | Epoch[638] completed. Training Speed: 314.20 samples/sec across all devices. Epoch Time: 55.81 sec. Average Epoch Time: 55.81 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:26:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:58:09 INFO stats.py:314 | Epoch[639] Step[6] GlobalStep[87549] Training Speed: 440.56 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:26:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:58:16 INFO loss_tracker.py:84 | Epoch[639/NA] Step[24] GlobalStep[87567/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0032] loss_depth[0.0125] total_loss[0.0158] Rank[0/16] 06/24/2025 22:58:19 INFO stats.py:314 | Epoch[639] Step[31] GlobalStep[87574] Training Speed: 439.64 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:25:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:58:26 INFO loss_tracker.py:84 | Epoch[639/NA] Step[49] GlobalStep[87592/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0125] total_loss[0.0163] Rank[0/16] 06/24/2025 22:58:29 INFO stats.py:314 | Epoch[639] Step[56] GlobalStep[87599] Training Speed: 389.76 samples/sec across all devices. Average Step Time: 0.33 sec. Estimated Remaining Time: 1:25:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:58:36 INFO loss_tracker.py:84 | Epoch[639/NA] Step[74] GlobalStep[87617/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 22:58:40 INFO stats.py:314 | Epoch[639] Step[81] GlobalStep[87624] Training Speed: 413.17 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 1:25:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:58:47 INFO loss_tracker.py:84 | Epoch[639/NA] Step[99] GlobalStep[87642/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0125] total_loss[0.0171] Rank[0/16] 06/24/2025 22:58:50 INFO stats.py:314 | Epoch[639] Step[106] GlobalStep[87649] Training Speed: 426.23 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:25:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:58:57 INFO loss_tracker.py:84 | Epoch[639/NA] Step[124] GlobalStep[87667/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 22:58:59 INFO stats.py:314 | Epoch[639] Step[131] GlobalStep[87674] Training Speed: 436.94 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:25:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:59:02 INFO stats.py:394 | Epoch[639] completed. Training Speed: 309.03 samples/sec across all devices. Epoch Time: 56.74 sec. Average Epoch Time: 56.74 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:25:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:59:10 INFO stats.py:314 | Epoch[640] Step[19] GlobalStep[87699] Training Speed: 433.58 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:25:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:59:13 INFO loss_tracker.py:84 | Epoch[640/NA] Step[24] GlobalStep[87704/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0035] loss_depth[0.0125] total_loss[0.0161] Rank[0/16] 06/24/2025 22:59:21 INFO stats.py:314 | Epoch[640] Step[44] GlobalStep[87724] Training Speed: 438.35 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:24:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:59:23 INFO loss_tracker.py:84 | Epoch[640/NA] Step[49] GlobalStep[87729/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0125] total_loss[0.0161] Rank[0/16] 06/24/2025 22:59:31 INFO stats.py:314 | Epoch[640] Step[69] GlobalStep[87749] Training Speed: 434.38 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:24:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:59:33 INFO loss_tracker.py:84 | Epoch[640/NA] Step[74] GlobalStep[87754/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 22:59:41 INFO stats.py:314 | Epoch[640] Step[94] GlobalStep[87774] Training Speed: 433.54 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:24:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:59:43 INFO loss_tracker.py:84 | Epoch[640/NA] Step[99] GlobalStep[87779/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0125] total_loss[0.0162] Rank[0/16] 06/24/2025 22:59:51 INFO stats.py:314 | Epoch[640] Step[119] GlobalStep[87799] Training Speed: 432.56 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:24:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 22:59:53 INFO loss_tracker.py:84 | Epoch[640/NA] Step[124] GlobalStep[87804/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0125] total_loss[0.0162] Rank[0/16] 06/24/2025 22:59:58 INFO stats.py:394 | Epoch[640] completed. Training Speed: 312.98 samples/sec across all devices. Epoch Time: 56.03 sec. Average Epoch Time: 56.03 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:24:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:00:02 INFO stats.py:314 | Epoch[641] Step[7] GlobalStep[87824] Training Speed: 439.69 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:24:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:00:09 INFO loss_tracker.py:84 | Epoch[641/NA] Step[24] GlobalStep[87841/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0164] Rank[0/16] 06/24/2025 23:00:12 INFO stats.py:314 | Epoch[641] Step[32] GlobalStep[87849] Training Speed: 432.29 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:24:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:00:19 INFO loss_tracker.py:84 | Epoch[641/NA] Step[49] GlobalStep[87866/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 23:00:22 INFO stats.py:314 | Epoch[641] Step[57] GlobalStep[87874] Training Speed: 433.38 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:23:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:00:29 INFO loss_tracker.py:84 | Epoch[641/NA] Step[74] GlobalStep[87891/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 23:00:33 INFO stats.py:314 | Epoch[641] Step[82] GlobalStep[87899] Training Speed: 438.82 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:23:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:00:40 INFO loss_tracker.py:84 | Epoch[641/NA] Step[99] GlobalStep[87916/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0033] loss_depth[0.0125] total_loss[0.0158] Rank[0/16] 06/24/2025 23:00:43 INFO stats.py:314 | Epoch[641] Step[107] GlobalStep[87924] Training Speed: 425.79 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:23:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:00:50 INFO loss_tracker.py:84 | Epoch[641/NA] Step[124] GlobalStep[87941/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0170] Rank[0/16] 06/24/2025 23:00:53 INFO stats.py:314 | Epoch[641] Step[132] GlobalStep[87949] Training Speed: 441.60 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:23:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:00:54 INFO stats.py:394 | Epoch[641] completed. Training Speed: 311.09 samples/sec across all devices. Epoch Time: 56.37 sec. Average Epoch Time: 56.37 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:23:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:01:04 INFO stats.py:314 | Epoch[642] Step[20] GlobalStep[87974] Training Speed: 433.02 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:23:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:01:06 INFO loss_tracker.py:84 | Epoch[642/NA] Step[24] GlobalStep[87978/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0126] total_loss[0.0167] Rank[0/16] 06/24/2025 23:01:14 INFO stats.py:314 | Epoch[642] Step[45] GlobalStep[87999] Training Speed: 434.88 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:22:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:01:15 WARNING accelerator.py:3099 | Deleting 1 checkpoints to make room for new checkpoint. Rank[0/16] 06/24/2025 23:01:15 INFO accelerator.py:3111 | Saving current state to /job_data/checkpoints/checkpoint_21 Rank[1/16] 06/24/2025 23:01:15 INFO checkpoint.py:110 | Save checkpoint at the end of step 87999 to /job_data/checkpoints/checkpoint_21 Rank[3/16] 06/24/2025 23:01:15 INFO checkpoint.py:110 | Save checkpoint at the end of step 87999 to /job_data/checkpoints/checkpoint_21 Rank[2/16] 06/24/2025 23:01:15 INFO checkpoint.py:110 | Save checkpoint at the end of step 87999 to /job_data/checkpoints/checkpoint_21 Rank[7/16] 06/24/2025 23:01:16 INFO checkpoint.py:110 | Save checkpoint at the end of step 87999 to /job_data/checkpoints/checkpoint_21 Rank[5/16] 06/24/2025 23:01:16 INFO checkpoint.py:110 | Save checkpoint at the end of step 87999 to /job_data/checkpoints/checkpoint_21 Rank[6/16] 06/24/2025 23:01:16 INFO checkpoint.py:110 | Save checkpoint at the end of step 87999 to /job_data/checkpoints/checkpoint_21 Rank[4/16] 06/24/2025 23:01:16 INFO checkpoint.py:110 | Save checkpoint at the end of step 87999 to /job_data/checkpoints/checkpoint_21 Rank[9/16] 06/24/2025 23:01:16 INFO checkpoint.py:110 | Save checkpoint at the end of step 87999 to /job_data/checkpoints/checkpoint_21 Rank[8/16] 06/24/2025 23:01:16 INFO checkpoint.py:110 | Save checkpoint at the end of step 87999 to /job_data/checkpoints/checkpoint_21 Rank[13/16] 06/24/2025 23:01:16 INFO checkpoint.py:110 | Save checkpoint at the end of step 87999 to /job_data/checkpoints/checkpoint_21 Rank[14/16] 06/24/2025 23:01:16 INFO checkpoint.py:110 | Save checkpoint at the end of step 87999 to /job_data/checkpoints/checkpoint_21 Rank[11/16] 06/24/2025 23:01:16 INFO checkpoint.py:110 | Save checkpoint at the end of step 87999 to /job_data/checkpoints/checkpoint_21 Rank[12/16] 06/24/2025 23:01:16 INFO checkpoint.py:110 | Save checkpoint at the end of step 87999 to /job_data/checkpoints/checkpoint_21 Rank[15/16] 06/24/2025 23:01:16 INFO checkpoint.py:110 | Save checkpoint at the end of step 87999 to /job_data/checkpoints/checkpoint_21 Rank[10/16] 06/24/2025 23:01:16 INFO checkpoint.py:110 | Save checkpoint at the end of step 87999 to /job_data/checkpoints/checkpoint_21 Rank[0/16] 06/24/2025 23:01:16 INFO checkpointing.py:106 | Model weights saved in /job_data/checkpoints/checkpoint_21/model.safetensors Rank[0/16] 06/24/2025 23:01:17 INFO checkpointing.py:113 | Optimizer state saved in /job_data/checkpoints/checkpoint_21/optimizer.bin Rank[0/16] 06/24/2025 23:01:17 INFO checkpointing.py:120 | Scheduler state saved in /job_data/checkpoints/checkpoint_21/scheduler.bin Rank[0/16] 06/24/2025 23:01:17 INFO checkpointing.py:137 | Sampler state for dataloader 0 saved in /job_data/checkpoints/checkpoint_21/sampler.bin Rank[0/16] 06/24/2025 23:01:17 INFO checkpointing.py:164 | Random states saved in /job_data/checkpoints/checkpoint_21/random_states_0.pkl Rank[0/16] 06/24/2025 23:01:17 INFO checkpointing.py:300 | Saving the state of TrainerProgressState to /job_data/checkpoints/checkpoint_21/custom_checkpoint_0.pkl Rank[0/16] 06/24/2025 23:01:17 INFO checkpoint.py:110 | Save checkpoint at the end of step 87999 to /job_data/checkpoints/checkpoint_21 Rank[0/16] 06/24/2025 23:01:19 INFO loss_tracker.py:84 | Epoch[642/NA] Step[49] GlobalStep[88003/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0164] Rank[0/16] 06/24/2025 23:01:27 INFO stats.py:314 | Epoch[642] Step[70] GlobalStep[88024] Training Speed: 436.22 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:22:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:01:29 INFO loss_tracker.py:84 | Epoch[642/NA] Step[74] GlobalStep[88028/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0125] total_loss[0.0171] Rank[0/16] 06/24/2025 23:01:37 INFO stats.py:314 | Epoch[642] Step[95] GlobalStep[88049] Training Speed: 425.77 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:22:38. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:01:39 INFO loss_tracker.py:84 | Epoch[642/NA] Step[99] GlobalStep[88053/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0051] loss_depth[0.0125] total_loss[0.0177] Rank[0/16] 06/24/2025 23:01:48 INFO stats.py:314 | Epoch[642] Step[120] GlobalStep[88074] Training Speed: 450.82 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 1:22:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:01:49 INFO loss_tracker.py:84 | Epoch[642/NA] Step[124] GlobalStep[88078/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 23:01:54 INFO stats.py:394 | Epoch[642] completed. Training Speed: 292.82 samples/sec across all devices. Epoch Time: 59.89 sec. Average Epoch Time: 59.89 sec. Average Step Time: 0.44 sec. Estimated Remaining Time: 1:22:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:01:59 INFO stats.py:314 | Epoch[643] Step[8] GlobalStep[88099] Training Speed: 440.42 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:22:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:02:05 INFO loss_tracker.py:84 | Epoch[643/NA] Step[24] GlobalStep[88115/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0035] loss_depth[0.0125] total_loss[0.0161] Rank[0/16] 06/24/2025 23:02:09 INFO stats.py:314 | Epoch[643] Step[33] GlobalStep[88124] Training Speed: 433.45 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:22:07. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:02:16 INFO loss_tracker.py:84 | Epoch[643/NA] Step[49] GlobalStep[88140/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0125] total_loss[0.0169] Rank[0/16] 06/24/2025 23:02:20 INFO stats.py:314 | Epoch[643] Step[58] GlobalStep[88149] Training Speed: 430.55 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:21:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:02:26 INFO loss_tracker.py:84 | Epoch[643/NA] Step[74] GlobalStep[88165/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0125] total_loss[0.0169] Rank[0/16] 06/24/2025 23:02:30 INFO stats.py:314 | Epoch[643] Step[83] GlobalStep[88174] Training Speed: 432.51 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:21:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:02:37 INFO loss_tracker.py:84 | Epoch[643/NA] Step[99] GlobalStep[88190/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 23:02:40 INFO stats.py:314 | Epoch[643] Step[108] GlobalStep[88199] Training Speed: 428.00 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:21:36. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:02:47 INFO loss_tracker.py:84 | Epoch[643/NA] Step[124] GlobalStep[88215/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0033] loss_depth[0.0125] total_loss[0.0159] Rank[0/16] 06/24/2025 23:02:50 INFO stats.py:314 | Epoch[643] Step[133] GlobalStep[88224] Training Speed: 453.44 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 1:21:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:02:51 INFO stats.py:394 | Epoch[643] completed. Training Speed: 306.97 samples/sec across all devices. Epoch Time: 57.13 sec. Average Epoch Time: 57.13 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 1:21:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:03:01 INFO stats.py:314 | Epoch[644] Step[21] GlobalStep[88249] Training Speed: 416.74 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 1:21:15. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:03:03 INFO loss_tracker.py:84 | Epoch[644/NA] Step[24] GlobalStep[88252/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0125] total_loss[0.0162] Rank[0/16] 06/24/2025 23:03:12 INFO stats.py:314 | Epoch[644] Step[46] GlobalStep[88274] Training Speed: 430.51 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:21:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:03:13 INFO loss_tracker.py:84 | Epoch[644/NA] Step[49] GlobalStep[88277/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0126] total_loss[0.0162] Rank[0/16] 06/24/2025 23:03:22 INFO stats.py:314 | Epoch[644] Step[71] GlobalStep[88299] Training Speed: 430.52 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:20:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:03:24 INFO loss_tracker.py:84 | Epoch[644/NA] Step[74] GlobalStep[88302/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0125] total_loss[0.0163] Rank[0/16] 06/24/2025 23:03:33 INFO stats.py:314 | Epoch[644] Step[96] GlobalStep[88324] Training Speed: 434.94 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:20:44. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:03:34 INFO loss_tracker.py:84 | Epoch[644/NA] Step[99] GlobalStep[88327/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 23:03:43 INFO stats.py:314 | Epoch[644] Step[121] GlobalStep[88349] Training Speed: 454.86 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 1:20:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:03:44 INFO loss_tracker.py:84 | Epoch[644/NA] Step[124] GlobalStep[88352/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 23:03:48 INFO stats.py:394 | Epoch[644] completed. Training Speed: 306.32 samples/sec across all devices. Epoch Time: 57.25 sec. Average Epoch Time: 57.25 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 1:20:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:03:53 INFO stats.py:314 | Epoch[645] Step[9] GlobalStep[88374] Training Speed: 404.72 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 1:20:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:03:59 INFO loss_tracker.py:84 | Epoch[645/NA] Step[24] GlobalStep[88389/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0125] total_loss[0.0173] Rank[0/16] 06/24/2025 23:04:03 INFO stats.py:314 | Epoch[645] Step[34] GlobalStep[88399] Training Speed: 431.79 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:20:13. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:04:10 INFO loss_tracker.py:84 | Epoch[645/NA] Step[49] GlobalStep[88414/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0125] total_loss[0.0162] Rank[0/16] 06/24/2025 23:04:13 INFO stats.py:314 | Epoch[645] Step[59] GlobalStep[88424] Training Speed: 424.19 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:20:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:04:20 INFO loss_tracker.py:84 | Epoch[645/NA] Step[74] GlobalStep[88439/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0126] total_loss[0.0162] Rank[0/16] 06/24/2025 23:04:24 INFO stats.py:314 | Epoch[645] Step[84] GlobalStep[88449] Training Speed: 389.75 samples/sec across all devices. Average Step Time: 0.33 sec. Estimated Remaining Time: 1:19:52. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:04:30 INFO loss_tracker.py:84 | Epoch[645/NA] Step[99] GlobalStep[88464/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0032] loss_depth[0.0126] total_loss[0.0157] Rank[0/16] 06/24/2025 23:04:34 INFO stats.py:314 | Epoch[645] Step[109] GlobalStep[88474] Training Speed: 432.06 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:19:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:04:40 INFO loss_tracker.py:84 | Epoch[645/NA] Step[124] GlobalStep[88489/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 23:04:44 INFO stats.py:314 | Epoch[645] Step[134] GlobalStep[88499] Training Speed: 444.04 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:19:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:04:44 INFO stats.py:394 | Epoch[645] completed. Training Speed: 311.72 samples/sec across all devices. Epoch Time: 56.26 sec. Average Epoch Time: 56.26 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:19:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:04:55 INFO stats.py:314 | Epoch[646] Step[22] GlobalStep[88524] Training Speed: 431.77 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:19:21. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:04:56 INFO loss_tracker.py:84 | Epoch[646/NA] Step[24] GlobalStep[88526/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0125] total_loss[0.0171] Rank[0/16] 06/24/2025 23:05:05 INFO stats.py:314 | Epoch[646] Step[47] GlobalStep[88549] Training Speed: 438.29 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:19:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:05:06 INFO loss_tracker.py:84 | Epoch[646/NA] Step[49] GlobalStep[88551/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0125] total_loss[0.0171] Rank[0/16] 06/24/2025 23:05:15 INFO stats.py:314 | Epoch[646] Step[72] GlobalStep[88574] Training Speed: 442.03 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:19:00. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:05:16 INFO loss_tracker.py:84 | Epoch[646/NA] Step[74] GlobalStep[88576/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 23:05:25 INFO stats.py:314 | Epoch[646] Step[97] GlobalStep[88599] Training Speed: 423.50 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:18:50. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:05:27 INFO loss_tracker.py:84 | Epoch[646/NA] Step[99] GlobalStep[88601/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0125] total_loss[0.0163] Rank[0/16] 06/24/2025 23:05:36 INFO stats.py:314 | Epoch[646] Step[122] GlobalStep[88624] Training Speed: 416.11 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 1:18:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:05:37 INFO loss_tracker.py:84 | Epoch[646/NA] Step[124] GlobalStep[88626/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0035] loss_depth[0.0125] total_loss[0.0161] Rank[0/16] 06/24/2025 23:05:41 INFO stats.py:394 | Epoch[646] completed. Training Speed: 308.24 samples/sec across all devices. Epoch Time: 56.89 sec. Average Epoch Time: 56.89 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 1:18:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:05:47 INFO stats.py:314 | Epoch[647] Step[10] GlobalStep[88649] Training Speed: 433.85 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:18:29. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:05:53 INFO loss_tracker.py:84 | Epoch[647/NA] Step[24] GlobalStep[88663/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0125] total_loss[0.0172] Rank[0/16] 06/24/2025 23:05:57 INFO stats.py:314 | Epoch[647] Step[35] GlobalStep[88674] Training Speed: 429.63 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:18:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:06:03 INFO loss_tracker.py:84 | Epoch[647/NA] Step[49] GlobalStep[88688/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 23:06:07 INFO stats.py:314 | Epoch[647] Step[60] GlobalStep[88699] Training Speed: 428.88 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:18:09. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:06:13 INFO loss_tracker.py:84 | Epoch[647/NA] Step[74] GlobalStep[88713/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 23:06:17 INFO stats.py:314 | Epoch[647] Step[85] GlobalStep[88724] Training Speed: 437.09 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:17:58. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:06:23 INFO loss_tracker.py:84 | Epoch[647/NA] Step[99] GlobalStep[88738/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 23:06:27 INFO stats.py:314 | Epoch[647] Step[110] GlobalStep[88749] Training Speed: 423.80 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:17:48. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:06:33 INFO loss_tracker.py:84 | Epoch[647/NA] Step[124] GlobalStep[88763/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0125] total_loss[0.0168] Rank[0/16] 06/24/2025 23:06:37 INFO stats.py:314 | Epoch[647] Step[135] GlobalStep[88774] Training Speed: 441.68 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:17:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:06:37 INFO stats.py:394 | Epoch[647] completed. Training Speed: 314.12 samples/sec across all devices. Epoch Time: 55.83 sec. Average Epoch Time: 55.83 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:17:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:06:48 INFO stats.py:314 | Epoch[648] Step[23] GlobalStep[88799] Training Speed: 432.29 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:17:27. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:06:49 INFO loss_tracker.py:84 | Epoch[648/NA] Step[24] GlobalStep[88800/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 23:06:59 INFO stats.py:314 | Epoch[648] Step[48] GlobalStep[88824] Training Speed: 393.47 samples/sec across all devices. Average Step Time: 0.33 sec. Estimated Remaining Time: 1:17:17. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:06:59 INFO loss_tracker.py:84 | Epoch[648/NA] Step[49] GlobalStep[88825/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0125] total_loss[0.0169] Rank[0/16] 06/24/2025 23:07:09 INFO stats.py:314 | Epoch[648] Step[73] GlobalStep[88849] Training Speed: 439.50 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:17:06. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:07:10 INFO loss_tracker.py:84 | Epoch[648/NA] Step[74] GlobalStep[88850/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 23:07:19 INFO stats.py:314 | Epoch[648] Step[98] GlobalStep[88874] Training Speed: 430.58 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:16:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:07:20 INFO loss_tracker.py:84 | Epoch[648/NA] Step[99] GlobalStep[88875/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 23:07:30 INFO stats.py:314 | Epoch[648] Step[123] GlobalStep[88899] Training Speed: 452.66 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 1:16:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:07:30 INFO loss_tracker.py:84 | Epoch[648/NA] Step[124] GlobalStep[88900/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 23:07:34 INFO stats.py:394 | Epoch[648] completed. Training Speed: 306.15 samples/sec across all devices. Epoch Time: 57.28 sec. Average Epoch Time: 57.28 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 1:16:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:07:40 INFO stats.py:314 | Epoch[649] Step[11] GlobalStep[88924] Training Speed: 403.32 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 1:16:35. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:07:45 INFO loss_tracker.py:84 | Epoch[649/NA] Step[24] GlobalStep[88937/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 23:07:50 INFO stats.py:314 | Epoch[649] Step[36] GlobalStep[88949] Training Speed: 420.16 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:16:25. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:07:56 INFO loss_tracker.py:84 | Epoch[649/NA] Step[49] GlobalStep[88962/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0046] loss_depth[0.0125] total_loss[0.0172] Rank[0/16] 06/24/2025 23:08:01 INFO stats.py:314 | Epoch[649] Step[61] GlobalStep[88974] Training Speed: 427.93 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:16:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:08:06 INFO loss_tracker.py:84 | Epoch[649/NA] Step[74] GlobalStep[88987/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 23:08:11 INFO stats.py:314 | Epoch[649] Step[86] GlobalStep[88999] Training Speed: 433.98 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:16:04. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:08:16 INFO loss_tracker.py:84 | Epoch[649/NA] Step[99] GlobalStep[89012/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0041] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 23:08:21 INFO stats.py:314 | Epoch[649] Step[111] GlobalStep[89024] Training Speed: 436.60 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:15:54. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:08:26 INFO loss_tracker.py:84 | Epoch[649/NA] Step[124] GlobalStep[89037/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0126] total_loss[0.0165] Rank[0/16] 06/24/2025 23:08:31 INFO stats.py:314 | Epoch[649] Step[136] GlobalStep[89049] Training Speed: 451.26 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 1:15:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:08:31 INFO stats.py:394 | Epoch[649] completed. Training Speed: 312.42 samples/sec across all devices. Epoch Time: 56.13 sec. Average Epoch Time: 56.13 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:15:43. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:08:42 INFO stats.py:314 | Epoch[650] Step[24] GlobalStep[89074] Training Speed: 411.45 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 1:15:33. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:08:42 INFO loss_tracker.py:84 | Epoch[650/NA] Step[24] GlobalStep[89074/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 23:08:52 INFO stats.py:314 | Epoch[650] Step[49] GlobalStep[89099] Training Speed: 425.60 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:15:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:08:52 INFO loss_tracker.py:84 | Epoch[650/NA] Step[49] GlobalStep[89099/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0125] total_loss[0.0163] Rank[0/16] 06/24/2025 23:09:03 INFO stats.py:314 | Epoch[650] Step[74] GlobalStep[89124] Training Speed: 426.93 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:15:12. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:09:03 INFO loss_tracker.py:84 | Epoch[650/NA] Step[74] GlobalStep[89124/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0125] total_loss[0.0163] Rank[0/16] 06/24/2025 23:09:13 INFO stats.py:314 | Epoch[650] Step[99] GlobalStep[89149] Training Speed: 400.64 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 1:15:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:09:13 INFO loss_tracker.py:84 | Epoch[650/NA] Step[99] GlobalStep[89149/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0048] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 23:09:23 INFO stats.py:314 | Epoch[650] Step[124] GlobalStep[89174] Training Speed: 453.28 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 1:14:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:09:23 INFO loss_tracker.py:84 | Epoch[650/NA] Step[124] GlobalStep[89174/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0126] total_loss[0.0174] Rank[0/16] 06/24/2025 23:09:28 INFO stats.py:394 | Epoch[650] completed. Training Speed: 307.98 samples/sec across all devices. Epoch Time: 56.94 sec. Average Epoch Time: 56.94 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 1:14:46. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:09:34 INFO stats.py:314 | Epoch[651] Step[12] GlobalStep[89199] Training Speed: 432.68 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:14:41. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:09:39 INFO loss_tracker.py:84 | Epoch[651/NA] Step[24] GlobalStep[89211/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0169] Rank[0/16] 06/24/2025 23:09:45 INFO stats.py:314 | Epoch[651] Step[37] GlobalStep[89224] Training Speed: 432.81 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:14:31. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:09:49 INFO loss_tracker.py:84 | Epoch[651/NA] Step[49] GlobalStep[89236/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0047] loss_depth[0.0126] total_loss[0.0173] Rank[0/16] 06/24/2025 23:09:55 INFO stats.py:314 | Epoch[651] Step[62] GlobalStep[89249] Training Speed: 386.28 samples/sec across all devices. Average Step Time: 0.33 sec. Estimated Remaining Time: 1:14:20. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:10:00 INFO loss_tracker.py:84 | Epoch[651/NA] Step[74] GlobalStep[89261/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0125] total_loss[0.0169] Rank[0/16] 06/24/2025 23:10:05 INFO stats.py:314 | Epoch[651] Step[87] GlobalStep[89274] Training Speed: 427.79 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:14:10. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:10:10 INFO loss_tracker.py:84 | Epoch[651/NA] Step[99] GlobalStep[89286/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0049] loss_depth[0.0125] total_loss[0.0175] Rank[0/16] 06/24/2025 23:10:15 INFO stats.py:314 | Epoch[651] Step[112] GlobalStep[89299] Training Speed: 430.96 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:13:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:10:20 INFO loss_tracker.py:84 | Epoch[651/NA] Step[124] GlobalStep[89311/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0125] total_loss[0.0163] Rank[0/16] 06/24/2025 23:10:25 INFO stats.py:394 | Epoch[651] completed. Training Speed: 306.64 samples/sec across all devices. Epoch Time: 57.19 sec. Average Epoch Time: 57.19 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 1:13:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:10:26 INFO stats.py:314 | Epoch[652] Step[0] GlobalStep[89324] Training Speed: 356.88 samples/sec across all devices. Average Step Time: 0.36 sec. Estimated Remaining Time: 1:13:49. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:10:36 INFO loss_tracker.py:84 | Epoch[652/NA] Step[24] GlobalStep[89348/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 23:10:36 INFO stats.py:314 | Epoch[652] Step[25] GlobalStep[89349] Training Speed: 398.77 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 1:13:39. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:10:47 INFO loss_tracker.py:84 | Epoch[652/NA] Step[49] GlobalStep[89373/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0164] Rank[0/16] 06/24/2025 23:10:47 INFO stats.py:314 | Epoch[652] Step[50] GlobalStep[89374] Training Speed: 424.28 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:13:28. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:10:57 INFO loss_tracker.py:84 | Epoch[652/NA] Step[74] GlobalStep[89398/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0125] total_loss[0.0163] Rank[0/16] 06/24/2025 23:10:57 INFO stats.py:314 | Epoch[652] Step[75] GlobalStep[89399] Training Speed: 425.18 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:13:18. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:11:07 INFO loss_tracker.py:84 | Epoch[652/NA] Step[99] GlobalStep[89423/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 23:11:07 INFO stats.py:314 | Epoch[652] Step[100] GlobalStep[89424] Training Speed: 400.73 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 1:13:08. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:11:18 INFO loss_tracker.py:84 | Epoch[652/NA] Step[124] GlobalStep[89448/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0043] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 23:11:18 INFO stats.py:314 | Epoch[652] Step[125] GlobalStep[89449] Training Speed: 414.40 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 1:12:57. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:11:22 INFO stats.py:394 | Epoch[652] completed. Training Speed: 306.67 samples/sec across all devices. Epoch Time: 57.18 sec. Average Epoch Time: 57.18 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 1:12:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:11:29 INFO stats.py:314 | Epoch[653] Step[13] GlobalStep[89474] Training Speed: 429.65 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:12:47. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:11:33 INFO loss_tracker.py:84 | Epoch[653/NA] Step[24] GlobalStep[89485/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0036] loss_depth[0.0125] total_loss[0.0161] Rank[0/16] 06/24/2025 23:11:39 INFO stats.py:314 | Epoch[653] Step[38] GlobalStep[89499] Training Speed: 434.27 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:12:37. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:11:44 INFO loss_tracker.py:84 | Epoch[653/NA] Step[49] GlobalStep[89510/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0164] Rank[0/16] 06/24/2025 23:11:49 INFO stats.py:314 | Epoch[653] Step[63] GlobalStep[89524] Training Speed: 438.14 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:12:26. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:11:54 INFO loss_tracker.py:84 | Epoch[653/NA] Step[74] GlobalStep[89535/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0126] total_loss[0.0168] Rank[0/16] 06/24/2025 23:12:00 INFO stats.py:314 | Epoch[653] Step[88] GlobalStep[89549] Training Speed: 428.67 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:12:16. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:12:04 INFO loss_tracker.py:84 | Epoch[653/NA] Step[99] GlobalStep[89560/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 23:12:09 INFO stats.py:314 | Epoch[653] Step[113] GlobalStep[89574] Training Speed: 429.53 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:12:05. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:12:14 INFO loss_tracker.py:84 | Epoch[653/NA] Step[124] GlobalStep[89585/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0125] total_loss[0.0163] Rank[0/16] 06/24/2025 23:12:19 INFO stats.py:394 | Epoch[653] completed. Training Speed: 307.76 samples/sec across all devices. Epoch Time: 56.98 sec. Average Epoch Time: 56.98 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 1:11:56. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:12:21 INFO stats.py:314 | Epoch[654] Step[1] GlobalStep[89599] Training Speed: 411.76 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 1:11:55. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:12:30 INFO loss_tracker.py:84 | Epoch[654/NA] Step[24] GlobalStep[89622/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0045] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 23:12:31 INFO stats.py:314 | Epoch[654] Step[26] GlobalStep[89624] Training Speed: 422.25 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:11:45. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:12:41 INFO loss_tracker.py:84 | Epoch[654/NA] Step[49] GlobalStep[89647/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0037] loss_depth[0.0125] total_loss[0.0162] Rank[0/16] 06/24/2025 23:12:42 INFO stats.py:314 | Epoch[654] Step[51] GlobalStep[89649] Training Speed: 420.46 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:11:34. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:12:51 INFO loss_tracker.py:84 | Epoch[654/NA] Step[74] GlobalStep[89672/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0034] loss_depth[0.0125] total_loss[0.0160] Rank[0/16] 06/24/2025 23:12:52 INFO stats.py:314 | Epoch[654] Step[76] GlobalStep[89674] Training Speed: 431.04 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:11:24. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:13:01 INFO loss_tracker.py:84 | Epoch[654/NA] Step[99] GlobalStep[89697/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0055] loss_depth[0.0126] total_loss[0.0181] Rank[0/16] 06/24/2025 23:13:02 INFO stats.py:314 | Epoch[654] Step[101] GlobalStep[89699] Training Speed: 430.34 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:11:14. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:13:10 INFO loss_tracker.py:84 | Epoch[654/NA] Step[124] GlobalStep[89722/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0044] loss_depth[0.0125] total_loss[0.0170] Rank[0/16] 06/24/2025 23:13:11 INFO stats.py:314 | Epoch[654] Step[126] GlobalStep[89724] Training Speed: 450.76 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 1:11:03. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:13:15 INFO stats.py:394 | Epoch[654] completed. Training Speed: 313.53 samples/sec across all devices. Epoch Time: 55.93 sec. Average Epoch Time: 55.93 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:10:59. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:13:22 INFO stats.py:314 | Epoch[655] Step[14] GlobalStep[89749] Training Speed: 426.29 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:10:53. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:13:26 INFO loss_tracker.py:84 | Epoch[655/NA] Step[24] GlobalStep[89759/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0126] total_loss[0.0166] Rank[0/16] 06/24/2025 23:13:33 INFO stats.py:314 | Epoch[655] Step[39] GlobalStep[89774] Training Speed: 436.95 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:10:42. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:13:37 INFO loss_tracker.py:84 | Epoch[655/NA] Step[49] GlobalStep[89784/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0055] loss_depth[0.0126] total_loss[0.0181] Rank[0/16] 06/24/2025 23:13:43 INFO stats.py:314 | Epoch[655] Step[64] GlobalStep[89799] Training Speed: 436.50 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:10:32. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:13:47 INFO loss_tracker.py:84 | Epoch[655/NA] Step[74] GlobalStep[89809/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0165] Rank[0/16] 06/24/2025 23:13:54 INFO stats.py:314 | Epoch[655] Step[89] GlobalStep[89824] Training Speed: 435.19 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:10:22. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:13:58 INFO loss_tracker.py:84 | Epoch[655/NA] Step[99] GlobalStep[89834/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0035] loss_depth[0.0125] total_loss[0.0161] Rank[0/16] 06/24/2025 23:14:04 INFO stats.py:314 | Epoch[655] Step[114] GlobalStep[89849] Training Speed: 437.51 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:10:11. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:14:08 INFO loss_tracker.py:84 | Epoch[655/NA] Step[124] GlobalStep[89859/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0039] loss_depth[0.0125] total_loss[0.0164] Rank[0/16] 06/24/2025 23:14:12 INFO stats.py:394 | Epoch[655] completed. Training Speed: 304.91 samples/sec across all devices. Epoch Time: 57.51 sec. Average Epoch Time: 57.51 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 1:10:02. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:14:14 INFO stats.py:314 | Epoch[656] Step[2] GlobalStep[89874] Training Speed: 435.28 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:10:01. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:14:23 INFO loss_tracker.py:84 | Epoch[656/NA] Step[24] GlobalStep[89896/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0042] loss_depth[0.0125] total_loss[0.0167] Rank[0/16] 06/24/2025 23:14:25 INFO stats.py:314 | Epoch[656] Step[27] GlobalStep[89899] Training Speed: 426.67 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:09:51. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:14:34 INFO loss_tracker.py:84 | Epoch[656/NA] Step[49] GlobalStep[89921/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0038] loss_depth[0.0126] total_loss[0.0163] Rank[0/16] 06/24/2025 23:14:36 INFO stats.py:314 | Epoch[656] Step[52] GlobalStep[89924] Training Speed: 425.27 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:09:40. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:14:45 INFO loss_tracker.py:84 | Epoch[656/NA] Step[74] GlobalStep[89946/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 23:14:46 INFO stats.py:314 | Epoch[656] Step[77] GlobalStep[89949] Training Speed: 421.15 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:09:30. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:14:55 INFO loss_tracker.py:84 | Epoch[656/NA] Step[99] GlobalStep[89971/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0040] loss_depth[0.0125] total_loss[0.0166] Rank[0/16] 06/24/2025 23:14:56 INFO stats.py:314 | Epoch[656] Step[102] GlobalStep[89974] Training Speed: 421.70 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:09:19. Learning Rate: 1.00000e-04. Rank[0/16] 06/24/2025 23:15:05 INFO loss_tracker.py:84 | Epoch[656/NA] Step[124] GlobalStep[89996/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0030] loss_depth[0.0125] total_loss[0.0156] Rank[0/16] 06/24/2025 23:15:06 INFO stats.py:314 | Epoch[656] Step[127] GlobalStep[89999] Training Speed: 439.22 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:09:09. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:15:09 INFO stats.py:394 | Epoch[656] completed. Training Speed: 306.81 samples/sec across all devices. Epoch Time: 57.16 sec. Average Epoch Time: 57.16 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 1:09:05. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:15:17 INFO stats.py:314 | Epoch[657] Step[15] GlobalStep[90024] Training Speed: 437.73 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:08:59. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:15:21 INFO loss_tracker.py:84 | Epoch[657/NA] Step[24] GlobalStep[90033/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0031] loss_depth[0.0125] total_loss[0.0157] Rank[0/16] 06/24/2025 23:15:27 INFO stats.py:314 | Epoch[657] Step[40] GlobalStep[90049] Training Speed: 440.93 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:08:48. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:15:31 INFO loss_tracker.py:84 | Epoch[657/NA] Step[49] GlobalStep[90058/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0014] loss_depth[0.0126] total_loss[0.0139] Rank[0/16] 06/24/2025 23:15:38 INFO stats.py:314 | Epoch[657] Step[65] GlobalStep[90074] Training Speed: 427.58 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:08:38. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:15:41 INFO loss_tracker.py:84 | Epoch[657/NA] Step[74] GlobalStep[90083/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0007] loss_depth[0.0125] total_loss[0.0132] Rank[0/16] 06/24/2025 23:15:48 INFO stats.py:314 | Epoch[657] Step[90] GlobalStep[90099] Training Speed: 432.61 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:08:28. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:15:52 INFO loss_tracker.py:84 | Epoch[657/NA] Step[99] GlobalStep[90108/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0135] Rank[0/16] 06/24/2025 23:15:58 INFO stats.py:314 | Epoch[657] Step[115] GlobalStep[90124] Training Speed: 426.43 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:08:17. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:16:02 INFO loss_tracker.py:84 | Epoch[657/NA] Step[124] GlobalStep[90133/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0007] loss_depth[0.0125] total_loss[0.0132] Rank[0/16] 06/24/2025 23:16:06 INFO stats.py:394 | Epoch[657] completed. Training Speed: 309.18 samples/sec across all devices. Epoch Time: 56.72 sec. Average Epoch Time: 56.72 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:08:08. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:16:09 INFO stats.py:314 | Epoch[658] Step[3] GlobalStep[90149] Training Speed: 432.18 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:08:07. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:16:18 INFO loss_tracker.py:84 | Epoch[658/NA] Step[24] GlobalStep[90170/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0008] loss_depth[0.0125] total_loss[0.0133] Rank[0/16] 06/24/2025 23:16:19 INFO stats.py:314 | Epoch[658] Step[28] GlobalStep[90174] Training Speed: 433.93 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:07:56. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:16:27 INFO loss_tracker.py:84 | Epoch[658/NA] Step[49] GlobalStep[90195/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0008] loss_depth[0.0125] total_loss[0.0133] Rank[0/16] 06/24/2025 23:16:29 INFO stats.py:314 | Epoch[658] Step[53] GlobalStep[90199] Training Speed: 420.21 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:07:46. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:16:38 INFO loss_tracker.py:84 | Epoch[658/NA] Step[74] GlobalStep[90220/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0008] loss_depth[0.0125] total_loss[0.0133] Rank[0/16] 06/24/2025 23:16:39 INFO stats.py:314 | Epoch[658] Step[78] GlobalStep[90224] Training Speed: 406.09 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 1:07:36. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:16:48 INFO loss_tracker.py:84 | Epoch[658/NA] Step[99] GlobalStep[90245/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0008] loss_depth[0.0126] total_loss[0.0133] Rank[0/16] 06/24/2025 23:16:50 INFO stats.py:314 | Epoch[658] Step[103] GlobalStep[90249] Training Speed: 429.76 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:07:25. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:16:58 INFO loss_tracker.py:84 | Epoch[658/NA] Step[124] GlobalStep[90270/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/24/2025 23:17:00 INFO stats.py:314 | Epoch[658] Step[128] GlobalStep[90274] Training Speed: 441.50 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:07:15. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:17:03 INFO stats.py:394 | Epoch[658] completed. Training Speed: 309.61 samples/sec across all devices. Epoch Time: 56.64 sec. Average Epoch Time: 56.64 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:07:12. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:17:11 INFO stats.py:314 | Epoch[659] Step[16] GlobalStep[90299] Training Speed: 419.95 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:07:05. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:17:14 INFO loss_tracker.py:84 | Epoch[659/NA] Step[24] GlobalStep[90307/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/24/2025 23:17:21 INFO stats.py:314 | Epoch[659] Step[41] GlobalStep[90324] Training Speed: 432.85 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:06:54. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:17:25 INFO loss_tracker.py:84 | Epoch[659/NA] Step[49] GlobalStep[90332/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0007] loss_depth[0.0125] total_loss[0.0132] Rank[0/16] 06/24/2025 23:17:32 INFO stats.py:314 | Epoch[659] Step[66] GlobalStep[90349] Training Speed: 431.23 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:06:44. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:17:35 INFO loss_tracker.py:84 | Epoch[659/NA] Step[74] GlobalStep[90357/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0007] loss_depth[0.0125] total_loss[0.0133] Rank[0/16] 06/24/2025 23:17:42 INFO stats.py:314 | Epoch[659] Step[91] GlobalStep[90374] Training Speed: 433.64 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:06:33. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:17:45 INFO loss_tracker.py:84 | Epoch[659/NA] Step[99] GlobalStep[90382/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0006] loss_depth[0.0125] total_loss[0.0132] Rank[0/16] 06/24/2025 23:17:52 INFO stats.py:314 | Epoch[659] Step[116] GlobalStep[90399] Training Speed: 426.31 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:06:23. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:17:55 INFO loss_tracker.py:84 | Epoch[659/NA] Step[124] GlobalStep[90407/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0126] total_loss[0.0136] Rank[0/16] 06/24/2025 23:18:00 INFO stats.py:394 | Epoch[659] completed. Training Speed: 307.32 samples/sec across all devices. Epoch Time: 57.06 sec. Average Epoch Time: 57.06 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 1:06:15. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:18:03 INFO stats.py:314 | Epoch[660] Step[4] GlobalStep[90424] Training Speed: 437.70 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:06:13. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:18:11 INFO loss_tracker.py:84 | Epoch[660/NA] Step[24] GlobalStep[90444/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0008] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/24/2025 23:18:13 INFO stats.py:314 | Epoch[660] Step[29] GlobalStep[90449] Training Speed: 432.00 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:06:02. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:18:22 INFO loss_tracker.py:84 | Epoch[660/NA] Step[49] GlobalStep[90469/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0125] total_loss[0.0135] Rank[0/16] 06/24/2025 23:18:24 INFO stats.py:314 | Epoch[660] Step[54] GlobalStep[90474] Training Speed: 408.79 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 1:05:52. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:18:32 INFO loss_tracker.py:84 | Epoch[660/NA] Step[74] GlobalStep[90494/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0008] loss_depth[0.0125] total_loss[0.0133] Rank[0/16] 06/24/2025 23:18:34 INFO stats.py:314 | Epoch[660] Step[79] GlobalStep[90499] Training Speed: 430.74 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:05:42. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:18:43 INFO loss_tracker.py:84 | Epoch[660/NA] Step[99] GlobalStep[90519/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0006] loss_depth[0.0125] total_loss[0.0131] Rank[0/16] 06/24/2025 23:18:45 INFO stats.py:314 | Epoch[660] Step[104] GlobalStep[90524] Training Speed: 440.45 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:05:31. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:18:52 INFO loss_tracker.py:84 | Epoch[660/NA] Step[124] GlobalStep[90544/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/24/2025 23:18:54 INFO stats.py:314 | Epoch[660] Step[129] GlobalStep[90549] Training Speed: 451.83 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 1:05:21. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:18:56 INFO stats.py:394 | Epoch[660] completed. Training Speed: 310.46 samples/sec across all devices. Epoch Time: 56.48 sec. Average Epoch Time: 56.48 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:05:18. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:19:05 INFO stats.py:314 | Epoch[661] Step[17] GlobalStep[90574] Training Speed: 423.80 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:05:10. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:19:08 INFO loss_tracker.py:84 | Epoch[661/NA] Step[24] GlobalStep[90581/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0007] loss_depth[0.0126] total_loss[0.0132] Rank[0/16] 06/24/2025 23:19:15 INFO stats.py:314 | Epoch[661] Step[42] GlobalStep[90599] Training Speed: 403.44 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 1:05:00. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:19:18 INFO loss_tracker.py:84 | Epoch[661/NA] Step[49] GlobalStep[90606/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0008] loss_depth[0.0125] total_loss[0.0133] Rank[0/16] 06/24/2025 23:19:26 INFO stats.py:314 | Epoch[661] Step[67] GlobalStep[90624] Training Speed: 433.00 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:04:50. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:19:29 INFO loss_tracker.py:84 | Epoch[661/NA] Step[74] GlobalStep[90631/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/24/2025 23:19:36 INFO stats.py:314 | Epoch[661] Step[92] GlobalStep[90649] Training Speed: 441.92 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:04:39. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:19:39 INFO loss_tracker.py:84 | Epoch[661/NA] Step[99] GlobalStep[90656/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0008] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/24/2025 23:19:46 INFO stats.py:314 | Epoch[661] Step[117] GlobalStep[90674] Training Speed: 395.38 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 1:04:29. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:19:49 INFO loss_tracker.py:84 | Epoch[661/NA] Step[124] GlobalStep[90681/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0008] loss_depth[0.0125] total_loss[0.0133] Rank[0/16] 06/24/2025 23:19:53 INFO stats.py:394 | Epoch[661] completed. Training Speed: 310.75 samples/sec across all devices. Epoch Time: 56.43 sec. Average Epoch Time: 56.43 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:04:21. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:19:56 INFO stats.py:314 | Epoch[662] Step[5] GlobalStep[90699] Training Speed: 399.96 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 1:04:19. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:20:04 INFO loss_tracker.py:84 | Epoch[662/NA] Step[24] GlobalStep[90718/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0007] loss_depth[0.0125] total_loss[0.0133] Rank[0/16] 06/24/2025 23:20:07 INFO stats.py:314 | Epoch[662] Step[30] GlobalStep[90724] Training Speed: 402.26 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 1:04:08. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:20:15 INFO loss_tracker.py:84 | Epoch[662/NA] Step[49] GlobalStep[90743/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0008] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/24/2025 23:20:17 INFO stats.py:314 | Epoch[662] Step[55] GlobalStep[90749] Training Speed: 416.39 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 1:03:58. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:20:25 INFO loss_tracker.py:84 | Epoch[662/NA] Step[74] GlobalStep[90768/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0007] loss_depth[0.0126] total_loss[0.0133] Rank[0/16] 06/24/2025 23:20:27 INFO stats.py:314 | Epoch[662] Step[80] GlobalStep[90774] Training Speed: 431.28 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:03:47. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:20:35 INFO loss_tracker.py:84 | Epoch[662/NA] Step[99] GlobalStep[90793/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/24/2025 23:20:37 INFO stats.py:314 | Epoch[662] Step[105] GlobalStep[90799] Training Speed: 425.87 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:03:37. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:20:45 INFO loss_tracker.py:84 | Epoch[662/NA] Step[124] GlobalStep[90818/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0007] loss_depth[0.0125] total_loss[0.0132] Rank[0/16] 06/24/2025 23:20:47 INFO stats.py:314 | Epoch[662] Step[130] GlobalStep[90824] Training Speed: 453.81 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 1:03:27. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:20:49 INFO stats.py:394 | Epoch[662] completed. Training Speed: 312.40 samples/sec across all devices. Epoch Time: 56.13 sec. Average Epoch Time: 56.13 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:03:24. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:20:58 INFO stats.py:314 | Epoch[663] Step[18] GlobalStep[90849] Training Speed: 434.00 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:03:16. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:21:00 INFO loss_tracker.py:84 | Epoch[663/NA] Step[24] GlobalStep[90855/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0008] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/24/2025 23:21:08 INFO stats.py:314 | Epoch[663] Step[43] GlobalStep[90874] Training Speed: 434.42 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:03:06. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:21:10 INFO loss_tracker.py:84 | Epoch[663/NA] Step[49] GlobalStep[90880/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0008] loss_depth[0.0125] total_loss[0.0133] Rank[0/16] 06/24/2025 23:21:18 INFO stats.py:314 | Epoch[663] Step[68] GlobalStep[90899] Training Speed: 432.56 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:02:55. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:21:20 INFO loss_tracker.py:84 | Epoch[663/NA] Step[74] GlobalStep[90905/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0125] total_loss[0.0135] Rank[0/16] 06/24/2025 23:21:28 INFO stats.py:314 | Epoch[663] Step[93] GlobalStep[90924] Training Speed: 431.55 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:02:45. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:21:31 INFO loss_tracker.py:84 | Epoch[663/NA] Step[99] GlobalStep[90930/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0135] Rank[0/16] 06/24/2025 23:21:39 INFO stats.py:314 | Epoch[663] Step[118] GlobalStep[90949] Training Speed: 396.50 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 1:02:35. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:21:41 INFO loss_tracker.py:84 | Epoch[663/NA] Step[124] GlobalStep[90955/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/24/2025 23:21:45 INFO stats.py:394 | Epoch[663] completed. Training Speed: 311.24 samples/sec across all devices. Epoch Time: 56.34 sec. Average Epoch Time: 56.34 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:02:27. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:21:50 INFO stats.py:314 | Epoch[664] Step[6] GlobalStep[90974] Training Speed: 399.48 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 1:02:24. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:21:57 INFO loss_tracker.py:84 | Epoch[664/NA] Step[24] GlobalStep[90992/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0008] loss_depth[0.0125] total_loss[0.0133] Rank[0/16] 06/24/2025 23:22:00 INFO stats.py:314 | Epoch[664] Step[31] GlobalStep[90999] Training Speed: 426.51 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:02:14. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:22:07 INFO loss_tracker.py:84 | Epoch[664/NA] Step[49] GlobalStep[91017/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0007] loss_depth[0.0125] total_loss[0.0133] Rank[0/16] 06/24/2025 23:22:10 INFO stats.py:314 | Epoch[664] Step[56] GlobalStep[91024] Training Speed: 441.12 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:02:04. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:22:17 INFO loss_tracker.py:84 | Epoch[664/NA] Step[74] GlobalStep[91042/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0126] total_loss[0.0136] Rank[0/16] 06/24/2025 23:22:20 INFO stats.py:314 | Epoch[664] Step[81] GlobalStep[91049] Training Speed: 437.96 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:01:53. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:22:28 INFO loss_tracker.py:84 | Epoch[664/NA] Step[99] GlobalStep[91067/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/24/2025 23:22:31 INFO stats.py:314 | Epoch[664] Step[106] GlobalStep[91074] Training Speed: 428.47 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:01:43. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:22:37 INFO loss_tracker.py:84 | Epoch[664/NA] Step[124] GlobalStep[91092/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0007] loss_depth[0.0125] total_loss[0.0132] Rank[0/16] 06/24/2025 23:22:40 INFO stats.py:314 | Epoch[664] Step[131] GlobalStep[91099] Training Speed: 440.79 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:01:32. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:22:42 INFO stats.py:394 | Epoch[664] completed. Training Speed: 309.75 samples/sec across all devices. Epoch Time: 56.61 sec. Average Epoch Time: 56.61 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:01:30. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:22:51 INFO stats.py:314 | Epoch[665] Step[19] GlobalStep[91124] Training Speed: 428.36 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:01:22. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:22:53 INFO loss_tracker.py:84 | Epoch[665/NA] Step[24] GlobalStep[91129/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0007] loss_depth[0.0125] total_loss[0.0133] Rank[0/16] 06/24/2025 23:23:01 INFO stats.py:314 | Epoch[665] Step[44] GlobalStep[91149] Training Speed: 437.02 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:01:12. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:23:03 INFO loss_tracker.py:84 | Epoch[665/NA] Step[49] GlobalStep[91154/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/24/2025 23:23:11 INFO stats.py:314 | Epoch[665] Step[69] GlobalStep[91174] Training Speed: 428.71 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:01:01. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:23:13 INFO loss_tracker.py:84 | Epoch[665/NA] Step[74] GlobalStep[91179/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/24/2025 23:23:21 INFO stats.py:314 | Epoch[665] Step[94] GlobalStep[91199] Training Speed: 403.11 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 1:00:51. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:23:23 INFO loss_tracker.py:84 | Epoch[665/NA] Step[99] GlobalStep[91204/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/24/2025 23:23:31 INFO stats.py:314 | Epoch[665] Step[119] GlobalStep[91224] Training Speed: 432.52 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:00:40. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:23:33 INFO loss_tracker.py:84 | Epoch[665/NA] Step[124] GlobalStep[91229/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/24/2025 23:23:37 INFO stats.py:394 | Epoch[665] completed. Training Speed: 315.76 samples/sec across all devices. Epoch Time: 55.54 sec. Average Epoch Time: 55.54 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 1:00:33. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:23:42 INFO stats.py:314 | Epoch[666] Step[7] GlobalStep[91249] Training Speed: 436.25 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 1:00:30. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:23:49 INFO loss_tracker.py:84 | Epoch[666/NA] Step[24] GlobalStep[91266/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0125] total_loss[0.0135] Rank[0/16] 06/24/2025 23:23:52 INFO stats.py:314 | Epoch[666] Step[32] GlobalStep[91274] Training Speed: 426.15 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:00:20. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:23:59 INFO loss_tracker.py:84 | Epoch[666/NA] Step[49] GlobalStep[91291/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0008] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/24/2025 23:24:03 INFO stats.py:314 | Epoch[666] Step[57] GlobalStep[91299] Training Speed: 422.83 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 1:00:09. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:24:09 INFO loss_tracker.py:84 | Epoch[666/NA] Step[74] GlobalStep[91316/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0013] loss_depth[0.0125] total_loss[0.0138] Rank[0/16] 06/24/2025 23:24:13 INFO stats.py:314 | Epoch[666] Step[82] GlobalStep[91324] Training Speed: 433.06 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:59:59. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:24:20 INFO loss_tracker.py:84 | Epoch[666/NA] Step[99] GlobalStep[91341/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0126] total_loss[0.0135] Rank[0/16] 06/24/2025 23:24:23 INFO stats.py:314 | Epoch[666] Step[107] GlobalStep[91349] Training Speed: 423.12 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:59:49. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:24:30 INFO loss_tracker.py:84 | Epoch[666/NA] Step[124] GlobalStep[91366/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/24/2025 23:24:33 INFO stats.py:314 | Epoch[666] Step[132] GlobalStep[91374] Training Speed: 452.17 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 0:59:38. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:24:34 INFO stats.py:394 | Epoch[666] completed. Training Speed: 309.05 samples/sec across all devices. Epoch Time: 56.74 sec. Average Epoch Time: 56.74 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 0:59:36. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:24:43 INFO stats.py:314 | Epoch[667] Step[20] GlobalStep[91399] Training Speed: 419.41 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:59:28. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:24:45 INFO loss_tracker.py:84 | Epoch[667/NA] Step[24] GlobalStep[91403/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0007] loss_depth[0.0125] total_loss[0.0132] Rank[0/16] 06/24/2025 23:24:54 INFO stats.py:314 | Epoch[667] Step[45] GlobalStep[91424] Training Speed: 428.36 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:59:17. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:24:56 INFO loss_tracker.py:84 | Epoch[667/NA] Step[49] GlobalStep[91428/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/24/2025 23:25:04 INFO stats.py:314 | Epoch[667] Step[70] GlobalStep[91449] Training Speed: 419.15 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:59:07. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:25:06 INFO loss_tracker.py:84 | Epoch[667/NA] Step[74] GlobalStep[91453/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/24/2025 23:25:15 INFO stats.py:314 | Epoch[667] Step[95] GlobalStep[91474] Training Speed: 417.76 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:58:57. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:25:16 INFO loss_tracker.py:84 | Epoch[667/NA] Step[99] GlobalStep[91478/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0013] loss_depth[0.0125] total_loss[0.0138] Rank[0/16] 06/24/2025 23:25:25 INFO stats.py:314 | Epoch[667] Step[120] GlobalStep[91499] Training Speed: 450.62 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 0:58:46. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:25:26 INFO loss_tracker.py:84 | Epoch[667/NA] Step[124] GlobalStep[91503/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/24/2025 23:25:31 INFO stats.py:394 | Epoch[667] completed. Training Speed: 310.55 samples/sec across all devices. Epoch Time: 56.47 sec. Average Epoch Time: 56.47 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 0:58:40. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:25:35 INFO stats.py:314 | Epoch[668] Step[8] GlobalStep[91524] Training Speed: 444.17 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:58:36. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:25:42 INFO loss_tracker.py:84 | Epoch[668/NA] Step[24] GlobalStep[91540/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/24/2025 23:25:46 INFO stats.py:314 | Epoch[668] Step[33] GlobalStep[91549] Training Speed: 438.60 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:58:26. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:25:52 INFO loss_tracker.py:84 | Epoch[668/NA] Step[49] GlobalStep[91565/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/24/2025 23:25:56 INFO stats.py:314 | Epoch[668] Step[58] GlobalStep[91574] Training Speed: 432.48 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:58:15. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:26:02 INFO loss_tracker.py:84 | Epoch[668/NA] Step[74] GlobalStep[91590/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/24/2025 23:26:06 INFO stats.py:314 | Epoch[668] Step[83] GlobalStep[91599] Training Speed: 417.04 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:58:05. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:26:13 INFO loss_tracker.py:84 | Epoch[668/NA] Step[99] GlobalStep[91615/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0126] total_loss[0.0135] Rank[0/16] 06/24/2025 23:26:16 INFO stats.py:314 | Epoch[668] Step[108] GlobalStep[91624] Training Speed: 403.88 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 0:57:54. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:26:23 INFO loss_tracker.py:84 | Epoch[668/NA] Step[124] GlobalStep[91640/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0126] total_loss[0.0136] Rank[0/16] 06/24/2025 23:26:26 INFO stats.py:314 | Epoch[668] Step[133] GlobalStep[91649] Training Speed: 443.85 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:57:44. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:26:27 INFO stats.py:394 | Epoch[668] completed. Training Speed: 310.50 samples/sec across all devices. Epoch Time: 56.48 sec. Average Epoch Time: 56.48 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 0:57:43. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:26:37 INFO stats.py:314 | Epoch[669] Step[21] GlobalStep[91674] Training Speed: 432.42 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:57:34. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:26:38 INFO loss_tracker.py:84 | Epoch[669/NA] Step[24] GlobalStep[91677/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/24/2025 23:26:48 INFO stats.py:314 | Epoch[669] Step[46] GlobalStep[91699] Training Speed: 405.39 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 0:57:23. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:26:49 INFO loss_tracker.py:84 | Epoch[669/NA] Step[49] GlobalStep[91702/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0007] loss_depth[0.0125] total_loss[0.0133] Rank[0/16] 06/24/2025 23:26:58 INFO stats.py:314 | Epoch[669] Step[71] GlobalStep[91724] Training Speed: 435.05 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:57:13. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:26:59 INFO loss_tracker.py:84 | Epoch[669/NA] Step[74] GlobalStep[91727/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/24/2025 23:27:08 INFO stats.py:314 | Epoch[669] Step[96] GlobalStep[91749] Training Speed: 434.98 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:57:03. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:27:10 INFO loss_tracker.py:84 | Epoch[669/NA] Step[99] GlobalStep[91752/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0125] total_loss[0.0135] Rank[0/16] 06/24/2025 23:27:19 INFO stats.py:314 | Epoch[669] Step[121] GlobalStep[91774] Training Speed: 437.38 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:56:52. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:27:20 INFO loss_tracker.py:84 | Epoch[669/NA] Step[124] GlobalStep[91777/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0007] loss_depth[0.0126] total_loss[0.0133] Rank[0/16] 06/24/2025 23:27:24 INFO stats.py:394 | Epoch[669] completed. Training Speed: 306.62 samples/sec across all devices. Epoch Time: 57.19 sec. Average Epoch Time: 57.19 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:56:46. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:27:29 INFO stats.py:314 | Epoch[670] Step[9] GlobalStep[91799] Training Speed: 426.42 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:56:42. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:27:36 INFO loss_tracker.py:84 | Epoch[670/NA] Step[24] GlobalStep[91814/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/24/2025 23:27:40 INFO stats.py:314 | Epoch[670] Step[34] GlobalStep[91824] Training Speed: 434.52 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:56:31. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:27:46 INFO loss_tracker.py:84 | Epoch[670/NA] Step[49] GlobalStep[91839/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0008] loss_depth[0.0125] total_loss[0.0133] Rank[0/16] 06/24/2025 23:27:50 INFO stats.py:314 | Epoch[670] Step[59] GlobalStep[91849] Training Speed: 434.82 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:56:21. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:27:56 INFO loss_tracker.py:84 | Epoch[670/NA] Step[74] GlobalStep[91864/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0014] loss_depth[0.0125] total_loss[0.0138] Rank[0/16] 06/24/2025 23:28:00 INFO stats.py:314 | Epoch[670] Step[84] GlobalStep[91874] Training Speed: 436.99 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:56:11. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:28:06 INFO loss_tracker.py:84 | Epoch[670/NA] Step[99] GlobalStep[91889/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0008] loss_depth[0.0125] total_loss[0.0133] Rank[0/16] 06/24/2025 23:28:10 INFO stats.py:314 | Epoch[670] Step[109] GlobalStep[91899] Training Speed: 420.90 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:56:00. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:28:17 INFO loss_tracker.py:84 | Epoch[670/NA] Step[124] GlobalStep[91914/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0125] total_loss[0.0135] Rank[0/16] 06/24/2025 23:28:20 INFO stats.py:314 | Epoch[670] Step[134] GlobalStep[91924] Training Speed: 440.59 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:55:50. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:28:21 INFO stats.py:394 | Epoch[670] completed. Training Speed: 308.87 samples/sec across all devices. Epoch Time: 56.77 sec. Average Epoch Time: 56.77 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 0:55:49. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:28:32 INFO stats.py:314 | Epoch[671] Step[22] GlobalStep[91949] Training Speed: 428.52 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:55:40. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:28:33 INFO loss_tracker.py:84 | Epoch[671/NA] Step[24] GlobalStep[91951/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0126] total_loss[0.0137] Rank[0/16] 06/24/2025 23:28:42 INFO stats.py:314 | Epoch[671] Step[47] GlobalStep[91974] Training Speed: 430.08 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:55:29. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:28:43 INFO loss_tracker.py:84 | Epoch[671/NA] Step[49] GlobalStep[91976/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0008] loss_depth[0.0125] total_loss[0.0133] Rank[0/16] 06/24/2025 23:28:53 INFO stats.py:314 | Epoch[671] Step[72] GlobalStep[91999] Training Speed: 421.39 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:55:19. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:28:53 WARNING accelerator.py:3099 | Deleting 1 checkpoints to make room for new checkpoint. Rank[0/16] 06/24/2025 23:28:54 INFO accelerator.py:3111 | Saving current state to /job_data/checkpoints/checkpoint_22 Rank[15/16] 06/24/2025 23:28:54 INFO checkpoint.py:110 | Save checkpoint at the end of step 91999 to /job_data/checkpoints/checkpoint_22 Rank[13/16] 06/24/2025 23:28:54 INFO checkpoint.py:110 | Save checkpoint at the end of step 91999 to /job_data/checkpoints/checkpoint_22 Rank[12/16] 06/24/2025 23:28:54 INFO checkpoint.py:110 | Save checkpoint at the end of step 91999 to /job_data/checkpoints/checkpoint_22 Rank[4/16] 06/24/2025 23:28:54 INFO checkpoint.py:110 | Save checkpoint at the end of step 91999 to /job_data/checkpoints/checkpoint_22 Rank[10/16] 06/24/2025 23:28:54 INFO checkpoint.py:110 | Save checkpoint at the end of step 91999 to /job_data/checkpoints/checkpoint_22 Rank[1/16] 06/24/2025 23:28:54 INFO checkpoint.py:110 | Save checkpoint at the end of step 91999 to /job_data/checkpoints/checkpoint_22 Rank[8/16] 06/24/2025 23:28:54 INFO checkpoint.py:110 | Save checkpoint at the end of step 91999 to /job_data/checkpoints/checkpoint_22 Rank[9/16] 06/24/2025 23:28:54 INFO checkpoint.py:110 | Save checkpoint at the end of step 91999 to /job_data/checkpoints/checkpoint_22 Rank[7/16] 06/24/2025 23:28:54 INFO checkpoint.py:110 | Save checkpoint at the end of step 91999 to /job_data/checkpoints/checkpoint_22 Rank[11/16] 06/24/2025 23:28:54 INFO checkpoint.py:110 | Save checkpoint at the end of step 91999 to /job_data/checkpoints/checkpoint_22 Rank[2/16] 06/24/2025 23:28:54 INFO checkpoint.py:110 | Save checkpoint at the end of step 91999 to /job_data/checkpoints/checkpoint_22 Rank[14/16] 06/24/2025 23:28:54 INFO checkpoint.py:110 | Save checkpoint at the end of step 91999 to /job_data/checkpoints/checkpoint_22 Rank[5/16] 06/24/2025 23:28:54 INFO checkpoint.py:110 | Save checkpoint at the end of step 91999 to /job_data/checkpoints/checkpoint_22 Rank[3/16] 06/24/2025 23:28:54 INFO checkpoint.py:110 | Save checkpoint at the end of step 91999 to /job_data/checkpoints/checkpoint_22 Rank[6/16] 06/24/2025 23:28:54 INFO checkpoint.py:110 | Save checkpoint at the end of step 91999 to /job_data/checkpoints/checkpoint_22 Rank[0/16] 06/24/2025 23:28:55 INFO checkpointing.py:106 | Model weights saved in /job_data/checkpoints/checkpoint_22/model.safetensors Rank[0/16] 06/24/2025 23:28:56 INFO checkpointing.py:113 | Optimizer state saved in /job_data/checkpoints/checkpoint_22/optimizer.bin Rank[0/16] 06/24/2025 23:28:56 INFO checkpointing.py:120 | Scheduler state saved in /job_data/checkpoints/checkpoint_22/scheduler.bin Rank[0/16] 06/24/2025 23:28:56 INFO checkpointing.py:137 | Sampler state for dataloader 0 saved in /job_data/checkpoints/checkpoint_22/sampler.bin Rank[0/16] 06/24/2025 23:28:56 INFO checkpointing.py:164 | Random states saved in /job_data/checkpoints/checkpoint_22/random_states_0.pkl Rank[0/16] 06/24/2025 23:28:56 INFO checkpointing.py:300 | Saving the state of TrainerProgressState to /job_data/checkpoints/checkpoint_22/custom_checkpoint_0.pkl Rank[0/16] 06/24/2025 23:28:56 INFO checkpoint.py:110 | Save checkpoint at the end of step 91999 to /job_data/checkpoints/checkpoint_22 Rank[0/16] 06/24/2025 23:28:57 INFO loss_tracker.py:84 | Epoch[671/NA] Step[74] GlobalStep[92001/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0008] loss_depth[0.0125] total_loss[0.0133] Rank[0/16] 06/24/2025 23:29:06 INFO stats.py:314 | Epoch[671] Step[97] GlobalStep[92024] Training Speed: 431.74 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:55:09. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:29:07 INFO loss_tracker.py:84 | Epoch[671/NA] Step[99] GlobalStep[92026/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/24/2025 23:29:16 INFO stats.py:314 | Epoch[671] Step[122] GlobalStep[92049] Training Speed: 445.50 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:54:58. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:29:17 INFO loss_tracker.py:84 | Epoch[671/NA] Step[124] GlobalStep[92051/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0014] loss_depth[0.0125] total_loss[0.0140] Rank[0/16] 06/24/2025 23:29:21 INFO stats.py:394 | Epoch[671] completed. Training Speed: 290.19 samples/sec across all devices. Epoch Time: 60.43 sec. Average Epoch Time: 60.43 sec. Average Step Time: 0.44 sec. Estimated Remaining Time: 0:54:52. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:29:28 INFO stats.py:314 | Epoch[672] Step[10] GlobalStep[92074] Training Speed: 429.27 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:54:48. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:29:34 INFO loss_tracker.py:84 | Epoch[672/NA] Step[24] GlobalStep[92088/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/24/2025 23:29:38 INFO stats.py:314 | Epoch[672] Step[35] GlobalStep[92099] Training Speed: 426.15 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:54:38. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:29:44 INFO loss_tracker.py:84 | Epoch[672/NA] Step[49] GlobalStep[92113/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0125] total_loss[0.0135] Rank[0/16] 06/24/2025 23:29:49 INFO stats.py:314 | Epoch[672] Step[60] GlobalStep[92124] Training Speed: 422.19 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:54:27. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:29:55 INFO loss_tracker.py:84 | Epoch[672/NA] Step[74] GlobalStep[92138/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0135] Rank[0/16] 06/24/2025 23:29:59 INFO stats.py:314 | Epoch[672] Step[85] GlobalStep[92149] Training Speed: 431.42 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:54:17. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:30:05 INFO loss_tracker.py:84 | Epoch[672/NA] Step[99] GlobalStep[92163/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/24/2025 23:30:09 INFO stats.py:314 | Epoch[672] Step[110] GlobalStep[92174] Training Speed: 438.49 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:54:07. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:30:15 INFO loss_tracker.py:84 | Epoch[672/NA] Step[124] GlobalStep[92188/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0126] total_loss[0.0136] Rank[0/16] 06/24/2025 23:30:19 INFO stats.py:314 | Epoch[672] Step[135] GlobalStep[92199] Training Speed: 450.99 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 0:53:56. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:30:19 INFO stats.py:394 | Epoch[672] completed. Training Speed: 303.35 samples/sec across all devices. Epoch Time: 57.81 sec. Average Epoch Time: 57.81 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:53:56. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:30:30 INFO stats.py:314 | Epoch[673] Step[23] GlobalStep[92224] Training Speed: 402.47 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 0:53:46. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:30:31 INFO loss_tracker.py:84 | Epoch[673/NA] Step[24] GlobalStep[92225/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0013] loss_depth[0.0125] total_loss[0.0139] Rank[0/16] 06/24/2025 23:30:41 INFO stats.py:314 | Epoch[673] Step[48] GlobalStep[92249] Training Speed: 431.05 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:53:36. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:30:42 INFO loss_tracker.py:84 | Epoch[673/NA] Step[49] GlobalStep[92250/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0125] total_loss[0.0135] Rank[0/16] 06/24/2025 23:30:52 INFO stats.py:314 | Epoch[673] Step[73] GlobalStep[92274] Training Speed: 431.03 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:53:25. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:30:52 INFO loss_tracker.py:84 | Epoch[673/NA] Step[74] GlobalStep[92275/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/24/2025 23:31:02 INFO stats.py:314 | Epoch[673] Step[98] GlobalStep[92299] Training Speed: 423.06 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:53:15. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:31:02 INFO loss_tracker.py:84 | Epoch[673/NA] Step[99] GlobalStep[92300/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0014] loss_depth[0.0125] total_loss[0.0139] Rank[0/16] 06/24/2025 23:31:12 INFO stats.py:314 | Epoch[673] Step[123] GlobalStep[92324] Training Speed: 453.17 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 0:53:04. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:31:13 INFO loss_tracker.py:84 | Epoch[673/NA] Step[124] GlobalStep[92325/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0017] loss_depth[0.0125] total_loss[0.0142] Rank[0/16] 06/24/2025 23:31:17 INFO stats.py:394 | Epoch[673] completed. Training Speed: 302.49 samples/sec across all devices. Epoch Time: 57.97 sec. Average Epoch Time: 57.97 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:52:59. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:31:23 INFO stats.py:314 | Epoch[674] Step[11] GlobalStep[92349] Training Speed: 435.59 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:52:54. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:31:28 INFO loss_tracker.py:84 | Epoch[674/NA] Step[24] GlobalStep[92362/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/24/2025 23:31:34 INFO stats.py:314 | Epoch[674] Step[36] GlobalStep[92374] Training Speed: 425.14 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:52:44. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:31:39 INFO loss_tracker.py:84 | Epoch[674/NA] Step[49] GlobalStep[92387/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/24/2025 23:31:44 INFO stats.py:314 | Epoch[674] Step[61] GlobalStep[92399] Training Speed: 429.01 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:52:33. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:31:49 INFO loss_tracker.py:84 | Epoch[674/NA] Step[74] GlobalStep[92412/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0126] total_loss[0.0137] Rank[0/16] 06/24/2025 23:31:55 INFO stats.py:314 | Epoch[674] Step[86] GlobalStep[92424] Training Speed: 424.94 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:52:23. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:32:00 INFO loss_tracker.py:84 | Epoch[674/NA] Step[99] GlobalStep[92437/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0019] loss_depth[0.0125] total_loss[0.0144] Rank[0/16] 06/24/2025 23:32:05 INFO stats.py:314 | Epoch[674] Step[111] GlobalStep[92449] Training Speed: 399.06 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 0:52:13. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:32:10 INFO loss_tracker.py:84 | Epoch[674/NA] Step[124] GlobalStep[92462/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0018] loss_depth[0.0125] total_loss[0.0143] Rank[0/16] 06/24/2025 23:32:15 INFO stats.py:314 | Epoch[674] Step[136] GlobalStep[92474] Training Speed: 454.64 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 0:52:02. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:32:15 INFO stats.py:394 | Epoch[674] completed. Training Speed: 304.05 samples/sec across all devices. Epoch Time: 57.67 sec. Average Epoch Time: 57.67 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:52:02. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:32:26 INFO stats.py:314 | Epoch[675] Step[24] GlobalStep[92499] Training Speed: 440.25 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:51:52. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:32:26 INFO loss_tracker.py:84 | Epoch[675/NA] Step[24] GlobalStep[92499/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0016] loss_depth[0.0125] total_loss[0.0142] Rank[0/16] 06/24/2025 23:32:36 INFO stats.py:314 | Epoch[675] Step[49] GlobalStep[92524] Training Speed: 438.27 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:51:41. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:32:36 INFO loss_tracker.py:84 | Epoch[675/NA] Step[49] GlobalStep[92524/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/24/2025 23:32:46 INFO stats.py:314 | Epoch[675] Step[74] GlobalStep[92549] Training Speed: 237.73 samples/sec across all devices. Average Step Time: 0.54 sec. Estimated Remaining Time: 0:51:31. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:32:46 INFO loss_tracker.py:84 | Epoch[675/NA] Step[74] GlobalStep[92549/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0008] loss_depth[0.0125] total_loss[0.0133] Rank[0/16] 06/24/2025 23:32:57 INFO stats.py:314 | Epoch[675] Step[99] GlobalStep[92574] Training Speed: 426.63 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:51:21. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:32:58 INFO loss_tracker.py:84 | Epoch[675/NA] Step[99] GlobalStep[92574/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0017] loss_depth[0.0125] total_loss[0.0142] Rank[0/16] 06/24/2025 23:33:07 INFO stats.py:314 | Epoch[675] Step[124] GlobalStep[92599] Training Speed: 443.53 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:51:10. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:33:08 INFO loss_tracker.py:84 | Epoch[675/NA] Step[124] GlobalStep[92599/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0019] loss_depth[0.0125] total_loss[0.0144] Rank[0/16] 06/24/2025 23:33:12 INFO stats.py:394 | Epoch[675] completed. Training Speed: 306.97 samples/sec across all devices. Epoch Time: 57.13 sec. Average Epoch Time: 57.13 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:51:05. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:33:19 INFO stats.py:314 | Epoch[676] Step[12] GlobalStep[92624] Training Speed: 428.93 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:51:00. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:33:24 INFO loss_tracker.py:84 | Epoch[676/NA] Step[24] GlobalStep[92636/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/24/2025 23:33:29 INFO stats.py:314 | Epoch[676] Step[37] GlobalStep[92649] Training Speed: 436.47 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:50:50. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:33:34 INFO loss_tracker.py:84 | Epoch[676/NA] Step[49] GlobalStep[92661/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0019] loss_depth[0.0125] total_loss[0.0144] Rank[0/16] 06/24/2025 23:33:39 INFO stats.py:314 | Epoch[676] Step[62] GlobalStep[92674] Training Speed: 435.94 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:50:39. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:33:44 INFO loss_tracker.py:84 | Epoch[676/NA] Step[74] GlobalStep[92686/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0015] loss_depth[0.0125] total_loss[0.0140] Rank[0/16] 06/24/2025 23:33:49 INFO stats.py:314 | Epoch[676] Step[87] GlobalStep[92699] Training Speed: 417.36 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:50:29. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:33:54 INFO loss_tracker.py:84 | Epoch[676/NA] Step[99] GlobalStep[92711/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0013] loss_depth[0.0125] total_loss[0.0138] Rank[0/16] 06/24/2025 23:34:00 INFO stats.py:314 | Epoch[676] Step[112] GlobalStep[92724] Training Speed: 433.89 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:50:19. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:34:05 INFO loss_tracker.py:84 | Epoch[676/NA] Step[124] GlobalStep[92736/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/24/2025 23:34:09 INFO stats.py:394 | Epoch[676] completed. Training Speed: 307.22 samples/sec across all devices. Epoch Time: 57.08 sec. Average Epoch Time: 57.08 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:50:09. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:34:10 INFO stats.py:314 | Epoch[677] Step[0] GlobalStep[92749] Training Speed: 345.44 samples/sec across all devices. Average Step Time: 0.37 sec. Estimated Remaining Time: 0:50:08. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:34:21 INFO loss_tracker.py:84 | Epoch[677/NA] Step[24] GlobalStep[92773/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/24/2025 23:34:21 INFO stats.py:314 | Epoch[677] Step[25] GlobalStep[92774] Training Speed: 412.89 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:49:58. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:34:31 INFO loss_tracker.py:84 | Epoch[677/NA] Step[49] GlobalStep[92798/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0126] total_loss[0.0138] Rank[0/16] 06/24/2025 23:34:31 INFO stats.py:314 | Epoch[677] Step[50] GlobalStep[92799] Training Speed: 410.00 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:49:47. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:34:41 INFO loss_tracker.py:84 | Epoch[677/NA] Step[74] GlobalStep[92823/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0017] loss_depth[0.0125] total_loss[0.0142] Rank[0/16] 06/24/2025 23:34:42 INFO stats.py:314 | Epoch[677] Step[75] GlobalStep[92824] Training Speed: 433.84 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:49:37. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:34:52 INFO loss_tracker.py:84 | Epoch[677/NA] Step[99] GlobalStep[92848/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/24/2025 23:34:52 INFO stats.py:314 | Epoch[677] Step[100] GlobalStep[92849] Training Speed: 419.01 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:49:27. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:35:02 INFO loss_tracker.py:84 | Epoch[677/NA] Step[124] GlobalStep[92873/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0017] loss_depth[0.0126] total_loss[0.0142] Rank[0/16] 06/24/2025 23:35:02 INFO stats.py:314 | Epoch[677] Step[125] GlobalStep[92874] Training Speed: 415.83 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:49:16. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:35:06 INFO stats.py:394 | Epoch[677] completed. Training Speed: 307.76 samples/sec across all devices. Epoch Time: 56.98 sec. Average Epoch Time: 56.98 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:49:12. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:35:13 INFO stats.py:314 | Epoch[678] Step[13] GlobalStep[92899] Training Speed: 428.16 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:49:06. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:35:18 INFO loss_tracker.py:84 | Epoch[678/NA] Step[24] GlobalStep[92910/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0008] loss_depth[0.0125] total_loss[0.0133] Rank[0/16] 06/24/2025 23:35:24 INFO stats.py:314 | Epoch[678] Step[38] GlobalStep[92924] Training Speed: 428.44 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:48:56. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:35:28 INFO loss_tracker.py:84 | Epoch[678/NA] Step[49] GlobalStep[92935/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/24/2025 23:35:34 INFO stats.py:314 | Epoch[678] Step[63] GlobalStep[92949] Training Speed: 436.47 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:48:45. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:35:39 INFO loss_tracker.py:84 | Epoch[678/NA] Step[74] GlobalStep[92960/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0125] total_loss[0.0135] Rank[0/16] 06/24/2025 23:35:44 INFO stats.py:314 | Epoch[678] Step[88] GlobalStep[92974] Training Speed: 437.12 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:48:35. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:35:49 INFO loss_tracker.py:84 | Epoch[678/NA] Step[99] GlobalStep[92985/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0015] loss_depth[0.0125] total_loss[0.0140] Rank[0/16] 06/24/2025 23:35:54 INFO stats.py:314 | Epoch[678] Step[113] GlobalStep[92999] Training Speed: 435.04 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:48:24. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:35:59 INFO loss_tracker.py:84 | Epoch[678/NA] Step[124] GlobalStep[93010/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0125] total_loss[0.0135] Rank[0/16] 06/24/2025 23:36:03 INFO stats.py:394 | Epoch[678] completed. Training Speed: 307.57 samples/sec across all devices. Epoch Time: 57.01 sec. Average Epoch Time: 57.01 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:48:15. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:36:05 INFO stats.py:314 | Epoch[679] Step[1] GlobalStep[93024] Training Speed: 403.45 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 0:48:14. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:36:15 INFO loss_tracker.py:84 | Epoch[679/NA] Step[24] GlobalStep[93047/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0016] loss_depth[0.0125] total_loss[0.0141] Rank[0/16] 06/24/2025 23:36:15 INFO stats.py:314 | Epoch[679] Step[26] GlobalStep[93049] Training Speed: 418.48 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:48:04. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:36:25 INFO loss_tracker.py:84 | Epoch[679/NA] Step[49] GlobalStep[93072/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0125] total_loss[0.0135] Rank[0/16] 06/24/2025 23:36:26 INFO stats.py:314 | Epoch[679] Step[51] GlobalStep[93074] Training Speed: 252.75 samples/sec across all devices. Average Step Time: 0.51 sec. Estimated Remaining Time: 0:47:53. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:36:35 INFO loss_tracker.py:84 | Epoch[679/NA] Step[74] GlobalStep[93097/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0016] loss_depth[0.0125] total_loss[0.0142] Rank[0/16] 06/24/2025 23:36:36 INFO stats.py:314 | Epoch[679] Step[76] GlobalStep[93099] Training Speed: 423.96 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:47:43. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:36:46 INFO loss_tracker.py:84 | Epoch[679/NA] Step[99] GlobalStep[93122/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0008] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/24/2025 23:36:46 INFO stats.py:314 | Epoch[679] Step[101] GlobalStep[93124] Training Speed: 430.00 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:47:33. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:36:56 INFO loss_tracker.py:84 | Epoch[679/NA] Step[124] GlobalStep[93147/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/24/2025 23:36:57 INFO stats.py:314 | Epoch[679] Step[126] GlobalStep[93149] Training Speed: 451.86 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 0:47:22. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:37:00 INFO stats.py:394 | Epoch[679] completed. Training Speed: 307.05 samples/sec across all devices. Epoch Time: 57.11 sec. Average Epoch Time: 57.11 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:47:18. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:37:07 INFO stats.py:314 | Epoch[680] Step[14] GlobalStep[93174] Training Speed: 426.39 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:47:12. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:37:11 INFO loss_tracker.py:84 | Epoch[680/NA] Step[24] GlobalStep[93184/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0125] total_loss[0.0135] Rank[0/16] 06/24/2025 23:37:18 INFO stats.py:314 | Epoch[680] Step[39] GlobalStep[93199] Training Speed: 430.08 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:47:01. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:37:22 INFO loss_tracker.py:84 | Epoch[680/NA] Step[49] GlobalStep[93209/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0016] loss_depth[0.0125] total_loss[0.0141] Rank[0/16] 06/24/2025 23:37:28 INFO stats.py:314 | Epoch[680] Step[64] GlobalStep[93224] Training Speed: 429.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:46:51. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:37:32 INFO loss_tracker.py:84 | Epoch[680/NA] Step[74] GlobalStep[93234/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0015] loss_depth[0.0125] total_loss[0.0140] Rank[0/16] 06/24/2025 23:37:39 INFO stats.py:314 | Epoch[680] Step[89] GlobalStep[93249] Training Speed: 397.80 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 0:46:41. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:37:43 INFO loss_tracker.py:84 | Epoch[680/NA] Step[99] GlobalStep[93259/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0135] Rank[0/16] 06/24/2025 23:37:49 INFO stats.py:314 | Epoch[680] Step[114] GlobalStep[93274] Training Speed: 424.43 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:46:30. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:37:53 INFO loss_tracker.py:84 | Epoch[680/NA] Step[124] GlobalStep[93284/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0017] loss_depth[0.0125] total_loss[0.0143] Rank[0/16] 06/24/2025 23:37:57 INFO stats.py:394 | Epoch[680] completed. Training Speed: 306.68 samples/sec across all devices. Epoch Time: 57.18 sec. Average Epoch Time: 57.18 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:46:21. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:38:00 INFO stats.py:314 | Epoch[681] Step[2] GlobalStep[93299] Training Speed: 433.20 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:46:20. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:38:09 INFO loss_tracker.py:84 | Epoch[681/NA] Step[24] GlobalStep[93321/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/24/2025 23:38:10 INFO stats.py:314 | Epoch[681] Step[27] GlobalStep[93324] Training Speed: 420.89 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:46:10. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:38:19 INFO loss_tracker.py:84 | Epoch[681/NA] Step[49] GlobalStep[93346/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0014] loss_depth[0.0125] total_loss[0.0139] Rank[0/16] 06/24/2025 23:38:20 INFO stats.py:314 | Epoch[681] Step[52] GlobalStep[93349] Training Speed: 418.93 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:45:59. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:38:29 INFO loss_tracker.py:84 | Epoch[681/NA] Step[74] GlobalStep[93371/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0017] loss_depth[0.0126] total_loss[0.0143] Rank[0/16] 06/24/2025 23:38:30 INFO stats.py:314 | Epoch[681] Step[77] GlobalStep[93374] Training Speed: 412.64 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:45:49. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:38:40 INFO loss_tracker.py:84 | Epoch[681/NA] Step[99] GlobalStep[93396/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/24/2025 23:38:41 INFO stats.py:314 | Epoch[681] Step[102] GlobalStep[93399] Training Speed: 429.04 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:45:38. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:38:50 INFO loss_tracker.py:84 | Epoch[681/NA] Step[124] GlobalStep[93421/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/24/2025 23:38:51 INFO stats.py:314 | Epoch[681] Step[127] GlobalStep[93424] Training Speed: 439.17 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:45:28. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:38:54 INFO stats.py:394 | Epoch[681] completed. Training Speed: 308.50 samples/sec across all devices. Epoch Time: 56.84 sec. Average Epoch Time: 56.84 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 0:45:24. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:39:02 INFO stats.py:314 | Epoch[682] Step[15] GlobalStep[93449] Training Speed: 435.43 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:45:18. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:39:06 INFO loss_tracker.py:84 | Epoch[682/NA] Step[24] GlobalStep[93458/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/24/2025 23:39:12 INFO stats.py:314 | Epoch[682] Step[40] GlobalStep[93474] Training Speed: 435.81 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:45:07. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:39:16 INFO loss_tracker.py:84 | Epoch[682/NA] Step[49] GlobalStep[93483/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0018] loss_depth[0.0125] total_loss[0.0144] Rank[0/16] 06/24/2025 23:39:22 INFO stats.py:314 | Epoch[682] Step[65] GlobalStep[93499] Training Speed: 383.68 samples/sec across all devices. Average Step Time: 0.33 sec. Estimated Remaining Time: 0:44:57. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:39:26 INFO loss_tracker.py:84 | Epoch[682/NA] Step[74] GlobalStep[93508/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0015] loss_depth[0.0125] total_loss[0.0140] Rank[0/16] 06/24/2025 23:39:32 INFO stats.py:314 | Epoch[682] Step[90] GlobalStep[93524] Training Speed: 433.96 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:44:47. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:39:36 INFO loss_tracker.py:84 | Epoch[682/NA] Step[99] GlobalStep[93533/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0015] loss_depth[0.0125] total_loss[0.0140] Rank[0/16] 06/24/2025 23:39:43 INFO stats.py:314 | Epoch[682] Step[115] GlobalStep[93549] Training Speed: 428.01 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:44:36. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:39:46 INFO loss_tracker.py:84 | Epoch[682/NA] Step[124] GlobalStep[93558/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0126] total_loss[0.0137] Rank[0/16] 06/24/2025 23:39:51 INFO stats.py:394 | Epoch[682] completed. Training Speed: 311.09 samples/sec across all devices. Epoch Time: 56.37 sec. Average Epoch Time: 56.37 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 0:44:27. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:39:53 INFO stats.py:314 | Epoch[683] Step[3] GlobalStep[93574] Training Speed: 423.86 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:44:26. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:40:02 INFO loss_tracker.py:84 | Epoch[683/NA] Step[24] GlobalStep[93595/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0013] loss_depth[0.0125] total_loss[0.0139] Rank[0/16] 06/24/2025 23:40:04 INFO stats.py:314 | Epoch[683] Step[28] GlobalStep[93599] Training Speed: 423.93 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:44:15. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:40:13 INFO loss_tracker.py:84 | Epoch[683/NA] Step[49] GlobalStep[93620/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0019] loss_depth[0.0125] total_loss[0.0144] Rank[0/16] 06/24/2025 23:40:14 INFO stats.py:314 | Epoch[683] Step[53] GlobalStep[93624] Training Speed: 418.04 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:44:05. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:40:23 INFO loss_tracker.py:84 | Epoch[683/NA] Step[74] GlobalStep[93645/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/24/2025 23:40:25 INFO stats.py:314 | Epoch[683] Step[78] GlobalStep[93649] Training Speed: 431.96 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:43:55. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:40:34 INFO loss_tracker.py:84 | Epoch[683/NA] Step[99] GlobalStep[93670/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/24/2025 23:40:35 INFO stats.py:314 | Epoch[683] Step[103] GlobalStep[93674] Training Speed: 435.22 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:43:44. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:40:44 INFO loss_tracker.py:84 | Epoch[683/NA] Step[124] GlobalStep[93695/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0017] loss_depth[0.0125] total_loss[0.0142] Rank[0/16] 06/24/2025 23:40:45 INFO stats.py:314 | Epoch[683] Step[128] GlobalStep[93699] Training Speed: 415.80 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:43:34. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:40:48 INFO stats.py:394 | Epoch[683] completed. Training Speed: 305.99 samples/sec across all devices. Epoch Time: 57.31 sec. Average Epoch Time: 57.31 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:43:31. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:40:56 INFO stats.py:314 | Epoch[684] Step[16] GlobalStep[93724] Training Speed: 408.50 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:43:24. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:40:59 INFO loss_tracker.py:84 | Epoch[684/NA] Step[24] GlobalStep[93732/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/24/2025 23:41:06 INFO stats.py:314 | Epoch[684] Step[41] GlobalStep[93749] Training Speed: 400.35 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 0:43:13. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:41:10 INFO loss_tracker.py:84 | Epoch[684/NA] Step[49] GlobalStep[93757/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0018] loss_depth[0.0125] total_loss[0.0143] Rank[0/16] 06/24/2025 23:41:17 INFO stats.py:314 | Epoch[684] Step[66] GlobalStep[93774] Training Speed: 438.65 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:43:03. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:41:20 INFO loss_tracker.py:84 | Epoch[684/NA] Step[74] GlobalStep[93782/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0138] Rank[0/16] 06/24/2025 23:41:27 INFO stats.py:314 | Epoch[684] Step[91] GlobalStep[93799] Training Speed: 432.64 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:42:52. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:41:30 INFO loss_tracker.py:84 | Epoch[684/NA] Step[99] GlobalStep[93807/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0015] loss_depth[0.0125] total_loss[0.0140] Rank[0/16] 06/24/2025 23:41:37 INFO stats.py:314 | Epoch[684] Step[116] GlobalStep[93824] Training Speed: 431.02 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:42:42. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:41:40 INFO loss_tracker.py:84 | Epoch[684/NA] Step[124] GlobalStep[93832/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0138] Rank[0/16] 06/24/2025 23:41:45 INFO stats.py:394 | Epoch[684] completed. Training Speed: 310.09 samples/sec across all devices. Epoch Time: 56.55 sec. Average Epoch Time: 56.55 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 0:42:34. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:41:47 INFO stats.py:314 | Epoch[685] Step[4] GlobalStep[93849] Training Speed: 433.48 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:42:32. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:41:56 INFO loss_tracker.py:84 | Epoch[685/NA] Step[24] GlobalStep[93869/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0017] loss_depth[0.0126] total_loss[0.0143] Rank[0/16] 06/24/2025 23:41:58 INFO stats.py:314 | Epoch[685] Step[29] GlobalStep[93874] Training Speed: 411.19 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:42:21. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:42:06 INFO loss_tracker.py:84 | Epoch[685/NA] Step[49] GlobalStep[93894/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/24/2025 23:42:09 INFO stats.py:314 | Epoch[685] Step[54] GlobalStep[93899] Training Speed: 226.50 samples/sec across all devices. Average Step Time: 0.57 sec. Estimated Remaining Time: 0:42:11. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:42:17 INFO loss_tracker.py:84 | Epoch[685/NA] Step[74] GlobalStep[93919/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0016] loss_depth[0.0125] total_loss[0.0141] Rank[0/16] 06/24/2025 23:42:19 INFO stats.py:314 | Epoch[685] Step[79] GlobalStep[93924] Training Speed: 414.23 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:42:01. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:42:27 INFO loss_tracker.py:84 | Epoch[685/NA] Step[99] GlobalStep[93944/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0016] loss_depth[0.0126] total_loss[0.0142] Rank[0/16] 06/24/2025 23:42:29 INFO stats.py:314 | Epoch[685] Step[104] GlobalStep[93949] Training Speed: 441.73 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:41:50. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:42:37 INFO loss_tracker.py:84 | Epoch[685/NA] Step[124] GlobalStep[93969/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/24/2025 23:42:39 INFO stats.py:314 | Epoch[685] Step[129] GlobalStep[93974] Training Speed: 452.26 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 0:41:40. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:42:42 INFO stats.py:394 | Epoch[685] completed. Training Speed: 306.13 samples/sec across all devices. Epoch Time: 57.28 sec. Average Epoch Time: 57.28 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:41:37. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:42:50 INFO stats.py:314 | Epoch[686] Step[17] GlobalStep[93999] Training Speed: 432.05 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:41:29. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:42:53 INFO loss_tracker.py:84 | Epoch[686/NA] Step[24] GlobalStep[94006/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/24/2025 23:43:00 INFO stats.py:314 | Epoch[686] Step[42] GlobalStep[94024] Training Speed: 427.27 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:41:19. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:43:03 INFO loss_tracker.py:84 | Epoch[686/NA] Step[49] GlobalStep[94031/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0013] loss_depth[0.0126] total_loss[0.0138] Rank[0/16] 06/24/2025 23:43:10 INFO stats.py:314 | Epoch[686] Step[67] GlobalStep[94049] Training Speed: 427.08 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:41:09. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:43:13 INFO loss_tracker.py:84 | Epoch[686/NA] Step[74] GlobalStep[94056/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0013] loss_depth[0.0125] total_loss[0.0138] Rank[0/16] 06/24/2025 23:43:21 INFO stats.py:314 | Epoch[686] Step[92] GlobalStep[94074] Training Speed: 428.99 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:40:58. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:43:24 INFO loss_tracker.py:84 | Epoch[686/NA] Step[99] GlobalStep[94081/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0014] loss_depth[0.0125] total_loss[0.0140] Rank[0/16] 06/24/2025 23:43:31 INFO stats.py:314 | Epoch[686] Step[117] GlobalStep[94099] Training Speed: 431.77 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:40:48. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:43:34 INFO loss_tracker.py:84 | Epoch[686/NA] Step[124] GlobalStep[94106/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/24/2025 23:43:38 INFO stats.py:394 | Epoch[686] completed. Training Speed: 311.31 samples/sec across all devices. Epoch Time: 56.33 sec. Average Epoch Time: 56.33 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 0:40:40. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:43:41 INFO stats.py:314 | Epoch[687] Step[5] GlobalStep[94124] Training Speed: 445.56 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:40:38. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:43:49 INFO loss_tracker.py:84 | Epoch[687/NA] Step[24] GlobalStep[94143/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0018] loss_depth[0.0126] total_loss[0.0144] Rank[0/16] 06/24/2025 23:43:52 INFO stats.py:314 | Epoch[687] Step[30] GlobalStep[94149] Training Speed: 406.55 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:40:27. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:43:59 INFO loss_tracker.py:84 | Epoch[687/NA] Step[49] GlobalStep[94168/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0138] Rank[0/16] 06/24/2025 23:44:02 INFO stats.py:314 | Epoch[687] Step[55] GlobalStep[94174] Training Speed: 437.29 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:40:17. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:44:09 INFO loss_tracker.py:84 | Epoch[687/NA] Step[74] GlobalStep[94193/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/24/2025 23:44:12 INFO stats.py:314 | Epoch[687] Step[80] GlobalStep[94199] Training Speed: 432.62 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:40:06. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:44:20 INFO loss_tracker.py:84 | Epoch[687/NA] Step[99] GlobalStep[94218/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/24/2025 23:44:22 INFO stats.py:314 | Epoch[687] Step[105] GlobalStep[94224] Training Speed: 426.46 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:39:56. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:44:30 INFO loss_tracker.py:84 | Epoch[687/NA] Step[124] GlobalStep[94243/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0013] loss_depth[0.0126] total_loss[0.0139] Rank[0/16] 06/24/2025 23:44:32 INFO stats.py:314 | Epoch[687] Step[130] GlobalStep[94249] Training Speed: 435.13 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:39:46. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:44:34 INFO stats.py:394 | Epoch[687] completed. Training Speed: 313.20 samples/sec across all devices. Epoch Time: 55.99 sec. Average Epoch Time: 55.99 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 0:39:43. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:44:43 INFO stats.py:314 | Epoch[688] Step[18] GlobalStep[94274] Training Speed: 424.77 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:39:35. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:44:45 INFO loss_tracker.py:84 | Epoch[688/NA] Step[24] GlobalStep[94280/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0015] loss_depth[0.0125] total_loss[0.0141] Rank[0/16] 06/24/2025 23:44:53 INFO stats.py:314 | Epoch[688] Step[43] GlobalStep[94299] Training Speed: 435.53 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:39:25. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:44:55 INFO loss_tracker.py:84 | Epoch[688/NA] Step[49] GlobalStep[94305/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0014] loss_depth[0.0125] total_loss[0.0139] Rank[0/16] 06/24/2025 23:45:02 INFO stats.py:314 | Epoch[688] Step[68] GlobalStep[94324] Training Speed: 404.39 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 0:39:14. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:45:05 INFO loss_tracker.py:84 | Epoch[688/NA] Step[74] GlobalStep[94330/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/24/2025 23:45:13 INFO stats.py:314 | Epoch[688] Step[93] GlobalStep[94349] Training Speed: 431.55 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:39:04. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:45:16 INFO loss_tracker.py:84 | Epoch[688/NA] Step[99] GlobalStep[94355/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/24/2025 23:45:23 INFO stats.py:314 | Epoch[688] Step[118] GlobalStep[94374] Training Speed: 399.67 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 0:38:54. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:45:26 INFO loss_tracker.py:84 | Epoch[688/NA] Step[124] GlobalStep[94380/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0015] loss_depth[0.0125] total_loss[0.0141] Rank[0/16] 06/24/2025 23:45:30 INFO stats.py:394 | Epoch[688] completed. Training Speed: 314.10 samples/sec across all devices. Epoch Time: 55.83 sec. Average Epoch Time: 55.83 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 0:38:46. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:45:34 INFO stats.py:314 | Epoch[689] Step[6] GlobalStep[94399] Training Speed: 395.38 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 0:38:43. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:45:42 INFO loss_tracker.py:84 | Epoch[689/NA] Step[24] GlobalStep[94417/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0016] loss_depth[0.0125] total_loss[0.0142] Rank[0/16] 06/24/2025 23:45:44 INFO stats.py:314 | Epoch[689] Step[31] GlobalStep[94424] Training Speed: 430.12 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:38:33. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:45:52 INFO loss_tracker.py:84 | Epoch[689/NA] Step[49] GlobalStep[94442/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0015] loss_depth[0.0125] total_loss[0.0140] Rank[0/16] 06/24/2025 23:45:55 INFO stats.py:314 | Epoch[689] Step[56] GlobalStep[94449] Training Speed: 431.96 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:38:23. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:46:02 INFO loss_tracker.py:84 | Epoch[689/NA] Step[74] GlobalStep[94467/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0126] total_loss[0.0135] Rank[0/16] 06/24/2025 23:46:05 INFO stats.py:314 | Epoch[689] Step[81] GlobalStep[94474] Training Speed: 427.68 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:38:12. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:46:13 INFO loss_tracker.py:84 | Epoch[689/NA] Step[99] GlobalStep[94492/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0018] loss_depth[0.0125] total_loss[0.0143] Rank[0/16] 06/24/2025 23:46:15 INFO stats.py:314 | Epoch[689] Step[106] GlobalStep[94499] Training Speed: 433.29 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:38:02. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:46:23 INFO loss_tracker.py:84 | Epoch[689/NA] Step[124] GlobalStep[94517/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0015] loss_depth[0.0125] total_loss[0.0140] Rank[0/16] 06/24/2025 23:46:25 INFO stats.py:314 | Epoch[689] Step[131] GlobalStep[94524] Training Speed: 439.79 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:37:51. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:46:27 INFO stats.py:394 | Epoch[689] completed. Training Speed: 306.58 samples/sec across all devices. Epoch Time: 57.20 sec. Average Epoch Time: 57.20 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:37:49. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:46:37 INFO stats.py:314 | Epoch[690] Step[19] GlobalStep[94549] Training Speed: 423.68 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:37:41. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:46:39 INFO loss_tracker.py:84 | Epoch[690/NA] Step[24] GlobalStep[94554/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/24/2025 23:46:47 INFO stats.py:314 | Epoch[690] Step[44] GlobalStep[94574] Training Speed: 429.91 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:37:31. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:46:50 INFO loss_tracker.py:84 | Epoch[690/NA] Step[49] GlobalStep[94579/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0017] loss_depth[0.0125] total_loss[0.0143] Rank[0/16] 06/24/2025 23:46:58 INFO stats.py:314 | Epoch[690] Step[69] GlobalStep[94599] Training Speed: 439.41 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:37:20. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:47:00 INFO loss_tracker.py:84 | Epoch[690/NA] Step[74] GlobalStep[94604/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0013] loss_depth[0.0126] total_loss[0.0139] Rank[0/16] 06/24/2025 23:47:08 INFO stats.py:314 | Epoch[690] Step[94] GlobalStep[94624] Training Speed: 405.45 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 0:37:10. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:47:10 INFO loss_tracker.py:84 | Epoch[690/NA] Step[99] GlobalStep[94629/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0014] loss_depth[0.0125] total_loss[0.0139] Rank[0/16] 06/24/2025 23:47:19 INFO stats.py:314 | Epoch[690] Step[119] GlobalStep[94649] Training Speed: 432.05 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:37:00. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:47:20 INFO loss_tracker.py:84 | Epoch[690/NA] Step[124] GlobalStep[94654/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0125] total_loss[0.0135] Rank[0/16] 06/24/2025 23:47:25 INFO stats.py:394 | Epoch[690] completed. Training Speed: 305.78 samples/sec across all devices. Epoch Time: 57.35 sec. Average Epoch Time: 57.35 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:36:53. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:47:29 INFO stats.py:314 | Epoch[691] Step[7] GlobalStep[94674] Training Speed: 431.70 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:36:49. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:47:36 INFO loss_tracker.py:84 | Epoch[691/NA] Step[24] GlobalStep[94691/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0019] loss_depth[0.0125] total_loss[0.0144] Rank[0/16] 06/24/2025 23:47:39 INFO stats.py:314 | Epoch[691] Step[32] GlobalStep[94699] Training Speed: 422.81 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:36:39. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:47:47 INFO loss_tracker.py:84 | Epoch[691/NA] Step[49] GlobalStep[94716/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0022] loss_depth[0.0126] total_loss[0.0147] Rank[0/16] 06/24/2025 23:47:50 INFO stats.py:314 | Epoch[691] Step[57] GlobalStep[94724] Training Speed: 409.71 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:36:29. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:47:57 INFO loss_tracker.py:84 | Epoch[691/NA] Step[74] GlobalStep[94741/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0015] loss_depth[0.0125] total_loss[0.0140] Rank[0/16] 06/24/2025 23:48:00 INFO stats.py:314 | Epoch[691] Step[82] GlobalStep[94749] Training Speed: 425.42 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:36:18. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:48:07 INFO loss_tracker.py:84 | Epoch[691/NA] Step[99] GlobalStep[94766/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0016] loss_depth[0.0125] total_loss[0.0141] Rank[0/16] 06/24/2025 23:48:10 INFO stats.py:314 | Epoch[691] Step[107] GlobalStep[94774] Training Speed: 432.01 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:36:08. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:48:17 INFO loss_tracker.py:84 | Epoch[691/NA] Step[124] GlobalStep[94791/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0021] loss_depth[0.0125] total_loss[0.0146] Rank[0/16] 06/24/2025 23:48:20 INFO stats.py:314 | Epoch[691] Step[132] GlobalStep[94799] Training Speed: 453.27 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 0:35:57. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:48:21 INFO stats.py:394 | Epoch[691] completed. Training Speed: 308.08 samples/sec across all devices. Epoch Time: 56.92 sec. Average Epoch Time: 56.92 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:35:56. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:48:31 INFO stats.py:314 | Epoch[692] Step[20] GlobalStep[94824] Training Speed: 428.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:35:47. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:48:33 INFO loss_tracker.py:84 | Epoch[692/NA] Step[24] GlobalStep[94828/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/24/2025 23:48:41 INFO stats.py:314 | Epoch[692] Step[45] GlobalStep[94849] Training Speed: 401.62 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 0:35:37. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:48:43 INFO loss_tracker.py:84 | Epoch[692/NA] Step[49] GlobalStep[94853/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0014] loss_depth[0.0125] total_loss[0.0139] Rank[0/16] 06/24/2025 23:48:52 INFO stats.py:314 | Epoch[692] Step[70] GlobalStep[94874] Training Speed: 426.11 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:35:26. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:48:53 INFO loss_tracker.py:84 | Epoch[692/NA] Step[74] GlobalStep[94878/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0125] total_loss[0.0135] Rank[0/16] 06/24/2025 23:49:02 INFO stats.py:314 | Epoch[692] Step[95] GlobalStep[94899] Training Speed: 424.93 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:35:16. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:49:04 INFO loss_tracker.py:84 | Epoch[692/NA] Step[99] GlobalStep[94903/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0022] loss_depth[0.0125] total_loss[0.0147] Rank[0/16] 06/24/2025 23:49:13 INFO stats.py:314 | Epoch[692] Step[120] GlobalStep[94924] Training Speed: 445.17 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:35:06. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:49:15 INFO loss_tracker.py:84 | Epoch[692/NA] Step[124] GlobalStep[94928/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0016] loss_depth[0.0125] total_loss[0.0141] Rank[0/16] 06/24/2025 23:49:19 INFO stats.py:394 | Epoch[692] completed. Training Speed: 305.56 samples/sec across all devices. Epoch Time: 57.39 sec. Average Epoch Time: 57.39 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:34:59. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:49:23 INFO stats.py:314 | Epoch[693] Step[8] GlobalStep[94949] Training Speed: 438.46 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:34:55. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:49:30 INFO loss_tracker.py:84 | Epoch[693/NA] Step[24] GlobalStep[94965/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0138] Rank[0/16] 06/24/2025 23:49:34 INFO stats.py:314 | Epoch[693] Step[33] GlobalStep[94974] Training Speed: 429.58 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:34:45. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:49:40 INFO loss_tracker.py:84 | Epoch[693/NA] Step[49] GlobalStep[94990/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0125] total_loss[0.0135] Rank[0/16] 06/24/2025 23:49:44 INFO stats.py:314 | Epoch[693] Step[58] GlobalStep[94999] Training Speed: 432.78 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:34:34. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:49:50 INFO loss_tracker.py:84 | Epoch[693/NA] Step[74] GlobalStep[95015/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/24/2025 23:49:54 INFO stats.py:314 | Epoch[693] Step[83] GlobalStep[95024] Training Speed: 415.55 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:34:24. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:50:01 INFO loss_tracker.py:84 | Epoch[693/NA] Step[99] GlobalStep[95040/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0014] loss_depth[0.0125] total_loss[0.0140] Rank[0/16] 06/24/2025 23:50:05 INFO stats.py:314 | Epoch[693] Step[108] GlobalStep[95049] Training Speed: 429.25 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:34:14. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:50:11 INFO loss_tracker.py:84 | Epoch[693/NA] Step[124] GlobalStep[95065/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/24/2025 23:50:14 INFO stats.py:314 | Epoch[693] Step[133] GlobalStep[95074] Training Speed: 440.79 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:34:03. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:50:16 INFO stats.py:394 | Epoch[693] completed. Training Speed: 308.25 samples/sec across all devices. Epoch Time: 56.89 sec. Average Epoch Time: 56.89 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:34:02. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:50:26 INFO stats.py:314 | Epoch[694] Step[21] GlobalStep[95099] Training Speed: 426.49 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:33:53. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:50:27 INFO loss_tracker.py:84 | Epoch[694/NA] Step[24] GlobalStep[95102/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0015] loss_depth[0.0125] total_loss[0.0141] Rank[0/16] 06/24/2025 23:50:36 INFO stats.py:314 | Epoch[694] Step[46] GlobalStep[95124] Training Speed: 437.44 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:33:43. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:50:38 INFO loss_tracker.py:84 | Epoch[694/NA] Step[49] GlobalStep[95127/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0015] loss_depth[0.0125] total_loss[0.0140] Rank[0/16] 06/24/2025 23:50:46 INFO stats.py:314 | Epoch[694] Step[71] GlobalStep[95149] Training Speed: 438.24 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:33:32. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:50:47 INFO loss_tracker.py:84 | Epoch[694/NA] Step[74] GlobalStep[95152/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0015] loss_depth[0.0125] total_loss[0.0140] Rank[0/16] 06/24/2025 23:50:56 INFO stats.py:314 | Epoch[694] Step[96] GlobalStep[95174] Training Speed: 436.18 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:33:22. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:50:58 INFO loss_tracker.py:84 | Epoch[694/NA] Step[99] GlobalStep[95177/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/24/2025 23:51:06 INFO stats.py:314 | Epoch[694] Step[121] GlobalStep[95199] Training Speed: 438.53 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:33:11. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:51:08 INFO loss_tracker.py:84 | Epoch[694/NA] Step[124] GlobalStep[95202/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0017] loss_depth[0.0125] total_loss[0.0143] Rank[0/16] 06/24/2025 23:51:12 INFO stats.py:394 | Epoch[694] completed. Training Speed: 310.85 samples/sec across all devices. Epoch Time: 56.41 sec. Average Epoch Time: 56.41 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 0:33:05. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:51:17 INFO stats.py:314 | Epoch[695] Step[9] GlobalStep[95224] Training Speed: 391.34 samples/sec across all devices. Average Step Time: 0.33 sec. Estimated Remaining Time: 0:33:01. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:51:23 INFO loss_tracker.py:84 | Epoch[695/NA] Step[24] GlobalStep[95239/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0013] loss_depth[0.0125] total_loss[0.0139] Rank[0/16] 06/24/2025 23:51:27 INFO stats.py:314 | Epoch[695] Step[34] GlobalStep[95249] Training Speed: 432.42 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:32:51. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:51:34 INFO loss_tracker.py:84 | Epoch[695/NA] Step[49] GlobalStep[95264/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0016] loss_depth[0.0125] total_loss[0.0141] Rank[0/16] 06/24/2025 23:51:38 INFO stats.py:314 | Epoch[695] Step[59] GlobalStep[95274] Training Speed: 424.19 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:32:40. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:51:44 INFO loss_tracker.py:84 | Epoch[695/NA] Step[74] GlobalStep[95289/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0126] total_loss[0.0137] Rank[0/16] 06/24/2025 23:51:48 INFO stats.py:314 | Epoch[695] Step[84] GlobalStep[95299] Training Speed: 435.93 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:32:30. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:51:54 INFO loss_tracker.py:84 | Epoch[695/NA] Step[99] GlobalStep[95314/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0014] loss_depth[0.0125] total_loss[0.0139] Rank[0/16] 06/24/2025 23:51:58 INFO stats.py:314 | Epoch[695] Step[109] GlobalStep[95324] Training Speed: 427.06 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:32:20. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:52:04 INFO loss_tracker.py:84 | Epoch[695/NA] Step[124] GlobalStep[95339/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/24/2025 23:52:08 INFO stats.py:314 | Epoch[695] Step[134] GlobalStep[95349] Training Speed: 436.98 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:32:09. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:52:09 INFO stats.py:394 | Epoch[695] completed. Training Speed: 310.86 samples/sec across all devices. Epoch Time: 56.41 sec. Average Epoch Time: 56.41 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 0:32:08. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:52:19 INFO stats.py:314 | Epoch[696] Step[22] GlobalStep[95374] Training Speed: 432.58 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:31:59. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:52:20 INFO loss_tracker.py:84 | Epoch[696/NA] Step[24] GlobalStep[95376/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0014] loss_depth[0.0126] total_loss[0.0140] Rank[0/16] 06/24/2025 23:52:30 INFO stats.py:314 | Epoch[696] Step[47] GlobalStep[95399] Training Speed: 416.80 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:31:48. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:52:30 INFO loss_tracker.py:84 | Epoch[696/NA] Step[49] GlobalStep[95401/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0016] loss_depth[0.0125] total_loss[0.0141] Rank[0/16] 06/24/2025 23:52:40 INFO stats.py:314 | Epoch[696] Step[72] GlobalStep[95424] Training Speed: 414.63 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:31:38. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:52:41 INFO loss_tracker.py:84 | Epoch[696/NA] Step[74] GlobalStep[95426/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0013] loss_depth[0.0125] total_loss[0.0138] Rank[0/16] 06/24/2025 23:52:50 INFO stats.py:314 | Epoch[696] Step[97] GlobalStep[95449] Training Speed: 437.48 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:31:28. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:52:51 INFO loss_tracker.py:84 | Epoch[696/NA] Step[99] GlobalStep[95451/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0014] loss_depth[0.0126] total_loss[0.0140] Rank[0/16] 06/24/2025 23:53:01 INFO stats.py:314 | Epoch[696] Step[122] GlobalStep[95474] Training Speed: 450.10 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 0:31:17. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:53:01 INFO loss_tracker.py:84 | Epoch[696/NA] Step[124] GlobalStep[95476/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0022] loss_depth[0.0125] total_loss[0.0147] Rank[0/16] 06/24/2025 23:53:06 INFO stats.py:394 | Epoch[696] completed. Training Speed: 307.42 samples/sec across all devices. Epoch Time: 57.04 sec. Average Epoch Time: 57.04 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:31:11. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:53:11 INFO stats.py:314 | Epoch[697] Step[10] GlobalStep[95499] Training Speed: 427.45 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:31:07. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:53:17 INFO loss_tracker.py:84 | Epoch[697/NA] Step[24] GlobalStep[95513/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0015] loss_depth[0.0125] total_loss[0.0141] Rank[0/16] 06/24/2025 23:53:22 INFO stats.py:314 | Epoch[697] Step[35] GlobalStep[95524] Training Speed: 423.38 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:30:57. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:53:27 INFO loss_tracker.py:84 | Epoch[697/NA] Step[49] GlobalStep[95538/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0023] loss_depth[0.0125] total_loss[0.0149] Rank[0/16] 06/24/2025 23:53:32 INFO stats.py:314 | Epoch[697] Step[60] GlobalStep[95549] Training Speed: 423.80 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:30:46. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:53:37 INFO loss_tracker.py:84 | Epoch[697/NA] Step[74] GlobalStep[95563/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0019] loss_depth[0.0125] total_loss[0.0144] Rank[0/16] 06/24/2025 23:53:42 INFO stats.py:314 | Epoch[697] Step[85] GlobalStep[95574] Training Speed: 410.66 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:30:36. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:53:48 INFO loss_tracker.py:84 | Epoch[697/NA] Step[99] GlobalStep[95588/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0016] loss_depth[0.0125] total_loss[0.0142] Rank[0/16] 06/24/2025 23:53:52 INFO stats.py:314 | Epoch[697] Step[110] GlobalStep[95599] Training Speed: 436.10 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:30:25. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:53:57 INFO loss_tracker.py:84 | Epoch[697/NA] Step[124] GlobalStep[95613/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/24/2025 23:54:01 INFO stats.py:314 | Epoch[697] Step[135] GlobalStep[95624] Training Speed: 452.70 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 0:30:15. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:54:02 INFO stats.py:394 | Epoch[697] completed. Training Speed: 312.88 samples/sec across all devices. Epoch Time: 56.05 sec. Average Epoch Time: 56.05 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 0:30:15. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:54:12 INFO stats.py:314 | Epoch[698] Step[23] GlobalStep[95649] Training Speed: 436.35 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:30:05. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:54:13 INFO loss_tracker.py:84 | Epoch[698/NA] Step[24] GlobalStep[95650/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0023] loss_depth[0.0125] total_loss[0.0148] Rank[0/16] 06/24/2025 23:54:23 INFO stats.py:314 | Epoch[698] Step[48] GlobalStep[95674] Training Speed: 397.99 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 0:29:54. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:54:23 INFO loss_tracker.py:84 | Epoch[698/NA] Step[49] GlobalStep[95675/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0014] loss_depth[0.0125] total_loss[0.0140] Rank[0/16] 06/24/2025 23:54:33 INFO stats.py:314 | Epoch[698] Step[73] GlobalStep[95699] Training Speed: 430.47 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:29:44. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:54:34 INFO loss_tracker.py:84 | Epoch[698/NA] Step[74] GlobalStep[95700/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/24/2025 23:54:44 INFO stats.py:314 | Epoch[698] Step[98] GlobalStep[95724] Training Speed: 410.26 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:29:34. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:54:44 INFO loss_tracker.py:84 | Epoch[698/NA] Step[99] GlobalStep[95725/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0008] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/24/2025 23:54:54 INFO stats.py:314 | Epoch[698] Step[123] GlobalStep[95749] Training Speed: 449.36 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 0:29:23. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:54:54 INFO loss_tracker.py:84 | Epoch[698/NA] Step[124] GlobalStep[95750/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0024] loss_depth[0.0125] total_loss[0.0149] Rank[0/16] 06/24/2025 23:54:59 INFO stats.py:394 | Epoch[698] completed. Training Speed: 305.99 samples/sec across all devices. Epoch Time: 57.31 sec. Average Epoch Time: 57.31 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:29:18. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:55:04 INFO stats.py:314 | Epoch[699] Step[11] GlobalStep[95774] Training Speed: 435.67 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:29:13. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:55:10 INFO loss_tracker.py:84 | Epoch[699/NA] Step[24] GlobalStep[95787/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0016] loss_depth[0.0125] total_loss[0.0141] Rank[0/16] 06/24/2025 23:55:15 INFO stats.py:314 | Epoch[699] Step[36] GlobalStep[95799] Training Speed: 429.95 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:29:02. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:55:20 INFO loss_tracker.py:84 | Epoch[699/NA] Step[49] GlobalStep[95812/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0017] loss_depth[0.0125] total_loss[0.0142] Rank[0/16] 06/24/2025 23:55:25 INFO stats.py:314 | Epoch[699] Step[61] GlobalStep[95824] Training Speed: 440.90 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:28:52. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:55:30 INFO loss_tracker.py:84 | Epoch[699/NA] Step[74] GlobalStep[95837/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0014] loss_depth[0.0125] total_loss[0.0139] Rank[0/16] 06/24/2025 23:55:35 INFO stats.py:314 | Epoch[699] Step[86] GlobalStep[95849] Training Speed: 430.91 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:28:42. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:55:40 INFO loss_tracker.py:84 | Epoch[699/NA] Step[99] GlobalStep[95862/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0138] Rank[0/16] 06/24/2025 23:55:45 INFO stats.py:314 | Epoch[699] Step[111] GlobalStep[95874] Training Speed: 428.57 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:28:31. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:55:50 INFO loss_tracker.py:84 | Epoch[699/NA] Step[124] GlobalStep[95887/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0019] loss_depth[0.0126] total_loss[0.0145] Rank[0/16] 06/24/2025 23:55:55 INFO stats.py:314 | Epoch[699] Step[136] GlobalStep[95899] Training Speed: 443.30 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:28:21. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:55:55 INFO stats.py:394 | Epoch[699] completed. Training Speed: 314.41 samples/sec across all devices. Epoch Time: 55.77 sec. Average Epoch Time: 55.77 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 0:28:21. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:56:06 INFO stats.py:314 | Epoch[700] Step[24] GlobalStep[95924] Training Speed: 440.50 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:28:10. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:56:06 INFO loss_tracker.py:84 | Epoch[700/NA] Step[24] GlobalStep[95924/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/24/2025 23:56:16 INFO stats.py:314 | Epoch[700] Step[49] GlobalStep[95949] Training Speed: 428.92 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:28:00. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:56:16 INFO loss_tracker.py:84 | Epoch[700/NA] Step[49] GlobalStep[95949/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0014] loss_depth[0.0125] total_loss[0.0139] Rank[0/16] 06/24/2025 23:56:26 INFO stats.py:314 | Epoch[700] Step[74] GlobalStep[95974] Training Speed: 441.16 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:27:50. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:56:26 INFO loss_tracker.py:84 | Epoch[700/NA] Step[74] GlobalStep[95974/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0018] loss_depth[0.0125] total_loss[0.0143] Rank[0/16] 06/24/2025 23:56:37 INFO stats.py:314 | Epoch[700] Step[99] GlobalStep[95999] Training Speed: 426.56 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:27:39. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:56:37 INFO loss_tracker.py:84 | Epoch[700/NA] Step[99] GlobalStep[95999/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/24/2025 23:56:37 WARNING accelerator.py:3099 | Deleting 1 checkpoints to make room for new checkpoint. Rank[0/16] 06/24/2025 23:56:38 INFO accelerator.py:3111 | Saving current state to /job_data/checkpoints/checkpoint_23 Rank[3/16] 06/24/2025 23:56:38 INFO checkpoint.py:110 | Save checkpoint at the end of step 95999 to /job_data/checkpoints/checkpoint_23 Rank[15/16] 06/24/2025 23:56:38 INFO checkpoint.py:110 | Save checkpoint at the end of step 95999 to /job_data/checkpoints/checkpoint_23 Rank[11/16] 06/24/2025 23:56:39 INFO checkpoint.py:110 | Save checkpoint at the end of step 95999 to /job_data/checkpoints/checkpoint_23 Rank[5/16] 06/24/2025 23:56:39 INFO checkpoint.py:110 | Save checkpoint at the end of step 95999 to /job_data/checkpoints/checkpoint_23 Rank[4/16] 06/24/2025 23:56:39 INFO checkpoint.py:110 | Save checkpoint at the end of step 95999 to /job_data/checkpoints/checkpoint_23 Rank[12/16] 06/24/2025 23:56:39 INFO checkpoint.py:110 | Save checkpoint at the end of step 95999 to /job_data/checkpoints/checkpoint_23 Rank[2/16] 06/24/2025 23:56:39 INFO checkpoint.py:110 | Save checkpoint at the end of step 95999 to /job_data/checkpoints/checkpoint_23 Rank[9/16] 06/24/2025 23:56:39 INFO checkpoint.py:110 | Save checkpoint at the end of step 95999 to /job_data/checkpoints/checkpoint_23 Rank[8/16] 06/24/2025 23:56:39 INFO checkpoint.py:110 | Save checkpoint at the end of step 95999 to /job_data/checkpoints/checkpoint_23 Rank[6/16] 06/24/2025 23:56:39 INFO checkpoint.py:110 | Save checkpoint at the end of step 95999 to /job_data/checkpoints/checkpoint_23 Rank[10/16] 06/24/2025 23:56:39 INFO checkpoint.py:110 | Save checkpoint at the end of step 95999 to /job_data/checkpoints/checkpoint_23 Rank[7/16] 06/24/2025 23:56:39 INFO checkpoint.py:110 | Save checkpoint at the end of step 95999 to /job_data/checkpoints/checkpoint_23 Rank[1/16] 06/24/2025 23:56:39 INFO checkpoint.py:110 | Save checkpoint at the end of step 95999 to /job_data/checkpoints/checkpoint_23 Rank[13/16] 06/24/2025 23:56:39 INFO checkpoint.py:110 | Save checkpoint at the end of step 95999 to /job_data/checkpoints/checkpoint_23 Rank[14/16] 06/24/2025 23:56:39 INFO checkpoint.py:110 | Save checkpoint at the end of step 95999 to /job_data/checkpoints/checkpoint_23 Rank[0/16] 06/24/2025 23:56:39 INFO checkpointing.py:106 | Model weights saved in /job_data/checkpoints/checkpoint_23/model.safetensors Rank[0/16] 06/24/2025 23:56:40 INFO checkpointing.py:113 | Optimizer state saved in /job_data/checkpoints/checkpoint_23/optimizer.bin Rank[0/16] 06/24/2025 23:56:40 INFO checkpointing.py:120 | Scheduler state saved in /job_data/checkpoints/checkpoint_23/scheduler.bin Rank[0/16] 06/24/2025 23:56:40 INFO checkpointing.py:137 | Sampler state for dataloader 0 saved in /job_data/checkpoints/checkpoint_23/sampler.bin Rank[0/16] 06/24/2025 23:56:40 INFO checkpointing.py:164 | Random states saved in /job_data/checkpoints/checkpoint_23/random_states_0.pkl Rank[0/16] 06/24/2025 23:56:40 INFO checkpointing.py:300 | Saving the state of TrainerProgressState to /job_data/checkpoints/checkpoint_23/custom_checkpoint_0.pkl Rank[0/16] 06/24/2025 23:56:41 INFO checkpoint.py:110 | Save checkpoint at the end of step 95999 to /job_data/checkpoints/checkpoint_23 Rank[0/16] 06/24/2025 23:56:50 INFO stats.py:314 | Epoch[700] Step[124] GlobalStep[96024] Training Speed: 444.12 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:27:29. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:56:51 INFO loss_tracker.py:84 | Epoch[700/NA] Step[124] GlobalStep[96024/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0008] loss_depth[0.0125] total_loss[0.0133] Rank[0/16] 06/24/2025 23:56:55 INFO stats.py:394 | Epoch[700] completed. Training Speed: 290.89 samples/sec across all devices. Epoch Time: 60.28 sec. Average Epoch Time: 60.28 sec. Average Step Time: 0.44 sec. Estimated Remaining Time: 0:27:24. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:57:02 INFO stats.py:314 | Epoch[701] Step[12] GlobalStep[96049] Training Speed: 399.27 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 0:27:19. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:57:06 INFO loss_tracker.py:84 | Epoch[701/NA] Step[24] GlobalStep[96061/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/24/2025 23:57:12 INFO stats.py:314 | Epoch[701] Step[37] GlobalStep[96074] Training Speed: 439.28 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:27:08. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:57:17 INFO loss_tracker.py:84 | Epoch[701/NA] Step[49] GlobalStep[96086/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0135] Rank[0/16] 06/24/2025 23:57:22 INFO stats.py:314 | Epoch[701] Step[62] GlobalStep[96099] Training Speed: 414.88 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:26:58. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:57:27 INFO loss_tracker.py:84 | Epoch[701/NA] Step[74] GlobalStep[96111/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0016] loss_depth[0.0125] total_loss[0.0140] Rank[0/16] 06/24/2025 23:57:32 INFO stats.py:314 | Epoch[701] Step[87] GlobalStep[96124] Training Speed: 431.72 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:26:48. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:57:37 INFO loss_tracker.py:84 | Epoch[701/NA] Step[99] GlobalStep[96136/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0018] loss_depth[0.0125] total_loss[0.0143] Rank[0/16] 06/24/2025 23:57:43 INFO stats.py:314 | Epoch[701] Step[112] GlobalStep[96149] Training Speed: 432.74 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:26:37. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:57:47 INFO loss_tracker.py:84 | Epoch[701/NA] Step[124] GlobalStep[96161/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/24/2025 23:57:52 INFO stats.py:394 | Epoch[701] completed. Training Speed: 310.04 samples/sec across all devices. Epoch Time: 56.56 sec. Average Epoch Time: 56.56 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 0:26:27. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:57:52 INFO stats.py:314 | Epoch[702] Step[0] GlobalStep[96174] Training Speed: 356.23 samples/sec across all devices. Average Step Time: 0.36 sec. Estimated Remaining Time: 0:26:27. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:58:03 INFO loss_tracker.py:84 | Epoch[702/NA] Step[24] GlobalStep[96198/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0016] loss_depth[0.0125] total_loss[0.0141] Rank[0/16] 06/24/2025 23:58:04 INFO stats.py:314 | Epoch[702] Step[25] GlobalStep[96199] Training Speed: 425.27 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:26:17. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:58:13 INFO loss_tracker.py:84 | Epoch[702/NA] Step[49] GlobalStep[96223/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/24/2025 23:58:14 INFO stats.py:314 | Epoch[702] Step[50] GlobalStep[96224] Training Speed: 428.60 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:26:06. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:58:24 INFO loss_tracker.py:84 | Epoch[702/NA] Step[74] GlobalStep[96248/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0017] loss_depth[0.0126] total_loss[0.0143] Rank[0/16] 06/24/2025 23:58:24 INFO stats.py:314 | Epoch[702] Step[75] GlobalStep[96249] Training Speed: 428.68 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:25:56. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:58:33 INFO loss_tracker.py:84 | Epoch[702/NA] Step[99] GlobalStep[96273/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0014] loss_depth[0.0125] total_loss[0.0139] Rank[0/16] 06/24/2025 23:58:34 INFO stats.py:314 | Epoch[702] Step[100] GlobalStep[96274] Training Speed: 352.08 samples/sec across all devices. Average Step Time: 0.36 sec. Estimated Remaining Time: 0:25:45. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:58:44 INFO loss_tracker.py:84 | Epoch[702/NA] Step[124] GlobalStep[96298/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0016] loss_depth[0.0126] total_loss[0.0142] Rank[0/16] 06/24/2025 23:58:44 INFO stats.py:314 | Epoch[702] Step[125] GlobalStep[96299] Training Speed: 433.69 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:25:35. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:58:48 INFO stats.py:394 | Epoch[702] completed. Training Speed: 309.35 samples/sec across all devices. Epoch Time: 56.69 sec. Average Epoch Time: 56.69 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 0:25:30. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:58:55 INFO stats.py:314 | Epoch[703] Step[13] GlobalStep[96324] Training Speed: 439.68 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:25:25. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:59:00 INFO loss_tracker.py:84 | Epoch[703/NA] Step[24] GlobalStep[96335/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0014] loss_depth[0.0125] total_loss[0.0139] Rank[0/16] 06/24/2025 23:59:06 INFO stats.py:314 | Epoch[703] Step[38] GlobalStep[96349] Training Speed: 403.99 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 0:25:14. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:59:10 INFO loss_tracker.py:84 | Epoch[703/NA] Step[49] GlobalStep[96360/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0015] loss_depth[0.0125] total_loss[0.0140] Rank[0/16] 06/24/2025 23:59:16 INFO stats.py:314 | Epoch[703] Step[63] GlobalStep[96374] Training Speed: 430.92 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:25:04. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:59:21 INFO loss_tracker.py:84 | Epoch[703/NA] Step[74] GlobalStep[96385/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0018] loss_depth[0.0126] total_loss[0.0144] Rank[0/16] 06/24/2025 23:59:27 INFO stats.py:314 | Epoch[703] Step[88] GlobalStep[96399] Training Speed: 438.11 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:24:54. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:59:31 INFO loss_tracker.py:84 | Epoch[703/NA] Step[99] GlobalStep[96410/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0013] loss_depth[0.0125] total_loss[0.0138] Rank[0/16] 06/24/2025 23:59:37 INFO stats.py:314 | Epoch[703] Step[113] GlobalStep[96424] Training Speed: 440.10 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:24:43. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:59:41 INFO loss_tracker.py:84 | Epoch[703/NA] Step[124] GlobalStep[96435/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/24/2025 23:59:46 INFO stats.py:394 | Epoch[703] completed. Training Speed: 305.30 samples/sec across all devices. Epoch Time: 57.44 sec. Average Epoch Time: 57.44 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:24:34. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:59:47 INFO stats.py:314 | Epoch[704] Step[1] GlobalStep[96449] Training Speed: 438.53 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:24:33. Learning Rate: 1.00000e-05. Rank[0/16] 06/24/2025 23:59:56 INFO loss_tracker.py:84 | Epoch[704/NA] Step[24] GlobalStep[96472/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0015] loss_depth[0.0125] total_loss[0.0141] Rank[0/16] 06/24/2025 23:59:57 INFO stats.py:314 | Epoch[704] Step[26] GlobalStep[96474] Training Speed: 408.88 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:24:22. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:00:07 INFO loss_tracker.py:84 | Epoch[704/NA] Step[49] GlobalStep[96497/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/25/2025 00:00:08 INFO stats.py:314 | Epoch[704] Step[51] GlobalStep[96499] Training Speed: 424.31 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:24:12. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:00:18 INFO loss_tracker.py:84 | Epoch[704/NA] Step[74] GlobalStep[96522/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0014] loss_depth[0.0125] total_loss[0.0139] Rank[0/16] 06/25/2025 00:00:18 INFO stats.py:314 | Epoch[704] Step[76] GlobalStep[96524] Training Speed: 427.81 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:24:02. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:00:28 INFO loss_tracker.py:84 | Epoch[704/NA] Step[99] GlobalStep[96547/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0126] total_loss[0.0137] Rank[0/16] 06/25/2025 00:00:29 INFO stats.py:314 | Epoch[704] Step[101] GlobalStep[96549] Training Speed: 431.50 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:23:51. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:00:38 INFO loss_tracker.py:84 | Epoch[704/NA] Step[124] GlobalStep[96572/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0014] loss_depth[0.0126] total_loss[0.0139] Rank[0/16] 06/25/2025 00:00:39 INFO stats.py:314 | Epoch[704] Step[126] GlobalStep[96574] Training Speed: 454.64 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 0:23:41. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:00:43 INFO stats.py:394 | Epoch[704] completed. Training Speed: 305.71 samples/sec across all devices. Epoch Time: 57.36 sec. Average Epoch Time: 57.36 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:23:37. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:00:50 INFO stats.py:314 | Epoch[705] Step[14] GlobalStep[96599] Training Speed: 431.03 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:23:31. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:00:54 INFO loss_tracker.py:84 | Epoch[705/NA] Step[24] GlobalStep[96609/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0015] loss_depth[0.0125] total_loss[0.0140] Rank[0/16] 06/25/2025 00:01:00 INFO stats.py:314 | Epoch[705] Step[39] GlobalStep[96624] Training Speed: 437.78 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:23:20. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:01:04 INFO loss_tracker.py:84 | Epoch[705/NA] Step[49] GlobalStep[96634/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/25/2025 00:01:10 INFO stats.py:314 | Epoch[705] Step[64] GlobalStep[96649] Training Speed: 430.29 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:23:10. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:01:14 INFO loss_tracker.py:84 | Epoch[705/NA] Step[74] GlobalStep[96659/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0017] loss_depth[0.0125] total_loss[0.0142] Rank[0/16] 06/25/2025 00:01:21 INFO stats.py:314 | Epoch[705] Step[89] GlobalStep[96674] Training Speed: 429.03 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:22:59. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:01:25 INFO loss_tracker.py:84 | Epoch[705/NA] Step[99] GlobalStep[96684/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0008] loss_depth[0.0125] total_loss[0.0133] Rank[0/16] 06/25/2025 00:01:30 INFO stats.py:314 | Epoch[705] Step[114] GlobalStep[96699] Training Speed: 428.01 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:22:49. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:01:35 INFO loss_tracker.py:84 | Epoch[705/NA] Step[124] GlobalStep[96709/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0016] loss_depth[0.0125] total_loss[0.0141] Rank[0/16] 06/25/2025 00:01:39 INFO stats.py:394 | Epoch[705] completed. Training Speed: 314.00 samples/sec across all devices. Epoch Time: 55.85 sec. Average Epoch Time: 55.85 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 0:22:40. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:01:41 INFO stats.py:314 | Epoch[706] Step[2] GlobalStep[96724] Training Speed: 438.67 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:22:39. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:01:50 INFO loss_tracker.py:84 | Epoch[706/NA] Step[24] GlobalStep[96746/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0138] Rank[0/16] 06/25/2025 00:01:51 INFO stats.py:314 | Epoch[706] Step[27] GlobalStep[96749] Training Speed: 437.10 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:22:28. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:02:00 INFO loss_tracker.py:84 | Epoch[706/NA] Step[49] GlobalStep[96771/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/25/2025 00:02:02 INFO stats.py:314 | Epoch[706] Step[52] GlobalStep[96774] Training Speed: 428.70 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:22:18. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:02:10 INFO loss_tracker.py:84 | Epoch[706/NA] Step[74] GlobalStep[96796/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0016] loss_depth[0.0125] total_loss[0.0141] Rank[0/16] 06/25/2025 00:02:11 INFO stats.py:314 | Epoch[706] Step[77] GlobalStep[96799] Training Speed: 431.26 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:22:07. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:02:21 INFO loss_tracker.py:84 | Epoch[706/NA] Step[99] GlobalStep[96821/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0019] loss_depth[0.0125] total_loss[0.0144] Rank[0/16] 06/25/2025 00:02:22 INFO stats.py:314 | Epoch[706] Step[102] GlobalStep[96824] Training Speed: 425.41 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:21:57. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:02:31 INFO loss_tracker.py:84 | Epoch[706/NA] Step[124] GlobalStep[96846/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0135] Rank[0/16] 06/25/2025 00:02:32 INFO stats.py:314 | Epoch[706] Step[127] GlobalStep[96849] Training Speed: 435.09 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:21:47. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:02:35 INFO stats.py:394 | Epoch[706] completed. Training Speed: 311.68 samples/sec across all devices. Epoch Time: 56.26 sec. Average Epoch Time: 56.26 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 0:21:43. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:02:43 INFO stats.py:314 | Epoch[707] Step[15] GlobalStep[96874] Training Speed: 428.18 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:21:36. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:02:47 INFO loss_tracker.py:84 | Epoch[707/NA] Step[24] GlobalStep[96883/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/25/2025 00:02:53 INFO stats.py:314 | Epoch[707] Step[40] GlobalStep[96899] Training Speed: 438.26 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:21:26. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:02:57 INFO loss_tracker.py:84 | Epoch[707/NA] Step[49] GlobalStep[96908/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/25/2025 00:03:03 INFO stats.py:314 | Epoch[707] Step[65] GlobalStep[96924] Training Speed: 438.46 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:21:16. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:03:07 INFO loss_tracker.py:84 | Epoch[707/NA] Step[74] GlobalStep[96933/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/25/2025 00:03:14 INFO stats.py:314 | Epoch[707] Step[90] GlobalStep[96949] Training Speed: 436.15 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:21:05. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:03:18 INFO loss_tracker.py:84 | Epoch[707/NA] Step[99] GlobalStep[96958/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0020] loss_depth[0.0125] total_loss[0.0145] Rank[0/16] 06/25/2025 00:03:24 INFO stats.py:314 | Epoch[707] Step[115] GlobalStep[96974] Training Speed: 426.23 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:20:55. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:03:27 INFO loss_tracker.py:84 | Epoch[707/NA] Step[124] GlobalStep[96983/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0016] loss_depth[0.0125] total_loss[0.0141] Rank[0/16] 06/25/2025 00:03:32 INFO stats.py:394 | Epoch[707] completed. Training Speed: 308.84 samples/sec across all devices. Epoch Time: 56.78 sec. Average Epoch Time: 56.78 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 0:20:46. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:03:35 INFO stats.py:314 | Epoch[708] Step[3] GlobalStep[96999] Training Speed: 434.25 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:20:44. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:03:44 INFO loss_tracker.py:84 | Epoch[708/NA] Step[24] GlobalStep[97020/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0015] loss_depth[0.0125] total_loss[0.0141] Rank[0/16] 06/25/2025 00:03:45 INFO stats.py:314 | Epoch[708] Step[28] GlobalStep[97024] Training Speed: 424.31 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:20:34. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:03:54 INFO loss_tracker.py:84 | Epoch[708/NA] Step[49] GlobalStep[97045/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0135] Rank[0/16] 06/25/2025 00:03:55 INFO stats.py:314 | Epoch[708] Step[53] GlobalStep[97049] Training Speed: 431.29 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:20:24. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:04:04 INFO loss_tracker.py:84 | Epoch[708/NA] Step[74] GlobalStep[97070/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0025] loss_depth[0.0125] total_loss[0.0150] Rank[0/16] 06/25/2025 00:04:05 INFO stats.py:314 | Epoch[708] Step[78] GlobalStep[97074] Training Speed: 433.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:20:13. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:04:14 INFO loss_tracker.py:84 | Epoch[708/NA] Step[99] GlobalStep[97095/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0126] total_loss[0.0136] Rank[0/16] 06/25/2025 00:04:16 INFO stats.py:314 | Epoch[708] Step[103] GlobalStep[97099] Training Speed: 249.17 samples/sec across all devices. Average Step Time: 0.51 sec. Estimated Remaining Time: 0:20:03. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:04:24 INFO loss_tracker.py:84 | Epoch[708/NA] Step[124] GlobalStep[97120/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0013] loss_depth[0.0125] total_loss[0.0138] Rank[0/16] 06/25/2025 00:04:26 INFO stats.py:314 | Epoch[708] Step[128] GlobalStep[97124] Training Speed: 451.93 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 0:19:53. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:04:28 INFO stats.py:394 | Epoch[708] completed. Training Speed: 311.13 samples/sec across all devices. Epoch Time: 56.36 sec. Average Epoch Time: 56.36 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 0:19:49. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:04:36 INFO stats.py:314 | Epoch[709] Step[16] GlobalStep[97149] Training Speed: 433.87 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:19:42. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:04:40 INFO loss_tracker.py:84 | Epoch[709/NA] Step[24] GlobalStep[97157/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0015] loss_depth[0.0125] total_loss[0.0140] Rank[0/16] 06/25/2025 00:04:47 INFO stats.py:314 | Epoch[709] Step[41] GlobalStep[97174] Training Speed: 433.12 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:19:32. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:04:50 INFO loss_tracker.py:84 | Epoch[709/NA] Step[49] GlobalStep[97182/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0014] loss_depth[0.0125] total_loss[0.0139] Rank[0/16] 06/25/2025 00:04:57 INFO stats.py:314 | Epoch[709] Step[66] GlobalStep[97199] Training Speed: 424.25 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:19:21. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:05:00 INFO loss_tracker.py:84 | Epoch[709/NA] Step[74] GlobalStep[97207/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0013] loss_depth[0.0125] total_loss[0.0138] Rank[0/16] 06/25/2025 00:05:07 INFO stats.py:314 | Epoch[709] Step[91] GlobalStep[97224] Training Speed: 431.70 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:19:11. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:05:11 INFO loss_tracker.py:84 | Epoch[709/NA] Step[99] GlobalStep[97232/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0016] loss_depth[0.0125] total_loss[0.0141] Rank[0/16] 06/25/2025 00:05:18 INFO stats.py:314 | Epoch[709] Step[116] GlobalStep[97249] Training Speed: 431.94 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:19:01. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:05:21 INFO loss_tracker.py:84 | Epoch[709/NA] Step[124] GlobalStep[97257/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0017] loss_depth[0.0125] total_loss[0.0142] Rank[0/16] 06/25/2025 00:05:25 INFO stats.py:394 | Epoch[709] completed. Training Speed: 311.03 samples/sec across all devices. Epoch Time: 56.38 sec. Average Epoch Time: 56.38 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 0:18:52. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:05:28 INFO stats.py:314 | Epoch[710] Step[4] GlobalStep[97274] Training Speed: 434.22 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:18:50. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:05:36 INFO loss_tracker.py:84 | Epoch[710/NA] Step[24] GlobalStep[97294/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0013] loss_depth[0.0125] total_loss[0.0139] Rank[0/16] 06/25/2025 00:05:38 INFO stats.py:314 | Epoch[710] Step[29] GlobalStep[97299] Training Speed: 427.86 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:18:40. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:05:47 INFO loss_tracker.py:84 | Epoch[710/NA] Step[49] GlobalStep[97319/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/25/2025 00:05:49 INFO stats.py:314 | Epoch[710] Step[54] GlobalStep[97324] Training Speed: 423.59 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:18:30. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:05:58 INFO loss_tracker.py:84 | Epoch[710/NA] Step[74] GlobalStep[97344/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0135] Rank[0/16] 06/25/2025 00:06:00 INFO stats.py:314 | Epoch[710] Step[79] GlobalStep[97349] Training Speed: 425.40 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:18:19. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:06:08 INFO loss_tracker.py:84 | Epoch[710/NA] Step[99] GlobalStep[97369/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/25/2025 00:06:10 INFO stats.py:314 | Epoch[710] Step[104] GlobalStep[97374] Training Speed: 425.64 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:18:09. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:06:18 INFO loss_tracker.py:84 | Epoch[710/NA] Step[124] GlobalStep[97394/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0018] loss_depth[0.0125] total_loss[0.0143] Rank[0/16] 06/25/2025 00:06:20 INFO stats.py:314 | Epoch[710] Step[129] GlobalStep[97399] Training Speed: 450.73 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 0:17:58. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:06:23 INFO stats.py:394 | Epoch[710] completed. Training Speed: 302.14 samples/sec across all devices. Epoch Time: 58.04 sec. Average Epoch Time: 58.04 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:17:56. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:06:31 INFO stats.py:314 | Epoch[711] Step[17] GlobalStep[97424] Training Speed: 414.06 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:17:48. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:06:34 INFO loss_tracker.py:84 | Epoch[711/NA] Step[24] GlobalStep[97431/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/25/2025 00:06:41 INFO stats.py:314 | Epoch[711] Step[42] GlobalStep[97449] Training Speed: 432.84 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:17:38. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:06:44 INFO loss_tracker.py:84 | Epoch[711/NA] Step[49] GlobalStep[97456/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0027] loss_depth[0.0125] total_loss[0.0152] Rank[0/16] 06/25/2025 00:06:52 INFO stats.py:314 | Epoch[711] Step[67] GlobalStep[97474] Training Speed: 428.04 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:17:27. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:06:55 INFO loss_tracker.py:84 | Epoch[711/NA] Step[74] GlobalStep[97481/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0019] loss_depth[0.0125] total_loss[0.0143] Rank[0/16] 06/25/2025 00:07:02 INFO stats.py:314 | Epoch[711] Step[92] GlobalStep[97499] Training Speed: 427.89 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:17:17. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:07:05 INFO loss_tracker.py:84 | Epoch[711/NA] Step[99] GlobalStep[97506/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0015] loss_depth[0.0125] total_loss[0.0140] Rank[0/16] 06/25/2025 00:07:13 INFO stats.py:314 | Epoch[711] Step[117] GlobalStep[97524] Training Speed: 427.60 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:17:07. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:07:16 INFO loss_tracker.py:84 | Epoch[711/NA] Step[124] GlobalStep[97531/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/25/2025 00:07:20 INFO stats.py:394 | Epoch[711] completed. Training Speed: 303.94 samples/sec across all devices. Epoch Time: 57.70 sec. Average Epoch Time: 57.70 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:16:59. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:07:24 INFO stats.py:314 | Epoch[712] Step[5] GlobalStep[97549] Training Speed: 432.50 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:16:56. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:07:32 INFO loss_tracker.py:84 | Epoch[712/NA] Step[24] GlobalStep[97568/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/25/2025 00:07:34 INFO stats.py:314 | Epoch[712] Step[30] GlobalStep[97574] Training Speed: 421.03 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:16:46. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:07:42 INFO loss_tracker.py:84 | Epoch[712/NA] Step[49] GlobalStep[97593/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/25/2025 00:07:44 INFO stats.py:314 | Epoch[712] Step[55] GlobalStep[97599] Training Speed: 421.63 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:16:36. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:07:52 INFO loss_tracker.py:84 | Epoch[712/NA] Step[74] GlobalStep[97618/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0126] total_loss[0.0136] Rank[0/16] 06/25/2025 00:07:55 INFO stats.py:314 | Epoch[712] Step[80] GlobalStep[97624] Training Speed: 433.57 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:16:25. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:08:03 INFO loss_tracker.py:84 | Epoch[712/NA] Step[99] GlobalStep[97643/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/25/2025 00:08:05 INFO stats.py:314 | Epoch[712] Step[105] GlobalStep[97649] Training Speed: 431.94 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:16:15. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:08:13 INFO loss_tracker.py:84 | Epoch[712/NA] Step[124] GlobalStep[97668/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/25/2025 00:08:15 INFO stats.py:314 | Epoch[712] Step[130] GlobalStep[97674] Training Speed: 436.00 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:16:04. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:08:17 INFO stats.py:394 | Epoch[712] completed. Training Speed: 307.71 samples/sec across all devices. Epoch Time: 56.99 sec. Average Epoch Time: 56.99 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:16:02. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:08:26 INFO stats.py:314 | Epoch[713] Step[18] GlobalStep[97699] Training Speed: 388.60 samples/sec across all devices. Average Step Time: 0.33 sec. Estimated Remaining Time: 0:15:54. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:08:29 INFO loss_tracker.py:84 | Epoch[713/NA] Step[24] GlobalStep[97705/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0018] loss_depth[0.0125] total_loss[0.0143] Rank[0/16] 06/25/2025 00:08:37 INFO stats.py:314 | Epoch[713] Step[43] GlobalStep[97724] Training Speed: 433.06 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:15:44. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:08:39 INFO loss_tracker.py:84 | Epoch[713/NA] Step[49] GlobalStep[97730/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/25/2025 00:08:47 INFO stats.py:314 | Epoch[713] Step[68] GlobalStep[97749] Training Speed: 433.45 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:15:33. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:08:49 INFO loss_tracker.py:84 | Epoch[713/NA] Step[74] GlobalStep[97755/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0020] loss_depth[0.0125] total_loss[0.0145] Rank[0/16] 06/25/2025 00:08:57 INFO stats.py:314 | Epoch[713] Step[93] GlobalStep[97774] Training Speed: 432.75 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:15:23. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:09:00 INFO loss_tracker.py:84 | Epoch[713/NA] Step[99] GlobalStep[97780/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0013] loss_depth[0.0125] total_loss[0.0139] Rank[0/16] 06/25/2025 00:09:07 INFO stats.py:314 | Epoch[713] Step[118] GlobalStep[97799] Training Speed: 427.98 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:15:12. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:09:10 INFO loss_tracker.py:84 | Epoch[713/NA] Step[124] GlobalStep[97805/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0019] loss_depth[0.0125] total_loss[0.0144] Rank[0/16] 06/25/2025 00:09:14 INFO stats.py:394 | Epoch[713] completed. Training Speed: 310.44 samples/sec across all devices. Epoch Time: 56.49 sec. Average Epoch Time: 56.49 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 0:15:05. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:09:18 INFO stats.py:314 | Epoch[714] Step[6] GlobalStep[97824] Training Speed: 422.72 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:15:02. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:09:25 INFO loss_tracker.py:84 | Epoch[714/NA] Step[24] GlobalStep[97842/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0020] loss_depth[0.0125] total_loss[0.0145] Rank[0/16] 06/25/2025 00:09:28 INFO stats.py:314 | Epoch[714] Step[31] GlobalStep[97849] Training Speed: 431.89 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:14:52. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:09:35 INFO loss_tracker.py:84 | Epoch[714/NA] Step[49] GlobalStep[97867/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/25/2025 00:09:38 INFO stats.py:314 | Epoch[714] Step[56] GlobalStep[97874] Training Speed: 434.91 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:14:41. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:09:46 INFO loss_tracker.py:84 | Epoch[714/NA] Step[74] GlobalStep[97892/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/25/2025 00:09:49 INFO stats.py:314 | Epoch[714] Step[81] GlobalStep[97899] Training Speed: 436.10 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:14:31. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:09:56 INFO loss_tracker.py:84 | Epoch[714/NA] Step[99] GlobalStep[97917/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0135] Rank[0/16] 06/25/2025 00:09:59 INFO stats.py:314 | Epoch[714] Step[106] GlobalStep[97924] Training Speed: 433.58 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:14:21. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:10:06 INFO loss_tracker.py:84 | Epoch[714/NA] Step[124] GlobalStep[97942/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/25/2025 00:10:08 INFO stats.py:314 | Epoch[714] Step[131] GlobalStep[97949] Training Speed: 450.79 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 0:14:10. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:10:10 INFO stats.py:394 | Epoch[714] completed. Training Speed: 312.51 samples/sec across all devices. Epoch Time: 56.11 sec. Average Epoch Time: 56.11 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 0:14:08. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:10:19 INFO stats.py:314 | Epoch[715] Step[19] GlobalStep[97974] Training Speed: 430.81 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:14:00. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:10:21 INFO loss_tracker.py:84 | Epoch[715/NA] Step[24] GlobalStep[97979/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0014] loss_depth[0.0125] total_loss[0.0140] Rank[0/16] 06/25/2025 00:10:30 INFO stats.py:314 | Epoch[715] Step[44] GlobalStep[97999] Training Speed: 429.58 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:13:49. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:10:32 INFO loss_tracker.py:84 | Epoch[715/NA] Step[49] GlobalStep[98004/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0014] loss_depth[0.0125] total_loss[0.0139] Rank[0/16] 06/25/2025 00:10:40 INFO stats.py:314 | Epoch[715] Step[69] GlobalStep[98024] Training Speed: 424.21 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:13:39. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:10:42 INFO loss_tracker.py:84 | Epoch[715/NA] Step[74] GlobalStep[98029/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0016] loss_depth[0.0125] total_loss[0.0141] Rank[0/16] 06/25/2025 00:10:51 INFO stats.py:314 | Epoch[715] Step[94] GlobalStep[98049] Training Speed: 425.90 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:13:29. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:10:53 INFO loss_tracker.py:84 | Epoch[715/NA] Step[99] GlobalStep[98054/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0125] total_loss[0.0135] Rank[0/16] 06/25/2025 00:11:01 INFO stats.py:314 | Epoch[715] Step[119] GlobalStep[98074] Training Speed: 439.45 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:13:18. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:11:03 INFO loss_tracker.py:84 | Epoch[715/NA] Step[124] GlobalStep[98079/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0016] loss_depth[0.0125] total_loss[0.0142] Rank[0/16] 06/25/2025 00:11:07 INFO stats.py:394 | Epoch[715] completed. Training Speed: 308.24 samples/sec across all devices. Epoch Time: 56.89 sec. Average Epoch Time: 56.89 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:13:11. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:11:11 INFO stats.py:314 | Epoch[716] Step[7] GlobalStep[98099] Training Speed: 439.62 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:13:08. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:11:18 INFO loss_tracker.py:84 | Epoch[716/NA] Step[24] GlobalStep[98116/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0126] total_loss[0.0136] Rank[0/16] 06/25/2025 00:11:22 INFO stats.py:314 | Epoch[716] Step[32] GlobalStep[98124] Training Speed: 399.53 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 0:12:58. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:11:29 INFO loss_tracker.py:84 | Epoch[716/NA] Step[49] GlobalStep[98141/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0015] loss_depth[0.0125] total_loss[0.0140] Rank[0/16] 06/25/2025 00:11:32 INFO stats.py:314 | Epoch[716] Step[57] GlobalStep[98149] Training Speed: 404.31 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 0:12:47. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:11:39 INFO loss_tracker.py:84 | Epoch[716/NA] Step[74] GlobalStep[98166/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0013] loss_depth[0.0126] total_loss[0.0139] Rank[0/16] 06/25/2025 00:11:42 INFO stats.py:314 | Epoch[716] Step[82] GlobalStep[98174] Training Speed: 422.87 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:12:37. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:11:50 INFO loss_tracker.py:84 | Epoch[716/NA] Step[99] GlobalStep[98191/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0015] loss_depth[0.0125] total_loss[0.0140] Rank[0/16] 06/25/2025 00:11:53 INFO stats.py:314 | Epoch[716] Step[107] GlobalStep[98199] Training Speed: 426.01 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:12:26. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:12:00 INFO loss_tracker.py:84 | Epoch[716/NA] Step[124] GlobalStep[98216/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0126] total_loss[0.0137] Rank[0/16] 06/25/2025 00:12:02 INFO stats.py:314 | Epoch[716] Step[132] GlobalStep[98224] Training Speed: 450.34 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 0:12:16. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:12:04 INFO stats.py:394 | Epoch[716] completed. Training Speed: 307.21 samples/sec across all devices. Epoch Time: 57.08 sec. Average Epoch Time: 57.08 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:12:14. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:12:13 INFO stats.py:314 | Epoch[717] Step[20] GlobalStep[98249] Training Speed: 428.98 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:12:06. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:12:15 INFO loss_tracker.py:84 | Epoch[717/NA] Step[24] GlobalStep[98253/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0015] loss_depth[0.0126] total_loss[0.0140] Rank[0/16] 06/25/2025 00:12:24 INFO stats.py:314 | Epoch[717] Step[45] GlobalStep[98274] Training Speed: 414.38 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:11:55. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:12:25 INFO loss_tracker.py:84 | Epoch[717/NA] Step[49] GlobalStep[98278/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/25/2025 00:12:34 INFO stats.py:314 | Epoch[717] Step[70] GlobalStep[98299] Training Speed: 431.34 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:11:45. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:12:36 INFO loss_tracker.py:84 | Epoch[717/NA] Step[74] GlobalStep[98303/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/25/2025 00:12:44 INFO stats.py:314 | Epoch[717] Step[95] GlobalStep[98324] Training Speed: 431.23 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:11:35. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:12:46 INFO loss_tracker.py:84 | Epoch[717/NA] Step[99] GlobalStep[98328/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0008] loss_depth[0.0125] total_loss[0.0133] Rank[0/16] 06/25/2025 00:12:55 INFO stats.py:314 | Epoch[717] Step[120] GlobalStep[98349] Training Speed: 436.58 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:11:24. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:12:56 INFO loss_tracker.py:84 | Epoch[717/NA] Step[124] GlobalStep[98353/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/25/2025 00:13:01 INFO stats.py:394 | Epoch[717] completed. Training Speed: 307.76 samples/sec across all devices. Epoch Time: 56.98 sec. Average Epoch Time: 56.98 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:11:18. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:13:06 INFO stats.py:314 | Epoch[718] Step[8] GlobalStep[98374] Training Speed: 423.79 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:11:14. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:13:12 INFO loss_tracker.py:84 | Epoch[718/NA] Step[24] GlobalStep[98390/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0126] total_loss[0.0135] Rank[0/16] 06/25/2025 00:13:16 INFO stats.py:314 | Epoch[718] Step[33] GlobalStep[98399] Training Speed: 272.68 samples/sec across all devices. Average Step Time: 0.47 sec. Estimated Remaining Time: 0:11:03. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:13:23 INFO loss_tracker.py:84 | Epoch[718/NA] Step[49] GlobalStep[98415/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0018] loss_depth[0.0125] total_loss[0.0143] Rank[0/16] 06/25/2025 00:13:26 INFO stats.py:314 | Epoch[718] Step[58] GlobalStep[98424] Training Speed: 437.94 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:10:53. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:13:33 INFO loss_tracker.py:84 | Epoch[718/NA] Step[74] GlobalStep[98440/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0016] loss_depth[0.0125] total_loss[0.0141] Rank[0/16] 06/25/2025 00:13:36 INFO stats.py:314 | Epoch[718] Step[83] GlobalStep[98449] Training Speed: 436.06 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:10:43. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:13:43 INFO loss_tracker.py:84 | Epoch[718/NA] Step[99] GlobalStep[98465/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/25/2025 00:13:47 INFO stats.py:314 | Epoch[718] Step[108] GlobalStep[98474] Training Speed: 423.83 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:10:32. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:13:53 INFO loss_tracker.py:84 | Epoch[718/NA] Step[124] GlobalStep[98490/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0013] loss_depth[0.0125] total_loss[0.0139] Rank[0/16] 06/25/2025 00:13:57 INFO stats.py:314 | Epoch[718] Step[133] GlobalStep[98499] Training Speed: 439.42 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:10:22. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:13:58 INFO stats.py:394 | Epoch[718] completed. Training Speed: 308.64 samples/sec across all devices. Epoch Time: 56.82 sec. Average Epoch Time: 56.82 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 0:10:21. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:14:08 INFO stats.py:314 | Epoch[719] Step[21] GlobalStep[98524] Training Speed: 413.31 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:10:12. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:14:09 INFO loss_tracker.py:84 | Epoch[719/NA] Step[24] GlobalStep[98527/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0007] loss_depth[0.0125] total_loss[0.0132] Rank[0/16] 06/25/2025 00:14:18 INFO stats.py:314 | Epoch[719] Step[46] GlobalStep[98549] Training Speed: 424.49 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:10:01. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:14:20 INFO loss_tracker.py:84 | Epoch[719/NA] Step[49] GlobalStep[98552/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0126] total_loss[0.0137] Rank[0/16] 06/25/2025 00:14:28 INFO stats.py:314 | Epoch[719] Step[71] GlobalStep[98574] Training Speed: 439.49 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:09:51. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:14:29 INFO loss_tracker.py:84 | Epoch[719/NA] Step[74] GlobalStep[98577/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/25/2025 00:14:38 INFO stats.py:314 | Epoch[719] Step[96] GlobalStep[98599] Training Speed: 440.13 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:09:40. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:14:40 INFO loss_tracker.py:84 | Epoch[719/NA] Step[99] GlobalStep[98602/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0013] loss_depth[0.0125] total_loss[0.0138] Rank[0/16] 06/25/2025 00:14:48 INFO stats.py:314 | Epoch[719] Step[121] GlobalStep[98624] Training Speed: 452.22 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 0:09:30. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:14:49 INFO loss_tracker.py:84 | Epoch[719/NA] Step[124] GlobalStep[98627/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0134] Rank[0/16] 06/25/2025 00:14:54 INFO stats.py:394 | Epoch[719] completed. Training Speed: 314.52 samples/sec across all devices. Epoch Time: 55.75 sec. Average Epoch Time: 55.75 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 0:09:24. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:14:59 INFO stats.py:314 | Epoch[720] Step[9] GlobalStep[98649] Training Speed: 416.43 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:09:20. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:15:05 INFO loss_tracker.py:84 | Epoch[720/NA] Step[24] GlobalStep[98664/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0126] total_loss[0.0137] Rank[0/16] 06/25/2025 00:15:09 INFO stats.py:314 | Epoch[720] Step[34] GlobalStep[98674] Training Speed: 427.26 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:09:09. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:15:16 INFO loss_tracker.py:84 | Epoch[720/NA] Step[49] GlobalStep[98689/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0023] loss_depth[0.0125] total_loss[0.0148] Rank[0/16] 06/25/2025 00:15:20 INFO stats.py:314 | Epoch[720] Step[59] GlobalStep[98699] Training Speed: 424.21 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:08:59. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:15:26 INFO loss_tracker.py:84 | Epoch[720/NA] Step[74] GlobalStep[98714/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0017] loss_depth[0.0125] total_loss[0.0143] Rank[0/16] 06/25/2025 00:15:30 INFO stats.py:314 | Epoch[720] Step[84] GlobalStep[98724] Training Speed: 431.80 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:08:49. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:15:36 INFO loss_tracker.py:84 | Epoch[720/NA] Step[99] GlobalStep[98739/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0126] total_loss[0.0136] Rank[0/16] 06/25/2025 00:15:40 INFO stats.py:314 | Epoch[720] Step[109] GlobalStep[98749] Training Speed: 432.99 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:08:38. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:15:46 INFO loss_tracker.py:84 | Epoch[720/NA] Step[124] GlobalStep[98764/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/25/2025 00:15:49 INFO stats.py:314 | Epoch[720] Step[134] GlobalStep[98774] Training Speed: 450.80 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 0:08:28. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:15:50 INFO stats.py:394 | Epoch[720] completed. Training Speed: 311.35 samples/sec across all devices. Epoch Time: 56.32 sec. Average Epoch Time: 56.32 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 0:08:27. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:16:00 INFO stats.py:314 | Epoch[721] Step[22] GlobalStep[98799] Training Speed: 428.23 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:08:17. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:16:01 INFO loss_tracker.py:84 | Epoch[721/NA] Step[24] GlobalStep[98801/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0014] loss_depth[0.0125] total_loss[0.0139] Rank[0/16] 06/25/2025 00:16:11 INFO stats.py:314 | Epoch[721] Step[47] GlobalStep[98824] Training Speed: 433.15 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:08:07. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:16:11 INFO loss_tracker.py:84 | Epoch[721/NA] Step[49] GlobalStep[98826/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0019] loss_depth[0.0125] total_loss[0.0144] Rank[0/16] 06/25/2025 00:16:21 INFO stats.py:314 | Epoch[721] Step[72] GlobalStep[98849] Training Speed: 431.54 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:07:57. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:16:22 INFO loss_tracker.py:84 | Epoch[721/NA] Step[74] GlobalStep[98851/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/25/2025 00:16:32 INFO stats.py:314 | Epoch[721] Step[97] GlobalStep[98874] Training Speed: 416.23 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:07:46. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:16:32 INFO loss_tracker.py:84 | Epoch[721/NA] Step[99] GlobalStep[98876/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0015] loss_depth[0.0125] total_loss[0.0140] Rank[0/16] 06/25/2025 00:16:42 INFO stats.py:314 | Epoch[721] Step[122] GlobalStep[98899] Training Speed: 454.01 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 0:07:36. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:16:43 INFO loss_tracker.py:84 | Epoch[721/NA] Step[124] GlobalStep[98901/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/25/2025 00:16:48 INFO stats.py:394 | Epoch[721] completed. Training Speed: 303.90 samples/sec across all devices. Epoch Time: 57.70 sec. Average Epoch Time: 57.70 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:07:30. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:16:53 INFO stats.py:314 | Epoch[722] Step[10] GlobalStep[98924] Training Speed: 433.16 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:07:26. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:16:59 INFO loss_tracker.py:84 | Epoch[722/NA] Step[24] GlobalStep[98938/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0126] total_loss[0.0136] Rank[0/16] 06/25/2025 00:17:04 INFO stats.py:314 | Epoch[722] Step[35] GlobalStep[98949] Training Speed: 431.98 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:07:15. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:17:09 INFO loss_tracker.py:84 | Epoch[722/NA] Step[49] GlobalStep[98963/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0017] loss_depth[0.0125] total_loss[0.0142] Rank[0/16] 06/25/2025 00:17:14 INFO stats.py:314 | Epoch[722] Step[60] GlobalStep[98974] Training Speed: 423.48 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:07:05. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:17:20 INFO loss_tracker.py:84 | Epoch[722/NA] Step[74] GlobalStep[98988/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0017] loss_depth[0.0125] total_loss[0.0142] Rank[0/16] 06/25/2025 00:17:24 INFO stats.py:314 | Epoch[722] Step[85] GlobalStep[98999] Training Speed: 430.98 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:06:54. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:17:30 INFO loss_tracker.py:84 | Epoch[722/NA] Step[99] GlobalStep[99013/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0009] loss_depth[0.0125] total_loss[0.0135] Rank[0/16] 06/25/2025 00:17:34 INFO stats.py:314 | Epoch[722] Step[110] GlobalStep[99024] Training Speed: 429.29 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:06:44. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:17:40 INFO loss_tracker.py:84 | Epoch[722/NA] Step[124] GlobalStep[99038/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/25/2025 00:17:44 INFO stats.py:314 | Epoch[722] Step[135] GlobalStep[99049] Training Speed: 443.87 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:06:34. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:17:44 INFO stats.py:394 | Epoch[722] completed. Training Speed: 308.19 samples/sec across all devices. Epoch Time: 56.90 sec. Average Epoch Time: 56.90 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:06:33. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:17:55 INFO stats.py:314 | Epoch[723] Step[23] GlobalStep[99074] Training Speed: 416.70 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:06:23. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:17:56 INFO loss_tracker.py:84 | Epoch[723/NA] Step[24] GlobalStep[99075/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0018] loss_depth[0.0125] total_loss[0.0144] Rank[0/16] 06/25/2025 00:18:06 INFO stats.py:314 | Epoch[723] Step[48] GlobalStep[99099] Training Speed: 436.51 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:06:13. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:18:06 INFO loss_tracker.py:84 | Epoch[723/NA] Step[49] GlobalStep[99100/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/25/2025 00:18:16 INFO stats.py:314 | Epoch[723] Step[73] GlobalStep[99124] Training Speed: 434.72 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:06:03. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:18:17 INFO loss_tracker.py:84 | Epoch[723/NA] Step[74] GlobalStep[99125/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0015] loss_depth[0.0125] total_loss[0.0140] Rank[0/16] 06/25/2025 00:18:26 INFO stats.py:314 | Epoch[723] Step[98] GlobalStep[99149] Training Speed: 423.31 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:05:52. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:18:27 INFO loss_tracker.py:84 | Epoch[723/NA] Step[99] GlobalStep[99150/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0013] loss_depth[0.0125] total_loss[0.0138] Rank[0/16] 06/25/2025 00:18:37 INFO stats.py:314 | Epoch[723] Step[123] GlobalStep[99174] Training Speed: 440.94 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:05:42. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:18:37 INFO loss_tracker.py:84 | Epoch[723/NA] Step[124] GlobalStep[99175/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0013] loss_depth[0.0126] total_loss[0.0139] Rank[0/16] 06/25/2025 00:18:42 INFO stats.py:394 | Epoch[723] completed. Training Speed: 305.23 samples/sec across all devices. Epoch Time: 57.45 sec. Average Epoch Time: 57.45 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:05:36. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:18:48 INFO stats.py:314 | Epoch[724] Step[11] GlobalStep[99199] Training Speed: 425.67 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:05:31. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:18:53 INFO loss_tracker.py:84 | Epoch[724/NA] Step[24] GlobalStep[99212/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/25/2025 00:18:59 INFO stats.py:314 | Epoch[724] Step[36] GlobalStep[99224] Training Speed: 405.75 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 0:05:21. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:19:04 INFO loss_tracker.py:84 | Epoch[724/NA] Step[49] GlobalStep[99237/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0013] loss_depth[0.0125] total_loss[0.0138] Rank[0/16] 06/25/2025 00:19:08 INFO stats.py:314 | Epoch[724] Step[61] GlobalStep[99249] Training Speed: 436.23 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:05:11. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:19:14 INFO loss_tracker.py:84 | Epoch[724/NA] Step[74] GlobalStep[99262/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0021] loss_depth[0.0125] total_loss[0.0146] Rank[0/16] 06/25/2025 00:19:19 INFO stats.py:314 | Epoch[724] Step[86] GlobalStep[99274] Training Speed: 434.55 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:05:00. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:19:24 INFO loss_tracker.py:84 | Epoch[724/NA] Step[99] GlobalStep[99287/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0016] loss_depth[0.0125] total_loss[0.0141] Rank[0/16] 06/25/2025 00:19:29 INFO stats.py:314 | Epoch[724] Step[111] GlobalStep[99299] Training Speed: 415.44 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:04:50. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:19:35 INFO loss_tracker.py:84 | Epoch[724/NA] Step[124] GlobalStep[99312/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0018] loss_depth[0.0125] total_loss[0.0143] Rank[0/16] 06/25/2025 00:19:39 INFO stats.py:314 | Epoch[724] Step[136] GlobalStep[99324] Training Speed: 443.86 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:04:40. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:19:39 INFO stats.py:394 | Epoch[724] completed. Training Speed: 305.31 samples/sec across all devices. Epoch Time: 57.44 sec. Average Epoch Time: 57.44 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:04:40. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:19:50 INFO stats.py:314 | Epoch[725] Step[24] GlobalStep[99349] Training Speed: 430.32 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:04:29. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:19:50 INFO loss_tracker.py:84 | Epoch[725/NA] Step[24] GlobalStep[99349/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0014] loss_depth[0.0125] total_loss[0.0139] Rank[0/16] 06/25/2025 00:20:01 INFO stats.py:314 | Epoch[725] Step[49] GlobalStep[99374] Training Speed: 432.83 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:04:19. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:20:02 INFO loss_tracker.py:84 | Epoch[725/NA] Step[49] GlobalStep[99374/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0020] loss_depth[0.0126] total_loss[0.0146] Rank[0/16] 06/25/2025 00:20:11 INFO stats.py:314 | Epoch[725] Step[74] GlobalStep[99399] Training Speed: 436.98 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:04:09. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:20:11 INFO loss_tracker.py:84 | Epoch[725/NA] Step[74] GlobalStep[99399/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0007] loss_depth[0.0125] total_loss[0.0132] Rank[0/16] 06/25/2025 00:20:22 INFO stats.py:314 | Epoch[725] Step[99] GlobalStep[99424] Training Speed: 432.28 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:03:58. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:20:22 INFO loss_tracker.py:84 | Epoch[725/NA] Step[99] GlobalStep[99424/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0021] loss_depth[0.0126] total_loss[0.0146] Rank[0/16] 06/25/2025 00:20:31 INFO stats.py:314 | Epoch[725] Step[124] GlobalStep[99449] Training Speed: 453.77 samples/sec across all devices. Average Step Time: 0.28 sec. Estimated Remaining Time: 0:03:48. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:20:31 INFO loss_tracker.py:84 | Epoch[725/NA] Step[124] GlobalStep[99449/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/25/2025 00:20:36 INFO stats.py:394 | Epoch[725] completed. Training Speed: 308.20 samples/sec across all devices. Epoch Time: 56.90 sec. Average Epoch Time: 56.90 sec. Average Step Time: 0.42 sec. Estimated Remaining Time: 0:03:43. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:20:42 INFO stats.py:314 | Epoch[726] Step[12] GlobalStep[99474] Training Speed: 417.11 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:03:37. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:20:47 INFO loss_tracker.py:84 | Epoch[726/NA] Step[24] GlobalStep[99486/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0016] loss_depth[0.0125] total_loss[0.0141] Rank[0/16] 06/25/2025 00:20:52 INFO stats.py:314 | Epoch[726] Step[37] GlobalStep[99499] Training Speed: 406.74 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:03:27. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:20:58 INFO loss_tracker.py:84 | Epoch[726/NA] Step[49] GlobalStep[99511/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/25/2025 00:21:03 INFO stats.py:314 | Epoch[726] Step[62] GlobalStep[99524] Training Speed: 383.32 samples/sec across all devices. Average Step Time: 0.33 sec. Estimated Remaining Time: 0:03:17. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:21:08 INFO loss_tracker.py:84 | Epoch[726/NA] Step[74] GlobalStep[99536/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0125] total_loss[0.0135] Rank[0/16] 06/25/2025 00:21:14 INFO stats.py:314 | Epoch[726] Step[87] GlobalStep[99549] Training Speed: 441.12 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:03:06. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:21:19 INFO loss_tracker.py:84 | Epoch[726/NA] Step[99] GlobalStep[99561/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0011] loss_depth[0.0125] total_loss[0.0136] Rank[0/16] 06/25/2025 00:21:24 INFO stats.py:314 | Epoch[726] Step[112] GlobalStep[99574] Training Speed: 434.70 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:02:56. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:21:28 INFO loss_tracker.py:84 | Epoch[726/NA] Step[124] GlobalStep[99586/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0019] loss_depth[0.0125] total_loss[0.0144] Rank[0/16] 06/25/2025 00:21:32 INFO stats.py:394 | Epoch[726] completed. Training Speed: 313.68 samples/sec across all devices. Epoch Time: 55.90 sec. Average Epoch Time: 55.90 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 0:02:46. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:21:33 INFO stats.py:314 | Epoch[727] Step[0] GlobalStep[99599] Training Speed: 373.55 samples/sec across all devices. Average Step Time: 0.34 sec. Estimated Remaining Time: 0:02:45. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:21:43 INFO loss_tracker.py:84 | Epoch[727/NA] Step[24] GlobalStep[99623/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0013] loss_depth[0.0125] total_loss[0.0138] Rank[0/16] 06/25/2025 00:21:43 INFO stats.py:314 | Epoch[727] Step[25] GlobalStep[99624] Training Speed: 416.83 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:02:35. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:21:53 INFO loss_tracker.py:84 | Epoch[727/NA] Step[49] GlobalStep[99648/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0017] loss_depth[0.0125] total_loss[0.0142] Rank[0/16] 06/25/2025 00:21:54 INFO stats.py:314 | Epoch[727] Step[50] GlobalStep[99649] Training Speed: 421.09 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:02:25. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:22:04 INFO loss_tracker.py:84 | Epoch[727/NA] Step[74] GlobalStep[99673/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0015] loss_depth[0.0125] total_loss[0.0140] Rank[0/16] 06/25/2025 00:22:04 INFO stats.py:314 | Epoch[727] Step[75] GlobalStep[99674] Training Speed: 400.89 samples/sec across all devices. Average Step Time: 0.32 sec. Estimated Remaining Time: 0:02:14. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:22:14 INFO loss_tracker.py:84 | Epoch[727/NA] Step[99] GlobalStep[99698/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0010] loss_depth[0.0125] total_loss[0.0135] Rank[0/16] 06/25/2025 00:22:15 INFO stats.py:314 | Epoch[727] Step[100] GlobalStep[99699] Training Speed: 409.59 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:02:04. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:22:24 INFO loss_tracker.py:84 | Epoch[727/NA] Step[124] GlobalStep[99723/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/25/2025 00:22:24 INFO stats.py:314 | Epoch[727] Step[125] GlobalStep[99724] Training Speed: 431.08 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:01:54. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:22:28 INFO stats.py:394 | Epoch[727] completed. Training Speed: 311.89 samples/sec across all devices. Epoch Time: 56.23 sec. Average Epoch Time: 56.23 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 0:01:49. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:22:35 INFO stats.py:314 | Epoch[728] Step[13] GlobalStep[99749] Training Speed: 437.15 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:01:43. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:22:39 INFO loss_tracker.py:84 | Epoch[728/NA] Step[24] GlobalStep[99760/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0016] loss_depth[0.0125] total_loss[0.0141] Rank[0/16] 06/25/2025 00:22:45 INFO stats.py:314 | Epoch[728] Step[38] GlobalStep[99774] Training Speed: 436.59 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:01:33. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:22:50 INFO loss_tracker.py:84 | Epoch[728/NA] Step[49] GlobalStep[99785/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0012] loss_depth[0.0125] total_loss[0.0137] Rank[0/16] 06/25/2025 00:22:55 INFO stats.py:314 | Epoch[728] Step[63] GlobalStep[99799] Training Speed: 423.46 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:01:22. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:22:59 INFO loss_tracker.py:84 | Epoch[728/NA] Step[74] GlobalStep[99810/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0014] loss_depth[0.0125] total_loss[0.0139] Rank[0/16] 06/25/2025 00:23:05 INFO stats.py:314 | Epoch[728] Step[88] GlobalStep[99824] Training Speed: 384.08 samples/sec across all devices. Average Step Time: 0.33 sec. Estimated Remaining Time: 0:01:12. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:23:10 INFO loss_tracker.py:84 | Epoch[728/NA] Step[99] GlobalStep[99835/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0014] loss_depth[0.0125] total_loss[0.0139] Rank[0/16] 06/25/2025 00:23:15 INFO stats.py:314 | Epoch[728] Step[113] GlobalStep[99849] Training Speed: 427.99 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:01:02. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:23:20 INFO loss_tracker.py:84 | Epoch[728/NA] Step[124] GlobalStep[99860/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0015] loss_depth[0.0125] total_loss[0.0140] Rank[0/16] 06/25/2025 00:23:24 INFO stats.py:394 | Epoch[728] completed. Training Speed: 313.13 samples/sec across all devices. Epoch Time: 56.00 sec. Average Epoch Time: 56.00 sec. Average Step Time: 0.41 sec. Estimated Remaining Time: 0:00:52. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:23:26 INFO stats.py:314 | Epoch[729] Step[1] GlobalStep[99874] Training Speed: 421.10 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:00:51. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:23:36 INFO loss_tracker.py:84 | Epoch[729/NA] Step[24] GlobalStep[99897/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0019] loss_depth[0.0126] total_loss[0.0144] Rank[0/16] 06/25/2025 00:23:37 INFO stats.py:314 | Epoch[729] Step[26] GlobalStep[99899] Training Speed: 434.36 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:00:41. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:23:46 INFO loss_tracker.py:84 | Epoch[729/NA] Step[49] GlobalStep[99922/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0013] loss_depth[0.0125] total_loss[0.0139] Rank[0/16] 06/25/2025 00:23:47 INFO stats.py:314 | Epoch[729] Step[51] GlobalStep[99924] Training Speed: 406.36 samples/sec across all devices. Average Step Time: 0.31 sec. Estimated Remaining Time: 0:00:31. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:23:57 INFO loss_tracker.py:84 | Epoch[729/NA] Step[74] GlobalStep[99947/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0014] loss_depth[0.0125] total_loss[0.0138] Rank[0/16] 06/25/2025 00:23:57 INFO stats.py:314 | Epoch[729] Step[76] GlobalStep[99949] Training Speed: 433.55 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:00:20. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:24:07 INFO loss_tracker.py:84 | Epoch[729/NA] Step[99] GlobalStep[99972/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0020] loss_depth[0.0125] total_loss[0.0145] Rank[0/16] 06/25/2025 00:24:08 INFO stats.py:314 | Epoch[729] Step[101] GlobalStep[99974] Training Speed: 420.37 samples/sec across all devices. Average Step Time: 0.30 sec. Estimated Remaining Time: 0:00:10. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:24:17 INFO loss_tracker.py:84 | Epoch[729/NA] Step[124] GlobalStep[99997/99999]: loss_noise_mse[0.0000] loss_fk_mse[0.0014] loss_depth[0.0125] total_loss[0.0139] Rank[0/16] 06/25/2025 00:24:18 INFO stats.py:314 | Epoch[729] Step[126] GlobalStep[99999] Training Speed: 437.64 samples/sec across all devices. Average Step Time: 0.29 sec. Estimated Remaining Time: 0:00:00. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:24:18 WARNING accelerator.py:3099 | Deleting 1 checkpoints to make room for new checkpoint. Rank[0/16] 06/25/2025 00:24:19 INFO accelerator.py:3111 | Saving current state to /job_data/checkpoints/checkpoint_24 Rank[5/16] 06/25/2025 00:24:19 INFO checkpoint.py:110 | Save checkpoint at the end of step 99999 to /job_data/checkpoints/checkpoint_24 Rank[6/16] 06/25/2025 00:24:19 INFO checkpoint.py:110 | Save checkpoint at the end of step 99999 to /job_data/checkpoints/checkpoint_24 Rank[3/16] 06/25/2025 00:24:20 INFO checkpoint.py:110 | Save checkpoint at the end of step 99999 to /job_data/checkpoints/checkpoint_24 Rank[14/16] 06/25/2025 00:24:20 INFO checkpoint.py:110 | Save checkpoint at the end of step 99999 to /job_data/checkpoints/checkpoint_24 Rank[2/16] 06/25/2025 00:24:20 INFO checkpoint.py:110 | Save checkpoint at the end of step 99999 to /job_data/checkpoints/checkpoint_24 Rank[4/16] 06/25/2025 00:24:20 INFO checkpoint.py:110 | Save checkpoint at the end of step 99999 to /job_data/checkpoints/checkpoint_24 Rank[7/16] 06/25/2025 00:24:20 INFO checkpoint.py:110 | Save checkpoint at the end of step 99999 to /job_data/checkpoints/checkpoint_24 Rank[1/16] 06/25/2025 00:24:20 INFO checkpoint.py:110 | Save checkpoint at the end of step 99999 to /job_data/checkpoints/checkpoint_24 Rank[11/16] 06/25/2025 00:24:20 INFO checkpoint.py:110 | Save checkpoint at the end of step 99999 to /job_data/checkpoints/checkpoint_24 Rank[13/16] 06/25/2025 00:24:20 INFO checkpoint.py:110 | Save checkpoint at the end of step 99999 to /job_data/checkpoints/checkpoint_24 Rank[8/16] 06/25/2025 00:24:20 INFO checkpoint.py:110 | Save checkpoint at the end of step 99999 to /job_data/checkpoints/checkpoint_24 Rank[15/16] 06/25/2025 00:24:20 INFO checkpoint.py:110 | Save checkpoint at the end of step 99999 to /job_data/checkpoints/checkpoint_24 Rank[12/16] 06/25/2025 00:24:20 INFO checkpoint.py:110 | Save checkpoint at the end of step 99999 to /job_data/checkpoints/checkpoint_24 Rank[9/16] 06/25/2025 00:24:20 INFO checkpoint.py:110 | Save checkpoint at the end of step 99999 to /job_data/checkpoints/checkpoint_24 Rank[10/16] 06/25/2025 00:24:20 INFO checkpoint.py:110 | Save checkpoint at the end of step 99999 to /job_data/checkpoints/checkpoint_24 Rank[0/16] 06/25/2025 00:24:20 INFO checkpointing.py:106 | Model weights saved in /job_data/checkpoints/checkpoint_24/model.safetensors Rank[0/16] 06/25/2025 00:24:21 INFO checkpointing.py:113 | Optimizer state saved in /job_data/checkpoints/checkpoint_24/optimizer.bin Rank[0/16] 06/25/2025 00:24:21 INFO checkpointing.py:120 | Scheduler state saved in /job_data/checkpoints/checkpoint_24/scheduler.bin Rank[0/16] 06/25/2025 00:24:21 INFO checkpointing.py:137 | Sampler state for dataloader 0 saved in /job_data/checkpoints/checkpoint_24/sampler.bin Rank[0/16] 06/25/2025 00:24:21 INFO checkpointing.py:164 | Random states saved in /job_data/checkpoints/checkpoint_24/random_states_0.pkl Rank[0/16] 06/25/2025 00:24:21 INFO checkpointing.py:300 | Saving the state of TrainerProgressState to /job_data/checkpoints/checkpoint_24/custom_checkpoint_0.pkl Rank[0/16] 06/25/2025 00:24:21 INFO checkpoint.py:110 | Save checkpoint at the end of step 99999 to /job_data/checkpoints/checkpoint_24 Rank[0/16] 06/25/2025 00:24:21 INFO stats.py:394 | Epoch[729] completed. Training Speed: 284.75 samples/sec across all devices. Epoch Time: 57.09 sec. Average Epoch Time: 57.09 sec. Average Step Time: 0.45 sec. Estimated Remaining Time: 0:00:00. Learning Rate: 1.00000e-05. Rank[0/16] 06/25/2025 00:24:21 INFO hook_based_trainer.py:396 | ==================================================FINISH TRAINING==================================================