2025-07-20 08:50:08,361 INFO MainThread:104802 [wandb_setup.py:_flush():80] Current SDK version is 0.21.0 2025-07-20 08:50:08,361 INFO MainThread:104802 [wandb_setup.py:_flush():80] Configure stats pid to 104802 2025-07-20 08:50:08,361 INFO MainThread:104802 [wandb_setup.py:_flush():80] Loading settings from /root/.config/wandb/settings 2025-07-20 08:50:08,361 INFO MainThread:104802 [wandb_setup.py:_flush():80] Loading settings from /root/RWKV-LM-V7/wandb/settings 2025-07-20 08:50:08,361 INFO MainThread:104802 [wandb_setup.py:_flush():80] Loading settings from environment variables 2025-07-20 08:50:08,361 INFO MainThread:104802 [wandb_init.py:setup_run_log_directory():703] Logging user logs to /root/RWKV-LM-V7/wandb/run-20250720_085008-3bxq2c89/logs/debug.log 2025-07-20 08:50:08,361 INFO MainThread:104802 [wandb_init.py:setup_run_log_directory():704] Logging internal logs to /root/RWKV-LM-V7/wandb/run-20250720_085008-3bxq2c89/logs/debug-internal.log 2025-07-20 08:50:08,361 INFO MainThread:104802 [wandb_init.py:init():830] calling init triggers 2025-07-20 08:50:08,361 INFO MainThread:104802 [wandb_init.py:init():835] wandb.init called with sweep_config: {} config: {'load_model': '/root/RWKV-LM-V7/0.4B_tanslate/rwkv-init.pth', 'wandb': 'RWKV_x070_ctx2048_0.4B_Translate_MI300X', 'proj_dir': '/root/RWKV-LM-V7/0.4B_tanslate', 'random_seed': -1, 'data_file': '/root/data/datasets_59596644_text_document', 'data_type': 'binidx', 'vocab_size': 65536, 'ctx_len': 2048, 'epoch_steps': 1260, 'epoch_count': 137, 'epoch_begin': 0, 'epoch_save': 1, 'micro_bsz': 32, 'n_layer': 24, 'n_embd': 1024, 'dim_att': 1024, 'dim_ffn': 4096, 'lr_init': 2e-05, 'lr_final': 1e-06, 'warmup_steps': 10, 'beta1': 0.9, 'beta2': 0.99, 'adam_eps': 1e-18, 'grad_cp': 0, 'weight_decay': 0.001, 'grad_clip': 1.0, 'train_stage': 3, 'ds_bucket_mb': 200, 'head_size': 64, 'load_partial': 0, 'magic_prime': 5554103, 'my_testing': 'x070', 'my_exit_tokens': 11374865357, 'compile': 1, 'logger': False, 'enable_checkpointing': False, 'default_root_dir': None, 'gradient_clip_val': 1.0, 'gradient_clip_algorithm': None, 'num_nodes': 1, 'num_processes': None, 'devices': '1', 'gpus': None, 'auto_select_gpus': None, 'tpu_cores': None, 'ipus': None, 'enable_progress_bar': True, 'overfit_batches': 0.0, 'track_grad_norm': -1, 'check_val_every_n_epoch': 100000000000000000000, 'fast_dev_run': False, 'accumulate_grad_batches': None, 'max_epochs': -1, 'min_epochs': None, 'max_steps': -1, 'min_steps': None, 'max_time': None, 'limit_train_batches': None, 'limit_val_batches': None, 'limit_test_batches': None, 'limit_predict_batches': None, 'val_check_interval': None, 'log_every_n_steps': 100000000000000000000, 'accelerator': 'gpu', 'strategy': 'deepspeed_stage_2', 'sync_batchnorm': False, 'precision': 'bf16', 'enable_model_summary': True, 'num_sanity_val_steps': 0, 'resume_from_checkpoint': None, 'profiler': None, 'benchmark': None, 'reload_dataloaders_every_n_epochs': 0, 'auto_lr_find': False, 'replace_sampler_ddp': False, 'detect_anomaly': False, 'auto_scale_batch_size': False, 'plugins': None, 'amp_backend': None, 'amp_level': None, 'move_metrics_to_cpu': False, 'multiple_trainloader_mode': 'max_size_cycle', 'inference_mode': True, 'my_timestamp': '2025-07-20-08-49-25', 'betas': (0.9, 0.99), 'real_bsz': 32, 'run_name': '65536 ctx2048 L24 D1024', '_wandb': {}} 2025-07-20 08:50:08,361 INFO MainThread:104802 [wandb_init.py:init():871] starting backend 2025-07-20 08:50:08,567 INFO MainThread:104802 [wandb_init.py:init():874] sending inform_init request 2025-07-20 08:50:08,569 INFO MainThread:104802 [wandb_init.py:init():882] backend started and connected 2025-07-20 08:50:08,574 INFO MainThread:104802 [wandb_init.py:init():953] updated telemetry 2025-07-20 08:50:08,575 INFO MainThread:104802 [wandb_init.py:init():977] communicating run to backend with 90.0 second timeout 2025-07-20 08:50:08,995 INFO MainThread:104802 [wandb_init.py:init():1029] starting run threads in backend 2025-07-20 08:50:09,065 INFO MainThread:104802 [wandb_run.py:_console_start():2458] atexit reg 2025-07-20 08:50:09,065 INFO MainThread:104802 [wandb_run.py:_redirect():2306] redirect: wrap_raw 2025-07-20 08:50:09,065 INFO MainThread:104802 [wandb_run.py:_redirect():2375] Wrapping output streams. 2025-07-20 08:50:09,065 INFO MainThread:104802 [wandb_run.py:_redirect():2398] Redirects installed. 2025-07-20 08:50:09,066 INFO MainThread:104802 [wandb_init.py:init():1075] run started, returning control to user process