WorldMem / wandb /debug.log
Amshaker's picture
Upload folder using huggingface_hub
fafb143 verified
Raw
History Blame
7.37 kB
2026-04-14 06:08:15,963 INFO MainThread:3349498 [wandb_setup.py:_flush():77] Current SDK version is 0.17.9
2026-04-14 06:08:15,963 INFO MainThread:3349498 [wandb_setup.py:_flush():77] Configure stats pid to 3349498
2026-04-14 06:08:15,963 INFO MainThread:3349498 [wandb_setup.py:_flush():77] Loading settings from /home/x_fahkh/.config/wandb/settings
2026-04-14 06:08:15,963 INFO MainThread:3349498 [wandb_setup.py:_flush():77] Loading settings from /proj/cvl/users/x_fahkh2/WorldMem_Repro/wandb/settings
2026-04-14 06:08:15,963 INFO MainThread:3349498 [wandb_setup.py:_flush():77] Loading settings from environment variables: {'disabled': 'true'}
2026-04-14 06:08:15,963 INFO MainThread:3349498 [wandb_setup.py:_flush():77] Applying setup settings: {'_disable_service': False}
2026-04-14 06:08:15,963 WARNING MainThread:3349498 [wandb_setup.py:_flush():77] Could not find program at -m main
2026-04-14 06:08:15,963 INFO MainThread:3349498 [wandb_setup.py:_flush():77] Inferring run settings from compute environment: {'program_relpath': None, 'program': '-m main'}
2026-04-14 06:08:15,963 INFO MainThread:3349498 [wandb_init.py:_log_setup():524] Logging user logs to /proj/cvl/users/x_fahkh2/WorldMem_Repro/checkpoints/hierarchy_bimamba_stage_b_joint/wandb/offline-run-20260414_060815-stage_b_joint_offline/logs/debug.log
2026-04-14 06:08:15,963 INFO MainThread:3349498 [wandb_init.py:_log_setup():525] Logging internal logs to /proj/cvl/users/x_fahkh2/WorldMem_Repro/checkpoints/hierarchy_bimamba_stage_b_joint/wandb/offline-run-20260414_060815-stage_b_joint_offline/logs/debug-internal.log
2026-04-14 06:08:15,963 INFO MainThread:3349498 [wandb_init.py:init():608] calling init triggers
2026-04-14 06:08:15,963 INFO MainThread:3349498 [wandb_init.py:init():615] wandb.init called with sweep_config: {}
config: {'experiment': {'debug': '${debug}', 'tasks': ['training'], 'num_nodes': 1, 'training': {'precision': '16-mixed', 'compile': False, 'lr': 2e-05, 'batch_size': 8, 'max_epochs': -1, 'max_steps': 175000, 'max_time': None, 'data': {'num_workers': 4, 'shuffle': True}, 'optim': {'accumulate_grad_batches': 1, 'gradient_clip_val': 1.0}, 'checkpointing': {'every_n_train_steps': 2500, 'every_n_epochs': None, 'train_time_interval': None, 'enable_version_counter': False}}, 'validation': {'precision': '16-mixed', 'compile': False, 'batch_size': 4, 'val_every_n_step': 2500, 'val_every_n_epoch': None, 'limit_batch': 1, 'inference_mode': False, 'data': {'num_workers': 4, 'shuffle': False}}, 'test': {'precision': '16-mixed', 'compile': False, 'batch_size': 1, 'limit_batch': 1, 'inference_mode': False, 'data': {'num_workers': 4, 'shuffle': False}}, 'logging': {'metrics': None}, '_name': 'exp_video'}, 'dataset': {'debug': '${debug}', 'metadata': 'data/${dataset.name}/metadata.json', 'data_mean': 0.5, 'data_std': 0.5, 'save_dir': '/proj/cvl/users/x_fahkh2/WorldMem_Repro/datasets/minecraft', 'n_frames': 200, 'context_length': 1, 'resolution': 128, 'observation_shape': [3, '${dataset.resolution}', '${dataset.resolution}'], 'external_cond_dim': 0, 'validation_multiplier': 1, 'frame_skip': 1, 'action_cond_dim': 25, '_name': 'video_minecraft', 'n_frames_valid': 200, 'angle_range': 110, 'pos_range': 8, 'wo_updown': False, 'customized_validation': True, 'add_timestamp_embedding': True, 'use_explicit_memory_frames': False}, 'algorithm': {'debug': '${debug}', 'lr': '${experiment.training.lr}', 'x_shape': '${dataset.observation_shape}', 'frame_stack': 1, 'frame_skip': '${dataset.frame_skip}', 'data_mean': '${dataset.data_mean}', 'data_std': '${dataset.data_std}', 'external_cond_dim': 0, 'context_frames': 100, 'weight_decay': 0.002, 'warmup_steps': 1000, 'optimizer_beta': [0.9, 0.99], 'uncertainty_scale': 1, 'guidance_scale': 0.0, 'chunk_size': 1, 'scheduling_matrix': 'autoregressive', 'noise_level': 'random_all', 'causal': True, 'diffusion': {'objective': 'pred_v', 'beta_schedule': 'sigmoid', 'schedule_fn_kwargs': {}, 'clip_noise': 20.0, 'use_snr': False, 'use_cum_snr': False, 'use_fused_snr': True, 'snr_clip': 5.0, 'cum_snr_decay': 0.96, 'timesteps': 1000, 'sampling_timesteps': 20, 'ddim_sampling_eta': 0.0, 'stabilization_level': 15, 'architecture': {'network_size': 64, 'attn_heads': 4, 'attn_dim_head': 64, 'dim_mults': [1, 2, 4, 8], 'resolution': '${dataset.resolution}', 'attn_resolutions': [16, 32, 64, 128], 'use_init_temporal_attn': True, 'use_linear_attn': True, 'time_emb_type': 'rotary'}}, 'n_frames': '${dataset.n_frames}', 'metadata': '${dataset.metadata}', 'action_cond_dim': 25, 'use_plucker': True, 'memory_condition_length': 0, 'log_video': True, 'use_mamba_memory_pipeline': True, 'training_stage': 'stage_b_diffusion_training', 'stage_b_joint_training': True, 'stage_b_memory_aux_weight': 0.1, 'stage_b_prefix_corruption_prob': 0.0, 'stage_b_prefix_corruption_timestep_range': [5, 50], 'stage_b_prefix_corruption_max_recent_frames': 32, 'diff_window_size': 8, 'use_precomputed_features': False, 'mamba_latent_channels': 16, 'mamba_model_dim': 256, 'mamba_depth': 4, 'mamba_cond_dim': 256, 'mamba_d_state': 16, 'mamba_d_conv': 4, 'mamba_expand': 2, 'memory_scene_grid': [2, 4], 'memory_num_tokens': 4, 'memory_retrieval_topk': 32, 'allow_mamba_fallback': False, 'strict_causal_training': True, 'strict_causal_evaluation': True, 'use_oracle_pose_eval': True, 'enable_memory_noise_curriculum': False, 'curriculum_phase_boundaries': [0.2, 0.7], 'curriculum_noise_ranges': [[600, 1000], [200, 900], [0, 400]], 'curriculum_horizons': [50, 100, 200], '_name': 'df_video_mamba3stage', 'use_memory_attention': False, 'relative_embedding': False, 'n_tokens': 8}, 'debug': False, 'wandb': {'entity': 'turlin', 'project': 'worldmem', 'mode': 'online'}, 'resume': 'stage_b_joint_offline', 'load': None, 'name': 'train_stage_b_hierarchy_bimamba_joint', 'output_dir': '/proj/cvl/users/x_fahkh2/WorldMem_Repro/checkpoints/hierarchy_bimamba_stage_b_joint/'}
2026-04-14 06:08:15,963 INFO MainThread:3349498 [wandb_init.py:init():658] starting backend
2026-04-14 06:08:15,963 INFO MainThread:3349498 [wandb_init.py:init():662] setting up manager
2026-04-14 06:08:15,980 INFO MainThread:3349498 [backend.py:_multiprocessing_setup():105] multiprocessing start_methods=fork,spawn,forkserver, using: spawn
2026-04-14 06:08:15,988 INFO MainThread:3349498 [wandb_init.py:init():670] backend started and connected
2026-04-14 06:08:16,011 INFO MainThread:3349498 [wandb_init.py:init():768] updated telemetry
2026-04-14 06:08:16,066 INFO MainThread:3349498 [wandb_init.py:init():801] communicating run to backend with 90.0 second timeout
2026-04-14 06:08:16,070 INFO MainThread:3349498 [wandb_init.py:init():852] starting run threads in backend
2026-04-14 06:08:22,613 INFO MainThread:3349498 [wandb_run.py:_console_start():2465] atexit reg
2026-04-14 06:08:22,613 INFO MainThread:3349498 [wandb_run.py:_redirect():2311] redirect: wrap_raw
2026-04-14 06:08:22,613 INFO MainThread:3349498 [wandb_run.py:_redirect():2376] Wrapping output streams.
2026-04-14 06:08:22,613 INFO MainThread:3349498 [wandb_run.py:_redirect():2401] Redirects installed.
2026-04-14 06:08:22,621 INFO MainThread:3349498 [wandb_init.py:init():895] run started, returning control to user process
2026-04-14 23:23:15,098 WARNING MsgRouterThr:3349498 [router.py:message_loop():77] message_loop has been closed