belkhale's picture
Upload folder using huggingface_hub
ca81022 verified
2024-11-05 19:31:02,511 INFO MainThread:2188020 [wandb_setup.py:_flush():79] Current SDK version is 0.18.5
2024-11-05 19:31:02,511 INFO MainThread:2188020 [wandb_setup.py:_flush():79] Configure stats pid to 2188020
2024-11-05 19:31:02,511 INFO MainThread:2188020 [wandb_setup.py:_flush():79] Loading settings from /hai/scratch/belkhale/.config/wandb/settings
2024-11-05 19:31:02,511 INFO MainThread:2188020 [wandb_setup.py:_flush():79] Loading settings from /hai/scratch/belkhale/openvla-mini/wandb/settings
2024-11-05 19:31:02,511 INFO MainThread:2188020 [wandb_setup.py:_flush():79] Loading settings from environment variables: {'_service_wait': '300'}
2024-11-05 19:31:02,512 INFO MainThread:2188020 [wandb_setup.py:_flush():79] Applying setup settings: {'mode': None, '_disable_service': None}
2024-11-05 19:31:02,512 INFO MainThread:2188020 [wandb_setup.py:_flush():79] Inferring run settings from compute environment: {'program_relpath': 'scripts/pretrain.py', 'program_abspath': '/hai/scratch/belkhale/openvla-mini/scripts/pretrain.py', 'program': '/hai/scratch/belkhale/openvla-mini/scripts/pretrain.py'}
2024-11-05 19:31:02,512 INFO MainThread:2188020 [wandb_setup.py:_flush():79] Applying login settings: {}
2024-11-05 19:31:02,512 INFO MainThread:2188020 [wandb_init.py:_log_setup():534] Logging user logs to runs/prism-qwen25-extra-dinosiglip-224px+0_5b+stage-finetune+x7/wandb/run-20241105_193102-jcj67gg8/logs/debug.log
2024-11-05 19:31:02,512 INFO MainThread:2188020 [wandb_init.py:_log_setup():535] Logging internal logs to runs/prism-qwen25-extra-dinosiglip-224px+0_5b+stage-finetune+x7/wandb/run-20241105_193102-jcj67gg8/logs/debug-internal.log
2024-11-05 19:31:02,512 INFO MainThread:2188020 [wandb_init.py:init():621] calling init triggers
2024-11-05 19:31:02,512 INFO MainThread:2188020 [wandb_init.py:init():628] wandb.init called with sweep_config: {}
config: {'model': {'type': 'prism-qwen25-extra-dinosiglip-224px+0_5b', 'model_id': 'prism-qwen25-extra-dinosiglip-224px+0_5b', 'arch_specifier': 'no-align+fused-gelu-mlp', 'vision_backbone_id': 'dinosiglip-vit-so-224px', 'llm_backbone_id': 'qwen25-0_5b-extra', 'image_resize_strategy': 'resize-naive', 'llm_max_length': 32768, 'align_epochs': 1, 'align_max_steps': None, 'align_save_every_n_steps': 10000, 'align_global_batch_size': 96, 'align_per_device_batch_size': 16, 'align_learning_rate': 0.001, 'align_weight_decay': 0.0, 'align_max_grad_norm': 1.0, 'align_lr_scheduler_type': 'linear-warmup+cosine-decay', 'align_warmup_ratio': 0.03, 'align_train_strategy': 'fsdp-shard-grad-op', 'finetune_epochs': 2, 'finetune_max_steps': None, 'finetune_save_every_n_steps': 10000, 'finetune_global_batch_size': 64, 'finetune_per_device_batch_size': 4, 'finetune_learning_rate': 2e-05, 'finetune_weight_decay': 0.1, 'finetune_max_grad_norm': 1.0, 'finetune_lr_scheduler_type': 'linear-warmup+cosine-decay', 'finetune_warmup_ratio': 0.03, 'finetune_train_strategy': 'fsdp-full-shard', 'enable_gradient_checkpointing': True, 'enable_mixed_precision_training': True, 'reduce_in_full_precision': False}, 'dataset': {'type': 'llava-v15', 'dataset_id': 'llava-v15', 'align_stage_components': ['download/llava-laion-cc-sbu-558k/chat.json', 'download/llava-laion-cc-sbu-558k'], 'finetune_stage_components': ['download/llava-v1.5-instruct/llava_v1_5_mix665k.json', 'download/llava-v1.5-instruct'], 'dataset_root_dir': '/hai/scratch/belkhale/datasets/prismatic-vlms'}, 'stage': 'finetune', 'pretrained_checkpoint': None, 'run_id': 'prism-qwen25-extra-dinosiglip-224px+0_5b+stage-finetune+x7', 'run_root_dir': 'runs', 'seed': 7, 'hf_token': '.hf_token', 'trackers': ['jsonl', 'wandb'], 'wandb_project': 'prismatic', 'wandb_entity': None}
2024-11-05 19:31:02,512 INFO MainThread:2188020 [wandb_init.py:init():671] starting backend
2024-11-05 19:31:02,512 INFO MainThread:2188020 [wandb_init.py:init():675] sending inform_init request
2024-11-05 19:31:02,513 INFO MainThread:2188020 [backend.py:_multiprocessing_setup():104] multiprocessing start_methods=fork,spawn,forkserver, using: spawn
2024-11-05 19:31:02,513 INFO MainThread:2188020 [wandb_init.py:init():688] backend started and connected
2024-11-05 19:31:02,515 INFO MainThread:2188020 [wandb_init.py:init():783] updated telemetry
2024-11-05 19:31:02,573 INFO MainThread:2188020 [wandb_init.py:init():816] communicating run to backend with 90.0 second timeout
2024-11-05 19:31:03,050 INFO MainThread:2188020 [wandb_init.py:init():867] starting run threads in backend
2024-11-05 19:31:03,226 INFO MainThread:2188020 [wandb_run.py:_console_start():2463] atexit reg
2024-11-05 19:31:03,227 INFO MainThread:2188020 [wandb_run.py:_redirect():2311] redirect: wrap_raw
2024-11-05 19:31:03,227 INFO MainThread:2188020 [wandb_run.py:_redirect():2376] Wrapping output streams.
2024-11-05 19:31:03,227 INFO MainThread:2188020 [wandb_run.py:_redirect():2401] Redirects installed.
2024-11-05 19:31:03,230 INFO MainThread:2188020 [wandb_init.py:init():911] run started, returning control to user process
2024-11-05 23:38:31,920 INFO MainThread:2188020 [wandb_run.py:_finish():2158] finishing run belkhale/prismatic/jcj67gg8
2024-11-05 23:38:31,920 INFO MainThread:2188020 [wandb_run.py:_atexit_cleanup():2426] got exitcode: 0
2024-11-05 23:38:31,921 INFO MainThread:2188020 [wandb_run.py:_restore():2408] restore
2024-11-05 23:38:31,921 INFO MainThread:2188020 [wandb_run.py:_restore():2414] restore done
2024-11-05 23:38:34,516 INFO MainThread:2188020 [wandb_run.py:_footer_history_summary_info():3975] rendering history
2024-11-05 23:38:34,517 INFO MainThread:2188020 [wandb_run.py:_footer_history_summary_info():4007] rendering summary
2024-11-05 23:38:34,534 INFO MainThread:2188020 [wandb_run.py:_footer_sync_info():3934] logging synced files