3v324v23's picture
first commit
cf05c06
2023-07-25 19:40:14,650 INFO MainThread:41395 [wandb_setup.py:_flush():68] Configure stats pid to 41395
2023-07-25 19:40:14,650 INFO MainThread:41395 [wandb_setup.py:_flush():68] Loading settings from /home/paperspace/.config/wandb/settings
2023-07-25 19:40:14,650 INFO MainThread:41395 [wandb_setup.py:_flush():68] Loading settings from /home/paperspace/safe-rlhf/wandb/settings
2023-07-25 19:40:14,650 INFO MainThread:41395 [wandb_setup.py:_flush():68] Loading settings from environment variables: {'mode': 'offline', '_require_service': 'True'}
2023-07-25 19:40:14,650 WARNING MainThread:41395 [wandb_setup.py:_flush():68] Could not find program at -m safe_rlhf.finetune.__main__
2023-07-25 19:40:14,650 INFO MainThread:41395 [wandb_setup.py:_flush():68] Inferring run settings from compute environment: {'program_relpath': None, 'program': '-m safe_rlhf.finetune.__main__'}
2023-07-25 19:40:14,650 INFO MainThread:41395 [wandb_init.py:_log_setup():476] Logging user logs to /home/paperspace/safe-rlhf/output/sft/wandb/offline-run-20230725_194014-2rh62cpq/logs/debug.log
2023-07-25 19:40:14,650 INFO MainThread:41395 [wandb_init.py:_log_setup():477] Logging internal logs to /home/paperspace/safe-rlhf/output/sft/wandb/offline-run-20230725_194014-2rh62cpq/logs/debug-internal.log
2023-07-25 19:40:14,651 INFO MainThread:41395 [wandb_init.py:init():516] calling init triggers
2023-07-25 19:40:14,651 INFO MainThread:41395 [wandb_init.py:init():519] wandb.init called with sweep_config: {}
config: {'model_name_or_path': 'cerebras/btlm-3b-8k-base', 'max_length': 8092, 'trust_remote_code': True, 'train_datasets': [('bt', {'proportion': 1.0})], 'eval_datasets': None, 'epochs': 16, 'per_device_train_batch_size': 8, 'per_device_eval_batch_size': 2, 'gradient_accumulation_steps': 1, 'gradient_checkpointing': True, 'learning_rate': 4.7e-06, 'lr_scheduler_type': <SchedulerType.COSINE: 'cosine'>, 'num_warmup_steps': 20, 'weight_decay': 0.0, 'seed': 42, 'fp16': False, 'bf16': True, 'tf32': True, 'eval_strategy': 'epoch', 'eval_interval': 1000000, 'need_eval': False, 'eval_split_ratio': None, 'output_dir': '/home/paperspace/safe-rlhf/output/sft', 'log_type': 'wandb', 'log_dir': '/home/paperspace/safe-rlhf/output/sft', 'log_project': 'BT-Training', 'log_run_name': 'sft-2023-07-25-19-40-13', 'save_16bit': False, 'save_interval': 1000000, 'local_rank': 0, 'zero_stage': 2, 'deepspeed': False, 'deepspeed_config': None, 'deepscale': False, 'deepscale_config': None, 'deepspeed_mpi': False, 'global_rank': 0, 'device': device(type='cuda', index=0), 'num_update_steps_per_epoch': 55, 'total_training_steps': 880}
2023-07-25 19:40:14,651 INFO MainThread:41395 [wandb_init.py:init():569] starting backend
2023-07-25 19:40:14,651 INFO MainThread:41395 [wandb_init.py:init():573] setting up manager
2023-07-25 19:40:14,654 INFO MainThread:41395 [backend.py:_multiprocessing_setup():102] multiprocessing start_methods=fork,spawn,forkserver, using: spawn
2023-07-25 19:40:14,655 INFO MainThread:41395 [wandb_init.py:init():580] backend started and connected
2023-07-25 19:40:14,660 INFO MainThread:41395 [wandb_init.py:init():658] updated telemetry
2023-07-25 19:40:14,716 INFO MainThread:41395 [wandb_init.py:init():728] starting run threads in backend
2023-07-25 19:40:15,440 INFO MainThread:41395 [wandb_run.py:_console_start():1980] atexit reg
2023-07-25 19:40:15,440 INFO MainThread:41395 [wandb_run.py:_redirect():1838] redirect: SettingsConsole.WRAP_RAW
2023-07-25 19:40:15,441 INFO MainThread:41395 [wandb_run.py:_redirect():1903] Wrapping output streams.
2023-07-25 19:40:15,441 INFO MainThread:41395 [wandb_run.py:_redirect():1925] Redirects installed.
2023-07-25 19:40:15,443 INFO MainThread:41395 [wandb_init.py:init():765] run started, returning control to user process
2023-07-25 21:08:22,533 INFO MainThread:41395 [wandb_run.py:_finish():1746] finishing run BT-Training/2rh62cpq
2023-07-25 21:08:22,534 INFO MainThread:41395 [wandb_run.py:_atexit_cleanup():1949] got exitcode: 0
2023-07-25 21:08:22,534 INFO MainThread:41395 [wandb_run.py:_restore():1932] restore
2023-07-25 21:08:22,534 INFO MainThread:41395 [wandb_run.py:_restore():1938] restore done
2023-07-25 21:08:23,776 INFO MainThread:41395 [wandb_run.py:_footer_history_summary_info():3377] rendering history
2023-07-25 21:08:23,777 INFO MainThread:41395 [wandb_run.py:_footer_history_summary_info():3409] rendering summary