Model parameters: d_model 4096 ffw_size 16384 kv_size 128 n_heads 32 n_layers 42 Megatron-DeepSpeed/pretrain_gpt.py --tensor-model-parallel-size 4 --pipeline-model-parallel-size 4 --num-layers 42 --hidden-size 4096 --num-attention-heads 32 --kv-channels 128 --ffn-hidden-size 16384 --seq-length 2048 --max-position-embeddings 2048 --micro-batch-size 1 --global-batch-size 512 --train-samples 1 --vocab-file gpt2/vocab.json --merge-file gpt2/merges.txt --clip-grad 1.0 --kill-switch-path kill-switch-8b7178b4bval --bf16 --optimizer adam --adam-beta1 0.9 --adam-beta2 0.999 --adam-eps 1e-8 --lr 2e-4 --min-lr 2e-5 --lr-decay-style cosine --lr-decay-samples 1 --lr-warmup-samples 0 --clip-grad 1.0 --weight-decay 1e-1 --override-lr-scheduler --reset-progress --no-load-optim --log-interval 10 --save-interval 5000 --eval-interval 1 --eval-iters 100 --eval-only true --tensorboard-dir tensorboard_8b7178b4bval --tensorboard-queue-size 5 --log-timers-to-tensorboard --log-batch-size-to-tensorboard --log-validation-ppl-to-tensorboard --save lm1-8b7-178b-c4-repetitions/8b7178b4b --load lm1-8b7-178b-c4-repetitions/8b7178b4b --train-weighted-split-paths-path train400m.txt --valid-weighted-split-paths-path val.txt --data-impl mmap --num-workers 0 --valid-num-workers 0 --deepspeed --deepspeed_config ds_configs/3583607.json --zero-stage 0 START 3583607: Thu 25 May 2023 01:34:29 PM EEST 0: 0: 0: ======================= ROCm System Management Interface ======================= 0: ================================= Concise Info ================================= 0: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 0: 0 46.0c 92.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 0: 1 49.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 0: 2 45.0c 85.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 0: 3 41.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 0: 4 44.0c 85.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 0: 5 46.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 0: 6 45.0c 82.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 0: 7 46.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 0: ================================================================================ 0: ============================= End of ROCm SMI Log ============================== 19: 19: 19: ======================= ROCm System Management Interface ======================= 19: ================================= Concise Info ================================= 19: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 19: 0 41.0c 92.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 19: 1 52.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 19: 2 37.0c 94.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 19: 3 41.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 19: 4 45.0c 90.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 19: 5 50.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 19: 6 40.0c 90.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 19: 7 43.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 19: ================================================================================ 19: ============================= End of ROCm SMI Log ============================== 22: 22: 22: ======================= ROCm System Management Interface ======================= 22: ================================= Concise Info ================================= 22: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 22: 0 46.0c 97.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 22: 1 50.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 22: 2 47.0c 88.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 22: 3 46.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 22: 4 42.0c 88.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 22: 5 46.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 22: 6 39.0c 87.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 22: 7 42.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 22: ================================================================================ 22: ============================= End of ROCm SMI Log ============================== 13: 13: 13: ======================= ROCm System Management Interface ======================= 13: ================================= Concise Info ================================= 13: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 13: 0 45.0c 87.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 13: 1 45.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 13: 2 43.0c 89.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 13: 3 46.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 13: 4 42.0c 93.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 13: 5 43.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 13: 6 43.0c 86.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 13: 7 44.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 13: ================================================================================ 13: ============================= End of ROCm SMI Log ============================== 9: 9: 9: ======================= ROCm System Management Interface ======================= 9: ================================= Concise Info ================================= 9: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 9: 0 46.0c 91.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 9: 1 43.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 9: 2 42.0c 92.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 9: 3 47.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 9: 4 42.0c 88.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 9: 5 44.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 9: 6 42.0c 94.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 9: 7 43.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 9: ================================================================================ 9: ============================= End of ROCm SMI Log ============================== 30: 30: 30: ======================= ROCm System Management Interface ======================= 30: ================================= Concise Info ================================= 30: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 30: 0 43.0c 98.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 30: 1 45.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 30: 2 41.0c 89.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 30: 3 49.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 30: 4 40.0c 89.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 30: 5 43.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 30: 6 43.0c 87.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 30: 7 45.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 30: ================================================================================ 30: ============================= End of ROCm SMI Log ============================== 16: 16: 16: ======================= ROCm System Management Interface ======================= 16: ================================= Concise Info ================================= 16: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 16: 0 42.0c 96.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 16: 1 48.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 16: 2 41.0c 92.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 16: 3 49.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 16: 4 43.0c 89.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 16: 5 45.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 16: 6 43.0c 85.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 16: 7 48.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 16: ================================================================================ 16: ============================= End of ROCm SMI Log ============================== 17: 17: 17: ======================= ROCm System Management Interface ======================= 17: ================================= Concise Info ================================= 17: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 17: 0 43.0c 94.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 17: 1 49.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 17: 2 38.0c 86.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 17: 3 47.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 17: 4 46.0c 83.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 17: 5 48.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 17: 6 38.0c 94.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 17: 7 42.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 17: ================================================================================ 17: ============================= End of ROCm SMI Log ============================== 10: 10: 10: ======================= ROCm System Management Interface ======================= 10: ================================= Concise Info ================================= 10: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 10: 0 46.0c 92.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 10: 1 43.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 10: 2 43.0c 87.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 10: 3 46.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 10: 4 39.0c 91.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 10: 5 48.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 10: 6 43.0c 90.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 10: 7 42.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 10: ================================================================================ 10: ============================= End of ROCm SMI Log ============================== 26: 26: 26: ======================= ROCm System Management Interface ======================= 26: ================================= Concise Info ================================= 26: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 26: 0 45.0c 97.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 26: 1 48.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 26: 2 42.0c 93.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 26: 3 45.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 26: 4 44.0c 87.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 26: 5 47.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 26: 6 38.0c 82.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 26: 7 44.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 26: ================================================================================ 26: ============================= End of ROCm SMI Log ============================== 24: 24: 24: ======================= ROCm System Management Interface ======================= 24: ================================= Concise Info ================================= 24: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 24: 0 46.0c 88.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 24: 1 48.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 24: 2 47.0c 85.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 24: 3 43.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 24: 4 41.0c 88.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 24: 5 46.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 24: 6 40.0c 88.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 24: 7 49.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 24: ================================================================================ 24: ============================= End of ROCm SMI Log ============================== 27: 27: 27: ======================= ROCm System Management Interface ======================= 27: ================================= Concise Info ================================= 27: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 27: 0 42.0c 93.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 27: 1 47.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 27: 2 38.0c 92.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 27: 3 46.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 27: 4 48.0c 83.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 27: 5 45.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 27: 6 42.0c 93.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 27: 7 45.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 27: ================================================================================ 27: ============================= End of ROCm SMI Log ============================== 7: 7: 7: ======================= ROCm System Management Interface ======================= 7: ================================= Concise Info ================================= 7: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 7: 0 46.0c 90.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 7: 1 46.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 7: 2 45.0c 84.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 7: 3 40.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 7: 4 41.0c 94.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 7: 5 42.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 7: 6 35.0c 86.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 7: 7 40.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 7: ================================================================================ 7: ============================= End of ROCm SMI Log ============================== 1: 1: 1: ======================= ROCm System Management Interface ======================= 1: ================================= Concise Info ================================= 1: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 1: 0 44.0c 97.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 1: 1 40.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 1: 2 37.0c 91.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 1: 3 42.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 1: 4 40.0c 93.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 1: 5 43.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 1: 6 36.0c 89.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 1: 7 42.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 1: ================================================================================ 1: ============================= End of ROCm SMI Log ============================== 14: 14: 14: ======================= ROCm System Management Interface ======================= 14: ================================= Concise Info ================================= 14: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 14: 0 46.0c 92.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 14: 1 40.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 14: 2 46.0c 94.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 14: 3 44.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 14: 4 48.0c 90.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 14: 5 43.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 14: 6 45.0c 85.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 14: 7 52.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 14: ================================================================================ 14: ============================= End of ROCm SMI Log ============================== 12: 12: 12: ======================= ROCm System Management Interface ======================= 12: ================================= Concise Info ================================= 12: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 12: 0 46.0c 88.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 12: 1 48.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 12: 2 36.0c 86.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 12: 3 44.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 12: 4 43.0c 85.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 12: 5 45.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 12: 6 41.0c 88.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 12: 7 43.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 12: ================================================================================ 12: ============================= End of ROCm SMI Log ============================== 31: 31: 31: ======================= ROCm System Management Interface ======================= 31: ================================= Concise Info ================================= 31: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 31: 0 46.0c 89.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 31: 1 48.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 31: 2 37.0c 87.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 31: 3 49.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 31: 4 44.0c 89.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 31: 5 46.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 31: 6 43.0c 84.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 31: 7 45.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 31: ================================================================================ 31: ============================= End of ROCm SMI Log ============================== 28: 28: 28: ======================= ROCm System Management Interface ======================= 28: ================================= Concise Info ================================= 28: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 28: 0 47.0c 92.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 28: 1 45.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 28: 2 39.0c 88.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 28: 3 38.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 28: 4 42.0c 85.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 28: 5 47.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 28: 6 39.0c 91.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 28: 7 46.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 28: ================================================================================ 28: ============================= End of ROCm SMI Log ============================== 20: 20: 20: ======================= ROCm System Management Interface ======================= 20: ================================= Concise Info ================================= 20: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 20: 0 48.0c 92.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 20: 1 46.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 20: 2 41.0c 89.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 20: 3 43.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 20: 4 46.0c 87.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 20: 5 44.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 20: 6 44.0c 87.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 20: 7 43.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 20: ================================================================================ 20: ============================= End of ROCm SMI Log ============================== 5: 5: 5: ======================= ROCm System Management Interface ======================= 5: ================================= Concise Info ================================= 5: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 5: 0 38.0c 90.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 5: 1 51.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 5: 2 42.0c 86.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 5: 3 44.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 5: 4 40.0c 86.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 5: 5 44.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 5: 6 39.0c 90.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 5: 7 38.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 5: ================================================================================ 5: ============================= End of ROCm SMI Log ============================== 23: 23: 23: ======================= ROCm System Management Interface ======================= 23: ================================= Concise Info ================================= 23: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 23: 0 46.0c 86.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 23: 1 46.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 23: 2 40.0c 86.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 23: 3 43.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 23: 4 41.0c 87.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 23: 5 47.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 23: 6 37.0c 91.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 23: 7 42.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 23: ================================================================================ 23: ============================= End of ROCm SMI Log ============================== 25: 25: 25: ======================= ROCm System Management Interface ======================= 25: ================================= Concise Info ================================= 25: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 25: 0 40.0c 96.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 25: 1 41.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 25: 2 44.0c 85.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 25: 3 43.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 25: 4 44.0c 84.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 25: 5 47.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 25: 6 34.0c 91.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 25: 7 42.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 25: ================================================================================ 25: ============================= End of ROCm SMI Log ============================== 29: 29: 29: ======================= ROCm System Management Interface ======================= 29: ================================= Concise Info ================================= 29: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 29: 0 41.0c 93.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 29: 1 48.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 29: 2 38.0c 94.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 29: 3 38.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 29: 4 41.0c 87.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 29: 5 41.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 29: 6 41.0c 84.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 29: 7 48.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 29: ================================================================================ 29: ============================= End of ROCm SMI Log ============================== 3: 3: 3: ======================= ROCm System Management Interface ======================= 3: ================================= Concise Info ================================= 3: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 3: 0 50.0c 95.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 3: 1 48.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 3: 2 37.0c 87.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 3: 3 45.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 3: 4 39.0c 88.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 3: 5 50.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 3: 6 39.0c 84.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 3: 7 44.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 3: ================================================================================ 3: ============================= End of ROCm SMI Log ============================== 21: 21: 21: ======================= ROCm System Management Interface ======================= 21: ================================= Concise Info ================================= 21: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 21: 0 44.0c 96.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 21: 1 48.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 21: 2 41.0c 86.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 21: 3 46.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 21: 4 41.0c 85.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 21: 5 45.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 21: 6 47.0c 86.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 21: 7 43.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 21: ================================================================================ 21: ============================= End of ROCm SMI Log ============================== 15: 15: 15: ======================= ROCm System Management Interface ======================= 15: ================================= Concise Info ================================= 15: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 15: 0 47.0c 90.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 15: 1 48.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 15: 2 40.0c 87.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 15: 3 45.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 15: 4 39.0c 84.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 15: 5 51.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 15: 6 45.0c 88.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 15: 7 45.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 15: ================================================================================ 15: ============================= End of ROCm SMI Log ============================== 11: 11: 11: ======================= ROCm System Management Interface ======================= 11: ================================= Concise Info ================================= 11: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 11: 0 45.0c 93.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 11: 1 46.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 11: 2 40.0c 86.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 11: 3 45.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 11: 4 44.0c 83.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 11: 5 47.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 11: 6 37.0c 91.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 11: 7 42.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 11: ================================================================================ 11: ============================= End of ROCm SMI Log ============================== 18: 18: 18: ======================= ROCm System Management Interface ======================= 18: ================================= Concise Info ================================= 18: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 18: 0 48.0c 92.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 18: 1 54.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 18: 2 38.0c 90.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 18: 3 42.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 18: 4 40.0c 93.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 18: 5 49.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 18: 6 38.0c 82.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 18: 7 45.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 18: ================================================================================ 18: ============================= End of ROCm SMI Log ============================== 6: 6: 6: ======================= ROCm System Management Interface ======================= 6: ================================= Concise Info ================================= 6: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 6: 0 48.0c 87.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 6: 1 49.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 6: 2 39.0c 89.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 6: 3 43.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 6: 4 41.0c 86.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 6: 5 49.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 6: 6 43.0c 89.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 6: 7 45.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 6: ================================================================================ 6: ============================= End of ROCm SMI Log ============================== 2: 2: 2: ======================= ROCm System Management Interface ======================= 2: ================================= Concise Info ================================= 2: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 2: 0 44.0c 96.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 2: 1 46.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 2: 2 45.0c 91.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 2: 3 44.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 2: 4 40.0c 84.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 2: 5 48.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 2: 6 37.0c 98.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 2: 7 46.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 2: ================================================================================ 2: ============================= End of ROCm SMI Log ============================== 4: 4: 4: ======================= ROCm System Management Interface ======================= 4: ================================= Concise Info ================================= 4: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 4: 0 44.0c 96.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 4: 1 45.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 4: 2 40.0c 92.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 4: 3 43.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 4: 4 50.0c 83.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 4: 5 46.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 4: 6 40.0c 95.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 4: 7 40.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 4: ================================================================================ 4: ============================= End of ROCm SMI Log ============================== 8: 8: 8: ======================= ROCm System Management Interface ======================= 8: ================================= Concise Info ================================= 8: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 8: 0 43.0c 89.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 8: 1 47.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 8: 2 40.0c 91.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 8: 3 44.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 8: 4 43.0c 86.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 8: 5 46.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 8: 6 38.0c 83.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 8: 7 50.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 8: ================================================================================ 8: ============================= End of ROCm SMI Log ============================== 27: Launching on nid006132 (27/32), master nid006105 port 9999, GPUs 8, CUDA: True 20: Launching on nid006125 (20/32), master nid006105 port 9999, GPUs 8, CUDA: True 17: Launching on nid006122 (17/32), master nid006105 port 9999, GPUs 8, CUDA: True 30: Launching on nid006135 (30/32), master nid006105 port 9999, GPUs 8, CUDA: True 13: Launching on nid006118 (13/32), master nid006105 port 9999, GPUs 8, CUDA: True 19: Launching on nid006124 (19/32), master nid006105 port 9999, GPUs 8, CUDA: True 24: Launching on nid006129 (24/32), master nid006105 port 9999, GPUs 8, CUDA: True 14: Launching on nid006119 (14/32), master nid006105 port 9999, GPUs 8, CUDA: True 22: Launching on nid006127 (22/32), master nid006105 port 9999, GPUs 8, CUDA: True 10: Launching on nid006115 (10/32), master nid006105 port 9999, GPUs 8, CUDA: True 23: Launching on nid006128 (23/32), master nid006105 port 9999, GPUs 8, CUDA: True 9: Launching on nid006114 (9/32), master nid006105 port 9999, GPUs 8, CUDA: True 28: Launching on nid006133 (28/32), master nid006105 port 9999, GPUs 8, CUDA: True 5: Launching on nid006110 (5/32), master nid006105 port 9999, GPUs 8, CUDA: True 1: Launching on nid006106 (1/32), master nid006105 port 9999, GPUs 8, CUDA: True 21: Launching on nid006126 (21/32), master nid006105 port 9999, GPUs 8, CUDA: True 12: Launching on nid006117 (12/32), master nid006105 port 9999, GPUs 8, CUDA: True 7: Launching on nid006112 (7/32), master nid006105 port 9999, GPUs 8, CUDA: True 31: Launching on nid006136 (31/32), master nid006105 port 9999, GPUs 8, CUDA: True 25: Launching on nid006130 (25/32), master nid006105 port 9999, GPUs 8, CUDA: True 16: Launching on nid006121 (16/32), master nid006105 port 9999, GPUs 8, CUDA: True 26: Launching on nid006131 (26/32), master nid006105 port 9999, GPUs 8, CUDA: True 3: Launching on nid006108 (3/32), master nid006105 port 9999, GPUs 8, CUDA: True 2: Launching on nid006107 (2/32), master nid006105 port 9999, GPUs 8, CUDA: True 4: Launching on nid006109 (4/32), master nid006105 port 9999, GPUs 8, CUDA: True 6: Launching on nid006111 (6/32), master nid006105 port 9999, GPUs 8, CUDA: True 8: Launching on nid006113 (8/32), master nid006105 port 9999, GPUs 8, CUDA: True 18: Launching on nid006123 (18/32), master nid006105 port 9999, GPUs 8, CUDA: True 11: Launching on nid006116 (11/32), master nid006105 port 9999, GPUs 8, CUDA: True 0: Launching on nid006105 (0/32), master nid006105 port 9999, GPUs 8, CUDA: True 15: Launching on nid006120 (15/32), master nid006105 port 9999, GPUs 8, CUDA: True 29: Launching on nid006134 (29/32), master nid006105 port 9999, GPUs 8, CUDA: True 0: using world size: 256, data-parallel-size: 16, tensor-model-parallel size: 4, pipeline-model-parallel size: 4 0: accumulate and all-reduce gradients in fp32 for bfloat16 data type. 0: using torch.bfloat16 for parameters ... 0: ------------------------ arguments ------------------------ 0: abort_on_unmet_fused_kernel_constraints ......... False 0: accumulate_allreduce_grads_in_fp32 .............. True 0: adam_beta1 ...................................... 0.9 0: adam_beta2 ...................................... 0.999 0: adam_eps ........................................ 1e-08 0: adlr_autoresume ................................. False 0: adlr_autoresume_interval ........................ 1000 0: apply_query_key_layer_scaling ................... True 0: apply_residual_connection_post_layernorm ........ False 0: attention_dropout ............................... 0.1 0: attention_softmax_in_fp32 ....................... False 0: bert_binary_head ................................ True 0: bert_load ....................................... None 0: bf16 ............................................ True 0: bias_dropout_fusion ............................. True 0: bias_gelu_fusion ................................ True 0: biencoder_projection_dim ........................ 0 0: biencoder_shared_query_context_model ............ False 0: block_data_path ................................. None 0: checkpoint_activations .......................... False 0: checkpoint_in_cpu ............................... False 0: checkpoint_num_layers ........................... 1 0: clip_grad ....................................... 1.0 0: codecarbon_dir .................................. None 0: consumed_train_samples .......................... 0 0: consumed_train_tokens ........................... 0 0: consumed_valid_samples .......................... 0 0: contigious_checkpointing ........................ False 0: cpu_optimizer ................................... False 0: cpu_torch_adam .................................. False 0: curriculum_learning ............................. False 0: data_impl ....................................... mmap 0: data_parallel_size .............................. 16 0: data_path ....................................... None 0: dataloader_type ................................. single 0: DDP_impl ........................................ local 0: decoder_seq_length .............................. None 0: deepscale ....................................... False 0: deepscale_config ................................ None 0: deepspeed ....................................... True 0: deepspeed_activation_checkpointing .............. False 0: deepspeed_config ................................ ds_configs/3583607.json 0: deepspeed_mpi ................................... False 0: distribute_checkpointed_activations ............. False 0: distributed_backend ............................. nccl 0: embed_layernorm ................................. False 0: embedding_path .................................. None 0: encoder_seq_length .............................. 2048 0: eod_mask_loss ................................... False 0: eval_interval ................................... 1 0: eval_iters ...................................... 100 0: eval_only ....................................... True 0: evidence_data_path .............................. None 0: exit_duration_in_mins ........................... None 0: exit_interval ................................... None 0: ffn_hidden_size ................................. 16384 0: finetune ........................................ False 0: fp16 ............................................ False 0: fp16_lm_cross_entropy ........................... False 0: fp32_residual_connection ........................ False 0: gigaflos_no_embeds .............................. 0 0: global_batch_size ............................... 512 0: glu_activation .................................. None 0: hidden_dropout .................................. 0.1 0: hidden_size ..................................... 4096 0: hysteresis ...................................... 2 0: ict_head_size ................................... None 0: ict_load ........................................ None 0: img_dim ......................................... 224 0: indexer_batch_size .............................. 128 0: indexer_log_interval ............................ 1000 0: inference ....................................... False 0: init_method_std ................................. 0.02 0: init_method_xavier_uniform ...................... False 0: initial_loss_scale .............................. 4294967296 0: kill_switch_path ................................ kill-switch-8b7178b4bval 0: kv_channels ..................................... 128 0: layer_norm_fusion ............................... True 0: layernorm_epsilon ............................... 1e-05 0: lazy_mpu_init ................................... None 0: load ............................................ lm1-8b7-178b-c4-repetitions/8b7178b4b 0: local_rank ...................................... None 0: log_batch_size_to_tensorboard ................... True 0: log_interval .................................... 10 0: log_learning_rate_to_tensorboard ................ True 0: log_level ....................................... None 0: log_level_replica ............................... None 0: log_loss_scale_to_tensorboard ................... True 0: log_num_zeros_in_grad ........................... False 0: log_params_norm ................................. False 0: log_path ........................................ None 0: log_timers_to_tensorboard ....................... True 0: log_validation_ppl_to_tensorboard ............... True 0: loss_on_targets_only ............................ False 0: loss_scale ...................................... None 0: loss_scale_window ............................... 1000 0: lr .............................................. 0.0002 0: lr_decay_iters .................................. None 0: lr_decay_samples ................................ 1 0: lr_decay_style .................................. cosine 0: lr_decay_tokens ................................. None 0: lr_warmup_fraction .............................. None 0: lr_warmup_iters ................................. 0 0: lr_warmup_samples ............................... 0 0: make_vocab_size_divisible_by .................... 128 0: mask_prob ....................................... 0.15 0: masked_softmax_fusion ........................... True 0: max_position_embeddings ......................... 2048 0: mean_noise_span_length .......................... None 0: memory_centric_tiled_linear ..................... False 0: merge_file ...................................... gpt2/merges.txt 0: micro_batch_size ................................ 1 0: min_loss_scale .................................. 1.0 0: min_lr .......................................... 2e-05 0: mmap_warmup ..................................... False 0: no_load_optim ................................... True 0: no_load_rng ..................................... None 0: no_save_optim ................................... None 0: no_save_rng ..................................... None 0: noise_density ................................... None 0: num_attention_heads ............................. 32 0: num_channels .................................... 3 0: num_classes ..................................... 1000 0: num_layers ...................................... 42 0: num_layers_per_virtual_pipeline_stage ........... None 0: num_workers ..................................... 0 0: onnx_safe ....................................... None 0: openai_gelu ..................................... False 0: optimizer ....................................... adam 0: optimizer_fusion ................................ True 0: override_lr_scheduler ........................... True 0: pad_vocab_size_to ............................... None 0: params_dtype .................................... torch.bfloat16 0: partition_activations ........................... False 0: patch_dim ....................................... 16 0: pipeline_model_parallel_size .................... 4 0: position_embedding_type ......................... PositionEmbeddingType.absolute 0: pp_partition_method ............................. None 0: profile_backward ................................ False 0: query_in_block_prob ............................. 0.1 0: rampup_batch_size ............................... None 0: rank ............................................ 0 0: remote_device ................................... none 0: reset_attention_mask ............................ False 0: reset_position_ids .............................. False 0: reset_progress .................................. True 0: retriever_report_topk_accuracies ................ [] 0: retriever_score_scaling ......................... False 0: retriever_seq_length ............................ 256 0: reweight_loss_based_on_position_frequency ....... False 0: sample_rate ..................................... 1.0 0: save ............................................ lm1-8b7-178b-c4-repetitions/8b7178b4b 0: save_interval ................................... 5000 0: scatter_gather_tensors_in_pipeline .............. True 0: scattered_embeddings ............................ False 0: seed ............................................ 1234 0: seq_length ...................................... 2048 0: sgd_momentum .................................... 0.9 0: short_seq_prob .................................. 0.1 0: skip_train_iteration_range ...................... None 0: split ........................................... None 0: split_transformers .............................. False 0: sync_tp_duplicated_parameters ................... False 0: synchronize_each_layer .......................... False 0: tensor_model_parallel_size ...................... 4 0: tensorboard_dir ................................. tensorboard_8b7178b4bval 0: tensorboard_log_interval ........................ 1 0: tensorboard_queue_size .......................... 5 0: test_weighted_split_paths ....................... None 0: test_weighted_split_paths_path .................. None 0: tile_factor ..................................... 1 0: titles_data_path ................................ None 0: tokenizer_name_or_path .......................... None 0: tokenizer_type .................................. GPT2BPETokenizer 0: train_iters ..................................... None 0: train_samples ................................... 1 0: train_tokens .................................... None 0: train_weighted_split_names ...................... ['train'] 0: train_weighted_split_paths ...................... [['/scratch/project_462000119/data/c4_subsampled/gpt2tok_c4_en_400M_text_document']] 0: train_weighted_split_paths_path ................. None 0: train_weighted_split_splits ..................... [['0:1']] 0: train_weighted_split_weights .................... [['1.0']] 0: universal_checkpoint ............................ False 0: use_bnb_optimizer ............................... False 0: use_checkpoint_lr_scheduler ..................... False 0: use_contiguous_buffers_in_ddp ................... True 0: use_cpu_initialization .......................... None 0: use_one_sent_docs ............................... False 0: use_pin_memory .................................. False 0: valid_num_workers ............................... 0 0: valid_weighted_split_names ...................... ['validation'] 0: valid_weighted_split_paths ...................... [['/scratch/project_462000119/data/c4_validation/gpt2tok_c4validation_rerun_text_document']] 0: valid_weighted_split_paths_path ................. None 0: valid_weighted_split_splits ..................... [['0:1']] 0: valid_weighted_split_weights .................... [['1.0']] 0: virtual_pipeline_model_parallel_size ............ None 0: vocab_extra_ids ................................. 0 0: vocab_file ...................................... gpt2/vocab.json 0: weight_decay .................................... 0.1 0: world_size ...................................... 256 0: zero_allgather_bucket_size ...................... 0.0 0: zero_contigious_gradients ....................... False 0: zero_reduce_bucket_size ......................... 0.0 0: zero_reduce_scatter ............................. False 0: zero_stage ...................................... 0 0: -------------------- end of arguments --------------------- 0: setting number of micro-batches to constant 32 0: > building GPT2BPETokenizer tokenizer ... 0: > padded vocab (size: 50257) with 431 dummy tokens (new size: 50688) 0: DeepSpeed general environment info: 0: torch install path ............... ['/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch'] 0: torch version .................... 1.13.0+rocm5.2 0: torch cuda version ............... None 0: torch hip version ................ 5.2.21151-afdc89f8 0: nvcc version ..................... None 0: deepspeed install path ........... ['/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/deepspeed'] 0: deepspeed info ................... 0.7.5, unknown, unknown 0: deepspeed wheel compiled w. ...... torch 1.13, hip 5.1 31: > setting tensorboard ... 0: **** Git info for Megatron: git_hash=unknown git_branch=unknown **** 0: > initializing torch distributed ... 0: [2023-05-25 13:37:21,763] [INFO] [comm.py:633:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl 0: > initializing tensor model parallel with size 4 0: > initializing pipeline model parallel with size 4 0: > setting random seeds to 1234 ... 0: > initializing model parallel cuda seeds on global rank 0, model parallel rank 0, and data parallel rank 0 with model parallel seed: 3952 and data parallel seed: 1234 0: > compiling dataset index builder ... 0: make: Entering directory '/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/data' 0: make: Nothing to be done for 'default'. 0: make: Leaving directory '/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/data' 0: >>> done with dataset index builder. Compilation time: 0.096 seconds 0: > compiling and loading fused kernels ... 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/compat.h -> /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/compat.h [skipped, no changes] 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/layer_norm_cuda.cpp -> /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/layer_norm_cuda.cpp [skipped, no changes] 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/type_shim.h -> /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/type_shim.h [skipped, no changes] 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/layer_norm_cuda_kernel.cu -> /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/layer_norm_hip_kernel.hip [skipped, already hipified] 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/type_shim.h -> /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/type_shim.h [skipped, no changes] 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/compat.h -> /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/compat.h [skipped, no changes] 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/scaled_upper_triang_masked_softmax.h -> /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/scaled_upper_triang_masked_softmax_hip.h [skipped, already hipified] 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/scaled_masked_softmax.h -> /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/scaled_masked_softmax_hip.h [skipped, already hipified] 0: Total number of unsupported CUDA function calls: 0 0: 0: 0: Total number of replaced kernel launches: 67 0: ninja: no work to do. 0: >>> done with compiling and loading fused kernels. Compilation time: 27.189 seconds 0: time to initialize megatron (seconds): -0.168 0: [after megatron is initialized] datetime: 2023-05-25 13:37:51 0: building GPT model ... 0: [2023-05-25 13:37:51,982] [INFO] [utils.py:827:see_memory_usage] Before Building Model 0: [2023-05-25 13:37:51,984] [INFO] [utils.py:828:see_memory_usage] MA 0.0 GB Max_MA 0.0 GB CA 0.0 GB Max_CA 0 GB 0: [2023-05-25 13:37:51,984] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 39.23 GB, percent = 7.8% 0: SEED_LAYERS=False BASE_SEED=1234 SEED_FN=None 0: Using topology: {ProcessCoord(pipe=0, data=0, model=0): 0, ProcessCoord(pipe=0, data=0, model=1): 1, ProcessCoord(pipe=0, data=0, model=2): 2, ProcessCoord(pipe=0, data=0, model=3): 3, ProcessCoord(pipe=0, data=1, model=0): 4, ProcessCoord(pipe=0, data=1, model=1): 5, ProcessCoord(pipe=0, data=1, model=2): 6, ProcessCoord(pipe=0, data=1, model=3): 7, ProcessCoord(pipe=0, data=2, model=0): 8, ProcessCoord(pipe=0, data=2, model=1): 9, ProcessCoord(pipe=0, data=2, model=2): 10, ProcessCoord(pipe=0, data=2, model=3): 11, ProcessCoord(pipe=0, data=3, model=0): 12, ProcessCoord(pipe=0, data=3, model=1): 13, ProcessCoord(pipe=0, data=3, model=2): 14, ProcessCoord(pipe=0, data=3, model=3): 15, ProcessCoord(pipe=0, data=4, model=0): 16, ProcessCoord(pipe=0, data=4, model=1): 17, ProcessCoord(pipe=0, data=4, model=2): 18, ProcessCoord(pipe=0, data=4, model=3): 19, ProcessCoord(pipe=0, data=5, model=0): 20, ProcessCoord(pipe=0, data=5, model=1): 21, ProcessCoord(pipe=0, data=5, model=2): 22, ProcessCoord(pipe=0, data=5, 0: model=3): 23, ProcessCoord(pipe=0, data=6, model=0): 24, ProcessCoord(pipe=0, data=6, model=1): 25, ProcessCoord(pipe=0, data=6, model=2): 26, ProcessCoord(pipe=0, data=6, model=3): 27, ProcessCoord(pipe=0, data=7, model=0): 28, ProcessCoord(pipe=0, data=7, model=1): 29, ProcessCoord(pipe=0, data=7, model=2): 30, ProcessCoord(pipe=0, data=7, model=3): 31, ProcessCoord(pipe=0, data=8, model=0): 32, ProcessCoord(pipe=0, data=8, model=1): 33, ProcessCoord(pipe=0, data=8, model=2): 34, ProcessCoord(pipe=0, data=8, model=3): 35, ProcessCoord(pipe=0, data=9, model=0): 36, ProcessCoord(pipe=0, data=9, model=1): 37, ProcessCoord(pipe=0, data=9, model=2): 38, ProcessCoord(pipe=0, data=9, model=3): 39, ProcessCoord(pipe=0, data=10, model=0): 40, ProcessCoord(pipe=0, data=10, model=1): 41, ProcessCoord(pipe=0, data=10, model=2): 42, ProcessCoord(pipe=0, data=10, model=3): 43, ProcessCoord(pipe=0, data=11, model=0): 44, ProcessCoord(pipe=0, data=11, model=1): 45, ProcessCoord(pipe=0, data=11, model=2): 46, ProcessCoord( 0: pipe=0, data=11, model=3): 47, ProcessCoord(pipe=0, data=12, model=0): 48, ProcessCoord(pipe=0, data=12, model=1): 49, ProcessCoord(pipe=0, data=12, model=2): 50, ProcessCoord(pipe=0, data=12, model=3): 51, ProcessCoord(pipe=0, data=13, model=0): 52, ProcessCoord(pipe=0, data=13, model=1): 53, ProcessCoord(pipe=0, data=13, model=2): 54, ProcessCoord(pipe=0, data=13, model=3): 55, ProcessCoord(pipe=0, data=14, model=0): 56, ProcessCoord(pipe=0, data=14, model=1): 57, ProcessCoord(pipe=0, data=14, model=2): 58, ProcessCoord(pipe=0, data=14, model=3): 59, ProcessCoord(pipe=0, data=15, model=0): 60, ProcessCoord(pipe=0, data=15, model=1): 61, ProcessCoord(pipe=0, data=15, model=2): 62, ProcessCoord(pipe=0, data=15, model=3): 63, ProcessCoord(pipe=1, data=0, model=0): 64, ProcessCoord(pipe=1, data=0, model=1): 65, ProcessCoord(pipe=1, data=0, model=2): 66, ProcessCoord(pipe=1, data=0, model=3): 67, ProcessCoord(pipe=1, data=1, model=0): 68, ProcessCoord(pipe=1, data=1, model=1): 69, ProcessCoord(pipe=1, data=1, mo 0: del=2): 70, ProcessCoord(pipe=1, data=1, model=3): 71, ProcessCoord(pipe=1, data=2, model=0): 72, ProcessCoord(pipe=1, data=2, model=1): 73, ProcessCoord(pipe=1, data=2, model=2): 74, ProcessCoord(pipe=1, data=2, model=3): 75, ProcessCoord(pipe=1, data=3, model=0): 76, ProcessCoord(pipe=1, data=3, model=1): 77, ProcessCoord(pipe=1, data=3, model=2): 78, ProcessCoord(pipe=1, data=3, model=3): 79, ProcessCoord(pipe=1, data=4, model=0): 80, ProcessCoord(pipe=1, data=4, model=1): 81, ProcessCoord(pipe=1, data=4, model=2): 82, ProcessCoord(pipe=1, data=4, model=3): 83, ProcessCoord(pipe=1, data=5, model=0): 84, ProcessCoord(pipe=1, data=5, model=1): 85, ProcessCoord(pipe=1, data=5, model=2): 86, ProcessCoord(pipe=1, data=5, model=3): 87, ProcessCoord(pipe=1, data=6, model=0): 88, ProcessCoord(pipe=1, data=6, model=1): 89, ProcessCoord(pipe=1, data=6, model=2): 90, ProcessCoord(pipe=1, data=6, model=3): 91, ProcessCoord(pipe=1, data=7, model=0): 92, ProcessCoord(pipe=1, data=7, model=1): 93, ProcessCoord(pipe=1, da 0: ta=7, model=2): 94, ProcessCoord(pipe=1, data=7, model=3): 95, ProcessCoord(pipe=1, data=8, model=0): 96, ProcessCoord(pipe=1, data=8, model=1): 97, ProcessCoord(pipe=1, data=8, model=2): 98, ProcessCoord(pipe=1, data=8, model=3): 99, ProcessCoord(pipe=1, data=9, model=0): 100, ProcessCoord(pipe=1, data=9, model=1): 101, ProcessCoord(pipe=1, data=9, model=2): 102, ProcessCoord(pipe=1, data=9, model=3): 103, ProcessCoord(pipe=1, data=10, model=0): 104, ProcessCoord(pipe=1, data=10, model=1): 105, ProcessCoord(pipe=1, data=10, model=2): 106, ProcessCoord(pipe=1, data=10, model=3): 107, ProcessCoord(pipe=1, data=11, model=0): 108, ProcessCoord(pipe=1, data=11, model=1): 109, ProcessCoord(pipe=1, data=11, model=2): 110, ProcessCoord(pipe=1, data=11, model=3): 111, ProcessCoord(pipe=1, data=12, model=0): 112, ProcessCoord(pipe=1, data=12, model=1): 113, ProcessCoord(pipe=1, data=12, model=2): 114, ProcessCoord(pipe=1, data=12, model=3): 115, ProcessCoord(pipe=1, data=13, model=0): 116, ProcessCoord(pipe=1, data=13 0: , model=1): 117, ProcessCoord(pipe=1, data=13, model=2): 118, ProcessCoord(pipe=1, data=13, model=3): 119, ProcessCoord(pipe=1, data=14, model=0): 120, ProcessCoord(pipe=1, data=14, model=1): 121, ProcessCoord(pipe=1, data=14, model=2): 122, ProcessCoord(pipe=1, data=14, model=3): 123, ProcessCoord(pipe=1, data=15, model=0): 124, ProcessCoord(pipe=1, data=15, model=1): 125, ProcessCoord(pipe=1, data=15, model=2): 126, ProcessCoord(pipe=1, data=15, model=3): 127, ProcessCoord(pipe=2, data=0, model=0): 128, ProcessCoord(pipe=2, data=0, model=1): 129, ProcessCoord(pipe=2, data=0, model=2): 130, ProcessCoord(pipe=2, data=0, model=3): 131, ProcessCoord(pipe=2, data=1, model=0): 132, ProcessCoord(pipe=2, data=1, model=1): 133, ProcessCoord(pipe=2, data=1, model=2): 134, ProcessCoord(pipe=2, data=1, model=3): 135, ProcessCoord(pipe=2, data=2, model=0): 136, ProcessCoord(pipe=2, data=2, model=1): 137, ProcessCoord(pipe=2, data=2, model=2): 138, ProcessCoord(pipe=2, data=2, model=3): 139, ProcessCoord(pipe=2, data=3, 0: model=0): 140, ProcessCoord(pipe=2, data=3, model=1): 141, ProcessCoord(pipe=2, data=3, model=2): 142, ProcessCoord(pipe=2, data=3, model=3): 143, ProcessCoord(pipe=2, data=4, model=0): 144, ProcessCoord(pipe=2, data=4, model=1): 145, ProcessCoord(pipe=2, data=4, model=2): 146, ProcessCoord(pipe=2, data=4, model=3): 147, ProcessCoord(pipe=2, data=5, model=0): 148, ProcessCoord(pipe=2, data=5, model=1): 149, ProcessCoord(pipe=2, data=5, model=2): 150, ProcessCoord(pipe=2, data=5, model=3): 151, ProcessCoord(pipe=2, data=6, model=0): 152, ProcessCoord(pipe=2, data=6, model=1): 153, ProcessCoord(pipe=2, data=6, model=2): 154, ProcessCoord(pipe=2, data=6, model=3): 155, ProcessCoord(pipe=2, data=7, model=0): 156, ProcessCoord(pipe=2, data=7, model=1): 157, ProcessCoord(pipe=2, data=7, model=2): 158, ProcessCoord(pipe=2, data=7, model=3): 159, ProcessCoord(pipe=2, data=8, model=0): 160, ProcessCoord(pipe=2, data=8, model=1): 161, ProcessCoord(pipe=2, data=8, model=2): 162, ProcessCoord(pipe=2, data=8, model=3): 16 0: 3, ProcessCoord(pipe=2, data=9, model=0): 164, ProcessCoord(pipe=2, data=9, model=1): 165, ProcessCoord(pipe=2, data=9, model=2): 166, ProcessCoord(pipe=2, data=9, model=3): 167, ProcessCoord(pipe=2, data=10, model=0): 168, ProcessCoord(pipe=2, data=10, model=1): 169, ProcessCoord(pipe=2, data=10, model=2): 170, ProcessCoord(pipe=2, data=10, model=3): 171, ProcessCoord(pipe=2, data=11, model=0): 172, ProcessCoord(pipe=2, data=11, model=1): 173, ProcessCoord(pipe=2, data=11, model=2): 174, ProcessCoord(pipe=2, data=11, model=3): 175, ProcessCoord(pipe=2, data=12, model=0): 176, ProcessCoord(pipe=2, data=12, model=1): 177, ProcessCoord(pipe=2, data=12, model=2): 178, ProcessCoord(pipe=2, data=12, model=3): 179, ProcessCoord(pipe=2, data=13, model=0): 180, ProcessCoord(pipe=2, data=13, model=1): 181, ProcessCoord(pipe=2, data=13, model=2): 182, ProcessCoord(pipe=2, data=13, model=3): 183, ProcessCoord(pipe=2, data=14, model=0): 184, ProcessCoord(pipe=2, data=14, model=1): 185, ProcessCoord(pipe=2, data=14, model 0: =2): 186, ProcessCoord(pipe=2, data=14, model=3): 187, ProcessCoord(pipe=2, data=15, model=0): 188, ProcessCoord(pipe=2, data=15, model=1): 189, ProcessCoord(pipe=2, data=15, model=2): 190, ProcessCoord(pipe=2, data=15, model=3): 191, ProcessCoord(pipe=3, data=0, model=0): 192, ProcessCoord(pipe=3, data=0, model=1): 193, ProcessCoord(pipe=3, data=0, model=2): 194, ProcessCoord(pipe=3, data=0, model=3): 195, ProcessCoord(pipe=3, data=1, model=0): 196, ProcessCoord(pipe=3, data=1, model=1): 197, ProcessCoord(pipe=3, data=1, model=2): 198, ProcessCoord(pipe=3, data=1, model=3): 199, ProcessCoord(pipe=3, data=2, model=0): 200, ProcessCoord(pipe=3, data=2, model=1): 201, ProcessCoord(pipe=3, data=2, model=2): 202, ProcessCoord(pipe=3, data=2, model=3): 203, ProcessCoord(pipe=3, data=3, model=0): 204, ProcessCoord(pipe=3, data=3, model=1): 205, ProcessCoord(pipe=3, data=3, model=2): 206, ProcessCoord(pipe=3, data=3, model=3): 207, ProcessCoord(pipe=3, data=4, model=0): 208, ProcessCoord(pipe=3, data=4, model=1): 20 0: 9, ProcessCoord(pipe=3, data=4, model=2): 210, ProcessCoord(pipe=3, data=4, model=3): 211, ProcessCoord(pipe=3, data=5, model=0): 212, ProcessCoord(pipe=3, data=5, model=1): 213, ProcessCoord(pipe=3, data=5, model=2): 214, ProcessCoord(pipe=3, data=5, model=3): 215, ProcessCoord(pipe=3, data=6, model=0): 216, ProcessCoord(pipe=3, data=6, model=1): 217, ProcessCoord(pipe=3, data=6, model=2): 218, ProcessCoord(pipe=3, data=6, model=3): 219, ProcessCoord(pipe=3, data=7, model=0): 220, ProcessCoord(pipe=3, data=7, model=1): 221, ProcessCoord(pipe=3, data=7, model=2): 222, ProcessCoord(pipe=3, data=7, model=3): 223, ProcessCoord(pipe=3, data=8, model=0): 224, ProcessCoord(pipe=3, data=8, model=1): 225, ProcessCoord(pipe=3, data=8, model=2): 226, ProcessCoord(pipe=3, data=8, model=3): 227, ProcessCoord(pipe=3, data=9, model=0): 228, ProcessCoord(pipe=3, data=9, model=1): 229, ProcessCoord(pipe=3, data=9, model=2): 230, ProcessCoord(pipe=3, data=9, model=3): 231, ProcessCoord(pipe=3, data=10, model=0): 232, ProcessC 0: oord(pipe=3, data=10, model=1): 233, ProcessCoord(pipe=3, data=10, model=2): 234, ProcessCoord(pipe=3, data=10, model=3): 235, ProcessCoord(pipe=3, data=11, model=0): 236, ProcessCoord(pipe=3, data=11, model=1): 237, ProcessCoord(pipe=3, data=11, model=2): 238, ProcessCoord(pipe=3, data=11, model=3): 239, ProcessCoord(pipe=3, data=12, model=0): 240, ProcessCoord(pipe=3, data=12, model=1): 241, ProcessCoord(pipe=3, data=12, model=2): 242, ProcessCoord(pipe=3, data=12, model=3): 243, ProcessCoord(pipe=3, data=13, model=0): 244, ProcessCoord(pipe=3, data=13, model=1): 245, ProcessCoord(pipe=3, data=13, model=2): 246, ProcessCoord(pipe=3, data=13, model=3): 247, ProcessCoord(pipe=3, data=14, model=0): 248, ProcessCoord(pipe=3, data=14, model=1): 249, ProcessCoord(pipe=3, data=14, model=2): 250, ProcessCoord(pipe=3, data=14, model=3): 251, ProcessCoord(pipe=3, data=15, model=0): 252, ProcessCoord(pipe=3, data=15, model=1): 253, ProcessCoord(pipe=3, data=15, model=2): 254, ProcessCoord(pipe=3, data=15, model=3): 25 0: 5} 0: [2023-05-25 13:37:53,694] [INFO] [module.py:366:_partition_layers] Partitioning pipeline stages with method type:transformer 0: stage=0 layers=14 0: 0: _to_float16 0: 1: EmbeddingPipe 0: 2: 0: 3: ParallelTransformerLayerPipe 0: 4: ParallelTransformerLayerPipe 0: 5: ParallelTransformerLayerPipe 0: 6: ParallelTransformerLayerPipe 0: 7: ParallelTransformerLayerPipe 0: 8: ParallelTransformerLayerPipe 0: 9: ParallelTransformerLayerPipe 0: 10: ParallelTransformerLayerPipe 0: 11: ParallelTransformerLayerPipe 0: 12: ParallelTransformerLayerPipe 0: 13: ParallelTransformerLayerPipe 0: stage=1 layers=11 0: 14: ParallelTransformerLayerPipe 0: 15: ParallelTransformerLayerPipe 0: 16: ParallelTransformerLayerPipe 0: 17: ParallelTransformerLayerPipe 0: 18: ParallelTransformerLayerPipe 0: 19: ParallelTransformerLayerPipe 0: 20: ParallelTransformerLayerPipe 0: 21: ParallelTransformerLayerPipe 0: 22: ParallelTransformerLayerPipe 0: 23: ParallelTransformerLayerPipe 0: 24: ParallelTransformerLayerPipe 0: stage=2 layers=11 0: 25: ParallelTransformerLayerPipe 0: 26: ParallelTransformerLayerPipe 0: 27: ParallelTransformerLayerPipe 0: 28: ParallelTransformerLayerPipe 0: 29: ParallelTransformerLayerPipe 0: 30: ParallelTransformerLayerPipe 0: 31: ParallelTransformerLayerPipe 0: 32: ParallelTransformerLayerPipe 0: 33: ParallelTransformerLayerPipe 0: 34: ParallelTransformerLayerPipe 0: 35: ParallelTransformerLayerPipe 0: stage=3 layers=13 0: 36: ParallelTransformerLayerPipe 0: 37: ParallelTransformerLayerPipe 0: 38: ParallelTransformerLayerPipe 0: 39: ParallelTransformerLayerPipe 0: 40: ParallelTransformerLayerPipe 0: 41: ParallelTransformerLayerPipe 0: 42: ParallelTransformerLayerPipe 0: 43: ParallelTransformerLayerPipe 0: 44: ParallelTransformerLayerPipe 0: 45: undo 0: 46: MixedFusedLayerNorm 0: 47: EmbeddingPipe 0: 48: float16_to_fp32 0: loss: CrossEntropy 0: [2023-05-25 13:37:55,326] [INFO] [utils.py:827:see_memory_usage] After Building Model 0: [2023-05-25 13:37:55,327] [INFO] [utils.py:828:see_memory_usage] MA 1.16 GB Max_MA 1.16 GB CA 1.19 GB Max_CA 1 GB 0: [2023-05-25 13:37:55,327] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 39.98 GB, percent = 7.9% 0: setting training iterations to 0 0: > learning rate decay style: cosine 0: DeepSpeed is enabled. 0: [2023-05-25 13:37:55,329] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed info: version=0.7.5, git-hash=unknown, git-branch=unknown 0: [2023-05-25 13:37:56,124] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False 0: [2023-05-25 13:37:56,125] [INFO] [logging.py:68:log_dist] [Rank 0] Removing param_group that has no 'params' in the client Optimizer 0: [2023-05-25 13:37:56,125] [INFO] [logging.py:68:log_dist] [Rank 0] Using client Optimizer as basic optimizer 0: [2023-05-25 13:37:56,128] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed Basic Optimizer = FusedAdam 0: [2023-05-25 13:37:56,128] [INFO] [logging.py:68:log_dist] [Rank 0] Creating BF16 optimizer 8: ninja: no work to do. 8: Time to load utils op: 0.30178356170654297 seconds 0: [2023-05-25 13:37:56,266] [INFO] [utils.py:827:see_memory_usage] begin bf16_optimizer 0: [2023-05-25 13:37:56,267] [INFO] [utils.py:828:see_memory_usage] MA 1.15 GB Max_MA 1.18 GB CA 1.21 GB Max_CA 1 GB 0: [2023-05-25 13:37:56,267] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 40.57 GB, percent = 8.1% 4: ninja: no work to do. 27: Time to load utils op: 0.23575830459594727 seconds 25: Time to load utils op: 0.23824620246887207 seconds 29: Time to load utils op: 0.23657011985778809 seconds 31: Time to load utils op: 0.23646783828735352 seconds 4: Time to load utils op: 0.2690882682800293 seconds 19: Time to load utils op: 0.438152551651001 seconds 27: Time to load utils op: 0.5091795921325684 seconds 27: Time to load utils op: 0.5093820095062256 seconds 4: Time to load utils op: 0.4035837650299072 seconds 4: Time to load utils op: 0.40418434143066406 secondsTime to load utils op: 0.4037165641784668 seconds 4: 4: Time to load utils op: 0.4037313461303711 seconds 4: Time to load utils op: 0.20209789276123047 seconds 4: Time to load utils op: 0.20212411880493164 seconds 4: Time to load utils op: 0.5114321708679199 seconds 8: Time to load utils op: 0.7036423683166504 seconds 8: Time to load utils op: 0.7027533054351807 seconds 27: Time to load utils op: 0.4047553539276123 seconds 31: Time to load utils op: 0.4048454761505127 seconds 27: Time to load utils op: 0.4050273895263672 seconds 27: Time to load utils op: 0.5087499618530273 seconds 25: Time to load utils op: 0.4051194190979004 seconds 25: Time to load utils op: 0.5113706588745117 seconds 29: Time to load utils op: 0.40415000915527344 seconds 31: Time to load utils op: 0.5094578266143799 seconds 19: Time to load utils op: 0.7111871242523193 seconds 31: Time to load utils op: 0.4053914546966553 seconds 25: Time to load utils op: 0.40567612648010254 seconds 25: Time to load utils op: 0.5124614238739014 seconds 29: Time to load utils op: 0.5098941326141357 seconds 25: Time to load utils op: 0.5127496719360352 seconds 29: Time to load utils op: 0.4054267406463623 seconds 1: Time to load utils op: 0.2576630115509033 seconds 1: Time to load utils op: 0.45353198051452637 seconds 1: Time to load utils op: 0.257843017578125 seconds 1: Time to load utils op: 0.45613670349121094 seconds 1: Time to load utils op: 0.562053918838501 seconds 1: Time to load utils op: 0.4560363292694092 secondsTime to load utils op: 0.5620720386505127 seconds 1: Time to load utils op: 0.45362186431884766 seconds 1: 12: Time to load utils op: 0.7370617389678955 secondsTime to load utils op: 0.660893440246582 seconds 12: 12: Time to load utils op: 0.7377474308013916 seconds 19: Time to load utils op: 0.10208749771118164 seconds 19: Time to load utils op: 0.10207986831665039 seconds 31: Time to load utils op: 0.6068530082702637 seconds 0: Time to load utils op: 0.4332597255706787 seconds 0: Time to load utils op: 0.5711750984191895 seconds 0: Time to load utils op: 0.45577406883239746 seconds 0: Time to load utils op: 0.4585137367248535 seconds 31: Time to load utils op: 0.6070947647094727 seconds 0: Time to load utils op: 0.4612240791320801 seconds 10: Time to load utils op: 0.6634395122528076 seconds 10: Time to load utils op: 0.6613531112670898 seconds 10: Time to load utils op: 0.7375209331512451 secondsTime to load utils op: 0.737938642501831 seconds 10: 21: Time to load utils op: 0.6615009307861328 seconds 21: Time to load utils op: 0.10659575462341309 seconds 21: Time to load utils op: 0.10660076141357422 seconds 21: Time to load utils op: 0.6614029407501221 seconds 6: Time to load utils op: 0.25034117698669434 seconds 6: Time to load utils op: 0.45963382720947266 seconds 6: Time to load utils op: 0.25041985511779785 seconds 6: Time to load utils op: 0.45969414710998535 seconds 6: Time to load utils op: 0.4598839282989502 secondsTime to load utils op: 0.4597182273864746 seconds 6: 6: Time to load utils op: 0.5618906021118164 seconds 6: Time to load utils op: 0.5621991157531738 seconds 11: Time to load utils op: 0.7446906566619873 seconds 11: Time to load utils op: 0.7448182106018066 seconds 2: Time to load utils op: 0.46088242530822754 secondsTime to load utils op: 0.25507473945617676 seconds 2: Time to load utils op: 0.4611179828643799 seconds 2: 2: Time to load utils op: 0.4609088897705078 seconds 2: Time to load utils op: 0.5657992362976074 secondsTime to load utils op: 0.2545464038848877 seconds 2: Time to load utils op: 0.5657992362976074 seconds 2: 2: Time to load utils op: 0.46126651763916016 seconds 5: Time to load utils op: 0.25250864028930664 seconds 5: Time to load utils op: 0.2522706985473633 seconds 5: Time to load utils op: 0.4619481563568115 seconds 5: Time to load utils op: 0.4620935916900635 seconds 5: Time to load utils op: 0.4618828296661377 secondsTime to load utils op: 0.4619159698486328 seconds 5: 5: Time to load utils op: 0.561814546585083 seconds 5: Time to load utils op: 0.561821699142456 seconds 21: Time to load utils op: 0.7082655429840088 seconds 7: Time to load utils op: 0.2521176338195801 seconds 7: Time to load utils op: 0.46318626403808594 secondsTime to load utils op: 0.5657186508178711 secondsTime to load utils op: 0.25212836265563965 seconds 7: 7: 7: Time to load utils op: 0.4626500606536865 seconds 7: Time to load utils op: 0.46317338943481445 secondsTime to load utils op: 0.4628410339355469 seconds 7: 7: Time to load utils op: 0.5657434463500977 seconds 14: Time to load utils op: 0.7455112934112549 seconds 14: Time to load utils op: 0.7455756664276123 seconds 21: Time to load utils op: 0.7084395885467529 seconds 9: Time to load utils op: 0.7485213279724121 seconds 9: Time to load utils op: 0.7485222816467285 seconds 29: Time to load utils op: 0.6059412956237793 seconds 21: Time to load utils op: 0.6028850078582764 seconds 21: Time to load utils op: 0.603020429611206 seconds 29: Time to load utils op: 0.6060101985931396 seconds 13: Time to load utils op: 0.7481284141540527 secondsTime to load utils op: 0.7480945587158203 seconds 13: 15: Time to load utils op: 0.7479209899902344 seconds 15: Time to load utils op: 0.7480747699737549 seconds 3: Time to load utils op: 0.26064062118530273 seconds 3: Time to load utils op: 0.26056909561157227 seconds 3: Time to load utils op: 0.4670083522796631 seconds 3: Time to load utils op: 0.5691032409667969 secondsTime to load utils op: 0.569145917892456 seconds 3: 3: Time to load utils op: 0.4664738178253174 seconds 3: Time to load utils op: 0.46637892723083496 seconds 3: Time to load utils op: 0.46645474433898926 seconds 24: Time to load utils op: 0.5560553073883057 secondsTime to load utils op: 0.5558757781982422 secondsTime to load utils op: 0.4384150505065918 seconds 24: 24: Time to load utils op: 0.43471837043762207 seconds 24: 19: Time to load utils op: 0.6093926429748535 seconds 19: Time to load utils op: 0.6096105575561523 seconds 18: Time to load utils op: 0.12158584594726562 seconds 18: Time to load utils op: 0.12160086631774902 seconds 18: Time to load utils op: 0.6143319606781006 seconds 18: Time to load utils op: 0.7552504539489746 seconds 18: Time to load utils op: 0.6144566535949707 seconds 18: Time to load utils op: 0.7540421485900879 seconds 22: Time to load utils op: 0.12187433242797852 secondsTime to load utils op: 0.12182140350341797 seconds 22: 22: Time to load utils op: 0.7060589790344238 seconds 22: Time to load utils op: 0.6143419742584229 secondsTime to load utils op: 0.6147100925445557 secondsTime to load utils op: 0.7060697078704834 seconds 22: 22: 26: Time to load utils op: 0.44120168685913086 seconds 26: Time to load utils op: 0.5481042861938477 secondsTime to load utils op: 0.5480649471282959 seconds 26: 26: Time to load utils op: 0.589282751083374 seconds 26: Time to load utils op: 0.4412698745727539 seconds 26: Time to load utils op: 0.5893232822418213 seconds 8: Time to load utils op: 0.7042996883392334 seconds 20: Time to load utils op: 0.1278538703918457 seconds 20: Time to load utils op: 0.1278393268585205 seconds 20: Time to load utils op: 0.620067834854126 seconds 20: Time to load utils op: 0.6201231479644775 seconds 20: Time to load utils op: 0.7358834743499756 seconds 20: Time to load utils op: 0.7359130382537842 seconds 23: Time to load utils op: 0.13059258460998535 secondsTime to load utils op: 0.13062262535095215 seconds 23: 23: Time to load utils op: 0.6215388774871826 seconds 23: Time to load utils op: 0.6220412254333496 seconds 23: Time to load utils op: 0.7129933834075928 seconds 9: Time to load utils op: 0.7032003402709961 seconds 23: Time to load utils op: 0.7130067348480225 seconds 12: Time to load utils op: 0.7025938034057617 seconds 9: Time to load utils op: 0.7030534744262695 seconds 8: Time to load utils op: 0.7033205032348633 seconds 16: Time to load utils op: 0.13652253150939941 secondsTime to load utils op: 0.13122868537902832 seconds 16: 16: Time to load utils op: 0.6476225852966309 secondsTime to load utils op: 0.6473546028137207 seconds 16: 13: Time to load utils op: 0.7033224105834961 seconds 11: Time to load utils op: 0.7037146091461182 seconds 16: Time to load utils op: 0.6476404666900635 seconds 16: Time to load utils op: 0.6138520240783691 secondsTime to load utils op: 0.613917350769043 seconds 16: 16: Time to load utils op: 0.647650957107544 seconds 15: Time to load utils op: 0.7038397789001465 seconds 12: Time to load utils op: 0.7036304473876953 seconds 13: Time to load utils op: 0.7037298679351807 seconds 15: Time to load utils op: 0.7037575244903564 seconds 11: Time to load utils op: 0.70395827293396 seconds 14: Time to load utils op: 0.7038009166717529 seconds 24: Time to load utils op: 0.202284574508667 seconds 14: Time to load utils op: 0.7040200233459473 seconds 8: Time to load utils op: 0.8194050788879395 seconds 17: Time to load utils op: 0.14082074165344238 seconds 17: Time to load utils op: 0.1407914161682129 seconds 17: Time to load utils op: 0.6432876586914062 secondsTime to load utils op: 0.6432888507843018 seconds 17: 17: Time to load utils op: 0.6433193683624268 secondsTime to load utils op: 0.6123011112213135 secondsTime to load utils op: 0.6431088447570801 seconds 17: 17: 17: Time to load utils op: 0.6123988628387451 seconds 8: Time to load utils op: 0.0005354881286621094 seconds 8: Time to load utils op: 0.0005872249603271484 seconds 8: Time to load utils op: 0.0005140304565429688 seconds 8: Time to load utils op: 0.0004918575286865234 seconds 8: Time to load utils op: 0.0008852481842041016 secondsTime to load utils op: 0.0008988380432128906 seconds 8: 31: Time to load utils op: 0.0005307197570800781 seconds 31: Time to load utils op: 0.0005800724029541016 seconds 31: Time to load utils op: 0.0005927085876464844 seconds 31: Time to load utils op: 0.0006144046783447266 secondsTime to load utils op: 0.0006015300750732422 seconds 31: 31: Time to load utils op: 0.0006074905395507812 seconds 4: Time to load utils op: 0.0005595684051513672 seconds 4: Time to load utils op: 0.0006244182586669922 secondsTime to load utils op: 0.0006279945373535156 secondsTime to load utils op: 0.0006160736083984375 seconds 4: Time to load utils op: 0.0006418228149414062 seconds 4: 4: 29: Time to load utils op: 0.00036144256591796875 seconds 4: Time to load utils op: 0.0006935596466064453 seconds 4: Time to load utils op: 0.0006651878356933594 seconds 4: Time to load utils op: 0.0005555152893066406 seconds 29: Time to load utils op: 0.0005433559417724609 seconds 19: Time to load utils op: 0.0005021095275878906 seconds 19: Time to load utils op: 0.0005240440368652344 seconds 27: Time to load utils op: 0.0005354881286621094 seconds 25: Time to load utils op: 0.0005118846893310547 seconds 27: Time to load utils op: 0.0005426406860351562 secondsTime to load utils op: 0.0005259513854980469 seconds 27: 25: Time to load utils op: 0.00041222572326660156 seconds 19: Time to load utils op: 0.0005705356597900391 seconds 29: Time to load utils op: 0.000476837158203125 seconds 27: Time to load utils op: 0.0006022453308105469 seconds 27: Time to load utils op: 0.0005571842193603516 seconds 27: Time to load utils op: 0.0005903244018554688 seconds 29: Time to load utils op: 0.0004334449768066406 seconds 29: Time to load utils op: 0.0004489421844482422 seconds 25: Time to load utils op: 0.0005373954772949219 seconds 25: Time to load utils op: 0.0005571842193603516 seconds 25: Time to load utils op: 0.0005788803100585938 seconds 25: Time to load utils op: 0.000576019287109375 seconds 29: Time to load utils op: 0.0004904270172119141 seconds 19: Time to load utils op: 0.0004885196685791016 seconds 19: Time to load utils op: 0.0004863739013671875 seconds 19: Time to load utils op: 0.0005009174346923828 seconds 30: Time to load utils op: 0.4806244373321533 secondsTime to load utils op: 0.480745792388916 seconds 30: 30: Time to load utils op: 0.5842585563659668 seconds 28: Time to load utils op: 0.5876946449279785 seconds 28: Time to load utils op: 0.6461985111236572 seconds 28: Time to load utils op: 0.5877132415771484 secondsTime to load utils op: 0.6461911201477051 seconds 28: 28: Time to load utils op: 0.4819180965423584 secondsTime to load utils op: 0.48189759254455566 seconds 28: 30: Time to load utils op: 0.6484808921813965 seconds 30: Time to load utils op: 0.5842554569244385 seconds 30: Time to load utils op: 0.6485540866851807 seconds 0: [2023-05-25 13:37:56,818] [INFO] [utils.py:827:see_memory_usage] before initializing group 0 0: [2023-05-25 13:37:56,819] [INFO] [utils.py:828:see_memory_usage] MA 1.15 GB Max_MA 1.15 GB CA 1.21 GB Max_CA 1 GB 0: [2023-05-25 13:37:56,819] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 40.57 GB, percent = 8.1% 11: Time to load utils op: 0.0004999637603759766 seconds 11: Time to load utils op: 0.0004982948303222656 seconds 11: Time to load utils op: 0.0005078315734863281 seconds 11: Time to load utils op: 0.00044846534729003906 seconds 26: Time to load utils op: 0.0005154609680175781 seconds 26: Time to load utils op: 0.0005376338958740234 secondsTime to load utils op: 0.0005030632019042969 seconds 26: 26: Time to load utils op: 0.0008871555328369141 seconds 26: Time to load utils op: 0.0008716583251953125 seconds 26: Time to load utils op: 0.0009043216705322266 seconds 0: Time to load utils op: 0.0005297660827636719 seconds 0: Time to load utils op: 0.0005872249603271484 seconds 0: Time to load utils op: 0.0005693435668945312 seconds 14: Time to load utils op: 0.00042724609375 seconds 14: Time to load utils op: 0.00031876564025878906 seconds 14: Time to load utils op: 0.0003883838653564453 seconds 22: Time to load utils op: 0.0005276203155517578 seconds 14: Time to load utils op: 0.0005440711975097656 seconds 22: Time to load utils op: 0.0005440711975097656 seconds 22: Time to load utils op: 0.0005826950073242188 seconds 22: Time to load utils op: 0.0005676746368408203 seconds 22: Time to load utils op: 0.0005846023559570312 seconds 0: Time to load utils op: 0.0005750656127929688 seconds 22: Time to load utils op: 0.0006031990051269531 seconds 7: Time to load utils op: 0.000568389892578125 seconds 7: Time to load utils op: 0.0005848407745361328 seconds 7: Time to load utils op: 0.00041174888610839844 secondsTime to load utils op: 0.00041103363037109375 seconds 7: 7: Time to load utils op: 0.0006411075592041016 secondsTime to load utils op: 0.0006239414215087891 secondsTime to load utils op: 0.0006394386291503906 seconds 7: 7: 7: Time to load utils op: 0.0006632804870605469 seconds 1: Time to load utils op: 0.0005211830139160156 seconds 1: Time to load utils op: 0.0005395412445068359 secondsTime to load utils op: 0.0005125999450683594 seconds 1: 1: Time to load utils op: 0.0005517005920410156 seconds 1: Time to load utils op: 0.0005724430084228516 seconds 1: Time to load utils op: 0.0006003379821777344 seconds 1: Time to load utils op: 0.0006380081176757812 seconds 1: Time to load utils op: 0.0006444454193115234 seconds 5: Time to load utils op: 0.0004296302795410156 seconds 5: Time to load utils op: 0.00043320655822753906 seconds 5: Time to load utils op: 0.0004394054412841797 seconds 5: Time to load utils op: 0.0005700588226318359 seconds 5: Time to load utils op: 0.0004134178161621094 seconds 5: Time to load utils op: 0.0006062984466552734 secondsTime to load utils op: 0.0005769729614257812 seconds 5: 5: Time to load utils op: 0.0006070137023925781 seconds 15: Time to load utils op: 0.00045561790466308594 seconds 15: Time to load utils op: 0.0005035400390625 seconds 16: Time to load utils op: 0.0005047321319580078 seconds 15: Time to load utils op: 0.0006430149078369141 seconds 16: Time to load utils op: 0.0004985332489013672 seconds 15: Time to load utils op: 0.0005288124084472656 seconds 16: Time to load utils op: 0.0004291534423828125 seconds 16: Time to load utils op: 0.0004525184631347656 secondsTime to load utils op: 0.00044345855712890625 secondsTime to load utils op: 0.0004315376281738281 seconds 16: 16: 16: Time to load utils op: 0.0005676746368408203 seconds 16: Time to load utils op: 0.0005621910095214844 seconds 18: Time to load utils op: 0.0005075931549072266 secondsTime to load utils op: 0.0004265308380126953 seconds 18: Time to load utils op: 0.0004279613494873047 seconds 18: 18: Time to load utils op: 0.0005202293395996094 seconds 18: Time to load utils op: 0.0005521774291992188 seconds 18: Time to load utils op: 0.00040721893310546875 seconds 17: Time to load utils op: 0.0009181499481201172 seconds 17: Time to load utils op: 0.0012173652648925781 seconds 17: Time to load utils op: 0.0011553764343261719 seconds 17: Time to load utils op: 0.0011568069458007812 seconds 17: Time to load utils op: 0.00116729736328125 seconds 17: Time to load utils op: 0.0011680126190185547 seconds 17: Time to load utils op: 0.0011627674102783203 seconds 17: Time to load utils op: 0.0012166500091552734 seconds 28: Time to load utils op: 0.0005033016204833984 seconds 28: Time to load utils op: 0.00052642822265625 seconds 28: Time to load utils op: 0.0004949569702148438 seconds 28: Time to load utils op: 0.0005240440368652344 seconds 28: Time to load utils op: 0.0005254745483398438 seconds 28: Time to load utils op: 0.0005373954772949219 seconds 21: Time to load utils op: 0.0005102157592773438 seconds 21: Time to load utils op: 0.0005254745483398438 seconds 21: Time to load utils op: 0.0005786418914794922 seconds 21: Time to load utils op: 0.0005681514739990234 seconds 21: Time to load utils op: 0.0006284713745117188 seconds 21: Time to load utils op: 0.0006606578826904297 secondsTime to load utils op: 0.0006785392761230469 seconds 21: Time to load utils op: 0.0006337165832519531 seconds 21: 12: Time to load utils op: 0.0005335807800292969 seconds 12: Time to load utils op: 0.0005369186401367188 seconds 12: Time to load utils op: 0.00037288665771484375 seconds 12: Time to load utils op: 0.0009567737579345703 seconds 12: Time to load utils op: 0.001031637191772461 seconds 13: Time to load utils op: 0.0004525184631347656 seconds 9: Time to load utils op: 0.0004379749298095703 seconds 13: Time to load utils op: 0.0004355907440185547 seconds 13: Time to load utils op: 0.0004734992980957031 seconds 13: Time to load utils op: 0.0005164146423339844 seconds 9: Time to load utils op: 0.0004668235778808594 seconds 9: Time to load utils op: 0.00047016143798828125 seconds 9: Time to load utils op: 0.0004749298095703125 seconds 20: Time to load utils op: 0.0004754066467285156 seconds 20: Time to load utils op: 0.0005810260772705078 seconds 20: Time to load utils op: 0.0005903244018554688 seconds 30: Time to load utils op: 0.0005154609680175781 seconds 30: Time to load utils op: 0.0005581378936767578 seconds 20: Time to load utils op: 0.0008459091186523438 seconds 30: Time to load utils op: 0.0004153251647949219 seconds 20: Time to load utils op: 0.0008363723754882812 seconds 20: Time to load utils op: 0.0008089542388916016 seconds 30: Time to load utils op: 0.0004482269287109375 seconds 30: Time to load utils op: 0.00048279762268066406 seconds 30: Time to load utils op: 0.0005724430084228516 seconds 10: Time to load utils op: 0.00045037269592285156 seconds 10: Time to load utils op: 0.0003428459167480469 seconds 10: Time to load utils op: 0.0004811286926269531 seconds 10: Time to load utils op: 0.0005216598510742188 seconds 23: Time to load utils op: 0.000377655029296875 seconds 23: Time to load utils op: 0.00048351287841796875 seconds 23: Time to load utils op: 0.0004029273986816406 seconds 23: Time to load utils op: 0.00048804283142089844 seconds 23: Time to load utils op: 0.0004127025604248047 seconds 3: Time to load utils op: 0.0005886554718017578 seconds 3: Time to load utils op: 0.0005872249603271484 seconds 3: Time to load utils op: 0.0005955696105957031 seconds 23: Time to load utils op: 0.0007665157318115234 seconds 3: Time to load utils op: 0.0005393028259277344 seconds 3: Time to load utils op: 0.0005140304565429688 seconds 3: Time to load utils op: 0.0004215240478515625 seconds 3: Time to load utils op: 0.00042366981506347656 seconds 3: Time to load utils op: 0.00044608116149902344 seconds 24: Time to load utils op: 0.0005261898040771484 seconds 24: Time to load utils op: 0.0005981922149658203 seconds 24: Time to load utils op: 0.0006041526794433594 seconds 24: Time to load utils op: 0.0006668567657470703 seconds 24: Time to load utils op: 0.0007162094116210938 seconds 2: Time to load utils op: 0.00041174888610839844 seconds 2: Time to load utils op: 0.00041556358337402344 seconds 2: Time to load utils op: 0.0005459785461425781 seconds 2: Time to load utils op: 0.0004086494445800781 seconds 6: Time to load utils op: 0.0005621910095214844 seconds 2: Time to load utils op: 0.0005383491516113281 seconds 6: Time to load utils op: 0.0005805492401123047 secondsTime to load utils op: 0.000553131103515625 seconds 6: 6: Time to load utils op: 0.0006177425384521484 seconds 6: Time to load utils op: 0.0006079673767089844 seconds 6: Time to load utils op: 0.0005972385406494141 seconds 2: Time to load utils op: 0.0005815029144287109 seconds 6: Time to load utils op: 0.0005533695220947266 seconds 6: Time to load utils op: 0.0006730556488037109 seconds 2: Time to load utils op: 0.0006244182586669922 secondsTime to load utils op: 0.0006330013275146484 seconds 2: 0: [2023-05-25 13:37:57,005] [INFO] [utils.py:827:see_memory_usage] after initializing group 0 0: [2023-05-25 13:37:57,006] [INFO] [utils.py:828:see_memory_usage] MA 2.43 GB Max_MA 2.43 GB CA 3.14 GB Max_CA 3 GB 0: [2023-05-25 13:37:57,006] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 40.58 GB, percent = 8.1% 0: [2023-05-25 13:37:57,112] [INFO] [utils.py:827:see_memory_usage] before initializing group 1 0: [2023-05-25 13:37:57,112] [INFO] [utils.py:828:see_memory_usage] MA 2.43 GB Max_MA 2.43 GB CA 3.14 GB Max_CA 3 GB 0: [2023-05-25 13:37:57,113] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 40.57 GB, percent = 8.1% 0: [2023-05-25 13:37:57,219] [INFO] [utils.py:827:see_memory_usage] after initializing group 1 0: [2023-05-25 13:37:57,220] [INFO] [utils.py:828:see_memory_usage] MA 3.58 GB Max_MA 3.58 GB CA 4.76 GB Max_CA 5 GB 0: [2023-05-25 13:37:57,220] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 40.62 GB, percent = 8.1% 0: [2023-05-25 13:37:57,324] [INFO] [utils.py:827:see_memory_usage] before initializing group 2 0: [2023-05-25 13:37:57,325] [INFO] [utils.py:828:see_memory_usage] MA 3.58 GB Max_MA 3.58 GB CA 4.76 GB Max_CA 5 GB 0: [2023-05-25 13:37:57,325] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 40.61 GB, percent = 8.1% 0: [2023-05-25 13:37:57,431] [INFO] [utils.py:827:see_memory_usage] after initializing group 2 0: [2023-05-25 13:37:57,431] [INFO] [utils.py:828:see_memory_usage] MA 3.58 GB Max_MA 3.58 GB CA 4.76 GB Max_CA 5 GB 0: [2023-05-25 13:37:57,432] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 40.59 GB, percent = 8.1% 24: Time to load utils op: 1.3585577011108398 seconds 24: Time to load utils op: 0.9049663543701172 seconds 24: Time to load utils op: 1.3587603569030762 seconds 9: Time to load utils op: 1.5204792022705078 secondsTime to load utils op: 1.5204503536224365 seconds 9: 30: Time to load utils op: 0.9090657234191895 seconds 13: Time to load utils op: 1.5154027938842773 seconds 12: Time to load utils op: 1.519517421722412 seconds 27: Time to load utils op: 0.9128096103668213 seconds 10: Time to load utils op: 1.4154348373413086 seconds 31: Time to load utils op: 0.9112515449523926 seconds 8: Time to load utils op: 1.4076459407806396 seconds 14: Time to load utils op: 1.5218493938446045 seconds 11: Time to load utils op: 1.5232417583465576 seconds 15: Time to load utils op: 1.5217669010162354 seconds 26: Time to load utils op: 0.9139127731323242 seconds 25: Time to load utils op: 0.9124491214752197 seconds 30: Time to load utils op: 0.9135839939117432 seconds 28: Time to load utils op: 0.9142982959747314 seconds 20: Time to load utils op: 1.4124619960784912 seconds 29: Time to load utils op: 0.9148545265197754 seconds 20: Time to load utils op: 1.4125392436981201 seconds 18: Time to load utils op: 1.416581630706787 seconds 9: Time to load utils op: 1.4094302654266357 seconds 0: Time to load utils op: 1.2117903232574463 seconds 19: Time to load utils op: 1.412179946899414 seconds 13: Time to load utils op: 1.522350549697876 seconds 12: Time to load utils op: 1.409986972808838 seconds 15: Time to load utils op: 1.411417007446289 seconds 27: Time to load utils op: 0.9187188148498535 seconds 23: Time to load utils op: 1.4150562286376953 seconds 9: Time to load utils op: 1.4122142791748047 seconds 31: Time to load utils op: 0.9175219535827637 seconds 11: Time to load utils op: 1.4125797748565674 seconds 14: Time to load utils op: 1.4122824668884277 seconds 26: Time to load utils op: 0.9198484420776367 seconds 25: Time to load utils op: 0.9182353019714355 seconds 28: Time to load utils op: 0.9202065467834473 seconds 0: Time to load utils op: 1.012251615524292 seconds 29: Time to load utils op: 0.9204561710357666 seconds 18: Time to load utils op: 1.42279052734375 seconds 22: Time to load utils op: 1.417161226272583 seconds 8: Time to load utils op: 1.4147624969482422 seconds 15: Time to load utils op: 1.4162187576293945 seconds 13: Time to load utils op: 1.416511058807373 seconds 19: Time to load utils op: 1.4203555583953857 seconds 10: Time to load utils op: 1.426513433456421 seconds 12: Time to load utils op: 1.532006025314331 seconds 23: Time to load utils op: 1.4213123321533203 seconds 24: Time to load utils op: 0.0005617141723632812 seconds 24: Time to load utils op: 0.00038051605224609375 seconds 24: Time to load utils op: 0.00035643577575683594 seconds 11: Time to load utils op: 1.535393476486206 seconds 14: Time to load utils op: 1.5340511798858643 seconds 0: Time to load utils op: 1.0168066024780273 seconds 22: Time to load utils op: 1.4235260486602783 seconds 15: Time to load utils op: 1.53786039352417 seconds 13: Time to load utils op: 1.422839879989624 seconds 11: Time to load utils op: 1.4239988327026367 seconds 14: Time to load utils op: 1.4244880676269531 seconds 10: Time to load utils op: 1.5378572940826416 seconds 10: Time to load utils op: 1.5438995361328125 seconds 8: Time to load utils op: 0.0005128383636474609 seconds 8: Time to load utils op: 0.0003561973571777344 seconds 9: Time to load utils op: 0.005045413970947266 seconds 9: Time to load utils op: 0.0003998279571533203 secondsTime to load utils op: 0.00043892860412597656 seconds 9: 9: Time to load utils op: 0.0003948211669921875 seconds 0: [2023-05-25 13:37:57,546] [INFO] [utils.py:827:see_memory_usage] before initialize_optimizer 0: [2023-05-25 13:37:57,547] [INFO] [utils.py:828:see_memory_usage] MA 3.58 GB Max_MA 3.58 GB CA 4.76 GB Max_CA 5 GB 0: [2023-05-25 13:37:57,547] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 40.64 GB, percent = 8.1% 30: Time to load utils op: 0.0031511783599853516 seconds 30: Time to load utils op: 0.0024242401123046875 seconds 10: Time to load utils op: 0.006439208984375 seconds 10: Time to load utils op: 0.0065593719482421875 seconds 10: Time to load utils op: 0.006856441497802734 secondsTime to load utils op: 0.0058116912841796875 seconds 10: 15: Time to load utils op: 0.005354642868041992 seconds 15: Time to load utils op: 0.005326509475708008 seconds 15: Time to load utils op: 0.005168437957763672 seconds 15: Time to load utils op: 0.004804134368896484 seconds 20: Time to load utils op: 0.004908323287963867 seconds 20: Time to load utils op: 0.0048220157623291016 seconds 13: Time to load utils op: 0.005181074142456055 seconds 13: Time to load utils op: 0.004738569259643555 seconds 13: Time to load utils op: 0.005239725112915039 seconds 13: Time to load utils op: 0.005003452301025391 seconds 0: Time to load utils op: 0.005186796188354492 secondsTime to load utils op: 0.0054874420166015625 seconds 0: 0: Time to load utils op: 0.005579471588134766 seconds 0: [2023-05-25 13:37:57,812] [INFO] [utils.py:827:see_memory_usage] end initialize_optimizer 0: [2023-05-25 13:37:57,813] [INFO] [utils.py:828:see_memory_usage] MA 3.87 GB Max_MA 3.87 GB CA 5.04 GB Max_CA 5 GB 0: [2023-05-25 13:37:57,813] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 40.59 GB, percent = 8.1% 23: Time to load utils op: 0.003995418548583984 seconds 12: Time to load utils op: 0.00435638427734375 seconds 12: Time to load utils op: 0.00035381317138671875 seconds 27: Time to load utils op: 0.004480123519897461 seconds 23: Time to load utils op: 0.00046753883361816406 seconds 27: Time to load utils op: 0.00036025047302246094 seconds 31: Time to load utils op: 0.004678249359130859 seconds 31: Time to load utils op: 0.00401616096496582 seconds 28: Time to load utils op: 0.004118442535400391 seconds 28: Time to load utils op: 0.00035262107849121094 seconds 19: Time to load utils op: 0.003877878189086914 seconds 11: Time to load utils op: 0.005124330520629883 seconds 19: Time to load utils op: 0.00040912628173828125 seconds 25: Time to load utils op: 0.0038955211639404297 secondsTime to load utils op: 0.003854036331176758 seconds 25: 11: Time to load utils op: 0.0004703998565673828 seconds 12: Time to load utils op: 0.0004875659942626953 seconds 18: Time to load utils op: 0.0043582916259765625 seconds 18: Time to load utils op: 0.0036308765411376953 seconds 29: Time to load utils op: 0.004399299621582031 seconds 14: Time to load utils op: 0.003698587417602539 seconds 29: Time to load utils op: 0.00035262107849121094 seconds 26: Time to load utils op: 0.0039052963256835938 seconds 26: Time to load utils op: 0.003925800323486328 seconds 14: Time to load utils op: 0.0004837512969970703 seconds 11: Time to load utils op: 0.0004680156707763672 seconds 22: Time to load utils op: 0.0045375823974609375 seconds 14: Time to load utils op: 0.00046563148498535156 seconds 11: Time to load utils op: 0.0003523826599121094 seconds 22: Time to load utils op: 0.00047659873962402344 seconds 14: Time to load utils op: 0.0004787445068359375 seconds 0: [2023-05-25 13:37:57,926] [INFO] [utils.py:827:see_memory_usage] end bf16_optimizer 0: [2023-05-25 13:37:57,927] [INFO] [utils.py:828:see_memory_usage] MA 3.87 GB Max_MA 3.87 GB CA 5.04 GB Max_CA 5 GB 0: [2023-05-25 13:37:57,927] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 40.63 GB, percent = 8.1% 0: [2023-05-25 13:37:57,927] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed Final Optimizer = FusedAdam 0: [2023-05-25 13:37:57,927] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed using client LR scheduler 0: [2023-05-25 13:37:57,927] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed LR Scheduler = 0: [2023-05-25 13:37:57,927] [INFO] [logging.py:68:log_dist] [Rank 0] step=0, skipped=0, lr=[0.0002, 0.0002, 0.0002], mom=[(0.9, 0.999), (0.9, 0.999), (0.9, 0.999)] 0: [2023-05-25 13:37:57,928] [INFO] [config.py:1007:print] DeepSpeedEngine configuration: 0: [2023-05-25 13:37:57,928] [INFO] [config.py:1011:print] activation_checkpointing_config { 0: "partition_activations": false, 0: "contiguous_memory_optimization": false, 0: "cpu_checkpointing": false, 0: "number_checkpoints": null, 0: "synchronize_checkpoint_boundary": false, 0: "profile": false 0: } 0: [2023-05-25 13:37:57,928] [INFO] [config.py:1011:print] aio_config ................... {'block_size': 1048576, 'queue_depth': 8, 'thread_count': 1, 'single_submit': False, 'overlap_events': True} 0: [2023-05-25 13:37:57,928] [INFO] [config.py:1011:print] amp_enabled .................. False 0: [2023-05-25 13:37:57,928] [INFO] [config.py:1011:print] amp_params ................... False 0: [2023-05-25 13:37:57,928] [INFO] [config.py:1011:print] autotuning_config ............ { 0: "enabled": false, 0: "start_step": null, 0: "end_step": null, 0: "metric_path": null, 0: "arg_mappings": null, 0: "metric": "throughput", 0: "model_info": null, 0: "results_dir": "/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/autotuning_results", 0: "exps_dir": "/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/autotuning_exps", 0: "overwrite": true, 0: "fast": true, 0: "start_profile_step": 3, 0: "end_profile_step": 5, 0: "tuner_type": "gridsearch", 0: "tuner_early_stopping": 5, 0: "tuner_num_trials": 50, 0: "model_info_path": null, 0: "mp_size": 1, 0: "max_train_batch_size": null, 0: "min_train_batch_size": 1, 0: "max_train_micro_batch_size_per_gpu": 1.024000e+03, 0: "min_train_micro_batch_size_per_gpu": 1, 0: "num_tuning_micro_batch_sizes": 3 0: } 0: [2023-05-25 13:37:57,928] [INFO] [config.py:1011:print] bfloat16_enabled ............. True 0: [2023-05-25 13:37:57,928] [INFO] [config.py:1011:print] checkpoint_parallel_write_pipeline False 0: [2023-05-25 13:37:57,928] [INFO] [config.py:1011:print] checkpoint_tag_validation_enabled True 0: [2023-05-25 13:37:57,928] [INFO] [config.py:1011:print] checkpoint_tag_validation_fail False 0: [2023-05-25 13:37:57,928] [INFO] [config.py:1011:print] comms_config ................. 0: [2023-05-25 13:37:57,928] [INFO] [config.py:1011:print] communication_data_type ...... None 0: [2023-05-25 13:37:57,928] [INFO] [config.py:1011:print] compression_config ........... {'weight_quantization': {'shared_parameters': {'enabled': False, 'quantizer_kernel': False, 'schedule_offset': 0, 'quantize_groups': 1, 'quantize_verbose': False, 'quantization_type': 'symmetric', 'quantize_weight_in_forward': False, 'rounding': 'nearest', 'fp16_mixed_quantize': False, 'quantize_change_ratio': 0.001}, 'different_groups': {}}, 'activation_quantization': {'shared_parameters': {'enabled': False, 'quantization_type': 'symmetric', 'range_calibration': 'dynamic', 'schedule_offset': 1000}, 'different_groups': {}}, 'sparse_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'row_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'head_pruning': {'shared_parameters': {'enabled': False, 'method': 'topk', 'schedule_offset': 1000}, 'different_groups': {}}, 'channel_pruning': {'shared_pa 0: rameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'layer_reduction': {'enabled': False}} 0: [2023-05-25 13:37:57,928] [INFO] [config.py:1011:print] curriculum_enabled ........... False 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] curriculum_params ............ False 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] dataloader_drop_last ......... False 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] disable_allgather ............ False 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] dump_state ................... False 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] dynamic_loss_scale_args ...... None 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] eigenvalue_enabled ........... False 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] eigenvalue_gas_boundary_resolution 1 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] eigenvalue_layer_name ........ bert.encoder.layer 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] eigenvalue_layer_num ......... 0 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] eigenvalue_max_iter .......... 100 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] eigenvalue_stability ......... 1e-06 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] eigenvalue_tol ............... 0.01 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] eigenvalue_verbose ........... False 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] elasticity_enabled ........... False 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] flops_profiler_config ........ { 0: "enabled": false, 0: "profile_step": 1, 0: "module_depth": -1, 0: "top_modules": 1, 0: "detailed": true, 0: "output_file": null 0: } 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] fp16_auto_cast ............... None 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] fp16_enabled ................. False 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] fp16_master_weights_and_gradients False 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] global_rank .................. 0 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] gradient_accumulation_steps .. 32 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] gradient_clipping ............ 1.0 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] gradient_predivide_factor .... 1.0 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] initial_dynamic_scale ........ 1 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] load_universal_checkpoint .... False 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] loss_scale ................... 1.0 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] memory_breakdown ............. False 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] monitor_config ............... 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] nebula_config ................ { 0: "enabled": false, 0: "persistent_storage_path": null, 0: "persistent_time_interval": 100, 0: "num_of_version_in_retention": 2, 0: "enable_nebula_load": true, 0: "load_path": null 0: } 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] optimizer_legacy_fusion ...... False 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] optimizer_name ............... None 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] optimizer_params ............. None 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] pipeline ..................... {'stages': 'auto', 'partition': 'best', 'seed_layers': False, 'activation_checkpoint_interval': 0} 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] pld_enabled .................. False 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] pld_params ................... False 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] prescale_gradients ........... False 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] scheduler_name ............... None 0: [2023-05-25 13:37:57,929] [INFO] [config.py:1011:print] scheduler_params ............. None 0: [2023-05-25 13:37:57,930] [INFO] [config.py:1011:print] sparse_attention ............. None 0: [2023-05-25 13:37:57,930] [INFO] [config.py:1011:print] sparse_gradients_enabled ..... False 0: [2023-05-25 13:37:57,930] [INFO] [config.py:1011:print] steps_per_print .............. 2000 0: [2023-05-25 13:37:57,930] [INFO] [config.py:1011:print] train_batch_size ............. 512 0: [2023-05-25 13:37:57,930] [INFO] [config.py:1011:print] train_micro_batch_size_per_gpu 1 0: [2023-05-25 13:37:57,930] [INFO] [config.py:1011:print] use_node_local_storage ....... False 0: [2023-05-25 13:37:57,930] [INFO] [config.py:1011:print] wall_clock_breakdown ......... False 0: [2023-05-25 13:37:57,930] [INFO] [config.py:1011:print] world_size ................... 16 0: [2023-05-25 13:37:57,930] [INFO] [config.py:1011:print] zero_allow_untested_optimizer False 0: [2023-05-25 13:37:57,930] [INFO] [config.py:1011:print] zero_config .................. stage=0 contiguous_gradients=True reduce_scatter=True reduce_bucket_size=500000000 allgather_partitions=True allgather_bucket_size=500000000 overlap_comm=False load_from_fp32_weights=True elastic_checkpoint=False offload_param=None offload_optimizer=None sub_group_size=1000000000 cpu_offload_param=None cpu_offload_use_pin_memory=None cpu_offload=None prefetch_bucket_size=50000000 param_persistence_threshold=100000 model_persistence_threshold=9223372036854775807 max_live_parameters=1000000000 max_reuse_distance=1000000000 gather_16bit_weights_on_model_save=False stage3_gather_fp16_weights_on_model_save=False ignore_unused_parameters=True legacy_stage1=False round_robin_gradients=False 0: [2023-05-25 13:37:57,930] [INFO] [config.py:1011:print] zero_enabled ................. False 0: [2023-05-25 13:37:57,930] [INFO] [config.py:1011:print] zero_optimization_stage ...... 0 0: [2023-05-25 13:37:57,930] [INFO] [config.py:996:print_user_config] json = { 0: "train_micro_batch_size_per_gpu": 1, 0: "train_batch_size": 512, 0: "gradient_clipping": 1.0, 0: "zero_optimization": { 0: "stage": 0 0: }, 0: "bf16": { 0: "enabled": true 0: }, 0: "steps_per_print": 2.000000e+03, 0: "wall_clock_breakdown": false 0: } 0: Time to load utils op: 0.0004336833953857422 seconds 0: [2023-05-25 13:37:57,930] [INFO] [engine.py:87:__init__] CONFIG: micro_batches=32 micro_batch_size=1 0: [2023-05-25 13:37:58,360] [INFO] [engine.py:145:__init__] RANK=0 STAGE=0 LAYERS=14 [0, 14) STAGE_PARAMS=614290432 (614.290M) TOTAL_PARAMS=8943427584 (8943.428M) UNIQUE_PARAMS=8702255104 (8702.255M) 0: [2023-05-25 13:37:58,360] [INFO] [engine.py:145:__init__] RANK=1 STAGE=0 LAYERS=14 [0, 14) STAGE_PARAMS=614290432 (614.290M) TOTAL_PARAMS=8943427584 (8943.428M) UNIQUE_PARAMS=8702255104 (8702.255M) 0: [2023-05-25 13:37:58,360] [INFO] [engine.py:145:__init__] RANK=2 STAGE=0 LAYERS=14 [0, 14) STAGE_PARAMS=614290432 (614.290M) TOTAL_PARAMS=8943427584 (8943.428M) UNIQUE_PARAMS=8702255104 (8702.255M) 0: [2023-05-25 13:37:58,360] [INFO] [engine.py:145:__init__] RANK=3 STAGE=0 LAYERS=14 [0, 14) STAGE_PARAMS=614290432 (614.290M) TOTAL_PARAMS=8943427584 (8943.428M) UNIQUE_PARAMS=8702255104 (8702.255M) 16: [2023-05-25 13:37:58,361] [INFO] [engine.py:145:__init__] RANK=131 STAGE=2 LAYERS=11 [25, 36) STAGE_PARAMS=553997312 (553.997M) TOTAL_PARAMS=8943427584 (8943.428M) UNIQUE_PARAMS=8702255104 (8702.255M) 16: [2023-05-25 13:37:58,361] [INFO] [engine.py:145:__init__] RANK=129 STAGE=2 LAYERS=11 [25, 36) STAGE_PARAMS=553997312 (553.997M) TOTAL_PARAMS=8943427584 (8943.428M) UNIQUE_PARAMS=8702255104 (8702.255M) 24: [2023-05-25 13:37:58,361] [INFO] [engine.py:145:__init__] RANK=195 STAGE=3 LAYERS=13 [36, 49) STAGE_PARAMS=513571840 (513.572M) TOTAL_PARAMS=8943427584 (8943.428M) UNIQUE_PARAMS=8702255104 (8702.255M) 24: [2023-05-25 13:37:58,361] [INFO] [engine.py:145:__init__] RANK=192 STAGE=3 LAYERS=13 [36, 49) STAGE_PARAMS=513571840 (513.572M) TOTAL_PARAMS=8943427584 (8943.428M) UNIQUE_PARAMS=8702255104 (8702.255M) 24: [2023-05-25 13:37:58,361] [INFO] [engine.py:145:__init__] RANK=193 STAGE=3 LAYERS=13 [36, 49) STAGE_PARAMS=513571840 (513.572M) TOTAL_PARAMS=8943427584 (8943.428M) UNIQUE_PARAMS=8702255104 (8702.255M) 24: [2023-05-25 13:37:58,361] [INFO] [engine.py:145:__init__] RANK=194 STAGE=3 LAYERS=13 [36, 49) STAGE_PARAMS=513571840 (513.572M) TOTAL_PARAMS=8943427584 (8943.428M) UNIQUE_PARAMS=8702255104 (8702.255M) 16: [2023-05-25 13:37:58,361] [INFO] [engine.py:145:__init__] RANK=130 STAGE=2 LAYERS=11 [25, 36) STAGE_PARAMS=553997312 (553.997M) TOTAL_PARAMS=8943427584 (8943.428M) UNIQUE_PARAMS=8702255104 (8702.255M) 16: [2023-05-25 13:37:58,361] [INFO] [engine.py:145:__init__] RANK=128 STAGE=2 LAYERS=11 [25, 36) STAGE_PARAMS=553997312 (553.997M) TOTAL_PARAMS=8943427584 (8943.428M) UNIQUE_PARAMS=8702255104 (8702.255M) 8: [2023-05-25 13:37:58,361] [INFO] [engine.py:145:__init__] RANK=65 STAGE=1 LAYERS=11 [14, 25) STAGE_PARAMS=553997312 (553.997M) TOTAL_PARAMS=8943427584 (8943.428M) UNIQUE_PARAMS=8702255104 (8702.255M) 8: [2023-05-25 13:37:58,361] [INFO] [engine.py:145:__init__] RANK=64 STAGE=1 LAYERS=11 [14, 25) STAGE_PARAMS=553997312 (553.997M) TOTAL_PARAMS=8943427584 (8943.428M) UNIQUE_PARAMS=8702255104 (8702.255M) 8: [2023-05-25 13:37:58,361] [INFO] [engine.py:145:__init__] RANK=66 STAGE=1 LAYERS=11 [14, 25) STAGE_PARAMS=553997312 (553.997M) TOTAL_PARAMS=8943427584 (8943.428M) UNIQUE_PARAMS=8702255104 (8702.255M) 8: [2023-05-25 13:37:58,361] [INFO] [engine.py:145:__init__] RANK=67 STAGE=1 LAYERS=11 [14, 25) STAGE_PARAMS=553997312 (553.997M) TOTAL_PARAMS=8943427584 (8943.428M) UNIQUE_PARAMS=8702255104 (8702.255M) 0: [2023-05-25 13:37:59,360] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 0: [2023-05-25 13:37:59,360] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 0: [2023-05-25 13:37:59,360] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 0: [2023-05-25 13:37:59,360] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 0: [2023-05-25 13:37:59,360] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 0: [2023-05-25 13:37:59,360] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 0: [2023-05-25 13:37:59,360] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 0: [2023-05-25 13:37:59,360] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 5: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 5: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 5: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 5: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 5: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 5: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 5: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 16: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 16: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 16: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 24: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 24: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 16: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 16: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 1: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 1: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 1: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 1: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 1: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 24: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 24: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 16: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 16: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 1: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 1: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 24: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 24: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 24: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 2: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 2: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 2: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 2: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 2: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 2: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 2: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 30: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 30: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 30: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 30: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 30: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 31: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 31: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 31: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 30: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 30: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 31: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 31: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 31: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 31: [2023-05-25 13:37:59,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 17: [2023-05-25 13:37:59,362] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 4: [2023-05-25 13:37:59,362] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 5: [2023-05-25 13:37:59,362] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 16: [2023-05-25 13:37:59,362] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 2: [2023-05-25 13:37:59,362] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 1: [2023-05-25 13:37:59,362] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 24: [2023-05-25 13:37:59,362] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 9: [2023-05-25 13:37:59,362] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 29: [2023-05-25 13:37:59,362] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 27: [2023-05-25 13:37:59,362] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 17: [2023-05-25 13:37:59,362] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 17: [2023-05-25 13:37:59,362] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 17: [2023-05-25 13:37:59,362] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 17: [2023-05-25 13:37:59,362] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 17: [2023-05-25 13:37:59,362] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 4: [2023-05-25 13:37:59,362] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 4: [2023-05-25 13:37:59,362] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 4: [2023-05-25 13:37:59,362] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 4: [2023-05-25 13:37:59,362] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 17: [2023-05-25 13:37:59,362] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 4: [2023-05-25 13:37:59,362] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 29: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 29: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 29: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 29: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 29: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 4: [2023-05-25 13:37:59,362] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 14: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 14: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 14: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 14: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 14: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 27: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 27: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 27: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 27: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 9: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 9: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 9: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 9: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 9: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 29: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 20: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 27: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 27: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 30: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 31: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 7: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 14: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 26: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 28: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 9: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 17: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 23: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 4: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 9: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 10: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 27: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 29: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 14: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 20: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 20: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 20: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 20: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 7: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 7: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 20: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 20: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 26: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 26: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 26: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 7: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 7: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 7: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 7: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 10: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 23: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 28: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 28: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 28: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 28: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 28: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 10: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 10: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 23: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 26: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 10: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 23: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 26: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 28: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 10: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 23: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 23: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 23: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 26: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 15: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 15: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 15: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 15: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 10: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 15: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 15: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 15: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 20: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 14: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 26: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 28: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 7: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 23: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 21: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 21: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 21: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 21: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 6: [2023-05-25 13:37:59,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 6: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 6: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 6: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 6: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 6: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 6: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 15: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 10: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 21: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 21: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 21: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 19: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 19: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 19: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 19: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 19: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 25: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 25: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 25: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 25: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 25: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 19: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 19: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 25: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 21: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 18: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 18: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 18: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 18: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 13: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 13: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 13: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 13: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 25: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 3: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 3: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 3: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 11: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 11: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 11: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 11: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 11: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 8: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 22: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 22: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 22: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 22: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 22: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 3: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 3: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 3: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 3: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 11: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 11: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 8: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 8: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 22: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 13: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 13: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 25: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 8: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 8: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 8: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 18: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 18: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 18: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 22: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 13: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 6: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 13: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 12: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 12: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 3: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 8: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 19: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 11: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 18: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 22: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 12: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 8: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 12: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 12: [2023-05-25 13:37:59,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 12: [2023-05-25 13:37:59,365] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 12: [2023-05-25 13:37:59,365] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 12: [2023-05-25 13:37:59,365] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 0: [2023-05-25 13:37:59,440] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 0: [2023-05-25 13:37:59,440] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 0: [2023-05-25 13:37:59,440] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 0: [2023-05-25 13:37:59,440] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 0: [2023-05-25 13:37:59,440] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 0: [2023-05-25 13:37:59,440] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 0: [2023-05-25 13:37:59,440] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 0: [2023-05-25 13:37:59,440] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 0: [2023-05-25 13:37:59,440] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 0: [2023-05-25 13:37:59,441] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt... 0: [2023-05-25 13:37:59,441] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt... 0: [2023-05-25 13:37:59,441] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 0: [2023-05-25 13:37:59,441] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt... 0: [2023-05-25 13:37:59,441] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt... 0: [2023-05-25 13:37:59,441] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt... 0: [2023-05-25 13:37:59,441] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt... 0: [2023-05-25 13:37:59,441] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 0: [2023-05-25 13:37:59,442] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 0: [2023-05-25 13:37:59,442] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 0: [2023-05-25 13:37:59,442] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 24: [2023-05-25 13:37:59,444] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 24: [2023-05-25 13:37:59,444] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 24: [2023-05-25 13:37:59,444] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 24: [2023-05-25 13:37:59,444] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 24: [2023-05-25 13:37:59,444] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 24: [2023-05-25 13:37:59,444] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 24: [2023-05-25 13:37:59,444] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 24: [2023-05-25 13:37:59,444] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 24: [2023-05-25 13:37:59,444] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt... 24: [2023-05-25 13:37:59,444] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt... 24: [2023-05-25 13:37:59,444] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt... 24: [2023-05-25 13:37:59,444] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt... 24: [2023-05-25 13:37:59,444] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt... 24: [2023-05-25 13:37:59,444] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt... 24: [2023-05-25 13:37:59,444] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt... 24: [2023-05-25 13:37:59,444] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt... 18: [2023-05-25 13:37:59,446] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 18: [2023-05-25 13:37:59,446] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 18: [2023-05-25 13:37:59,446] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 18: [2023-05-25 13:37:59,446] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 18: [2023-05-25 13:37:59,446] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 18: [2023-05-25 13:37:59,446] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 18: [2023-05-25 13:37:59,446] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 18: [2023-05-25 13:37:59,446] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 18: [2023-05-25 13:37:59,446] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt... 18: [2023-05-25 13:37:59,446] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt... 18: [2023-05-25 13:37:59,446] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt... 18: [2023-05-25 13:37:59,446] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt... 18: [2023-05-25 13:37:59,446] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt... 5: [2023-05-25 13:37:59,446] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 5: [2023-05-25 13:37:59,446] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 5: [2023-05-25 13:37:59,446] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 18: [2023-05-25 13:37:59,446] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt... 18: [2023-05-25 13:37:59,446] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt... 18: [2023-05-25 13:37:59,446] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt... 5: [2023-05-25 13:37:59,446] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 5: [2023-05-25 13:37:59,446] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 5: [2023-05-25 13:37:59,446] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 5: [2023-05-25 13:37:59,446] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 5: [2023-05-25 13:37:59,446] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 5: [2023-05-25 13:37:59,446] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 5: [2023-05-25 13:37:59,446] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 5: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt... 5: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt... 31: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 31: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 31: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 31: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 31: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 11: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 11: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 11: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 11: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 11: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 7: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 7: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 7: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 7: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 7: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 5: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt... 5: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt... 6: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 6: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 6: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 8: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 8: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 8: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 5: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt... 14: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 14: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 14: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 14: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 14: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 31: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 31: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 11: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 11: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 7: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 7: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 5: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt... 6: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 8: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 8: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 8: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 8: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 14: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 14: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 6: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 6: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 6: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 7: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 31: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 8: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 7: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 6: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 6: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 6: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 7: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 31: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt... 2: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 7: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt... 31: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt... 31: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt... 31: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt... 31: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt... 7: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt... 31: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt... 5: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 3: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 3: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 2: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 2: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 2: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 2: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 2: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 11: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt... 8: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt... 7: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt... 4: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 4: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 4: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 4: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 4: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 7: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt... 31: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt... 31: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt... 3: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 2: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 11: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt... 11: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt... 14: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt... 3: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 3: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 4: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 11: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt... 11: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt... 11: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt... 8: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt... 8: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt... 8: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt... 8: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt... 7: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt... 17: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 17: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 17: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 17: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 3: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 3: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 2: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 2: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 4: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 11: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt... 8: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt... 8: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt... 8: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt... 7: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt... 14: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt... 14: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt... 3: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 3: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 6: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt... 6: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt... 10: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 10: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 10: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 17: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 17: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 17: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 14: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt... 3: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 4: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 5: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 10: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 23: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 14: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt... 9: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 9: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 6: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt... 6: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt... 4: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 10: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 10: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 10: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 14: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt... 6: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt... 6: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt... 2: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 17: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 23: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 14: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt... 9: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 9: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 9: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 9: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 4: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 23: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 23: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 23: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 9: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 2: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt... 2: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt... 23: [2023-05-25 13:37:59,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 23: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 9: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 10: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 1: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 1: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 1: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 1: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 1: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 15: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 15: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 12: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 12: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 3: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt... 3: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt... 1: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 1: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 23: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 26: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 26: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 26: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 26: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 26: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 1: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 1: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt... 15: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 15: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 15: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 15: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 3: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt... 3: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt... 2: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt... 2: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt... 15: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 12: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 3: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt... 1: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 17: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt... 26: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 26: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 3: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt... 2: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt... 4: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt... 4: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt... 2: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt... 15: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 12: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 12: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 4: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt... 4: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt... 17: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt... 17: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt... 12: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 4: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt... 4: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt... 17: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt... 12: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 26: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 7: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 17: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt... 17: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt... 17: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt... 5: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 10: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt... 17: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt... 23: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt... 9: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt... 10: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt... 1: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt... 5: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 10: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt... 10: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt... 10: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt... 23: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt... 23: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt... 27: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 27: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 27: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 27: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 27: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 9: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt... 16: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 16: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 16: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 16: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 16: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 10: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt... 10: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt... 10: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt... 9: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt... 1: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt... 7: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 23: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt... 23: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt... 27: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 27: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 9: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt... 1: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt... 15: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt... 15: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt... 19: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 19: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 19: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 23: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt... 9: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt... 9: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt... 6: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 6: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 13: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 13: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 13: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 23: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt... 9: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt... 9: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt... 16: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 19: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 19: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 19: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 19: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 25: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 25: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 25: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 16: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 11: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 1: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt... 13: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 13: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 13: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 13: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 23: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt... 27: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 21: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 1: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt... 15: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt... 15: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt... 14: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 25: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 25: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 25: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 25: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 29: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 29: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 29: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 29: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 29: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 15: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt... 28: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 28: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 28: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 28: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 28: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 21: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 21: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 16: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 11: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt... 1: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt... 15: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt... 15: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt... 13: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 14: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt... 12: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt... 12: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt... 26: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt... 26: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt... 21: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 21: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 3: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 20: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 20: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 20: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 20: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 20: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 22: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 22: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 19: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 28: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 28: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 30: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 30: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 29: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 29: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 21: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 20: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 15: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt... 12: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt... 12: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt... 26: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt... 26: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt... 30: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 30: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 30: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 30: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 30: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 25: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 21: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 20: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 12: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt... 26: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt... 26: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt... 26: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt... 26: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt... 22: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 21: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 2: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 22: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 22: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 22: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 12: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt... 12: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt... 28: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 20: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 22: [2023-05-25 13:37:59,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 4: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 30: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 27: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt... 29: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 3: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 27: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt... 4: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 27: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt... 27: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt... 19: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt... 13: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt... 27: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt... 27: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt... 27: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt... 7: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 22: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 19: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt... 13: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt... 27: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt... 25: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt... 16: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt... 16: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt... 16: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt... 2: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 1: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 13: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt... 19: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt... 19: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt... 13: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt... 25: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt... 16: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt... 7: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 19: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt... 19: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt... 19: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt... 13: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt... 13: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt... 25: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt... 25: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt... 25: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt... 21: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt... 16: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt... 19: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt... 13: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt... 16: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt... 25: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt... 25: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt... 25: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt... 21: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt... 21: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt... 21: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt... 21: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt... 1: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 13: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt... 20: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt... 16: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt... 28: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt... 28: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt... 21: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt... 21: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt... 16: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt... 20: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt... 21: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt... 20: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt... 20: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt... 20: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt... 20: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt... 28: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt... 28: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt... 29: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt... 28: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt... 28: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt... 28: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt... 28: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt... 22: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt... 29: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt... 30: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt... 30: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt... 29: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt... 20: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt... 20: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt... 29: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt... 22: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt... 30: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt... 29: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt... 22: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt... 30: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt... 29: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt... 22: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt... 30: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt... 30: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt... 30: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt... 30: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt... 29: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt... 29: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt... 3: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 22: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt... 22: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt... 22: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt... 22: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt... 3: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 12: [2023-05-25 13:37:59,449] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_00_model_states.pt. 1: [2023-05-25 13:37:59,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 12: [2023-05-25 13:37:59,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt... 6: [2023-05-25 13:37:59,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 1: [2023-05-25 13:37:59,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 2: [2023-05-25 13:37:59,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 6: [2023-05-25 13:37:59,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 4: [2023-05-25 13:37:59,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 2: [2023-05-25 13:37:59,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 4: [2023-05-25 13:37:59,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 23: [2023-05-25 13:37:59,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt. 18: [2023-05-25 13:37:59,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt. 17: [2023-05-25 13:37:59,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt. 18: [2023-05-25 13:37:59,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt. 20: [2023-05-25 13:37:59,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt. 19: [2023-05-25 13:37:59,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt. 23: [2023-05-25 13:37:59,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt. 17: [2023-05-25 13:37:59,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt. 16: [2023-05-25 13:37:59,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt. 20: [2023-05-25 13:37:59,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt. 19: [2023-05-25 13:37:59,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt. 22: [2023-05-25 13:37:59,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt. 21: [2023-05-25 13:37:59,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt. 16: [2023-05-25 13:37:59,453] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt. 21: [2023-05-25 13:37:59,453] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt. 23: [2023-05-25 13:37:59,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 18: [2023-05-25 13:37:59,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 17: [2023-05-25 13:37:59,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 23: [2023-05-25 13:37:59,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 17: [2023-05-25 13:37:59,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 20: [2023-05-25 13:37:59,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 18: [2023-05-25 13:37:59,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 19: [2023-05-25 13:37:59,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 22: [2023-05-25 13:37:59,453] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_09_model_states.pt. 16: [2023-05-25 13:37:59,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 19: [2023-05-25 13:37:59,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 16: [2023-05-25 13:37:59,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 20: [2023-05-25 13:37:59,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 21: [2023-05-25 13:37:59,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 22: [2023-05-25 13:37:59,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 21: [2023-05-25 13:37:59,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 22: [2023-05-25 13:37:59,454] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 11: [2023-05-25 13:37:59,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt. 11: [2023-05-25 13:37:59,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt. 9: [2023-05-25 13:37:59,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt. 10: [2023-05-25 13:37:59,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt. 15: [2023-05-25 13:37:59,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt. 10: [2023-05-25 13:37:59,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt. 13: [2023-05-25 13:37:59,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt. 15: [2023-05-25 13:37:59,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt. 8: [2023-05-25 13:37:59,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt. 14: [2023-05-25 13:37:59,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt. 8: [2023-05-25 13:37:59,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt. 13: [2023-05-25 13:37:59,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt. 14: [2023-05-25 13:37:59,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt. 11: [2023-05-25 13:37:59,455] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 11: [2023-05-25 13:37:59,455] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 9: [2023-05-25 13:37:59,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt. 9: [2023-05-25 13:37:59,456] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 15: [2023-05-25 13:37:59,456] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 15: [2023-05-25 13:37:59,456] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 10: [2023-05-25 13:37:59,456] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 12: [2023-05-25 13:37:59,456] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt. 13: [2023-05-25 13:37:59,456] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 13: [2023-05-25 13:37:59,456] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 8: [2023-05-25 13:37:59,456] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 8: [2023-05-25 13:37:59,456] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 10: [2023-05-25 13:37:59,456] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 14: [2023-05-25 13:37:59,456] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 14: [2023-05-25 13:37:59,456] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 9: [2023-05-25 13:37:59,456] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 12: [2023-05-25 13:37:59,456] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_06_model_states.pt. 12: [2023-05-25 13:37:59,457] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 24: [2023-05-25 13:37:59,457] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt. 12: [2023-05-25 13:37:59,458] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 25: [2023-05-25 13:37:59,458] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt. 31: [2023-05-25 13:37:59,458] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt. 24: [2023-05-25 13:37:59,458] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt. 27: [2023-05-25 13:37:59,458] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt. 26: [2023-05-25 13:37:59,458] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt. 28: [2023-05-25 13:37:59,458] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt. 31: [2023-05-25 13:37:59,458] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt. 25: [2023-05-25 13:37:59,458] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt. 29: [2023-05-25 13:37:59,458] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt. 26: [2023-05-25 13:37:59,458] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt. 28: [2023-05-25 13:37:59,458] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt. 29: [2023-05-25 13:37:59,458] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt. 24: [2023-05-25 13:37:59,458] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 24: [2023-05-25 13:37:59,458] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 27: [2023-05-25 13:37:59,458] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt. 25: [2023-05-25 13:37:59,458] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 31: [2023-05-25 13:37:59,458] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 30: [2023-05-25 13:37:59,458] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt. 27: [2023-05-25 13:37:59,459] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 26: [2023-05-25 13:37:59,459] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 25: [2023-05-25 13:37:59,459] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 31: [2023-05-25 13:37:59,459] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 28: [2023-05-25 13:37:59,459] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 28: [2023-05-25 13:37:59,459] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 26: [2023-05-25 13:37:59,459] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 11: [2023-05-25 13:37:59,459] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt. 30: [2023-05-25 13:37:59,459] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_13_model_states.pt. 29: [2023-05-25 13:37:59,459] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 29: [2023-05-25 13:37:59,459] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 11: [2023-05-25 13:37:59,459] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt. 14: [2023-05-25 13:37:59,459] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt. 27: [2023-05-25 13:37:59,459] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 13: [2023-05-25 13:37:59,459] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt. 15: [2023-05-25 13:37:59,459] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt. 9: [2023-05-25 13:37:59,459] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt. 14: [2023-05-25 13:37:59,459] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt. 10: [2023-05-25 13:37:59,459] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt. 15: [2023-05-25 13:37:59,459] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt. 8: [2023-05-25 13:37:59,459] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt. 13: [2023-05-25 13:37:59,459] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt. 12: [2023-05-25 13:37:59,459] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt. 9: [2023-05-25 13:37:59,459] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt. 11: [2023-05-25 13:37:59,459] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 10: [2023-05-25 13:37:59,459] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt. 11: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 18: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt. 14: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 13: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 15: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 9: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 30: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 14: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 15: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 10: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 18: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt. 13: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 30: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 8: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 23: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt. 12: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 9: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 10: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 17: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt. 23: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt. 21: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt. 20: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt. 18: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 22: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt. 19: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt. 17: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt. 12: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt. 24: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt. 16: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt. 20: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt. 18: [2023-05-25 13:37:59,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 24: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt. 16: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt. 22: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt. 21: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt. 23: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 19: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_11_model_states.pt. 23: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 17: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 21: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 20: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 25: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt. 28: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt. 22: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 19: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 17: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 29: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt. 31: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt. 26: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt. 27: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt. 28: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt. 16: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 24: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 25: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt. 29: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt. 31: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt. 20: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 27: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt. 24: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 16: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 8: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_05_model_states.pt. 22: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 26: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt. 21: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 11: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt. 19: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 28: [2023-05-25 13:37:59,462] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 25: [2023-05-25 13:37:59,461] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 26: [2023-05-25 13:37:59,462] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 28: [2023-05-25 13:37:59,462] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 31: [2023-05-25 13:37:59,462] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 12: [2023-05-25 13:37:59,462] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 27: [2023-05-25 13:37:59,462] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 29: [2023-05-25 13:37:59,462] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 11: [2023-05-25 13:37:59,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt. 25: [2023-05-25 13:37:59,462] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 31: [2023-05-25 13:37:59,462] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 15: [2023-05-25 13:37:59,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt. 29: [2023-05-25 13:37:59,462] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 8: [2023-05-25 13:37:59,462] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 27: [2023-05-25 13:37:59,462] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 14: [2023-05-25 13:37:59,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt. 26: [2023-05-25 13:37:59,462] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 30: [2023-05-25 13:37:59,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt. 10: [2023-05-25 13:37:59,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt. 11: [2023-05-25 13:37:59,462] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 13: [2023-05-25 13:37:59,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt. 9: [2023-05-25 13:37:59,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt. 10: [2023-05-25 13:37:59,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt. 30: [2023-05-25 13:37:59,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_12_model_states.pt. 13: [2023-05-25 13:37:59,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt. 14: [2023-05-25 13:37:59,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt. 11: [2023-05-25 13:37:59,462] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 15: [2023-05-25 13:37:59,462] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 9: [2023-05-25 13:37:59,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt. 12: [2023-05-25 13:37:59,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt. 14: [2023-05-25 13:37:59,463] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 10: [2023-05-25 13:37:59,463] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 13: [2023-05-25 13:37:59,463] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 9: [2023-05-25 13:37:59,463] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 10: [2023-05-25 13:37:59,463] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 13: [2023-05-25 13:37:59,463] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 14: [2023-05-25 13:37:59,463] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 30: [2023-05-25 13:37:59,463] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 9: [2023-05-25 13:37:59,463] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 0: [2023-05-25 13:37:59,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt. 30: [2023-05-25 13:37:59,463] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 15: [2023-05-25 13:37:59,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt. 8: [2023-05-25 13:37:59,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt. 12: [2023-05-25 13:37:59,463] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 18: [2023-05-25 13:37:59,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt. 5: [2023-05-25 13:37:59,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt. 0: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt. 16: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt. 8: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt. 17: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt. 23: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt. 0: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 2: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt. 20: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt. 22: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt. 6: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt. 7: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt. 3: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt. 1: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt. 17: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt. 16: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt. 4: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt. 21: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt. 7: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt. 23: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt. 2: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt. 15: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 19: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt. 20: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt. 8: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 18: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 22: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt. 3: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt. 6: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt. 21: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt. 5: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 19: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt. 4: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt. 16: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 0: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 1: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt. 8: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 17: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 23: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 20: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 2: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 7: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 22: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 6: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 3: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 16: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 4: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 1: [2023-05-25 13:37:59,464] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 5: [2023-05-25 13:37:59,465] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_01_model_states.pt. 17: [2023-05-25 13:37:59,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 18: [2023-05-25 13:37:59,465] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_10_model_states.pt. 23: [2023-05-25 13:37:59,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 7: [2023-05-25 13:37:59,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 21: [2023-05-25 13:37:59,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 22: [2023-05-25 13:37:59,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 2: [2023-05-25 13:37:59,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 20: [2023-05-25 13:37:59,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 19: [2023-05-25 13:37:59,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 3: [2023-05-25 13:37:59,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 6: [2023-05-25 13:37:59,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 19: [2023-05-25 13:37:59,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 4: [2023-05-25 13:37:59,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 21: [2023-05-25 13:37:59,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 1: [2023-05-25 13:37:59,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 12: [2023-05-25 13:37:59,465] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_07_model_states.pt. 5: [2023-05-25 13:37:59,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 18: [2023-05-25 13:37:59,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 0: [2023-05-25 13:37:59,465] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt. 0: [2023-05-25 13:37:59,466] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt. 5: [2023-05-25 13:37:59,466] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt. 5: [2023-05-25 13:37:59,466] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt. 0: [2023-05-25 13:37:59,466] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 0: [2023-05-25 13:37:59,466] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 6: [2023-05-25 13:37:59,466] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt. 12: [2023-05-25 13:37:59,466] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 5: [2023-05-25 13:37:59,466] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 5: [2023-05-25 13:37:59,467] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 4: [2023-05-25 13:37:59,466] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt. 7: [2023-05-25 13:37:59,467] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt. 4: [2023-05-25 13:37:59,467] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt. 6: [2023-05-25 13:37:59,467] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 7: [2023-05-25 13:37:59,467] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt. 6: [2023-05-25 13:37:59,467] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt. 4: [2023-05-25 13:37:59,467] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 1: [2023-05-25 13:37:59,467] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt. 2: [2023-05-25 13:37:59,467] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt. 7: [2023-05-25 13:37:59,467] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 2: [2023-05-25 13:37:59,467] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt. 4: [2023-05-25 13:37:59,467] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 24: [2023-05-25 13:37:59,467] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt. 25: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt. 24: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt. 25: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt. 6: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 28: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt. 31: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt. 2: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 1: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 7: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 27: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt. 2: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 18: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt. 26: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt. 31: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt. 28: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt. 24: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 25: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 29: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt. 26: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt. 27: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt. 23: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt. 1: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt. 24: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 29: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt. 25: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 23: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt. 30: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt. 0: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt. 28: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 31: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 3: [2023-05-25 13:37:59,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt. 27: [2023-05-25 13:37:59,469] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 0: [2023-05-25 13:37:59,469] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt. 18: [2023-05-25 13:37:59,469] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 26: [2023-05-25 13:37:59,469] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 28: [2023-05-25 13:37:59,469] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 31: [2023-05-25 13:37:59,469] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 3: [2023-05-25 13:37:59,469] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_03_model_states.pt. 26: [2023-05-25 13:37:59,469] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 5: [2023-05-25 13:37:59,469] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt. 27: [2023-05-25 13:37:59,469] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 29: [2023-05-25 13:37:59,469] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 23: [2023-05-25 13:37:59,469] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 30: [2023-05-25 13:37:59,469] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_14_model_states.pt. 5: [2023-05-25 13:37:59,469] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt. 17: [2023-05-25 13:37:59,469] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt. 29: [2023-05-25 13:37:59,469] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 1: [2023-05-25 13:37:59,469] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 23: [2023-05-25 13:37:59,469] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 0: [2023-05-25 13:37:59,469] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 17: [2023-05-25 13:37:59,469] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt. 3: [2023-05-25 13:37:59,469] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 0: [2023-05-25 13:37:59,469] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 19: [2023-05-25 13:37:59,469] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt. 20: [2023-05-25 13:37:59,469] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt. 3: [2023-05-25 13:37:59,469] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 7: [2023-05-25 13:37:59,469] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt. 16: [2023-05-25 13:37:59,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt. 5: [2023-05-25 13:37:59,470] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 30: [2023-05-25 13:37:59,470] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 19: [2023-05-25 13:37:59,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt. 5: [2023-05-25 13:37:59,470] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 7: [2023-05-25 13:37:59,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt. 17: [2023-05-25 13:37:59,470] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 18: [2023-05-25 13:37:59,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt. 6: [2023-05-25 13:37:59,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt. 20: [2023-05-25 13:37:59,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt. 16: [2023-05-25 13:37:59,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt. 17: [2023-05-25 13:37:59,470] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 6: [2023-05-25 13:37:59,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt. 20: [2023-05-25 13:37:59,470] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 22: [2023-05-25 13:37:59,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt. 19: [2023-05-25 13:37:59,470] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 21: [2023-05-25 13:37:59,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt. 7: [2023-05-25 13:37:59,470] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 22: [2023-05-25 13:37:59,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt. 19: [2023-05-25 13:37:59,470] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 16: [2023-05-25 13:37:59,470] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 30: [2023-05-25 13:37:59,470] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 7: [2023-05-25 13:37:59,470] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 21: [2023-05-25 13:37:59,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_08_model_states.pt. 18: [2023-05-25 13:37:59,470] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 20: [2023-05-25 13:37:59,470] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 6: [2023-05-25 13:37:59,470] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 16: [2023-05-25 13:37:59,470] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 4: [2023-05-25 13:37:59,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt. 11: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt. 6: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 22: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 21: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 2: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt. 1: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt. 22: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 3: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt. 4: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt. 11: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt. 10: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt. 9: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt. 3: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt. 15: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt. 21: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 1: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt. 9: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt. 10: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt. 15: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt. 14: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt. 2: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_02_model_states.pt. 4: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 11: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 13: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt. 8: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt. 14: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt. 12: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt. 2: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 13: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt. 8: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt. 11: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 1: [2023-05-25 13:37:59,471] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 3: [2023-05-25 13:37:59,472] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 10: [2023-05-25 13:37:59,472] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 4: [2023-05-25 13:37:59,472] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 9: [2023-05-25 13:37:59,472] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 3: [2023-05-25 13:37:59,472] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 15: [2023-05-25 13:37:59,472] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 9: [2023-05-25 13:37:59,472] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 1: [2023-05-25 13:37:59,472] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 15: [2023-05-25 13:37:59,472] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 12: [2023-05-25 13:37:59,472] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_04_model_states.pt. 14: [2023-05-25 13:37:59,472] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 10: [2023-05-25 13:37:59,472] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 2: [2023-05-25 13:37:59,472] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 13: [2023-05-25 13:37:59,472] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 8: [2023-05-25 13:37:59,472] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 24: [2023-05-25 13:37:59,472] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt. 14: [2023-05-25 13:37:59,472] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 13: [2023-05-25 13:37:59,472] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 12: [2023-05-25 13:37:59,472] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 8: [2023-05-25 13:37:59,472] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 24: [2023-05-25 13:37:59,472] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt. 27: [2023-05-25 13:37:59,472] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt. 31: [2023-05-25 13:37:59,472] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt. 27: [2023-05-25 13:37:59,473] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt. 31: [2023-05-25 13:37:59,473] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt. 26: [2023-05-25 13:37:59,473] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt. 12: [2023-05-25 13:37:59,473] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 24: [2023-05-25 13:37:59,473] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 28: [2023-05-25 13:37:59,473] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt. 25: [2023-05-25 13:37:59,473] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt. 29: [2023-05-25 13:37:59,473] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt. 24: [2023-05-25 13:37:59,473] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 28: [2023-05-25 13:37:59,473] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt. 25: [2023-05-25 13:37:59,473] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt. 29: [2023-05-25 13:37:59,473] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt. 26: [2023-05-25 13:37:59,473] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt. 27: [2023-05-25 13:37:59,473] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 31: [2023-05-25 13:37:59,473] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 30: [2023-05-25 13:37:59,473] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt. 27: [2023-05-25 13:37:59,473] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 31: [2023-05-25 13:37:59,473] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 26: [2023-05-25 13:37:59,473] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 30: [2023-05-25 13:37:59,473] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/mp_rank_15_model_states.pt. 29: [2023-05-25 13:37:59,473] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 28: [2023-05-25 13:37:59,473] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 25: [2023-05-25 13:37:59,473] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 28: [2023-05-25 13:37:59,474] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 29: [2023-05-25 13:37:59,474] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 25: [2023-05-25 13:37:59,474] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 26: [2023-05-25 13:37:59,474] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 30: [2023-05-25 13:37:59,474] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 30: [2023-05-25 13:37:59,475] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 4: [2023-05-25 13:37:59,654] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 4: [2023-05-25 13:37:59,655] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 4: [2023-05-25 13:37:59,657] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 4: [2023-05-25 13:37:59,657] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 4: [2023-05-25 13:37:59,657] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 4: [2023-05-25 13:37:59,657] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 4: [2023-05-25 13:37:59,657] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 4: [2023-05-25 13:37:59,657] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 4: [2023-05-25 13:37:59,659] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt... 4: [2023-05-25 13:37:59,660] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt... 4: [2023-05-25 13:37:59,660] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt... 4: [2023-05-25 13:37:59,661] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt... 4: [2023-05-25 13:37:59,662] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt... 4: [2023-05-25 13:37:59,662] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt... 4: [2023-05-25 13:37:59,662] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 4: [2023-05-25 13:37:59,663] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 18: [2023-05-25 13:37:59,679] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 18: [2023-05-25 13:37:59,679] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 18: [2023-05-25 13:37:59,679] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 18: [2023-05-25 13:37:59,679] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 18: [2023-05-25 13:37:59,679] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 18: [2023-05-25 13:37:59,679] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 18: [2023-05-25 13:37:59,679] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 18: [2023-05-25 13:37:59,679] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 1: [2023-05-25 13:37:59,679] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 1: [2023-05-25 13:37:59,680] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 1: [2023-05-25 13:37:59,681] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 1: [2023-05-25 13:37:59,681] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 1: [2023-05-25 13:37:59,681] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 1: [2023-05-25 13:37:59,681] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 1: [2023-05-25 13:37:59,681] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 1: [2023-05-25 13:37:59,682] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 1: [2023-05-25 13:37:59,684] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt... 18: [2023-05-25 13:37:59,684] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt... 18: [2023-05-25 13:37:59,684] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 18: [2023-05-25 13:37:59,685] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt... 18: [2023-05-25 13:37:59,685] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 18: [2023-05-25 13:37:59,685] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt... 1: [2023-05-25 13:37:59,685] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt... 18: [2023-05-25 13:37:59,685] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt... 18: [2023-05-25 13:37:59,685] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt... 18: [2023-05-25 13:37:59,686] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt... 1: [2023-05-25 13:37:59,686] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt... 1: [2023-05-25 13:37:59,687] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt... 1: [2023-05-25 13:37:59,687] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 1: [2023-05-25 13:37:59,687] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt... 1: [2023-05-25 13:37:59,688] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 5: [2023-05-25 13:37:59,687] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 5: [2023-05-25 13:37:59,687] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 5: [2023-05-25 13:37:59,687] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 5: [2023-05-25 13:37:59,687] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 5: [2023-05-25 13:37:59,687] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 5: [2023-05-25 13:37:59,687] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 5: [2023-05-25 13:37:59,687] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 5: [2023-05-25 13:37:59,688] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 1: [2023-05-25 13:37:59,688] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt... 7: [2023-05-25 13:37:59,691] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 7: [2023-05-25 13:37:59,691] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 5: [2023-05-25 13:37:59,691] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 7: [2023-05-25 13:37:59,692] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 7: [2023-05-25 13:37:59,692] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 7: [2023-05-25 13:37:59,692] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 7: [2023-05-25 13:37:59,692] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 7: [2023-05-25 13:37:59,692] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 5: [2023-05-25 13:37:59,692] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 7: [2023-05-25 13:37:59,692] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 5: [2023-05-25 13:37:59,692] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt... 5: [2023-05-25 13:37:59,692] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt... 5: [2023-05-25 13:37:59,692] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt... 5: [2023-05-25 13:37:59,692] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt... 5: [2023-05-25 13:37:59,693] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt... 5: [2023-05-25 13:37:59,693] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt... 3: [2023-05-25 13:37:59,694] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 3: [2023-05-25 13:37:59,694] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 2: [2023-05-25 13:37:59,694] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 2: [2023-05-25 13:37:59,694] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 0: [2023-05-25 13:37:59,695] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 0: [2023-05-25 13:37:59,695] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 3: [2023-05-25 13:37:59,694] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 3: [2023-05-25 13:37:59,694] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 3: [2023-05-25 13:37:59,694] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 3: [2023-05-25 13:37:59,694] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 3: [2023-05-25 13:37:59,694] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 3: [2023-05-25 13:37:59,695] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 2: [2023-05-25 13:37:59,695] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 2: [2023-05-25 13:37:59,695] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 2: [2023-05-25 13:37:59,695] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 2: [2023-05-25 13:37:59,695] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 2: [2023-05-25 13:37:59,695] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 2: [2023-05-25 13:37:59,696] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 7: [2023-05-25 13:37:59,696] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 7: [2023-05-25 13:37:59,696] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt... 0: [2023-05-25 13:37:59,696] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 0: [2023-05-25 13:37:59,696] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 0: [2023-05-25 13:37:59,696] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 0: [2023-05-25 13:37:59,696] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 0: [2023-05-25 13:37:59,696] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 7: [2023-05-25 13:37:59,696] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt... 0: [2023-05-25 13:37:59,696] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 7: [2023-05-25 13:37:59,697] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 7: [2023-05-25 13:37:59,697] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt... 7: [2023-05-25 13:37:59,697] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt... 7: [2023-05-25 13:37:59,697] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt... 7: [2023-05-25 13:37:59,697] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt... 0: [2023-05-25 13:37:59,698] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 2: [2023-05-25 13:37:59,699] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 2: [2023-05-25 13:37:59,699] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 2: [2023-05-25 13:37:59,699] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt... 17: [2023-05-25 13:37:59,699] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 17: [2023-05-25 13:37:59,699] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 17: [2023-05-25 13:37:59,699] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 17: [2023-05-25 13:37:59,700] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 17: [2023-05-25 13:37:59,700] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 17: [2023-05-25 13:37:59,700] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 17: [2023-05-25 13:37:59,700] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 2: [2023-05-25 13:37:59,700] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt... 4: [2023-05-25 13:37:59,700] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 17: [2023-05-25 13:37:59,700] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 3: [2023-05-25 13:37:59,701] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 3: [2023-05-25 13:37:59,701] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt... 0: [2023-05-25 13:37:59,702] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt... 2: [2023-05-25 13:37:59,702] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt... 2: [2023-05-25 13:37:59,702] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt... 2: [2023-05-25 13:37:59,703] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt... 2: [2023-05-25 13:37:59,703] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt... 17: [2023-05-25 13:37:59,704] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt... 17: [2023-05-25 13:37:59,704] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt... 17: [2023-05-25 13:37:59,704] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt... 17: [2023-05-25 13:37:59,704] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 10: [2023-05-25 13:37:59,704] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 10: [2023-05-25 13:37:59,704] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 10: [2023-05-25 13:37:59,704] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 10: [2023-05-25 13:37:59,704] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 10: [2023-05-25 13:37:59,704] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 4: [2023-05-25 13:37:59,704] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 10: [2023-05-25 13:37:59,704] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 10: [2023-05-25 13:37:59,704] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 3: [2023-05-25 13:37:59,704] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 0: [2023-05-25 13:37:59,704] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 0: [2023-05-25 13:37:59,704] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt... 0: [2023-05-25 13:37:59,704] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt... 17: [2023-05-25 13:37:59,705] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt... 0: [2023-05-25 13:37:59,705] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt... 10: [2023-05-25 13:37:59,705] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 3: [2023-05-25 13:37:59,705] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt... 3: [2023-05-25 13:37:59,705] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt... 0: [2023-05-25 13:37:59,705] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt... 0: [2023-05-25 13:37:59,705] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt... 17: [2023-05-25 13:37:59,705] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt... 17: [2023-05-25 13:37:59,705] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 3: [2023-05-25 13:37:59,705] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt... 3: [2023-05-25 13:37:59,705] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt... 3: [2023-05-25 13:37:59,705] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt... 17: [2023-05-25 13:37:59,705] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt... 10: [2023-05-25 13:37:59,707] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 10: [2023-05-25 13:37:59,707] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt... 10: [2023-05-25 13:37:59,708] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt... 10: [2023-05-25 13:37:59,708] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 10: [2023-05-25 13:37:59,708] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt... 10: [2023-05-25 13:37:59,708] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt... 10: [2023-05-25 13:37:59,708] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt... 14: [2023-05-25 13:37:59,708] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 14: [2023-05-25 13:37:59,708] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 10: [2023-05-25 13:37:59,709] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt... 14: [2023-05-25 13:37:59,709] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 14: [2023-05-25 13:37:59,709] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 14: [2023-05-25 13:37:59,709] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 14: [2023-05-25 13:37:59,709] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 14: [2023-05-25 13:37:59,709] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 14: [2023-05-25 13:37:59,709] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 24: [2023-05-25 13:37:59,710] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 24: [2023-05-25 13:37:59,710] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 24: [2023-05-25 13:37:59,710] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 24: [2023-05-25 13:37:59,710] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 24: [2023-05-25 13:37:59,710] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 24: [2023-05-25 13:37:59,710] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 24: [2023-05-25 13:37:59,710] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 23: [2023-05-25 13:37:59,710] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 23: [2023-05-25 13:37:59,710] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 23: [2023-05-25 13:37:59,710] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 23: [2023-05-25 13:37:59,710] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 23: [2023-05-25 13:37:59,710] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 24: [2023-05-25 13:37:59,711] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 23: [2023-05-25 13:37:59,710] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 23: [2023-05-25 13:37:59,710] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 23: [2023-05-25 13:37:59,711] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 14: [2023-05-25 13:37:59,712] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 14: [2023-05-25 13:37:59,712] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt... 14: [2023-05-25 13:37:59,712] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt... 14: [2023-05-25 13:37:59,712] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt... 14: [2023-05-25 13:37:59,713] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt... 14: [2023-05-25 13:37:59,713] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt... 14: [2023-05-25 13:37:59,713] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 24: [2023-05-25 13:37:59,713] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt... 23: [2023-05-25 13:37:59,713] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 23: [2023-05-25 13:37:59,714] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt... 23: [2023-05-25 13:37:59,714] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 14: [2023-05-25 13:37:59,714] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt... 23: [2023-05-25 13:37:59,714] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt... 24: [2023-05-25 13:37:59,714] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt... 24: [2023-05-25 13:37:59,714] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt... 23: [2023-05-25 13:37:59,715] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt... 23: [2023-05-25 13:37:59,715] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt... 24: [2023-05-25 13:37:59,715] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt... 23: [2023-05-25 13:37:59,715] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt... 23: [2023-05-25 13:37:59,715] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt... 24: [2023-05-25 13:37:59,715] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt... 24: [2023-05-25 13:37:59,715] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt... 24: [2023-05-25 13:37:59,715] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 24: [2023-05-25 13:37:59,715] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 22: [2023-05-25 13:37:59,715] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 22: [2023-05-25 13:37:59,715] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 22: [2023-05-25 13:37:59,715] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 22: [2023-05-25 13:37:59,715] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 22: [2023-05-25 13:37:59,716] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 22: [2023-05-25 13:37:59,716] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 22: [2023-05-25 13:37:59,716] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 16: [2023-05-25 13:37:59,716] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 22: [2023-05-25 13:37:59,716] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 16: [2023-05-25 13:37:59,716] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 16: [2023-05-25 13:37:59,716] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 16: [2023-05-25 13:37:59,717] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 16: [2023-05-25 13:37:59,717] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 16: [2023-05-25 13:37:59,717] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 16: [2023-05-25 13:37:59,717] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 18: [2023-05-25 13:37:59,717] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 16: [2023-05-25 13:37:59,717] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 4: [2023-05-25 13:37:59,718] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 4: [2023-05-25 13:37:59,718] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 9: [2023-05-25 13:37:59,718] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 9: [2023-05-25 13:37:59,718] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 9: [2023-05-25 13:37:59,718] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 18: [2023-05-25 13:37:59,718] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 9: [2023-05-25 13:37:59,718] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 9: [2023-05-25 13:37:59,718] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 9: [2023-05-25 13:37:59,718] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 9: [2023-05-25 13:37:59,718] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 9: [2023-05-25 13:37:59,719] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 16: [2023-05-25 13:37:59,719] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt... 22: [2023-05-25 13:37:59,720] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 22: [2023-05-25 13:37:59,720] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt... 16: [2023-05-25 13:37:59,720] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 22: [2023-05-25 13:37:59,720] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt... 16: [2023-05-25 13:37:59,720] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt... 9: [2023-05-25 13:37:59,722] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt... 16: [2023-05-25 13:37:59,722] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt... 9: [2023-05-25 13:37:59,722] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt... 16: [2023-05-25 13:37:59,722] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt... 16: [2023-05-25 13:37:59,722] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt... 16: [2023-05-25 13:37:59,722] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 22: [2023-05-25 13:37:59,722] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt... 16: [2023-05-25 13:37:59,723] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt... 22: [2023-05-25 13:37:59,723] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt... 9: [2023-05-25 13:37:59,723] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt... 22: [2023-05-25 13:37:59,723] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 22: [2023-05-25 13:37:59,723] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt... 22: [2023-05-25 13:37:59,724] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt... 9: [2023-05-25 13:37:59,724] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt... 9: [2023-05-25 13:37:59,725] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 9: [2023-05-25 13:37:59,725] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt... 9: [2023-05-25 13:37:59,725] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 9: [2023-05-25 13:37:59,725] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt... 1: [2023-05-25 13:37:59,726] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 1: [2023-05-25 13:37:59,726] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 5: [2023-05-25 13:37:59,727] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 20: [2023-05-25 13:37:59,730] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 20: [2023-05-25 13:37:59,730] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 20: [2023-05-25 13:37:59,730] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 20: [2023-05-25 13:37:59,730] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 20: [2023-05-25 13:37:59,730] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 20: [2023-05-25 13:37:59,730] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 20: [2023-05-25 13:37:59,730] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 20: [2023-05-25 13:37:59,731] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 26: [2023-05-25 13:37:59,731] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 26: [2023-05-25 13:37:59,731] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 26: [2023-05-25 13:37:59,731] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 5: [2023-05-25 13:37:59,731] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 26: [2023-05-25 13:37:59,731] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 26: [2023-05-25 13:37:59,731] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 26: [2023-05-25 13:37:59,731] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 26: [2023-05-25 13:37:59,731] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 26: [2023-05-25 13:37:59,732] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 18: [2023-05-25 13:37:59,733] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 20: [2023-05-25 13:37:59,734] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt... 18: [2023-05-25 13:37:59,734] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 0: [2023-05-25 13:37:59,734] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 11: [2023-05-25 13:37:59,734] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 11: [2023-05-25 13:37:59,734] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 11: [2023-05-25 13:37:59,734] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 11: [2023-05-25 13:37:59,734] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 11: [2023-05-25 13:37:59,734] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 11: [2023-05-25 13:37:59,734] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 11: [2023-05-25 13:37:59,734] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 20: [2023-05-25 13:37:59,734] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 11: [2023-05-25 13:37:59,734] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 20: [2023-05-25 13:37:59,734] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt... 27: [2023-05-25 13:37:59,734] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 27: [2023-05-25 13:37:59,734] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 27: [2023-05-25 13:37:59,734] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 27: [2023-05-25 13:37:59,734] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 27: [2023-05-25 13:37:59,734] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 27: [2023-05-25 13:37:59,735] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 26: [2023-05-25 13:37:59,735] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt... 26: [2023-05-25 13:37:59,735] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt... 27: [2023-05-25 13:37:59,735] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 20: [2023-05-25 13:37:59,735] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt... 27: [2023-05-25 13:37:59,735] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 26: [2023-05-25 13:37:59,735] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt... 20: [2023-05-25 13:37:59,735] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt... 20: [2023-05-25 13:37:59,735] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 26: [2023-05-25 13:37:59,735] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt... 20: [2023-05-25 13:37:59,735] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt... 26: [2023-05-25 13:37:59,735] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 20: [2023-05-25 13:37:59,736] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt... 29: [2023-05-25 13:37:59,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 29: [2023-05-25 13:37:59,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 7: [2023-05-25 13:37:59,735] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 7: [2023-05-25 13:37:59,735] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 26: [2023-05-25 13:37:59,736] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt... 26: [2023-05-25 13:37:59,736] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt... 29: [2023-05-25 13:37:59,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 29: [2023-05-25 13:37:59,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 25: [2023-05-25 13:37:59,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 25: [2023-05-25 13:37:59,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 26: [2023-05-25 13:37:59,736] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 29: [2023-05-25 13:37:59,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 29: [2023-05-25 13:37:59,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 29: [2023-05-25 13:37:59,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 30: [2023-05-25 13:37:59,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 30: [2023-05-25 13:37:59,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 30: [2023-05-25 13:37:59,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 29: [2023-05-25 13:37:59,737] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 2: [2023-05-25 13:37:59,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 2: [2023-05-25 13:37:59,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 25: [2023-05-25 13:37:59,737] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 25: [2023-05-25 13:37:59,737] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 25: [2023-05-25 13:37:59,737] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 30: [2023-05-25 13:37:59,737] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 30: [2023-05-25 13:37:59,737] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 30: [2023-05-25 13:37:59,737] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 30: [2023-05-25 13:37:59,737] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 11: [2023-05-25 13:37:59,737] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 25: [2023-05-25 13:37:59,737] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 25: [2023-05-25 13:37:59,737] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 30: [2023-05-25 13:37:59,737] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 25: [2023-05-25 13:37:59,737] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 17: [2023-05-25 13:37:59,737] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 31: [2023-05-25 13:37:59,737] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 31: [2023-05-25 13:37:59,737] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 31: [2023-05-25 13:37:59,737] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 31: [2023-05-25 13:37:59,737] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 31: [2023-05-25 13:37:59,737] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 31: [2023-05-25 13:37:59,737] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 31: [2023-05-25 13:37:59,737] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 31: [2023-05-25 13:37:59,737] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 11: [2023-05-25 13:37:59,738] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt... 11: [2023-05-25 13:37:59,738] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt... 28: [2023-05-25 13:37:59,738] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 28: [2023-05-25 13:37:59,738] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 28: [2023-05-25 13:37:59,738] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 28: [2023-05-25 13:37:59,738] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 28: [2023-05-25 13:37:59,738] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 11: [2023-05-25 13:37:59,738] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt... 28: [2023-05-25 13:37:59,738] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 28: [2023-05-25 13:37:59,738] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 28: [2023-05-25 13:37:59,738] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 11: [2023-05-25 13:37:59,739] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt... 11: [2023-05-25 13:37:59,739] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt... 17: [2023-05-25 13:37:59,739] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 29: [2023-05-25 13:37:59,740] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 11: [2023-05-25 13:37:59,740] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 27: [2023-05-25 13:37:59,740] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt... 29: [2023-05-25 13:37:59,740] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt... 11: [2023-05-25 13:37:59,740] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt... 29: [2023-05-25 13:37:59,740] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt... 3: [2023-05-25 13:37:59,740] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 1: [2023-05-25 13:37:59,740] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 1: [2023-05-25 13:37:59,740] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 6: [2023-05-25 13:37:59,740] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 6: [2023-05-25 13:37:59,740] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 6: [2023-05-25 13:37:59,740] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 6: [2023-05-25 13:37:59,740] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 6: [2023-05-25 13:37:59,740] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 6: [2023-05-25 13:37:59,740] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 6: [2023-05-25 13:37:59,740] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 29: [2023-05-25 13:37:59,741] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt... 29: [2023-05-25 13:37:59,741] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt... 29: [2023-05-25 13:37:59,741] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt... 6: [2023-05-25 13:37:59,741] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 29: [2023-05-25 13:37:59,741] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 31: [2023-05-25 13:37:59,741] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 10: [2023-05-25 13:37:59,741] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 29: [2023-05-25 13:37:59,741] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt... 10: [2023-05-25 13:37:59,741] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 31: [2023-05-25 13:37:59,741] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt... 31: [2023-05-25 13:37:59,741] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt... 28: [2023-05-25 13:37:59,741] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt... 28: [2023-05-25 13:37:59,741] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt... 28: [2023-05-25 13:37:59,742] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 30: [2023-05-25 13:37:59,742] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt... 28: [2023-05-25 13:37:59,742] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt... 28: [2023-05-25 13:37:59,742] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 25: [2023-05-25 13:37:59,742] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 25: [2023-05-25 13:37:59,742] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt... 30: [2023-05-25 13:37:59,742] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt... 25: [2023-05-25 13:37:59,742] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt... 31: [2023-05-25 13:37:59,742] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 27: [2023-05-25 13:37:59,742] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 27: [2023-05-25 13:37:59,742] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt... 25: [2023-05-25 13:37:59,742] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt... 25: [2023-05-25 13:37:59,742] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt... 28: [2023-05-25 13:37:59,742] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt... 31: [2023-05-25 13:37:59,742] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt... 28: [2023-05-25 13:37:59,743] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt... 27: [2023-05-25 13:37:59,743] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt... 27: [2023-05-25 13:37:59,743] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt... 25: [2023-05-25 13:37:59,743] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 28: [2023-05-25 13:37:59,743] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt... 31: [2023-05-25 13:37:59,743] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt... 5: [2023-05-25 13:37:59,743] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 31: [2023-05-25 13:37:59,743] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt... 31: [2023-05-25 13:37:59,743] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt... 25: [2023-05-25 13:37:59,743] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt... 27: [2023-05-25 13:37:59,743] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt... 27: [2023-05-25 13:37:59,743] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt... 25: [2023-05-25 13:37:59,743] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt... 27: [2023-05-25 13:37:59,743] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 5: [2023-05-25 13:37:59,744] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 14: [2023-05-25 13:37:59,743] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 6: [2023-05-25 13:37:59,744] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 6: [2023-05-25 13:37:59,745] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt... 30: [2023-05-25 13:37:59,745] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 6: [2023-05-25 13:37:59,745] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt... 6: [2023-05-25 13:37:59,745] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt... 6: [2023-05-25 13:37:59,745] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt... 6: [2023-05-25 13:37:59,745] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt... 0: [2023-05-25 13:37:59,746] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 6: [2023-05-25 13:37:59,746] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt... 6: [2023-05-25 13:37:59,746] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt... 30: [2023-05-25 13:37:59,746] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt... 30: [2023-05-25 13:37:59,746] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt... 30: [2023-05-25 13:37:59,747] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt... 30: [2023-05-25 13:37:59,747] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt... 30: [2023-05-25 13:37:59,747] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt... 8: [2023-05-25 13:37:59,747] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 14: [2023-05-25 13:37:59,747] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 8: [2023-05-25 13:37:59,747] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 8: [2023-05-25 13:37:59,747] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 8: [2023-05-25 13:37:59,748] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 8: [2023-05-25 13:37:59,748] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 8: [2023-05-25 13:37:59,748] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 8: [2023-05-25 13:37:59,748] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 8: [2023-05-25 13:37:59,748] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 23: [2023-05-25 13:37:59,748] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 0: [2023-05-25 13:37:59,749] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 23: [2023-05-25 13:37:59,750] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 24: [2023-05-25 13:37:59,750] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 24: [2023-05-25 13:37:59,750] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 17: [2023-05-25 13:37:59,750] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 2: [2023-05-25 13:37:59,750] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 2: [2023-05-25 13:37:59,750] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 17: [2023-05-25 13:37:59,751] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 3: [2023-05-25 13:37:59,751] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 8: [2023-05-25 13:37:59,752] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt... 7: [2023-05-25 13:37:59,752] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 8: [2023-05-25 13:37:59,752] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt... 8: [2023-05-25 13:37:59,752] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt... 7: [2023-05-25 13:37:59,752] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 10: [2023-05-25 13:37:59,752] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 10: [2023-05-25 13:37:59,753] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 8: [2023-05-25 13:37:59,753] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt... 8: [2023-05-25 13:37:59,753] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt... 8: [2023-05-25 13:37:59,753] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 8: [2023-05-25 13:37:59,753] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 8: [2023-05-25 13:37:59,753] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt... 22: [2023-05-25 13:37:59,754] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 16: [2023-05-25 13:37:59,755] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 14: [2023-05-25 13:37:59,755] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 0: [2023-05-25 13:37:59,757] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 19: [2023-05-25 13:37:59,756] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 19: [2023-05-25 13:37:59,756] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 19: [2023-05-25 13:37:59,757] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 19: [2023-05-25 13:37:59,757] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 19: [2023-05-25 13:37:59,757] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 19: [2023-05-25 13:37:59,757] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 19: [2023-05-25 13:37:59,757] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 19: [2023-05-25 13:37:59,757] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 16: [2023-05-25 13:37:59,757] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 21: [2023-05-25 13:37:59,758] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 21: [2023-05-25 13:37:59,758] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 21: [2023-05-25 13:37:59,758] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 21: [2023-05-25 13:37:59,758] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 21: [2023-05-25 13:37:59,758] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 21: [2023-05-25 13:37:59,758] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 21: [2023-05-25 13:37:59,758] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 22: [2023-05-25 13:37:59,759] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 21: [2023-05-25 13:37:59,759] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 14: [2023-05-25 13:37:59,759] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 3: [2023-05-25 13:37:59,759] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 13: [2023-05-25 13:37:59,760] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 9: [2023-05-25 13:37:59,760] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 9: [2023-05-25 13:37:59,760] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 19: [2023-05-25 13:37:59,760] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 19: [2023-05-25 13:37:59,760] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 23: [2023-05-25 13:37:59,760] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 13: [2023-05-25 13:37:59,760] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 13: [2023-05-25 13:37:59,760] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 19: [2023-05-25 13:37:59,761] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt... 19: [2023-05-25 13:37:59,761] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt... 19: [2023-05-25 13:37:59,761] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt... 19: [2023-05-25 13:37:59,761] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt... 19: [2023-05-25 13:37:59,761] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt... 19: [2023-05-25 13:37:59,761] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt... 13: [2023-05-25 13:37:59,761] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 13: [2023-05-25 13:37:59,761] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 13: [2023-05-25 13:37:59,761] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 13: [2023-05-25 13:37:59,761] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 13: [2023-05-25 13:37:59,761] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 23: [2023-05-25 13:37:59,761] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 24: [2023-05-25 13:37:59,761] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 24: [2023-05-25 13:37:59,761] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 21: [2023-05-25 13:37:59,762] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 21: [2023-05-25 13:37:59,762] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt... 21: [2023-05-25 13:37:59,762] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt... 21: [2023-05-25 13:37:59,762] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt... 21: [2023-05-25 13:37:59,762] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt... 21: [2023-05-25 13:37:59,763] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt... 21: [2023-05-25 13:37:59,763] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt... 13: [2023-05-25 13:37:59,763] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 21: [2023-05-25 13:37:59,763] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt... 13: [2023-05-25 13:37:59,764] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt... 13: [2023-05-25 13:37:59,765] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt... 13: [2023-05-25 13:37:59,766] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt... 13: [2023-05-25 13:37:59,766] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 13: [2023-05-25 13:37:59,766] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt... 13: [2023-05-25 13:37:59,767] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt... 13: [2023-05-25 13:37:59,767] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt... 16: [2023-05-25 13:37:59,768] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 12: [2023-05-25 13:37:59,769] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 12: [2023-05-25 13:37:59,769] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 12: [2023-05-25 13:37:59,769] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 12: [2023-05-25 13:37:59,769] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 12: [2023-05-25 13:37:59,769] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 12: [2023-05-25 13:37:59,769] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 12: [2023-05-25 13:37:59,769] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 22: [2023-05-25 13:37:59,770] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 16: [2023-05-25 13:37:59,770] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 11: [2023-05-25 13:37:59,769] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 12: [2023-05-25 13:37:59,770] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 26: [2023-05-25 13:37:59,770] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 20: [2023-05-25 13:37:59,771] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 26: [2023-05-25 13:37:59,771] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 9: [2023-05-25 13:37:59,772] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 9: [2023-05-25 13:37:59,772] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 29: [2023-05-25 13:37:59,773] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 31: [2023-05-25 13:37:59,773] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 3: [2023-05-25 13:37:59,774] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 15: [2023-05-25 13:37:59,774] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 15: [2023-05-25 13:37:59,774] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 11: [2023-05-25 13:37:59,774] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 15: [2023-05-25 13:37:59,774] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 25: [2023-05-25 13:37:59,774] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 20: [2023-05-25 13:37:59,775] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 15: [2023-05-25 13:37:59,775] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 15: [2023-05-25 13:37:59,775] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 15: [2023-05-25 13:37:59,775] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 15: [2023-05-25 13:37:59,775] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 12: [2023-05-25 13:37:59,775] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt... 12: [2023-05-25 13:37:59,775] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt... 15: [2023-05-25 13:37:59,775] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 12: [2023-05-25 13:37:59,775] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt... 12: [2023-05-25 13:37:59,775] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 22: [2023-05-25 13:37:59,775] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 28: [2023-05-25 13:37:59,775] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 28: [2023-05-25 13:37:59,775] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 29: [2023-05-25 13:37:59,776] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 15: [2023-05-25 13:37:59,777] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt... 25: [2023-05-25 13:37:59,778] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 12: [2023-05-25 13:37:59,778] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 12: [2023-05-25 13:37:59,778] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt... 12: [2023-05-25 13:37:59,778] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt... 12: [2023-05-25 13:37:59,778] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt... 15: [2023-05-25 13:37:59,778] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt... 15: [2023-05-25 13:37:59,778] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 27: [2023-05-25 13:37:59,779] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 31: [2023-05-25 13:37:59,779] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 15: [2023-05-25 13:37:59,780] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt... 15: [2023-05-25 13:37:59,780] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt... 15: [2023-05-25 13:37:59,780] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt... 15: [2023-05-25 13:37:59,780] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt... 15: [2023-05-25 13:37:59,780] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt... 11: [2023-05-25 13:37:59,781] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 30: [2023-05-25 13:37:59,781] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 27: [2023-05-25 13:37:59,781] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 30: [2023-05-25 13:37:59,782] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_00-model_states.pt. 6: [2023-05-25 13:37:59,783] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 20: [2023-05-25 13:37:59,784] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 6: [2023-05-25 13:37:59,783] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_00-model_states.pt. 26: [2023-05-25 13:37:59,785] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 26: [2023-05-25 13:37:59,785] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 31: [2023-05-25 13:37:59,785] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 11: [2023-05-25 13:37:59,787] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 8: [2023-05-25 13:37:59,787] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 8: [2023-05-25 13:37:59,787] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 29: [2023-05-25 13:37:59,788] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 20: [2023-05-25 13:37:59,790] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 28: [2023-05-25 13:37:59,790] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 28: [2023-05-25 13:37:59,790] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 29: [2023-05-25 13:37:59,790] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 25: [2023-05-25 13:37:59,791] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 31: [2023-05-25 13:37:59,793] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 19: [2023-05-25 13:37:59,793] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 25: [2023-05-25 13:37:59,794] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 13: [2023-05-25 13:37:59,795] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 21: [2023-05-25 13:37:59,795] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 30: [2023-05-25 13:37:59,796] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 27: [2023-05-25 13:37:59,797] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 30: [2023-05-25 13:37:59,797] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 27: [2023-05-25 13:37:59,798] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 19: [2023-05-25 13:37:59,799] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 21: [2023-05-25 13:37:59,799] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_00-model_states.pt. 8: [2023-05-25 13:37:59,800] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 8: [2023-05-25 13:37:59,801] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 13: [2023-05-25 13:37:59,804] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 19: [2023-05-25 13:37:59,807] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 21: [2023-05-25 13:37:59,808] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 13: [2023-05-25 13:37:59,809] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 6: [2023-05-25 13:37:59,810] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 6: [2023-05-25 13:37:59,811] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 21: [2023-05-25 13:37:59,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 19: [2023-05-25 13:37:59,814] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 15: [2023-05-25 13:37:59,815] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 13: [2023-05-25 13:37:59,816] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 12: [2023-05-25 13:37:59,816] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 15: [2023-05-25 13:37:59,816] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 12: [2023-05-25 13:37:59,818] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_00-model_states.pt. 15: [2023-05-25 13:37:59,828] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 15: [2023-05-25 13:37:59,832] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 12: [2023-05-25 13:37:59,833] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 12: [2023-05-25 13:37:59,836] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 4: [2023-05-25 13:37:59,877] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt. 4: [2023-05-25 13:37:59,877] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt. 4: [2023-05-25 13:37:59,889] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 4: [2023-05-25 13:37:59,891] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 4: [2023-05-25 13:37:59,899] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt. 4: [2023-05-25 13:37:59,900] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt. 4: [2023-05-25 13:37:59,912] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 4: [2023-05-25 13:37:59,912] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 1: [2023-05-25 13:37:59,923] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt. 1: [2023-05-25 13:37:59,923] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt. 5: [2023-05-25 13:37:59,926] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt. 5: [2023-05-25 13:37:59,927] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt. 1: [2023-05-25 13:37:59,928] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt. 1: [2023-05-25 13:37:59,928] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt. 2: [2023-05-25 13:37:59,935] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt. 2: [2023-05-25 13:37:59,935] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt. 1: [2023-05-25 13:37:59,938] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 1: [2023-05-25 13:37:59,938] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 7: [2023-05-25 13:37:59,940] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt. 5: [2023-05-25 13:37:59,942] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 5: [2023-05-25 13:37:59,942] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 7: [2023-05-25 13:37:59,943] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt. 5: [2023-05-25 13:37:59,945] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt. 5: [2023-05-25 13:37:59,945] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt. 1: [2023-05-25 13:37:59,947] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 1: [2023-05-25 13:37:59,947] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 2: [2023-05-25 13:37:59,948] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 0: [2023-05-25 13:37:59,948] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt. 0: [2023-05-25 13:37:59,948] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt. 17: [2023-05-25 13:37:59,949] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt. 17: [2023-05-25 13:37:59,949] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt. 0: [2023-05-25 13:37:59,949] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt. 0: [2023-05-25 13:37:59,950] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt. 2: [2023-05-25 13:37:59,950] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 7: [2023-05-25 13:37:59,953] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 3: [2023-05-25 13:37:59,954] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt. 3: [2023-05-25 13:37:59,954] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt. 28: [2023-05-25 13:37:59,956] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt. 28: [2023-05-25 13:37:59,957] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt. 23: [2023-05-25 13:37:59,958] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt. 7: [2023-05-25 13:37:59,958] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 23: [2023-05-25 13:37:59,958] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt. 5: [2023-05-25 13:37:59,958] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 5: [2023-05-25 13:37:59,958] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 17: [2023-05-25 13:37:59,961] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 17: [2023-05-25 13:37:59,961] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 0: [2023-05-25 13:37:59,963] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 0: [2023-05-25 13:37:59,963] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 0: [2023-05-25 13:37:59,963] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 4: [2023-05-25 13:37:59,964] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt. 4: [2023-05-25 13:37:59,965] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt. 0: [2023-05-25 13:37:59,966] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 28: [2023-05-25 13:37:59,967] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 28: [2023-05-25 13:37:59,969] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 23: [2023-05-25 13:37:59,969] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 23: [2023-05-25 13:37:59,971] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 6: [2023-05-25 13:37:59,972] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt. 6: [2023-05-25 13:37:59,972] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt. 3: [2023-05-25 13:37:59,972] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 7: [2023-05-25 13:37:59,972] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt. 7: [2023-05-25 13:37:59,973] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt. 2: [2023-05-25 13:37:59,974] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt. 2: [2023-05-25 13:37:59,975] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt. 8: [2023-05-25 13:37:59,976] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt. 8: [2023-05-25 13:37:59,976] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt. 14: [2023-05-25 13:37:59,976] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt. 14: [2023-05-25 13:37:59,976] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt. 3: [2023-05-25 13:37:59,976] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 27: [2023-05-25 13:37:59,978] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt. 4: [2023-05-25 13:37:59,978] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 4: [2023-05-25 13:37:59,979] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 11: [2023-05-25 13:37:59,979] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt. 11: [2023-05-25 13:37:59,979] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt. 13: [2023-05-25 13:37:59,979] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt. 7: [2023-05-25 13:37:59,979] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt. 27: [2023-05-25 13:37:59,980] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt. 13: [2023-05-25 13:37:59,980] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt. 9: [2023-05-25 13:37:59,980] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt. 9: [2023-05-25 13:37:59,981] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt. 7: [2023-05-25 13:37:59,981] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt. 3: [2023-05-25 13:37:59,982] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt. 3: [2023-05-25 13:37:59,982] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_03-model_states.pt. 16: [2023-05-25 13:37:59,983] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt. 24: [2023-05-25 13:37:59,983] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 16: [2023-05-25 13:37:59,983] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt. 24: [2023-05-25 13:37:59,983] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 15: [2023-05-25 13:37:59,984] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt. 15: [2023-05-25 13:37:59,984] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt. 10: [2023-05-25 13:37:59,984] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt. 10: [2023-05-25 13:37:59,985] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt. 24: [2023-05-25 13:37:59,986] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 24: [2023-05-25 13:37:59,986] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 18: [2023-05-25 13:37:59,986] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt. 18: [2023-05-25 13:37:59,987] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt. 8: [2023-05-25 13:37:59,987] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 27: [2023-05-25 13:37:59,987] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 30: [2023-05-25 13:37:59,987] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt. 12: [2023-05-25 13:37:59,987] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt. 2: [2023-05-25 13:37:59,987] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 7: [2023-05-25 13:37:59,987] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 7: [2023-05-25 13:37:59,988] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 27: [2023-05-25 13:37:59,988] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 12: [2023-05-25 13:37:59,988] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt. 8: [2023-05-25 13:37:59,988] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 6: [2023-05-25 13:37:59,988] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 5: [2023-05-25 13:37:59,988] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt. 6: [2023-05-25 13:37:59,988] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 17: [2023-05-25 13:37:59,989] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt. 2: [2023-05-25 13:37:59,989] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 5: [2023-05-25 13:37:59,989] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt. 14: [2023-05-25 13:37:59,989] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 17: [2023-05-25 13:37:59,989] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt. 14: [2023-05-25 13:37:59,990] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 27: [2023-05-25 13:37:59,990] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 16: [2023-05-25 13:37:59,990] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt. 16: [2023-05-25 13:37:59,990] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt. 11: [2023-05-25 13:37:59,991] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 27: [2023-05-25 13:37:59,991] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 11: [2023-05-25 13:37:59,991] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 9: [2023-05-25 13:37:59,991] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 13: [2023-05-25 13:37:59,992] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt. 13: [2023-05-25 13:37:59,992] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 9: [2023-05-25 13:37:59,992] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 13: [2023-05-25 13:37:59,992] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 27: [2023-05-25 13:37:59,992] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 24: [2023-05-25 13:37:59,992] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt. 13: [2023-05-25 13:37:59,993] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt. 24: [2023-05-25 13:37:59,993] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt. 7: [2023-05-25 13:37:59,994] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 10: [2023-05-25 13:37:59,995] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 10: [2023-05-25 13:37:59,995] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 19: [2023-05-25 13:37:59,995] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt. 19: [2023-05-25 13:37:59,995] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt. 7: [2023-05-25 13:37:59,996] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 6: [2023-05-25 13:37:59,996] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt. 14: [2023-05-25 13:37:59,995] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt. 26: [2023-05-25 13:37:59,996] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 6: [2023-05-25 13:37:59,996] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_02-model_states.pt. 26: [2023-05-25 13:37:59,996] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 14: [2023-05-25 13:37:59,996] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_02-model_states.pt. 27: [2023-05-25 13:37:59,996] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 15: [2023-05-25 13:37:59,997] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 30: [2023-05-25 13:37:59,997] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt. 28: [2023-05-25 13:37:59,997] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 28: [2023-05-25 13:37:59,997] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 3: [2023-05-25 13:37:59,998] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 16: [2023-05-25 13:37:59,998] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 15: [2023-05-25 13:37:59,999] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 23: [2023-05-25 13:37:59,998] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt. 23: [2023-05-25 13:37:59,999] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt. 26: [2023-05-25 13:37:59,999] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 16: [2023-05-25 13:37:59,999] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 30: [2023-05-25 13:37:59,999] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 26: [2023-05-25 13:37:59,999] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 18: [2023-05-25 13:37:59,999] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 28: [2023-05-25 13:38:00,000] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 28: [2023-05-25 13:38:00,000] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 22: [2023-05-25 13:38:00,000] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt. 22: [2023-05-25 13:38:00,001] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt. 12: [2023-05-25 13:38:00,001] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 17: [2023-05-25 13:38:00,001] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 31: [2023-05-25 13:38:00,001] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 17: [2023-05-25 13:38:00,001] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 31: [2023-05-25 13:38:00,001] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 5: [2023-05-25 13:38:00,002] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 5: [2023-05-25 13:38:00,002] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 13: [2023-05-25 13:38:00,002] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt. 20: [2023-05-25 13:38:00,003] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt. 20: [2023-05-25 13:38:00,003] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt. 20: [2023-05-25 13:38:00,004] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt. 20: [2023-05-25 13:38:00,004] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt. 31: [2023-05-25 13:38:00,004] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 3: [2023-05-25 13:38:00,004] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 31: [2023-05-25 13:38:00,004] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 14: [2023-05-25 13:38:00,004] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt. 24: [2023-05-25 13:38:00,004] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 16: [2023-05-25 13:38:00,004] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 16: [2023-05-25 13:38:00,004] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 13: [2023-05-25 13:38:00,004] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 12: [2023-05-25 13:38:00,005] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 29: [2023-05-25 13:38:00,005] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt. 24: [2023-05-25 13:38:00,005] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 21: [2023-05-25 13:38:00,005] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt. 31: [2023-05-25 13:38:00,005] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt. 31: [2023-05-25 13:38:00,006] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt. 29: [2023-05-25 13:38:00,006] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt. 18: [2023-05-25 13:38:00,006] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 13: [2023-05-25 13:38:00,007] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt. 14: [2023-05-25 13:38:00,007] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 13: [2023-05-25 13:38:00,007] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 19: [2023-05-25 13:38:00,008] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 19: [2023-05-25 13:38:00,008] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 14: [2023-05-25 13:38:00,009] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 28: [2023-05-25 13:38:00,010] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 28: [2023-05-25 13:38:00,010] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 23: [2023-05-25 13:38:00,010] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 23: [2023-05-25 13:38:00,010] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 14: [2023-05-25 13:38:00,010] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt. 6: [2023-05-25 13:38:00,011] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 30: [2023-05-25 13:38:00,011] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 28: [2023-05-25 13:38:00,012] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt... 28: [2023-05-25 13:38:00,012] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt... 6: [2023-05-25 13:38:00,012] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 22: [2023-05-25 13:38:00,014] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 30: [2023-05-25 13:38:00,014] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 30: [2023-05-25 13:38:00,015] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 21: [2023-05-25 13:38:00,014] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt. 30: [2023-05-25 13:38:00,015] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt. 30: [2023-05-25 13:38:00,015] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt. 21: [2023-05-25 13:38:00,016] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt. 13: [2023-05-25 13:38:00,016] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 22: [2023-05-25 13:38:00,016] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 29: [2023-05-25 13:38:00,016] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 21: [2023-05-25 13:38:00,016] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_02-model_states.pt. 14: [2023-05-25 13:38:00,017] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 21: [2023-05-25 13:38:00,017] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 20: [2023-05-25 13:38:00,017] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 9: [2023-05-25 13:38:00,017] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt. 31: [2023-05-25 13:38:00,017] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 20: [2023-05-25 13:38:00,017] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 29: [2023-05-25 13:38:00,017] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 29: [2023-05-25 13:38:00,017] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 9: [2023-05-25 13:38:00,017] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt. 31: [2023-05-25 13:38:00,017] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 20: [2023-05-25 13:38:00,018] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 20: [2023-05-25 13:38:00,018] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 30: [2023-05-25 13:38:00,018] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 30: [2023-05-25 13:38:00,018] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 29: [2023-05-25 13:38:00,019] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 29: [2023-05-25 13:38:00,019] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 29: [2023-05-25 13:38:00,020] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt. 29: [2023-05-25 13:38:00,020] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 13: [2023-05-25 13:38:00,020] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 29: [2023-05-25 13:38:00,020] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt. 8: [2023-05-25 13:38:00,021] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt. 8: [2023-05-25 13:38:00,022] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt. 25: [2023-05-25 13:38:00,023] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 14: [2023-05-25 13:38:00,024] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 3: [2023-05-25 13:38:00,024] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt. 3: [2023-05-25 13:38:00,025] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt. 25: [2023-05-25 13:38:00,026] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 24: [2023-05-25 13:38:00,026] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 24: [2023-05-25 13:38:00,026] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 24: [2023-05-25 13:38:00,026] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt. 25: [2023-05-25 13:38:00,027] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 15: [2023-05-25 13:38:00,027] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt. 25: [2023-05-25 13:38:00,027] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt. 15: [2023-05-25 13:38:00,027] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt. 26: [2023-05-25 13:38:00,027] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt. 25: [2023-05-25 13:38:00,027] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt. 26: [2023-05-25 13:38:00,027] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt. 24: [2023-05-25 13:38:00,027] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt. 2: [2023-05-25 13:38:00,027] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt. 2: [2023-05-25 13:38:00,028] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt. 28: [2023-05-25 13:38:00,029] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 9: [2023-05-25 13:38:00,029] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 21: [2023-05-25 13:38:00,029] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 19: [2023-05-25 13:38:00,031] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt. 25: [2023-05-25 13:38:00,031] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 19: [2023-05-25 13:38:00,031] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt. 30: [2023-05-25 13:38:00,031] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 9: [2023-05-25 13:38:00,031] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 30: [2023-05-25 13:38:00,031] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 21: [2023-05-25 13:38:00,032] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 21: [2023-05-25 13:38:00,032] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 8: [2023-05-25 13:38:00,033] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 27: [2023-05-25 13:38:00,033] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 27: [2023-05-25 13:38:00,033] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 28: [2023-05-25 13:38:00,033] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 8: [2023-05-25 13:38:00,033] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 27: [2023-05-25 13:38:00,033] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 29: [2023-05-25 13:38:00,033] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 29: [2023-05-25 13:38:00,033] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 28: [2023-05-25 13:38:00,034] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt. 10: [2023-05-25 13:38:00,035] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt. 28: [2023-05-25 13:38:00,035] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt. 10: [2023-05-25 13:38:00,035] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt. 10: [2023-05-25 13:38:00,035] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt. 10: [2023-05-25 13:38:00,035] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt. 27: [2023-05-25 13:38:00,036] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt... 27: [2023-05-25 13:38:00,036] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 18: [2023-05-25 13:38:00,036] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt. 27: [2023-05-25 13:38:00,038] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt. 18: [2023-05-25 13:38:00,038] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt. 28: [2023-05-25 13:38:00,038] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt. 15: [2023-05-25 13:38:00,039] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 26: [2023-05-25 13:38:00,038] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 15: [2023-05-25 13:38:00,039] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 27: [2023-05-25 13:38:00,039] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt... 24: [2023-05-25 13:38:00,039] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 28: [2023-05-25 13:38:00,040] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt. 24: [2023-05-25 13:38:00,041] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 26: [2023-05-25 13:38:00,041] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 24: [2023-05-25 13:38:00,041] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 22: [2023-05-25 13:38:00,041] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 24: [2023-05-25 13:38:00,041] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 22: [2023-05-25 13:38:00,041] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 25: [2023-05-25 13:38:00,041] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt. 27: [2023-05-25 13:38:00,042] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt. 30: [2023-05-25 13:38:00,042] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt. 30: [2023-05-25 13:38:00,042] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt. 19: [2023-05-25 13:38:00,042] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 2: [2023-05-25 13:38:00,042] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 2: [2023-05-25 13:38:00,042] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 25: [2023-05-25 13:38:00,042] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt. 19: [2023-05-25 13:38:00,042] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 25: [2023-05-25 13:38:00,043] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 26: [2023-05-25 13:38:00,043] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 25: [2023-05-25 13:38:00,043] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 24: [2023-05-25 13:38:00,043] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 26: [2023-05-25 13:38:00,043] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 26: [2023-05-25 13:38:00,043] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt. 26: [2023-05-25 13:38:00,043] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt. 29: [2023-05-25 13:38:00,043] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt. 26: [2023-05-25 13:38:00,043] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt. 29: [2023-05-25 13:38:00,043] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt. 26: [2023-05-25 13:38:00,043] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt. 24: [2023-05-25 13:38:00,044] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt... 22: [2023-05-25 13:38:00,044] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 28: [2023-05-25 13:38:00,044] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 22: [2023-05-25 13:38:00,044] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 3: [2023-05-25 13:38:00,046] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 10: [2023-05-25 13:38:00,046] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 3: [2023-05-25 13:38:00,046] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 28: [2023-05-25 13:38:00,047] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 10: [2023-05-25 13:38:00,047] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 6: [2023-05-25 13:38:00,047] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt. 10: [2023-05-25 13:38:00,047] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 6: [2023-05-25 13:38:00,047] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt. 10: [2023-05-25 13:38:00,047] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 27: [2023-05-25 13:38:00,047] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 28: [2023-05-25 13:38:00,048] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 28: [2023-05-25 13:38:00,048] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 27: [2023-05-25 13:38:00,048] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 18: [2023-05-25 13:38:00,049] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 24: [2023-05-25 13:38:00,049] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 18: [2023-05-25 13:38:00,049] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 18: [2023-05-25 13:38:00,050] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 31: [2023-05-25 13:38:00,050] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 31: [2023-05-25 13:38:00,050] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 18: [2023-05-25 13:38:00,050] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 18: [2023-05-25 13:38:00,050] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 24: [2023-05-25 13:38:00,051] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt... 28: [2023-05-25 13:38:00,051] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 8: [2023-05-25 13:38:00,051] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt. 11: [2023-05-25 13:38:00,051] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 8: [2023-05-25 13:38:00,051] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt. 11: [2023-05-25 13:38:00,051] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 11: [2023-05-25 13:38:00,051] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 1: [2023-05-25 13:38:00,051] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 11: [2023-05-25 13:38:00,052] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 1: [2023-05-25 13:38:00,052] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 27: [2023-05-25 13:38:00,052] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 1: [2023-05-25 13:38:00,052] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 1: [2023-05-25 13:38:00,052] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 1: [2023-05-25 13:38:00,052] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 1: [2023-05-25 13:38:00,052] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 18: [2023-05-25 13:38:00,052] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 12: [2023-05-25 13:38:00,052] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt. 12: [2023-05-25 13:38:00,053] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt. 18: [2023-05-25 13:38:00,053] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt... 18: [2023-05-25 13:38:00,053] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt... 15: [2023-05-25 13:38:00,053] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt. 11: [2023-05-25 13:38:00,053] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 27: [2023-05-25 13:38:00,053] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt. 28: [2023-05-25 13:38:00,053] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 18: [2023-05-25 13:38:00,053] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 15: [2023-05-25 13:38:00,053] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt. 1: [2023-05-25 13:38:00,053] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt. 1: [2023-05-25 13:38:00,053] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt. 18: [2023-05-25 13:38:00,053] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 31: [2023-05-25 13:38:00,053] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt. 27: [2023-05-25 13:38:00,054] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt. 12: [2023-05-25 13:38:00,054] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt. 12: [2023-05-25 13:38:00,054] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt. 11: [2023-05-25 13:38:00,054] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 0: [2023-05-25 13:38:00,054] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt. 31: [2023-05-25 13:38:00,054] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_03-model_states.pt. 0: [2023-05-25 13:38:00,054] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_01-model_01-model_states.pt. 1: [2023-05-25 13:38:00,054] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt... 16: [2023-05-25 13:38:00,055] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 16: [2023-05-25 13:38:00,055] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 16: [2023-05-25 13:38:00,055] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 11: [2023-05-25 13:38:00,055] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt... 29: [2023-05-25 13:38:00,055] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 1: [2023-05-25 13:38:00,055] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt... 11: [2023-05-25 13:38:00,056] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt... 30: [2023-05-25 13:38:00,056] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 30: [2023-05-25 13:38:00,056] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 16: [2023-05-25 13:38:00,056] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 1: [2023-05-25 13:38:00,056] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt... 16: [2023-05-25 13:38:00,056] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 1: [2023-05-25 13:38:00,056] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt... 1: [2023-05-25 13:38:00,057] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 1: [2023-05-25 13:38:00,057] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 30: [2023-05-25 13:38:00,057] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 16: [2023-05-25 13:38:00,057] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 11: [2023-05-25 13:38:00,057] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt. 29: [2023-05-25 13:38:00,057] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 11: [2023-05-25 13:38:00,057] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt. 16: [2023-05-25 13:38:00,058] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 31: [2023-05-25 13:38:00,058] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 16: [2023-05-25 13:38:00,058] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt... 27: [2023-05-25 13:38:00,058] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 16: [2023-05-25 13:38:00,058] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt... 26: [2023-05-25 13:38:00,058] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 26: [2023-05-25 13:38:00,058] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 16: [2023-05-25 13:38:00,058] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt... 26: [2023-05-25 13:38:00,058] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 26: [2023-05-25 13:38:00,058] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 16: [2023-05-25 13:38:00,059] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt... 30: [2023-05-25 13:38:00,059] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt... 16: [2023-05-25 13:38:00,059] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 25: [2023-05-25 13:38:00,060] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 25: [2023-05-25 13:38:00,060] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 30: [2023-05-25 13:38:00,061] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 30: [2023-05-25 13:38:00,061] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 20: [2023-05-25 13:38:00,062] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 20: [2023-05-25 13:38:00,062] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 26: [2023-05-25 13:38:00,062] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 8: [2023-05-25 13:38:00,063] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 31: [2023-05-25 13:38:00,062] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 8: [2023-05-25 13:38:00,063] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 26: [2023-05-25 13:38:00,063] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 21: [2023-05-25 13:38:00,063] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 21: [2023-05-25 13:38:00,063] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 6: [2023-05-25 13:38:00,064] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 31: [2023-05-25 13:38:00,064] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt... 31: [2023-05-25 13:38:00,064] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 20: [2023-05-25 13:38:00,064] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 20: [2023-05-25 13:38:00,064] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 6: [2023-05-25 13:38:00,065] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 30: [2023-05-25 13:38:00,064] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt... 21: [2023-05-25 13:38:00,065] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 21: [2023-05-25 13:38:00,065] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 29: [2023-05-25 13:38:00,065] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 22: [2023-05-25 13:38:00,065] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 31: [2023-05-25 13:38:00,065] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 22: [2023-05-25 13:38:00,065] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 29: [2023-05-25 13:38:00,066] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 15: [2023-05-25 13:38:00,066] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 15: [2023-05-25 13:38:00,066] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 17: [2023-05-25 13:38:00,066] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 20: [2023-05-25 13:38:00,066] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 17: [2023-05-25 13:38:00,066] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 27: [2023-05-25 13:38:00,066] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 17: [2023-05-25 13:38:00,066] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 17: [2023-05-25 13:38:00,066] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 17: [2023-05-25 13:38:00,066] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 29: [2023-05-25 13:38:00,066] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 31: [2023-05-25 13:38:00,066] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 31: [2023-05-25 13:38:00,066] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt... 17: [2023-05-25 13:38:00,066] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 27: [2023-05-25 13:38:00,066] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 20: [2023-05-25 13:38:00,066] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 20: [2023-05-25 13:38:00,066] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 20: [2023-05-25 13:38:00,066] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 25: [2023-05-25 13:38:00,066] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 29: [2023-05-25 13:38:00,067] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 31: [2023-05-25 13:38:00,067] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 12: [2023-05-25 13:38:00,067] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 29: [2023-05-25 13:38:00,067] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt... 23: [2023-05-25 13:38:00,067] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 23: [2023-05-25 13:38:00,067] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 23: [2023-05-25 13:38:00,067] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 23: [2023-05-25 13:38:00,067] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 23: [2023-05-25 13:38:00,067] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 23: [2023-05-25 13:38:00,067] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 1: [2023-05-25 13:38:00,067] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 1: [2023-05-25 13:38:00,068] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 3: [2023-05-25 13:38:00,067] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 3: [2023-05-25 13:38:00,067] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 25: [2023-05-25 13:38:00,068] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt. 21: [2023-05-25 13:38:00,068] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 25: [2023-05-25 13:38:00,068] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt. 3: [2023-05-25 13:38:00,068] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 22: [2023-05-25 13:38:00,068] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt... 3: [2023-05-25 13:38:00,068] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 3: [2023-05-25 13:38:00,068] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 3: [2023-05-25 13:38:00,068] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 20: [2023-05-25 13:38:00,068] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt... 24: [2023-05-25 13:38:00,068] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt. 17: [2023-05-25 13:38:00,069] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt... 24: [2023-05-25 13:38:00,069] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_02-model_states.pt. 29: [2023-05-25 13:38:00,069] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt... 20: [2023-05-25 13:38:00,069] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt... 23: [2023-05-25 13:38:00,069] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 20: [2023-05-25 13:38:00,069] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt... 20: [2023-05-25 13:38:00,069] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt... 13: [2023-05-25 13:38:00,069] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 23: [2023-05-25 13:38:00,069] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 17: [2023-05-25 13:38:00,070] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt... 4: [2023-05-25 13:38:00,069] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 4: [2023-05-25 13:38:00,069] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 13: [2023-05-25 13:38:00,070] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 13: [2023-05-25 13:38:00,070] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 17: [2023-05-25 13:38:00,070] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 22: [2023-05-25 13:38:00,070] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt... 17: [2023-05-25 13:38:00,070] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt... 4: [2023-05-25 13:38:00,070] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 4: [2023-05-25 13:38:00,070] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 17: [2023-05-25 13:38:00,070] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt... 25: [2023-05-25 13:38:00,070] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 4: [2023-05-25 13:38:00,070] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 4: [2023-05-25 13:38:00,070] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 17: [2023-05-25 13:38:00,070] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 4: [2023-05-25 13:38:00,070] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 12: [2023-05-25 13:38:00,070] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 23: [2023-05-25 13:38:00,070] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt... 23: [2023-05-25 13:38:00,070] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt... 12: [2023-05-25 13:38:00,070] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 13: [2023-05-25 13:38:00,070] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 13: [2023-05-25 13:38:00,070] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 13: [2023-05-25 13:38:00,070] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 23: [2023-05-25 13:38:00,070] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt... 23: [2023-05-25 13:38:00,070] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt... 4: [2023-05-25 13:38:00,070] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 13: [2023-05-25 13:38:00,070] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 21: [2023-05-25 13:38:00,070] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 13: [2023-05-25 13:38:00,070] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 21: [2023-05-25 13:38:00,070] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 21: [2023-05-25 13:38:00,070] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 13: [2023-05-25 13:38:00,071] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 30: [2023-05-25 13:38:00,071] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 19: [2023-05-25 13:38:00,072] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 19: [2023-05-25 13:38:00,072] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 11: [2023-05-25 13:38:00,072] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 19: [2023-05-25 13:38:00,072] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 19: [2023-05-25 13:38:00,072] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 11: [2023-05-25 13:38:00,072] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 3: [2023-05-25 13:38:00,071] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 3: [2023-05-25 13:38:00,072] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 13: [2023-05-25 13:38:00,072] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt... 3: [2023-05-25 13:38:00,072] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt... 3: [2023-05-25 13:38:00,072] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt... 13: [2023-05-25 13:38:00,072] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt... 3: [2023-05-25 13:38:00,073] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt... 31: [2023-05-25 13:38:00,073] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt. 31: [2023-05-25 13:38:00,073] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_36-model_01-model_states.pt. 0: [2023-05-25 13:38:00,073] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 0: [2023-05-25 13:38:00,073] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 3: [2023-05-25 13:38:00,073] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt... 4: [2023-05-25 13:38:00,073] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt... 4: [2023-05-25 13:38:00,073] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt... 4: [2023-05-25 13:38:00,073] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt... 4: [2023-05-25 13:38:00,073] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 4: [2023-05-25 13:38:00,073] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt... 4: [2023-05-25 13:38:00,073] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 4: [2023-05-25 13:38:00,074] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt... 21: [2023-05-25 13:38:00,074] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt... 21: [2023-05-25 13:38:00,074] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt... 4: [2023-05-25 13:38:00,074] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt... 13: [2023-05-25 13:38:00,074] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 21: [2023-05-25 13:38:00,074] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt... 19: [2023-05-25 13:38:00,074] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 21: [2023-05-25 13:38:00,074] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt... 22: [2023-05-25 13:38:00,074] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 19: [2023-05-25 13:38:00,074] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 13: [2023-05-25 13:38:00,074] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt... 13: [2023-05-25 13:38:00,074] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt... 30: [2023-05-25 13:38:00,075] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt... 13: [2023-05-25 13:38:00,075] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt... 13: [2023-05-25 13:38:00,075] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt... 30: [2023-05-25 13:38:00,075] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 0: [2023-05-25 13:38:00,075] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 0: [2023-05-25 13:38:00,075] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 0: [2023-05-25 13:38:00,076] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 0: [2023-05-25 13:38:00,076] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 0: [2023-05-25 13:38:00,076] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 0: [2023-05-25 13:38:00,076] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 12: [2023-05-25 13:38:00,076] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 22: [2023-05-25 13:38:00,077] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 30: [2023-05-25 13:38:00,077] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 9: [2023-05-25 13:38:00,077] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt. 19: [2023-05-25 13:38:00,077] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt... 19: [2023-05-25 13:38:00,077] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt... 9: [2023-05-25 13:38:00,077] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_01-model_states.pt. 0: [2023-05-25 13:38:00,077] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 22: [2023-05-25 13:38:00,078] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt. 22: [2023-05-25 13:38:00,078] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_03-model_states.pt. 17: [2023-05-25 13:38:00,079] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt. 17: [2023-05-25 13:38:00,080] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt. 24: [2023-05-25 13:38:00,080] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 0: [2023-05-25 13:38:00,081] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 0: [2023-05-25 13:38:00,081] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt... 24: [2023-05-25 13:38:00,081] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 24: [2023-05-25 13:38:00,081] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 24: [2023-05-25 13:38:00,081] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 29: [2023-05-25 13:38:00,081] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 23: [2023-05-25 13:38:00,081] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt. 29: [2023-05-25 13:38:00,081] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 11: [2023-05-25 13:38:00,081] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt. 0: [2023-05-25 13:38:00,081] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt... 23: [2023-05-25 13:38:00,081] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt. 19: [2023-05-25 13:38:00,082] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 19: [2023-05-25 13:38:00,082] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 11: [2023-05-25 13:38:00,082] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_14-model_03-model_states.pt. 0: [2023-05-25 13:38:00,082] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt... 0: [2023-05-25 13:38:00,082] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt... 26: [2023-05-25 13:38:00,082] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 20: [2023-05-25 13:38:00,082] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt. 20: [2023-05-25 13:38:00,083] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt. 29: [2023-05-25 13:38:00,083] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 29: [2023-05-25 13:38:00,083] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 18: [2023-05-25 13:38:00,083] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 25: [2023-05-25 13:38:00,083] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 19: [2023-05-25 13:38:00,084] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt. 19: [2023-05-25 13:38:00,084] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt. 19: [2023-05-25 13:38:00,084] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt... 19: [2023-05-25 13:38:00,084] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt... 24: [2023-05-25 13:38:00,084] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt... 24: [2023-05-25 13:38:00,084] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt... 25: [2023-05-25 13:38:00,084] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 25: [2023-05-25 13:38:00,084] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 5: [2023-05-25 13:38:00,084] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 5: [2023-05-25 13:38:00,085] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 5: [2023-05-25 13:38:00,085] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 5: [2023-05-25 13:38:00,085] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 5: [2023-05-25 13:38:00,085] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 5: [2023-05-25 13:38:00,085] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 5: [2023-05-25 13:38:00,085] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 9: [2023-05-25 13:38:00,085] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 9: [2023-05-25 13:38:00,085] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 25: [2023-05-25 13:38:00,085] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 26: [2023-05-25 13:38:00,085] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 21: [2023-05-25 13:38:00,085] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt. 5: [2023-05-25 13:38:00,085] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 21: [2023-05-25 13:38:00,085] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt. 29: [2023-05-25 13:38:00,085] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt... 9: [2023-05-25 13:38:00,085] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 9: [2023-05-25 13:38:00,085] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 9: [2023-05-25 13:38:00,085] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 29: [2023-05-25 13:38:00,086] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt... 26: [2023-05-25 13:38:00,086] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt... 9: [2023-05-25 13:38:00,086] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 25: [2023-05-25 13:38:00,086] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 11: [2023-05-25 13:38:00,086] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 31: [2023-05-25 13:38:00,086] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 28: [2023-05-25 13:38:00,087] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 28: [2023-05-25 13:38:00,087] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 5: [2023-05-25 13:38:00,087] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 9: [2023-05-25 13:38:00,087] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt... 26: [2023-05-25 13:38:00,088] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt... 31: [2023-05-25 13:38:00,088] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt... 18: [2023-05-25 13:38:00,088] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 18: [2023-05-25 13:38:00,088] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 16: [2023-05-25 13:38:00,088] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 16: [2023-05-25 13:38:00,088] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 28: [2023-05-25 13:38:00,088] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 3: [2023-05-25 13:38:00,088] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 5: [2023-05-25 13:38:00,089] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 5: [2023-05-25 13:38:00,089] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt... 25: [2023-05-25 13:38:00,089] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 22: [2023-05-25 13:38:00,089] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 5: [2023-05-25 13:38:00,089] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt... 5: [2023-05-25 13:38:00,089] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt... 5: [2023-05-25 13:38:00,089] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt... 9: [2023-05-25 13:38:00,089] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 10: [2023-05-25 13:38:00,089] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 10: [2023-05-25 13:38:00,089] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 5: [2023-05-25 13:38:00,089] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt... 5: [2023-05-25 13:38:00,089] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt... 10: [2023-05-25 13:38:00,089] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 9: [2023-05-25 13:38:00,090] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 10: [2023-05-25 13:38:00,089] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 10: [2023-05-25 13:38:00,089] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 2: [2023-05-25 13:38:00,089] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 2: [2023-05-25 13:38:00,089] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 10: [2023-05-25 13:38:00,090] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 25: [2023-05-25 13:38:00,090] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt... 10: [2023-05-25 13:38:00,090] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 10: [2023-05-25 13:38:00,090] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 9: [2023-05-25 13:38:00,090] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt... 9: [2023-05-25 13:38:00,090] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 9: [2023-05-25 13:38:00,090] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt... 25: [2023-05-25 13:38:00,090] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt... 9: [2023-05-25 13:38:00,090] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt... 30: [2023-05-25 13:38:00,090] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 2: [2023-05-25 13:38:00,091] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 2: [2023-05-25 13:38:00,091] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 28: [2023-05-25 13:38:00,091] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 28: [2023-05-25 13:38:00,091] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt... 28: [2023-05-25 13:38:00,091] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt... 28: [2023-05-25 13:38:00,091] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt... 9: [2023-05-25 13:38:00,091] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 2: [2023-05-25 13:38:00,091] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 2: [2023-05-25 13:38:00,091] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 2: [2023-05-25 13:38:00,091] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 18: [2023-05-25 13:38:00,091] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt... 2: [2023-05-25 13:38:00,091] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 30: [2023-05-25 13:38:00,092] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 10: [2023-05-25 13:38:00,092] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 10: [2023-05-25 13:38:00,092] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt... 10: [2023-05-25 13:38:00,092] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt... 10: [2023-05-25 13:38:00,092] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 10: [2023-05-25 13:38:00,092] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt... 10: [2023-05-25 13:38:00,092] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt... 14: [2023-05-25 13:38:00,092] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 14: [2023-05-25 13:38:00,092] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 14: [2023-05-25 13:38:00,092] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 14: [2023-05-25 13:38:00,092] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 10: [2023-05-25 13:38:00,092] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt... 10: [2023-05-25 13:38:00,092] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt... 14: [2023-05-25 13:38:00,093] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 14: [2023-05-25 13:38:00,093] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 14: [2023-05-25 13:38:00,093] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 22: [2023-05-25 13:38:00,093] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 14: [2023-05-25 13:38:00,093] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 2: [2023-05-25 13:38:00,093] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 22: [2023-05-25 13:38:00,093] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 27: [2023-05-25 13:38:00,093] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 28: [2023-05-25 13:38:00,093] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt... 17: [2023-05-25 13:38:00,093] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 17: [2023-05-25 13:38:00,093] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 2: [2023-05-25 13:38:00,094] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt... 2: [2023-05-25 13:38:00,094] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt... 11: [2023-05-25 13:38:00,094] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 3: [2023-05-25 13:38:00,094] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt... 23: [2023-05-25 13:38:00,094] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 23: [2023-05-25 13:38:00,094] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 2: [2023-05-25 13:38:00,094] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 3: [2023-05-25 13:38:00,094] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 11: [2023-05-25 13:38:00,094] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 30: [2023-05-25 13:38:00,094] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt... 20: [2023-05-25 13:38:00,095] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 22: [2023-05-25 13:38:00,095] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 21: [2023-05-25 13:38:00,095] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 14: [2023-05-25 13:38:00,095] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 14: [2023-05-25 13:38:00,095] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt... 1: [2023-05-25 13:38:00,096] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 14: [2023-05-25 13:38:00,096] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt... 18: [2023-05-25 13:38:00,095] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 14: [2023-05-25 13:38:00,096] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt... 14: [2023-05-25 13:38:00,096] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt... 11: [2023-05-25 13:38:00,096] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 19: [2023-05-25 13:38:00,096] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 20: [2023-05-25 13:38:00,096] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 1: [2023-05-25 13:38:00,096] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 14: [2023-05-25 13:38:00,096] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 14: [2023-05-25 13:38:00,096] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt... 14: [2023-05-25 13:38:00,097] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt... 2: [2023-05-25 13:38:00,097] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt... 2: [2023-05-25 13:38:00,097] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt... 21: [2023-05-25 13:38:00,097] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 2: [2023-05-25 13:38:00,097] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt... 2: [2023-05-25 13:38:00,097] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt... 21: [2023-05-25 13:38:00,097] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 27: [2023-05-25 13:38:00,097] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt... 18: [2023-05-25 13:38:00,097] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 19: [2023-05-25 13:38:00,098] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 26: [2023-05-25 13:38:00,098] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 13: [2023-05-25 13:38:00,098] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 27: [2023-05-25 13:38:00,098] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 18: [2023-05-25 13:38:00,098] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt... 15: [2023-05-25 13:38:00,098] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 15: [2023-05-25 13:38:00,099] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 26: [2023-05-25 13:38:00,099] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 3: [2023-05-25 13:38:00,099] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt... 26: [2023-05-25 13:38:00,099] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 26: [2023-05-25 13:38:00,099] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 11: [2023-05-25 13:38:00,099] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 29: [2023-05-25 13:38:00,099] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 29: [2023-05-25 13:38:00,099] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 27: [2023-05-25 13:38:00,101] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt... 23: [2023-05-25 13:38:00,101] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 23: [2023-05-25 13:38:00,101] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 16: [2023-05-25 13:38:00,101] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt. 15: [2023-05-25 13:38:00,101] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 15: [2023-05-25 13:38:00,101] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 15: [2023-05-25 13:38:00,101] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 15: [2023-05-25 13:38:00,101] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 15: [2023-05-25 13:38:00,101] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 29: [2023-05-25 13:38:00,101] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt... 16: [2023-05-25 13:38:00,101] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt. 29: [2023-05-25 13:38:00,102] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt... 26: [2023-05-25 13:38:00,102] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt... 26: [2023-05-25 13:38:00,102] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt... 26: [2023-05-25 13:38:00,102] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt... 26: [2023-05-25 13:38:00,102] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt... 16: [2023-05-25 13:38:00,102] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 15: [2023-05-25 13:38:00,102] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt... 16: [2023-05-25 13:38:00,103] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 8: [2023-05-25 13:38:00,102] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 17: [2023-05-25 13:38:00,103] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 25: [2023-05-25 13:38:00,102] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 25: [2023-05-25 13:38:00,102] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 8: [2023-05-25 13:38:00,103] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 18: [2023-05-25 13:38:00,103] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 8: [2023-05-25 13:38:00,103] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 8: [2023-05-25 13:38:00,103] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 15: [2023-05-25 13:38:00,103] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt... 21: [2023-05-25 13:38:00,103] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 15: [2023-05-25 13:38:00,103] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt... 8: [2023-05-25 13:38:00,103] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 31: [2023-05-25 13:38:00,104] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 27: [2023-05-25 13:38:00,103] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 15: [2023-05-25 13:38:00,104] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 27: [2023-05-25 13:38:00,104] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 15: [2023-05-25 13:38:00,104] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt... 20: [2023-05-25 13:38:00,104] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 8: [2023-05-25 13:38:00,105] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 8: [2023-05-25 13:38:00,105] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 13: [2023-05-25 13:38:00,105] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 30: [2023-05-25 13:38:00,105] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 8: [2023-05-25 13:38:00,105] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 8: [2023-05-25 13:38:00,105] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt... 8: [2023-05-25 13:38:00,106] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt... 18: [2023-05-25 13:38:00,105] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt. 8: [2023-05-25 13:38:00,106] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 8: [2023-05-25 13:38:00,106] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt... 18: [2023-05-25 13:38:00,106] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt. 31: [2023-05-25 13:38:00,106] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt... 1: [2023-05-25 13:38:00,107] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 1: [2023-05-25 13:38:00,107] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 17: [2023-05-25 13:38:00,107] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 8: [2023-05-25 13:38:00,107] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt... 8: [2023-05-25 13:38:00,107] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt... 25: [2023-05-25 13:38:00,107] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt... 25: [2023-05-25 13:38:00,107] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt... 20: [2023-05-25 13:38:00,107] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 30: [2023-05-25 13:38:00,107] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 3: [2023-05-25 13:38:00,107] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 27: [2023-05-25 13:38:00,108] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt... 27: [2023-05-25 13:38:00,108] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt... 4: [2023-05-25 13:38:00,107] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 4: [2023-05-25 13:38:00,107] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 8: [2023-05-25 13:38:00,108] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 8: [2023-05-25 13:38:00,108] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt... 3: [2023-05-25 13:38:00,108] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 30: [2023-05-25 13:38:00,108] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt... 19: [2023-05-25 13:38:00,108] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 21: [2023-05-25 13:38:00,109] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 30: [2023-05-25 13:38:00,109] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt... 11: [2023-05-25 13:38:00,110] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 15: [2023-05-25 13:38:00,110] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 15: [2023-05-25 13:38:00,110] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 1: [2023-05-25 13:38:00,110] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 1: [2023-05-25 13:38:00,110] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 31: [2023-05-25 13:38:00,111] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 12: [2023-05-25 13:38:00,111] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 12: [2023-05-25 13:38:00,111] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 12: [2023-05-25 13:38:00,111] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 12: [2023-05-25 13:38:00,111] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 12: [2023-05-25 13:38:00,111] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 12: [2023-05-25 13:38:00,111] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 1: [2023-05-25 13:38:00,112] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt... 1: [2023-05-25 13:38:00,112] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt... 12: [2023-05-25 13:38:00,112] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 15: [2023-05-25 13:38:00,113] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt... 15: [2023-05-25 13:38:00,113] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt... 31: [2023-05-25 13:38:00,113] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt... 13: [2023-05-25 13:38:00,113] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 0: [2023-05-25 13:38:00,113] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 19: [2023-05-25 13:38:00,113] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 12: [2023-05-25 13:38:00,114] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 12: [2023-05-25 13:38:00,114] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 12: [2023-05-25 13:38:00,114] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt... 12: [2023-05-25 13:38:00,114] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt... 16: [2023-05-25 13:38:00,115] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 12: [2023-05-25 13:38:00,115] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt... 12: [2023-05-25 13:38:00,115] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt... 12: [2023-05-25 13:38:00,115] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt... 12: [2023-05-25 13:38:00,115] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt... 16: [2023-05-25 13:38:00,115] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 11: [2023-05-25 13:38:00,116] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 22: [2023-05-25 13:38:00,116] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt. 6: [2023-05-25 13:38:00,116] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 6: [2023-05-25 13:38:00,116] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 6: [2023-05-25 13:38:00,116] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 6: [2023-05-25 13:38:00,116] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 22: [2023-05-25 13:38:00,116] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_25-model_01-model_states.pt. 24: [2023-05-25 13:38:00,115] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 6: [2023-05-25 13:38:00,116] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 6: [2023-05-25 13:38:00,116] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 6: [2023-05-25 13:38:00,116] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 6: [2023-05-25 13:38:00,116] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 23: [2023-05-25 13:38:00,116] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 23: [2023-05-25 13:38:00,116] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 12: [2023-05-25 13:38:00,117] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt... 21: [2023-05-25 13:38:00,118] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 25: [2023-05-25 13:38:00,118] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 20: [2023-05-25 13:38:00,118] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 18: [2023-05-25 13:38:00,118] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 24: [2023-05-25 13:38:00,118] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 24: [2023-05-25 13:38:00,118] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt... 11: [2023-05-25 13:38:00,118] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt... 17: [2023-05-25 13:38:00,119] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 0: [2023-05-25 13:38:00,119] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 0: [2023-05-25 13:38:00,119] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 0: [2023-05-25 13:38:00,120] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 6: [2023-05-25 13:38:00,120] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 24: [2023-05-25 13:38:00,120] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt... 18: [2023-05-25 13:38:00,120] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 11: [2023-05-25 13:38:00,120] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 6: [2023-05-25 13:38:00,120] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt... 6: [2023-05-25 13:38:00,120] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt... 5: [2023-05-25 13:38:00,120] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 6: [2023-05-25 13:38:00,120] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 6: [2023-05-25 13:38:00,121] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt... 6: [2023-05-25 13:38:00,121] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt... 6: [2023-05-25 13:38:00,121] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt... 6: [2023-05-25 13:38:00,121] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt... 10: [2023-05-25 13:38:00,121] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 4: [2023-05-25 13:38:00,121] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 25: [2023-05-25 13:38:00,121] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt... 13: [2023-05-25 13:38:00,121] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 25: [2023-05-25 13:38:00,121] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 10: [2023-05-25 13:38:00,122] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 11: [2023-05-25 13:38:00,122] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt... 5: [2023-05-25 13:38:00,122] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 4: [2023-05-25 13:38:00,123] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 20: [2023-05-25 13:38:00,123] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 14: [2023-05-25 13:38:00,123] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 0: [2023-05-25 13:38:00,123] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt... 0: [2023-05-25 13:38:00,123] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt... 25: [2023-05-25 13:38:00,123] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt... 3: [2023-05-25 13:38:00,124] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 17: [2023-05-25 13:38:00,124] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 31: [2023-05-25 13:38:00,125] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 3: [2023-05-25 13:38:00,125] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 9: [2023-05-25 13:38:00,125] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 9: [2023-05-25 13:38:00,125] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 19: [2023-05-25 13:38:00,125] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 0: [2023-05-25 13:38:00,125] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 14: [2023-05-25 13:38:00,125] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 9: [2023-05-25 13:38:00,126] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 31: [2023-05-25 13:38:00,127] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_00-model_states.pt. 19: [2023-05-25 13:38:00,127] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 9: [2023-05-25 13:38:00,127] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt... 2: [2023-05-25 13:38:00,127] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 2: [2023-05-25 13:38:00,127] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 22: [2023-05-25 13:38:00,129] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 31: [2023-05-25 13:38:00,129] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt... 31: [2023-05-25 13:38:00,129] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt... 22: [2023-05-25 13:38:00,131] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt... 23: [2023-05-25 13:38:00,130] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 17: [2023-05-25 13:38:00,131] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 17: [2023-05-25 13:38:00,131] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 15: [2023-05-25 13:38:00,131] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 23: [2023-05-25 13:38:00,132] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 23: [2023-05-25 13:38:00,133] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt... 17: [2023-05-25 13:38:00,133] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt... 0: [2023-05-25 13:38:00,133] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 17: [2023-05-25 13:38:00,133] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt... 23: [2023-05-25 13:38:00,134] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt... 20: [2023-05-25 13:38:00,134] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 22: [2023-05-25 13:38:00,134] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 7: [2023-05-25 13:38:00,134] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 7: [2023-05-25 13:38:00,134] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 10: [2023-05-25 13:38:00,135] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 11: [2023-05-25 13:38:00,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 7: [2023-05-25 13:38:00,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 10: [2023-05-25 13:38:00,135] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 7: [2023-05-25 13:38:00,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 7: [2023-05-25 13:38:00,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 21: [2023-05-25 13:38:00,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 5: [2023-05-25 13:38:00,135] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 21: [2023-05-25 13:38:00,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 7: [2023-05-25 13:38:00,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 7: [2023-05-25 13:38:00,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 7: [2023-05-25 13:38:00,136] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 14: [2023-05-25 13:38:00,136] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 20: [2023-05-25 13:38:00,136] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt... 5: [2023-05-25 13:38:00,137] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 11: [2023-05-25 13:38:00,137] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt... 22: [2023-05-25 13:38:00,137] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt... 15: [2023-05-25 13:38:00,137] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 20: [2023-05-25 13:38:00,138] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 22: [2023-05-25 13:38:00,138] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 21: [2023-05-25 13:38:00,138] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt... 21: [2023-05-25 13:38:00,138] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt... 7: [2023-05-25 13:38:00,138] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 19: [2023-05-25 13:38:00,138] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 11: [2023-05-25 13:38:00,138] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 7: [2023-05-25 13:38:00,138] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt... 7: [2023-05-25 13:38:00,139] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt... 7: [2023-05-25 13:38:00,139] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt... 19: [2023-05-25 13:38:00,139] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 7: [2023-05-25 13:38:00,139] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt... 7: [2023-05-25 13:38:00,139] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt... 9: [2023-05-25 13:38:00,139] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 14: [2023-05-25 13:38:00,139] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 7: [2023-05-25 13:38:00,139] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt... 9: [2023-05-25 13:38:00,139] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 7: [2023-05-25 13:38:00,140] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt... 8: [2023-05-25 13:38:00,139] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 9: [2023-05-25 13:38:00,140] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 19: [2023-05-25 13:38:00,140] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt... 8: [2023-05-25 13:38:00,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 11: [2023-05-25 13:38:00,140] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt... 20: [2023-05-25 13:38:00,140] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt... 22: [2023-05-25 13:38:00,141] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt... 19: [2023-05-25 13:38:00,141] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt... 9: [2023-05-25 13:38:00,142] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt... 2: [2023-05-25 13:38:00,142] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 2: [2023-05-25 13:38:00,142] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 12: [2023-05-25 13:38:00,144] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 12: [2023-05-25 13:38:00,146] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_00-model_states.pt. 15: [2023-05-25 13:38:00,147] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 15: [2023-05-25 13:38:00,151] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 16: [2023-05-25 13:38:00,151] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 8: [2023-05-25 13:38:00,153] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 8: [2023-05-25 13:38:00,153] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 16: [2023-05-25 13:38:00,153] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 18: [2023-05-25 13:38:00,153] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 6: [2023-05-25 13:38:00,154] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 16: [2023-05-25 13:38:00,154] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt... 16: [2023-05-25 13:38:00,154] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt... 6: [2023-05-25 13:38:00,156] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 18: [2023-05-25 13:38:00,156] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt... 18: [2023-05-25 13:38:00,158] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 18: [2023-05-25 13:38:00,160] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt... 12: [2023-05-25 13:38:00,163] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 12: [2023-05-25 13:38:00,163] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 22: [2023-05-25 13:38:00,168] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 22: [2023-05-25 13:38:00,171] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt... 22: [2023-05-25 13:38:00,171] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_00-model_states.pt. 6: [2023-05-25 13:38:00,172] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 6: [2023-05-25 13:38:00,174] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 7: [2023-05-25 13:38:00,174] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 7: [2023-05-25 13:38:00,174] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_00-model_states.pt. 22: [2023-05-25 13:38:00,174] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt... 7: [2023-05-25 13:38:00,190] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 7: [2023-05-25 13:38:00,191] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 30: [2023-05-25 13:38:00,248] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt. 27: [2023-05-25 13:38:00,250] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt. 27: [2023-05-25 13:38:00,251] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt. 4: [2023-05-25 13:38:00,251] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt. 4: [2023-05-25 13:38:00,252] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt. 30: [2023-05-25 13:38:00,257] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt. 29: [2023-05-25 13:38:00,257] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt. 29: [2023-05-25 13:38:00,259] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt. 30: [2023-05-25 13:38:00,260] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 27: [2023-05-25 13:38:00,264] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 28: [2023-05-25 13:38:00,264] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt. 28: [2023-05-25 13:38:00,265] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt. 4: [2023-05-25 13:38:00,266] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 4: [2023-05-25 13:38:00,267] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 27: [2023-05-25 13:38:00,267] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 1: [2023-05-25 13:38:00,267] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt. 1: [2023-05-25 13:38:00,268] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt. 5: [2023-05-25 13:38:00,270] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt. 5: [2023-05-25 13:38:00,270] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt. 30: [2023-05-25 13:38:00,271] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 29: [2023-05-25 13:38:00,272] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 29: [2023-05-25 13:38:00,273] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 25: [2023-05-25 13:38:00,275] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt. 25: [2023-05-25 13:38:00,275] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt. 28: [2023-05-25 13:38:00,278] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 28: [2023-05-25 13:38:00,279] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 1: [2023-05-25 13:38:00,282] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 1: [2023-05-25 13:38:00,283] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 5: [2023-05-25 13:38:00,283] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 24: [2023-05-25 13:38:00,284] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt. 24: [2023-05-25 13:38:00,285] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt. 26: [2023-05-25 13:38:00,285] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt. 26: [2023-05-25 13:38:00,285] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt. 5: [2023-05-25 13:38:00,285] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 31: [2023-05-25 13:38:00,285] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt. 31: [2023-05-25 13:38:00,288] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_02-model_states.pt. 25: [2023-05-25 13:38:00,289] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 25: [2023-05-25 13:38:00,289] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 31: [2023-05-25 13:38:00,289] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 31: [2023-05-25 13:38:00,290] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 31: [2023-05-25 13:38:00,292] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 31: [2023-05-25 13:38:00,293] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 24: [2023-05-25 13:38:00,298] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 31: [2023-05-25 13:38:00,299] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 24: [2023-05-25 13:38:00,299] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 26: [2023-05-25 13:38:00,299] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 26: [2023-05-25 13:38:00,300] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 0: [2023-05-25 13:38:00,301] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt. 0: [2023-05-25 13:38:00,301] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt. 31: [2023-05-25 13:38:00,301] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 20: [2023-05-25 13:38:00,303] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt. 20: [2023-05-25 13:38:00,303] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt. 28: [2023-05-25 13:38:00,305] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 28: [2023-05-25 13:38:00,305] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 28: [2023-05-25 13:38:00,307] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 28: [2023-05-25 13:38:00,307] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 2: [2023-05-25 13:38:00,309] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt. 2: [2023-05-25 13:38:00,309] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt. 16: [2023-05-25 13:38:00,309] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt. 16: [2023-05-25 13:38:00,309] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt. 26: [2023-05-25 13:38:00,311] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 26: [2023-05-25 13:38:00,311] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 11: [2023-05-25 13:38:00,312] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt. 26: [2023-05-25 13:38:00,314] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 26: [2023-05-25 13:38:00,314] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 27: [2023-05-25 13:38:00,315] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 27: [2023-05-25 13:38:00,315] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 27: [2023-05-25 13:38:00,315] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 27: [2023-05-25 13:38:00,315] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 20: [2023-05-25 13:38:00,316] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 28: [2023-05-25 13:38:00,315] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 28: [2023-05-25 13:38:00,315] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 23: [2023-05-25 13:38:00,316] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt. 23: [2023-05-25 13:38:00,316] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt. 20: [2023-05-25 13:38:00,316] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 0: [2023-05-25 13:38:00,316] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 0: [2023-05-25 13:38:00,316] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 27: [2023-05-25 13:38:00,317] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 27: [2023-05-25 13:38:00,317] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 27: [2023-05-25 13:38:00,317] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt... 27: [2023-05-25 13:38:00,317] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt... 11: [2023-05-25 13:38:00,317] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt. 28: [2023-05-25 13:38:00,318] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt... 28: [2023-05-25 13:38:00,318] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt... 13: [2023-05-25 13:38:00,320] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt. 13: [2023-05-25 13:38:00,320] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt. 6: [2023-05-25 13:38:00,322] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt. 0: [2023-05-25 13:38:00,322] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt. 12: [2023-05-25 13:38:00,322] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt. 0: [2023-05-25 13:38:00,322] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt. 6: [2023-05-25 13:38:00,322] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt. 16: [2023-05-25 13:38:00,323] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 2: [2023-05-25 13:38:00,323] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 16: [2023-05-25 13:38:00,323] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 12: [2023-05-25 13:38:00,323] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt. 24: [2023-05-25 13:38:00,323] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 24: [2023-05-25 13:38:00,324] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 2: [2023-05-25 13:38:00,324] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 11: [2023-05-25 13:38:00,324] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 24: [2023-05-25 13:38:00,326] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 24: [2023-05-25 13:38:00,326] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 21: [2023-05-25 13:38:00,327] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt. 21: [2023-05-25 13:38:00,328] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt. 23: [2023-05-25 13:38:00,329] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 23: [2023-05-25 13:38:00,330] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 19: [2023-05-25 13:38:00,330] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt. 19: [2023-05-25 13:38:00,330] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt. 11: [2023-05-25 13:38:00,331] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 3: [2023-05-25 13:38:00,331] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt. 3: [2023-05-25 13:38:00,332] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt. 17: [2023-05-25 13:38:00,332] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt. 17: [2023-05-25 13:38:00,332] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt. 31: [2023-05-25 13:38:00,333] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 31: [2023-05-25 13:38:00,333] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 25: [2023-05-25 13:38:00,334] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 25: [2023-05-25 13:38:00,334] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 13: [2023-05-25 13:38:00,334] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt. 30: [2023-05-25 13:38:00,333] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 30: [2023-05-25 13:38:00,334] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 30: [2023-05-25 13:38:00,334] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 25: [2023-05-25 13:38:00,334] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 30: [2023-05-25 13:38:00,334] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 13: [2023-05-25 13:38:00,334] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 25: [2023-05-25 13:38:00,334] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 13: [2023-05-25 13:38:00,335] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 13: [2023-05-25 13:38:00,335] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt. 29: [2023-05-25 13:38:00,335] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 29: [2023-05-25 13:38:00,335] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 29: [2023-05-25 13:38:00,335] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 29: [2023-05-25 13:38:00,335] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 18: [2023-05-25 13:38:00,336] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt. 12: [2023-05-25 13:38:00,336] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 31: [2023-05-25 13:38:00,336] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 18: [2023-05-25 13:38:00,336] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt. 25: [2023-05-25 13:38:00,336] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 4: [2023-05-25 13:38:00,336] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt. 30: [2023-05-25 13:38:00,336] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt... 25: [2023-05-25 13:38:00,337] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 4: [2023-05-25 13:38:00,336] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt. 30: [2023-05-25 13:38:00,337] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 25: [2023-05-25 13:38:00,337] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt... 25: [2023-05-25 13:38:00,337] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt... 24: [2023-05-25 13:38:00,336] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 30: [2023-05-25 13:38:00,337] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt... 30: [2023-05-25 13:38:00,337] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 24: [2023-05-25 13:38:00,337] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 29: [2023-05-25 13:38:00,337] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 29: [2023-05-25 13:38:00,338] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 29: [2023-05-25 13:38:00,338] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt... 29: [2023-05-25 13:38:00,338] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt... 22: [2023-05-25 13:38:00,338] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt. 0: [2023-05-25 13:38:00,338] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 0: [2023-05-25 13:38:00,338] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 22: [2023-05-25 13:38:00,338] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_02-model_states.pt. 6: [2023-05-25 13:38:00,339] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 6: [2023-05-25 13:38:00,339] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 24: [2023-05-25 13:38:00,339] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt... 26: [2023-05-25 13:38:00,340] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 26: [2023-05-25 13:38:00,340] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 24: [2023-05-25 13:38:00,340] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt... 12: [2023-05-25 13:38:00,340] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 0: [2023-05-25 13:38:00,340] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt. 0: [2023-05-25 13:38:00,341] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt. 31: [2023-05-25 13:38:00,341] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 21: [2023-05-25 13:38:00,341] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 28: [2023-05-25 13:38:00,341] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 6: [2023-05-25 13:38:00,342] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt. 28: [2023-05-25 13:38:00,343] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 3: [2023-05-25 13:38:00,342] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt. 3: [2023-05-25 13:38:00,343] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt. 6: [2023-05-25 13:38:00,343] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt. 26: [2023-05-25 13:38:00,343] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt... 26: [2023-05-25 13:38:00,343] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt... 2: [2023-05-25 13:38:00,343] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt. 21: [2023-05-25 13:38:00,344] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 19: [2023-05-25 13:38:00,344] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 19: [2023-05-25 13:38:00,344] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 2: [2023-05-25 13:38:00,344] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt. 17: [2023-05-25 13:38:00,346] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 17: [2023-05-25 13:38:00,346] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 6: [2023-05-25 13:38:00,346] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt. 6: [2023-05-25 13:38:00,346] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt. 9: [2023-05-25 13:38:00,346] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt. 9: [2023-05-25 13:38:00,346] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt. 31: [2023-05-25 13:38:00,347] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 5: [2023-05-25 13:38:00,347] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt. 5: [2023-05-25 13:38:00,347] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt. 5: [2023-05-25 13:38:00,347] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt. 5: [2023-05-25 13:38:00,347] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt. 31: [2023-05-25 13:38:00,347] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 26: [2023-05-25 13:38:00,347] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 31: [2023-05-25 13:38:00,347] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt... 31: [2023-05-25 13:38:00,348] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt... 27: [2023-05-25 13:38:00,348] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 27: [2023-05-25 13:38:00,348] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 3: [2023-05-25 13:38:00,349] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 18: [2023-05-25 13:38:00,349] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 13: [2023-05-25 13:38:00,349] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 20: [2023-05-25 13:38:00,349] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 20: [2023-05-25 13:38:00,349] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 13: [2023-05-25 13:38:00,350] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 4: [2023-05-25 13:38:00,350] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 8: [2023-05-25 13:38:00,350] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt. 20: [2023-05-25 13:38:00,351] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 7: [2023-05-25 13:38:00,351] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt. 7: [2023-05-25 13:38:00,352] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_02-model_states.pt. 4: [2023-05-25 13:38:00,352] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 8: [2023-05-25 13:38:00,350] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt. 20: [2023-05-25 13:38:00,352] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 22: [2023-05-25 13:38:00,353] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 18: [2023-05-25 13:38:00,353] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 22: [2023-05-25 13:38:00,354] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 28: [2023-05-25 13:38:00,355] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 28: [2023-05-25 13:38:00,355] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 30: [2023-05-25 13:38:00,355] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt. 26: [2023-05-25 13:38:00,355] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 30: [2023-05-25 13:38:00,356] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt. 0: [2023-05-25 13:38:00,356] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 0: [2023-05-25 13:38:00,356] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 20: [2023-05-25 13:38:00,356] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 20: [2023-05-25 13:38:00,357] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 10: [2023-05-25 13:38:00,358] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt. 8: [2023-05-25 13:38:00,359] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt. 24: [2023-05-25 13:38:00,359] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt. 24: [2023-05-25 13:38:00,359] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt. 24: [2023-05-25 13:38:00,359] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 2: [2023-05-25 13:38:00,359] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 2: [2023-05-25 13:38:00,359] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 9: [2023-05-25 13:38:00,360] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 8: [2023-05-25 13:38:00,360] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt. 20: [2023-05-25 13:38:00,360] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt... 20: [2023-05-25 13:38:00,360] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt... 3: [2023-05-25 13:38:00,360] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 16: [2023-05-25 13:38:00,360] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 16: [2023-05-25 13:38:00,360] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 24: [2023-05-25 13:38:00,360] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 16: [2023-05-25 13:38:00,360] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 28: [2023-05-25 13:38:00,360] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt. 28: [2023-05-25 13:38:00,361] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt. 9: [2023-05-25 13:38:00,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 10: [2023-05-25 13:38:00,361] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt. 16: [2023-05-25 13:38:00,361] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 26: [2023-05-25 13:38:00,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 5: [2023-05-25 13:38:00,362] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 5: [2023-05-25 13:38:00,362] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 6: [2023-05-25 13:38:00,362] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 1: [2023-05-25 13:38:00,362] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt. 1: [2023-05-25 13:38:00,362] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt. 9: [2023-05-25 13:38:00,362] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt. 16: [2023-05-25 13:38:00,362] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 9: [2023-05-25 13:38:00,363] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt. 16: [2023-05-25 13:38:00,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 21: [2023-05-25 13:38:00,363] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 8: [2023-05-25 13:38:00,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 8: [2023-05-25 13:38:00,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 27: [2023-05-25 13:38:00,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 21: [2023-05-25 13:38:00,363] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 6: [2023-05-25 13:38:00,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 5: [2023-05-25 13:38:00,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 5: [2023-05-25 13:38:00,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 27: [2023-05-25 13:38:00,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 3: [2023-05-25 13:38:00,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 23: [2023-05-25 13:38:00,365] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 23: [2023-05-25 13:38:00,365] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 21: [2023-05-25 13:38:00,365] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 4: [2023-05-25 13:38:00,366] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt. 16: [2023-05-25 13:38:00,366] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt... 21: [2023-05-25 13:38:00,366] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 4: [2023-05-25 13:38:00,366] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt. 16: [2023-05-25 13:38:00,366] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt... 6: [2023-05-25 13:38:00,366] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 1: [2023-05-25 13:38:00,366] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt. 1: [2023-05-25 13:38:00,366] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt. 3: [2023-05-25 13:38:00,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 6: [2023-05-25 13:38:00,367] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 23: [2023-05-25 13:38:00,367] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 23: [2023-05-25 13:38:00,367] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 23: [2023-05-25 13:38:00,368] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 23: [2023-05-25 13:38:00,368] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 29: [2023-05-25 13:38:00,368] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 25: [2023-05-25 13:38:00,368] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 11: [2023-05-25 13:38:00,369] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt. 30: [2023-05-25 13:38:00,369] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 26: [2023-05-25 13:38:00,369] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 10: [2023-05-25 13:38:00,370] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 25: [2023-05-25 13:38:00,370] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 29: [2023-05-25 13:38:00,370] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt. 29: [2023-05-25 13:38:00,370] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt. 11: [2023-05-25 13:38:00,370] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt. 30: [2023-05-25 13:38:00,370] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 24: [2023-05-25 13:38:00,372] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 30: [2023-05-25 13:38:00,372] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 7: [2023-05-25 13:38:00,372] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 7: [2023-05-25 13:38:00,372] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 10: [2023-05-25 13:38:00,372] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt. 10: [2023-05-25 13:38:00,373] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 8: [2023-05-25 13:38:00,373] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 29: [2023-05-25 13:38:00,373] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 10: [2023-05-25 13:38:00,373] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt. 19: [2023-05-25 13:38:00,373] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 19: [2023-05-25 13:38:00,373] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 14: [2023-05-25 13:38:00,374] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt. 28: [2023-05-25 13:38:00,374] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 28: [2023-05-25 13:38:00,374] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 14: [2023-05-25 13:38:00,374] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt. 8: [2023-05-25 13:38:00,375] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 23: [2023-05-25 13:38:00,375] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt... 24: [2023-05-25 13:38:00,375] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 23: [2023-05-25 13:38:00,375] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt... 19: [2023-05-25 13:38:00,376] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 24: [2023-05-25 13:38:00,376] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 19: [2023-05-25 13:38:00,376] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 24: [2023-05-25 13:38:00,376] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 9: [2023-05-25 13:38:00,377] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 30: [2023-05-25 13:38:00,377] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 3: [2023-05-25 13:38:00,378] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt. 3: [2023-05-25 13:38:00,378] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt. 2: [2023-05-25 13:38:00,378] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt. 26: [2023-05-25 13:38:00,377] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt. 9: [2023-05-25 13:38:00,378] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 26: [2023-05-25 13:38:00,378] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt. 7: [2023-05-25 13:38:00,380] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt. 2: [2023-05-25 13:38:00,380] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt. 1: [2023-05-25 13:38:00,380] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 1: [2023-05-25 13:38:00,380] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 7: [2023-05-25 13:38:00,380] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_03-model_states.pt. 21: [2023-05-25 13:38:00,380] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 21: [2023-05-25 13:38:00,380] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 4: [2023-05-25 13:38:00,381] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 1: [2023-05-25 13:38:00,381] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 4: [2023-05-25 13:38:00,381] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 17: [2023-05-25 13:38:00,381] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 17: [2023-05-25 13:38:00,381] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 1: [2023-05-25 13:38:00,381] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 11: [2023-05-25 13:38:00,382] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 22: [2023-05-25 13:38:00,381] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 22: [2023-05-25 13:38:00,381] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 25: [2023-05-25 13:38:00,382] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 28: [2023-05-25 13:38:00,382] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt. 20: [2023-05-25 13:38:00,382] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 28: [2023-05-25 13:38:00,382] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt. 10: [2023-05-25 13:38:00,382] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt. 19: [2023-05-25 13:38:00,383] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 19: [2023-05-25 13:38:00,383] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 10: [2023-05-25 13:38:00,383] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt. 29: [2023-05-25 13:38:00,383] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 20: [2023-05-25 13:38:00,383] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 30: [2023-05-25 13:38:00,383] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 17: [2023-05-25 13:38:00,384] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 18: [2023-05-25 13:38:00,384] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 11: [2023-05-25 13:38:00,384] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 17: [2023-05-25 13:38:00,384] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 17: [2023-05-25 13:38:00,384] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 17: [2023-05-25 13:38:00,384] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 18: [2023-05-25 13:38:00,384] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 29: [2023-05-25 13:38:00,384] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt. 21: [2023-05-25 13:38:00,384] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt... 21: [2023-05-25 13:38:00,384] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt... 22: [2023-05-25 13:38:00,384] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 22: [2023-05-25 13:38:00,384] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 29: [2023-05-25 13:38:00,385] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 29: [2023-05-25 13:38:00,385] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 29: [2023-05-25 13:38:00,385] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt. 25: [2023-05-25 13:38:00,385] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 19: [2023-05-25 13:38:00,386] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt... 19: [2023-05-25 13:38:00,386] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt... 10: [2023-05-25 13:38:00,386] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 25: [2023-05-25 13:38:00,386] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt. 18: [2023-05-25 13:38:00,386] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 17: [2023-05-25 13:38:00,387] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt... 17: [2023-05-25 13:38:00,387] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt... 25: [2023-05-25 13:38:00,387] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt. 18: [2023-05-25 13:38:00,387] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 14: [2023-05-25 13:38:00,387] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 14: [2023-05-25 13:38:00,387] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 29: [2023-05-25 13:38:00,388] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 31: [2023-05-25 13:38:00,388] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt. 31: [2023-05-25 13:38:00,388] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt. 27: [2023-05-25 13:38:00,388] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt. 27: [2023-05-25 13:38:00,389] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt. 10: [2023-05-25 13:38:00,389] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 13: [2023-05-25 13:38:00,390] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 13: [2023-05-25 13:38:00,390] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 13: [2023-05-25 13:38:00,390] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 13: [2023-05-25 13:38:00,391] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 13: [2023-05-25 13:38:00,391] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 13: [2023-05-25 13:38:00,391] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 26: [2023-05-25 13:38:00,391] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt. 26: [2023-05-25 13:38:00,391] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt. 26: [2023-05-25 13:38:00,392] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 14: [2023-05-25 13:38:00,392] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 14: [2023-05-25 13:38:00,392] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 13: [2023-05-25 13:38:00,392] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 18: [2023-05-25 13:38:00,392] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 30: [2023-05-25 13:38:00,393] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 13: [2023-05-25 13:38:00,393] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt... 13: [2023-05-25 13:38:00,393] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt... 26: [2023-05-25 13:38:00,393] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 21: [2023-05-25 13:38:00,393] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 24: [2023-05-25 13:38:00,394] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt. 13: [2023-05-25 13:38:00,394] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt... 16: [2023-05-25 13:38:00,394] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 18: [2023-05-25 13:38:00,394] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 24: [2023-05-25 13:38:00,394] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt. 13: [2023-05-25 13:38:00,394] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt. 13: [2023-05-25 13:38:00,394] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 14: [2023-05-25 13:38:00,394] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 23: [2023-05-25 13:38:00,394] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 13: [2023-05-25 13:38:00,394] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt. 28: [2023-05-25 13:38:00,394] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 13: [2023-05-25 13:38:00,394] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt... 16: [2023-05-25 13:38:00,395] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 11: [2023-05-25 13:38:00,395] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 11: [2023-05-25 13:38:00,395] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 11: [2023-05-25 13:38:00,395] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 11: [2023-05-25 13:38:00,395] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 18: [2023-05-25 13:38:00,395] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt... 14: [2023-05-25 13:38:00,395] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 20: [2023-05-25 13:38:00,395] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 28: [2023-05-25 13:38:00,396] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 15: [2023-05-25 13:38:00,396] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt. 10: [2023-05-25 13:38:00,396] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 15: [2023-05-25 13:38:00,396] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt. 22: [2023-05-25 13:38:00,396] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 3: [2023-05-25 13:38:00,396] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 22: [2023-05-25 13:38:00,396] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 18: [2023-05-25 13:38:00,396] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt... 11: [2023-05-25 13:38:00,396] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 10: [2023-05-25 13:38:00,397] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 3: [2023-05-25 13:38:00,397] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 2: [2023-05-25 13:38:00,397] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 14: [2023-05-25 13:38:00,397] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt. 2: [2023-05-25 13:38:00,397] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 8: [2023-05-25 13:38:00,397] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt. 23: [2023-05-25 13:38:00,397] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 8: [2023-05-25 13:38:00,397] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt. 11: [2023-05-25 13:38:00,397] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 14: [2023-05-25 13:38:00,397] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_02-model_states.pt. 11: [2023-05-25 13:38:00,398] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt... 11: [2023-05-25 13:38:00,398] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt... 29: [2023-05-25 13:38:00,399] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 22: [2023-05-25 13:38:00,399] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt... 15: [2023-05-25 13:38:00,399] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 22: [2023-05-25 13:38:00,399] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt... 15: [2023-05-25 13:38:00,399] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 29: [2023-05-25 13:38:00,399] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 20: [2023-05-25 13:38:00,401] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 31: [2023-05-25 13:38:00,401] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 31: [2023-05-25 13:38:00,402] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 15: [2023-05-25 13:38:00,402] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 15: [2023-05-25 13:38:00,402] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 25: [2023-05-25 13:38:00,402] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 7: [2023-05-25 13:38:00,403] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt. 25: [2023-05-25 13:38:00,403] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 9: [2023-05-25 13:38:00,403] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 21: [2023-05-25 13:38:00,404] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 9: [2023-05-25 13:38:00,404] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 9: [2023-05-25 13:38:00,404] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 7: [2023-05-25 13:38:00,404] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_03-model_01-model_states.pt. 7: [2023-05-25 13:38:00,404] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 27: [2023-05-25 13:38:00,404] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 7: [2023-05-25 13:38:00,404] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 26: [2023-05-25 13:38:00,404] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 30: [2023-05-25 13:38:00,404] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 27: [2023-05-25 13:38:00,405] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 19: [2023-05-25 13:38:00,405] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 9: [2023-05-25 13:38:00,405] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 9: [2023-05-25 13:38:00,406] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt... 9: [2023-05-25 13:38:00,406] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt. 24: [2023-05-25 13:38:00,406] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 9: [2023-05-25 13:38:00,406] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt. 21: [2023-05-25 13:38:00,406] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 30: [2023-05-25 13:38:00,407] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt... 19: [2023-05-25 13:38:00,407] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 9: [2023-05-25 13:38:00,407] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 9: [2023-05-25 13:38:00,407] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 26: [2023-05-25 13:38:00,408] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 23: [2023-05-25 13:38:00,408] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 28: [2023-05-25 13:38:00,409] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 28: [2023-05-25 13:38:00,409] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 30: [2023-05-25 13:38:00,409] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 24: [2023-05-25 13:38:00,409] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 24: [2023-05-25 13:38:00,409] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 9: [2023-05-25 13:38:00,409] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt... 13: [2023-05-25 13:38:00,409] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 13: [2023-05-25 13:38:00,410] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 16: [2023-05-25 13:38:00,410] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 8: [2023-05-25 13:38:00,410] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 8: [2023-05-25 13:38:00,410] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 23: [2023-05-25 13:38:00,410] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 16: [2023-05-25 13:38:00,410] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 14: [2023-05-25 13:38:00,410] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 15: [2023-05-25 13:38:00,411] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt. 24: [2023-05-25 13:38:00,411] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt... 30: [2023-05-25 13:38:00,411] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt... 15: [2023-05-25 13:38:00,411] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 15: [2023-05-25 13:38:00,411] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt. 28: [2023-05-25 13:38:00,412] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt... 28: [2023-05-25 13:38:00,412] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt... 15: [2023-05-25 13:38:00,412] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 17: [2023-05-25 13:38:00,412] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 24: [2023-05-25 13:38:00,412] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 15: [2023-05-25 13:38:00,413] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt. 22: [2023-05-25 13:38:00,413] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 14: [2023-05-25 13:38:00,413] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 29: [2023-05-25 13:38:00,413] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 15: [2023-05-25 13:38:00,413] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt. 24: [2023-05-25 13:38:00,414] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt... 30: [2023-05-25 13:38:00,414] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt. 11: [2023-05-25 13:38:00,414] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 30: [2023-05-25 13:38:00,414] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt. 12: [2023-05-25 13:38:00,414] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt. 12: [2023-05-25 13:38:00,415] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt. 8: [2023-05-25 13:38:00,415] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 29: [2023-05-25 13:38:00,415] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt... 8: [2023-05-25 13:38:00,415] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 8: [2023-05-25 13:38:00,415] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 8: [2023-05-25 13:38:00,415] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 8: [2023-05-25 13:38:00,415] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 8: [2023-05-25 13:38:00,415] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 18: [2023-05-25 13:38:00,415] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 11: [2023-05-25 13:38:00,416] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 11: [2023-05-25 13:38:00,417] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt. 11: [2023-05-25 13:38:00,417] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt. 29: [2023-05-25 13:38:00,417] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 11: [2023-05-25 13:38:00,417] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt... 8: [2023-05-25 13:38:00,418] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt... 21: [2023-05-25 13:38:00,418] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 14: [2023-05-25 13:38:00,418] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt. 18: [2023-05-25 13:38:00,418] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 14: [2023-05-25 13:38:00,418] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_03-model_states.pt. 11: [2023-05-25 13:38:00,418] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt... 17: [2023-05-25 13:38:00,419] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 22: [2023-05-25 13:38:00,419] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 9: [2023-05-25 13:38:00,419] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 8: [2023-05-25 13:38:00,419] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt... 8: [2023-05-25 13:38:00,419] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 8: [2023-05-25 13:38:00,419] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 8: [2023-05-25 13:38:00,419] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt... 8: [2023-05-25 13:38:00,419] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt... 29: [2023-05-25 13:38:00,419] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt... 19: [2023-05-25 13:38:00,420] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 9: [2023-05-25 13:38:00,421] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 19: [2023-05-25 13:38:00,422] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 9: [2023-05-25 13:38:00,422] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 13: [2023-05-25 13:38:00,423] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 9: [2023-05-25 13:38:00,423] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 10: [2023-05-25 13:38:00,423] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 10: [2023-05-25 13:38:00,423] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 10: [2023-05-25 13:38:00,423] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 10: [2023-05-25 13:38:00,423] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 28: [2023-05-25 13:38:00,424] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 28: [2023-05-25 13:38:00,424] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 9: [2023-05-25 13:38:00,425] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt... 25: [2023-05-25 13:38:00,425] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt. 25: [2023-05-25 13:38:00,425] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt. 7: [2023-05-25 13:38:00,425] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 7: [2023-05-25 13:38:00,425] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 10: [2023-05-25 13:38:00,425] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 17: [2023-05-25 13:38:00,425] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 10: [2023-05-25 13:38:00,426] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt... 10: [2023-05-25 13:38:00,426] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt... 9: [2023-05-25 13:38:00,426] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt... 15: [2023-05-25 13:38:00,426] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 22: [2023-05-25 13:38:00,426] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 10: [2023-05-25 13:38:00,426] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 11: [2023-05-25 13:38:00,426] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 26: [2023-05-25 13:38:00,427] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 12: [2023-05-25 13:38:00,427] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt. 12: [2023-05-25 13:38:00,427] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_15-model_01-model_states.pt. 30: [2023-05-25 13:38:00,428] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 28: [2023-05-25 13:38:00,428] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt... 28: [2023-05-25 13:38:00,428] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt... 18: [2023-05-25 13:38:00,428] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 30: [2023-05-25 13:38:00,428] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 31: [2023-05-25 13:38:00,428] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 12: [2023-05-25 13:38:00,429] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 15: [2023-05-25 13:38:00,428] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 15: [2023-05-25 13:38:00,428] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 26: [2023-05-25 13:38:00,429] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt... 29: [2023-05-25 13:38:00,429] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 29: [2023-05-25 13:38:00,430] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 26: [2023-05-25 13:38:00,430] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 31: [2023-05-25 13:38:00,430] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt... 10: [2023-05-25 13:38:00,430] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 15: [2023-05-25 13:38:00,431] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 16: [2023-05-25 13:38:00,431] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt. 16: [2023-05-25 13:38:00,431] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt. 11: [2023-05-25 13:38:00,431] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 11: [2023-05-25 13:38:00,431] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 31: [2023-05-25 13:38:00,431] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 4: [2023-05-25 13:38:00,431] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 4: [2023-05-25 13:38:00,431] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 4: [2023-05-25 13:38:00,431] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 4: [2023-05-25 13:38:00,431] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 4: [2023-05-25 13:38:00,431] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 4: [2023-05-25 13:38:00,431] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 4: [2023-05-25 13:38:00,431] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 4: [2023-05-25 13:38:00,432] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 18: [2023-05-25 13:38:00,432] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 14: [2023-05-25 13:38:00,432] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 26: [2023-05-25 13:38:00,432] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt... 22: [2023-05-25 13:38:00,433] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 12: [2023-05-25 13:38:00,433] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 14: [2023-05-25 13:38:00,433] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 27: [2023-05-25 13:38:00,433] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt. 27: [2023-05-25 13:38:00,433] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_03-model_states.pt. 17: [2023-05-25 13:38:00,433] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 4: [2023-05-25 13:38:00,433] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt... 31: [2023-05-25 13:38:00,433] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt... 4: [2023-05-25 13:38:00,434] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt... 4: [2023-05-25 13:38:00,434] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt... 4: [2023-05-25 13:38:00,434] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt... 27: [2023-05-25 13:38:00,434] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 4: [2023-05-25 13:38:00,434] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt... 14: [2023-05-25 13:38:00,434] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 14: [2023-05-25 13:38:00,434] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 4: [2023-05-25 13:38:00,434] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 10: [2023-05-25 13:38:00,434] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 10: [2023-05-25 13:38:00,434] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 10: [2023-05-25 13:38:00,434] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 4: [2023-05-25 13:38:00,435] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 4: [2023-05-25 13:38:00,435] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt... 11: [2023-05-25 13:38:00,435] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 25: [2023-05-25 13:38:00,435] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 29: [2023-05-25 13:38:00,435] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt... 29: [2023-05-25 13:38:00,435] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt... 27: [2023-05-25 13:38:00,435] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 10: [2023-05-25 13:38:00,435] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt... 31: [2023-05-25 13:38:00,436] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt. 10: [2023-05-25 13:38:00,436] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt... 10: [2023-05-25 13:38:00,436] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt... 31: [2023-05-25 13:38:00,436] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_37-model_01-model_states.pt. 10: [2023-05-25 13:38:00,436] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt... 27: [2023-05-25 13:38:00,437] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt... 27: [2023-05-25 13:38:00,438] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt... 14: [2023-05-25 13:38:00,438] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt... 13: [2023-05-25 13:38:00,438] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 26: [2023-05-25 13:38:00,438] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 14: [2023-05-25 13:38:00,438] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 13: [2023-05-25 13:38:00,438] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 11: [2023-05-25 13:38:00,438] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 25: [2023-05-25 13:38:00,439] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt... 25: [2023-05-25 13:38:00,439] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 14: [2023-05-25 13:38:00,440] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 25: [2023-05-25 13:38:00,440] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 26: [2023-05-25 13:38:00,441] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt... 24: [2023-05-25 13:38:00,441] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 25: [2023-05-25 13:38:00,441] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt... 14: [2023-05-25 13:38:00,441] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt... 25: [2023-05-25 13:38:00,441] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 1: [2023-05-25 13:38:00,441] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 1: [2023-05-25 13:38:00,441] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 1: [2023-05-25 13:38:00,442] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 1: [2023-05-25 13:38:00,442] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 1: [2023-05-25 13:38:00,442] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 1: [2023-05-25 13:38:00,442] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 1: [2023-05-25 13:38:00,442] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 1: [2023-05-25 13:38:00,442] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 0: [2023-05-25 13:38:00,443] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 8: [2023-05-25 13:38:00,443] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 0: [2023-05-25 13:38:00,443] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 24: [2023-05-25 13:38:00,443] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt... 0: [2023-05-25 13:38:00,444] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 0: [2023-05-25 13:38:00,444] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 0: [2023-05-25 13:38:00,444] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 0: [2023-05-25 13:38:00,444] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 0: [2023-05-25 13:38:00,444] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 0: [2023-05-25 13:38:00,444] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 16: [2023-05-25 13:38:00,444] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 1: [2023-05-25 13:38:00,444] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt... 0: [2023-05-25 13:38:00,444] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 1: [2023-05-25 13:38:00,444] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt... 1: [2023-05-25 13:38:00,445] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt... 1: [2023-05-25 13:38:00,445] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 1: [2023-05-25 13:38:00,445] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt... 1: [2023-05-25 13:38:00,445] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 1: [2023-05-25 13:38:00,445] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt... 16: [2023-05-25 13:38:00,445] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 12: [2023-05-25 13:38:00,445] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 1: [2023-05-25 13:38:00,445] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt... 8: [2023-05-25 13:38:00,445] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt... 24: [2023-05-25 13:38:00,445] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 13: [2023-05-25 13:38:00,445] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 12: [2023-05-25 13:38:00,446] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 26: [2023-05-25 13:38:00,446] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 3: [2023-05-25 13:38:00,446] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 3: [2023-05-25 13:38:00,446] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 9: [2023-05-25 13:38:00,446] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 9: [2023-05-25 13:38:00,446] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 15: [2023-05-25 13:38:00,446] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 3: [2023-05-25 13:38:00,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 3: [2023-05-25 13:38:00,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 0: [2023-05-25 13:38:00,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt... 3: [2023-05-25 13:38:00,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 3: [2023-05-25 13:38:00,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 3: [2023-05-25 13:38:00,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 3: [2023-05-25 13:38:00,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 9: [2023-05-25 13:38:00,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 0: [2023-05-25 13:38:00,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 24: [2023-05-25 13:38:00,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt... 19: [2023-05-25 13:38:00,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt. 19: [2023-05-25 13:38:00,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt. 26: [2023-05-25 13:38:00,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt... 8: [2023-05-25 13:38:00,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 14: [2023-05-25 13:38:00,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 0: [2023-05-25 13:38:00,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt... 27: [2023-05-25 13:38:00,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 27: [2023-05-25 13:38:00,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 12: [2023-05-25 13:38:00,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 0: [2023-05-25 13:38:00,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt... 0: [2023-05-25 13:38:00,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt... 0: [2023-05-25 13:38:00,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt... 0: [2023-05-25 13:38:00,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt... 11: [2023-05-25 13:38:00,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 13: [2023-05-25 13:38:00,449] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 13: [2023-05-25 13:38:00,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt... 12: [2023-05-25 13:38:00,449] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 12: [2023-05-25 13:38:00,449] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 12: [2023-05-25 13:38:00,450] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 31: [2023-05-25 13:38:00,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 3: [2023-05-25 13:38:00,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 9: [2023-05-25 13:38:00,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt... 3: [2023-05-25 13:38:00,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 3: [2023-05-25 13:38:00,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt... 3: [2023-05-25 13:38:00,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt... 3: [2023-05-25 13:38:00,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt... 8: [2023-05-25 13:38:00,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt... 3: [2023-05-25 13:38:00,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt... 3: [2023-05-25 13:38:00,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt... 31: [2023-05-25 13:38:00,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt... 3: [2023-05-25 13:38:00,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt... 12: [2023-05-25 13:38:00,451] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 8: [2023-05-25 13:38:00,451] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 8: [2023-05-25 13:38:00,451] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 15: [2023-05-25 13:38:00,451] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 14: [2023-05-25 13:38:00,451] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 13: [2023-05-25 13:38:00,452] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt... 12: [2023-05-25 13:38:00,452] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt... 12: [2023-05-25 13:38:00,452] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt... 12: [2023-05-25 13:38:00,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt... 13: [2023-05-25 13:38:00,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 14: [2023-05-25 13:38:00,454] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt... 15: [2023-05-25 13:38:00,454] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt... 10: [2023-05-25 13:38:00,454] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 10: [2023-05-25 13:38:00,454] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 5: [2023-05-25 13:38:00,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 5: [2023-05-25 13:38:00,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 5: [2023-05-25 13:38:00,456] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 5: [2023-05-25 13:38:00,456] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 5: [2023-05-25 13:38:00,456] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 5: [2023-05-25 13:38:00,456] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 5: [2023-05-25 13:38:00,456] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 2: [2023-05-25 13:38:00,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 2: [2023-05-25 13:38:00,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 2: [2023-05-25 13:38:00,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 2: [2023-05-25 13:38:00,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 2: [2023-05-25 13:38:00,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 5: [2023-05-25 13:38:00,456] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 2: [2023-05-25 13:38:00,456] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 2: [2023-05-25 13:38:00,456] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 2: [2023-05-25 13:38:00,456] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 14: [2023-05-25 13:38:00,456] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 14: [2023-05-25 13:38:00,456] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 15: [2023-05-25 13:38:00,456] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 9: [2023-05-25 13:38:00,457] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 2: [2023-05-25 13:38:00,457] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt... 2: [2023-05-25 13:38:00,458] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 2: [2023-05-25 13:38:00,458] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 5: [2023-05-25 13:38:00,457] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 30: [2023-05-25 13:38:00,458] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 5: [2023-05-25 13:38:00,458] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 5: [2023-05-25 13:38:00,459] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt... 5: [2023-05-25 13:38:00,459] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt... 23: [2023-05-25 13:38:00,459] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt. 14: [2023-05-25 13:38:00,459] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt... 2: [2023-05-25 13:38:00,459] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt... 5: [2023-05-25 13:38:00,459] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt... 5: [2023-05-25 13:38:00,459] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt... 5: [2023-05-25 13:38:00,459] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt... 5: [2023-05-25 13:38:00,459] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt... 9: [2023-05-25 13:38:00,459] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt... 2: [2023-05-25 13:38:00,459] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt... 15: [2023-05-25 13:38:00,459] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt... 2: [2023-05-25 13:38:00,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt... 2: [2023-05-25 13:38:00,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt... 2: [2023-05-25 13:38:00,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt... 19: [2023-05-25 13:38:00,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 9: [2023-05-25 13:38:00,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 9: [2023-05-25 13:38:00,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 15: [2023-05-25 13:38:00,460] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 30: [2023-05-25 13:38:00,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt... 23: [2023-05-25 13:38:00,461] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt. 19: [2023-05-25 13:38:00,461] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 6: [2023-05-25 13:38:00,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 6: [2023-05-25 13:38:00,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 6: [2023-05-25 13:38:00,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 6: [2023-05-25 13:38:00,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 6: [2023-05-25 13:38:00,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 20: [2023-05-25 13:38:00,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt. 30: [2023-05-25 13:38:00,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 6: [2023-05-25 13:38:00,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 6: [2023-05-25 13:38:00,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 15: [2023-05-25 13:38:00,464] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 4: [2023-05-25 13:38:00,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 4: [2023-05-25 13:38:00,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 6: [2023-05-25 13:38:00,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 20: [2023-05-25 13:38:00,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt. 8: [2023-05-25 13:38:00,464] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 19: [2023-05-25 13:38:00,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt. 8: [2023-05-25 13:38:00,464] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 19: [2023-05-25 13:38:00,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt. 17: [2023-05-25 13:38:00,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt. 14: [2023-05-25 13:38:00,465] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 6: [2023-05-25 13:38:00,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 30: [2023-05-25 13:38:00,466] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt... 12: [2023-05-25 13:38:00,466] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 6: [2023-05-25 13:38:00,466] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 12: [2023-05-25 13:38:00,466] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 6: [2023-05-25 13:38:00,466] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt... 6: [2023-05-25 13:38:00,466] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt... 6: [2023-05-25 13:38:00,466] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt... 6: [2023-05-25 13:38:00,466] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt... 6: [2023-05-25 13:38:00,467] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt... 6: [2023-05-25 13:38:00,467] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt... 14: [2023-05-25 13:38:00,467] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt... 25: [2023-05-25 13:38:00,467] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 11: [2023-05-25 13:38:00,467] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 10: [2023-05-25 13:38:00,468] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 10: [2023-05-25 13:38:00,468] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 15: [2023-05-25 13:38:00,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 12: [2023-05-25 13:38:00,468] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt... 12: [2023-05-25 13:38:00,469] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt... 14: [2023-05-25 13:38:00,469] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 25: [2023-05-25 13:38:00,469] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 11: [2023-05-25 13:38:00,469] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 11: [2023-05-25 13:38:00,470] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt... 17: [2023-05-25 13:38:00,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt. 25: [2023-05-25 13:38:00,471] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt... 14: [2023-05-25 13:38:00,471] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt... 25: [2023-05-25 13:38:00,471] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt... 11: [2023-05-25 13:38:00,472] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt... 15: [2023-05-25 13:38:00,472] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt... 23: [2023-05-25 13:38:00,472] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 17: [2023-05-25 13:38:00,472] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt. 17: [2023-05-25 13:38:00,473] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt. 15: [2023-05-25 13:38:00,475] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 18: [2023-05-25 13:38:00,475] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt. 1: [2023-05-25 13:38:00,476] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 21: [2023-05-25 13:38:00,476] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt. 18: [2023-05-25 13:38:00,476] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt. 0: [2023-05-25 13:38:00,476] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 21: [2023-05-25 13:38:00,476] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt. 16: [2023-05-25 13:38:00,476] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 23: [2023-05-25 13:38:00,476] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 0: [2023-05-25 13:38:00,477] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 15: [2023-05-25 13:38:00,477] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 4: [2023-05-25 13:38:00,477] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 20: [2023-05-25 13:38:00,477] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 18: [2023-05-25 13:38:00,477] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt. 18: [2023-05-25 13:38:00,478] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt. 23: [2023-05-25 13:38:00,478] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt. 16: [2023-05-25 13:38:00,478] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 17: [2023-05-25 13:38:00,478] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 22: [2023-05-25 13:38:00,478] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt. 16: [2023-05-25 13:38:00,478] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt... 16: [2023-05-25 13:38:00,479] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt. 4: [2023-05-25 13:38:00,479] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 16: [2023-05-25 13:38:00,479] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt. 19: [2023-05-25 13:38:00,479] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 23: [2023-05-25 13:38:00,479] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt. 15: [2023-05-25 13:38:00,479] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 22: [2023-05-25 13:38:00,479] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_03-model_states.pt. 3: [2023-05-25 13:38:00,479] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 3: [2023-05-25 13:38:00,479] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 1: [2023-05-25 13:38:00,479] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 20: [2023-05-25 13:38:00,480] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt. 15: [2023-05-25 13:38:00,480] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt... 31: [2023-05-25 13:38:00,479] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 19: [2023-05-25 13:38:00,480] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 16: [2023-05-25 13:38:00,481] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt... 20: [2023-05-25 13:38:00,481] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 20: [2023-05-25 13:38:00,481] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt. 12: [2023-05-25 13:38:00,481] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 15: [2023-05-25 13:38:00,481] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt... 12: [2023-05-25 13:38:00,482] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 31: [2023-05-25 13:38:00,482] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt... 27: [2023-05-25 13:38:00,483] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 27: [2023-05-25 13:38:00,483] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 12: [2023-05-25 13:38:00,483] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 15: [2023-05-25 13:38:00,483] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 12: [2023-05-25 13:38:00,485] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt... 27: [2023-05-25 13:38:00,485] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt... 27: [2023-05-25 13:38:00,485] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt... 15: [2023-05-25 13:38:00,485] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt... 12: [2023-05-25 13:38:00,486] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt... 21: [2023-05-25 13:38:00,486] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt. 5: [2023-05-25 13:38:00,486] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 22: [2023-05-25 13:38:00,486] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt. 22: [2023-05-25 13:38:00,486] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt. 31: [2023-05-25 13:38:00,487] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_00-model_states.pt. 21: [2023-05-25 13:38:00,487] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_26-model_01-model_states.pt. 17: [2023-05-25 13:38:00,487] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 0: [2023-05-25 13:38:00,487] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 17: [2023-05-25 13:38:00,487] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 17: [2023-05-25 13:38:00,488] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 2: [2023-05-25 13:38:00,488] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 2: [2023-05-25 13:38:00,488] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 12: [2023-05-25 13:38:00,489] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_00-model_states.pt. 5: [2023-05-25 13:38:00,489] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 0: [2023-05-25 13:38:00,489] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 31: [2023-05-25 13:38:00,489] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt... 1: [2023-05-25 13:38:00,489] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 18: [2023-05-25 13:38:00,490] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 21: [2023-05-25 13:38:00,490] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 21: [2023-05-25 13:38:00,491] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 7: [2023-05-25 13:38:00,491] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 7: [2023-05-25 13:38:00,491] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 7: [2023-05-25 13:38:00,491] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 7: [2023-05-25 13:38:00,491] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 7: [2023-05-25 13:38:00,491] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 18: [2023-05-25 13:38:00,491] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 7: [2023-05-25 13:38:00,491] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 7: [2023-05-25 13:38:00,491] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 7: [2023-05-25 13:38:00,491] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 18: [2023-05-25 13:38:00,492] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 16: [2023-05-25 13:38:00,492] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 18: [2023-05-25 13:38:00,492] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 23: [2023-05-25 13:38:00,493] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 3: [2023-05-25 13:38:00,493] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 1: [2023-05-25 13:38:00,493] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 6: [2023-05-25 13:38:00,493] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 23: [2023-05-25 13:38:00,493] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 16: [2023-05-25 13:38:00,494] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 20: [2023-05-25 13:38:00,494] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 7: [2023-05-25 13:38:00,494] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 19: [2023-05-25 13:38:00,494] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 7: [2023-05-25 13:38:00,494] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt... 7: [2023-05-25 13:38:00,494] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt... 12: [2023-05-25 13:38:00,494] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 3: [2023-05-25 13:38:00,494] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 7: [2023-05-25 13:38:00,494] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt... 7: [2023-05-25 13:38:00,495] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt... 7: [2023-05-25 13:38:00,495] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt... 22: [2023-05-25 13:38:00,495] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 7: [2023-05-25 13:38:00,495] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt... 7: [2023-05-25 13:38:00,495] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt... 22: [2023-05-25 13:38:00,495] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 20: [2023-05-25 13:38:00,497] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 19: [2023-05-25 13:38:00,497] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt... 6: [2023-05-25 13:38:00,497] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 5: [2023-05-25 13:38:00,500] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 21: [2023-05-25 13:38:00,501] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 21: [2023-05-25 13:38:00,501] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 19: [2023-05-25 13:38:00,501] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 2: [2023-05-25 13:38:00,501] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 12: [2023-05-25 13:38:00,502] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 2: [2023-05-25 13:38:00,502] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 22: [2023-05-25 13:38:00,502] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 22: [2023-05-25 13:38:00,502] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt... 5: [2023-05-25 13:38:00,502] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 19: [2023-05-25 13:38:00,503] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt... 6: [2023-05-25 13:38:00,508] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 23: [2023-05-25 13:38:00,508] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 23: [2023-05-25 13:38:00,509] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 19: [2023-05-25 13:38:00,509] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 23: [2023-05-25 13:38:00,511] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt... 23: [2023-05-25 13:38:00,512] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt... 19: [2023-05-25 13:38:00,512] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt... 6: [2023-05-25 13:38:00,514] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 19: [2023-05-25 13:38:00,515] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 17: [2023-05-25 13:38:00,515] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 19: [2023-05-25 13:38:00,517] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt... 17: [2023-05-25 13:38:00,517] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 20: [2023-05-25 13:38:00,518] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 17: [2023-05-25 13:38:00,520] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt... 20: [2023-05-25 13:38:00,520] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 17: [2023-05-25 13:38:00,520] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt... 20: [2023-05-25 13:38:00,521] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt... 20: [2023-05-25 13:38:00,523] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt... 17: [2023-05-25 13:38:00,524] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 17: [2023-05-25 13:38:00,524] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 16: [2023-05-25 13:38:00,524] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 16: [2023-05-25 13:38:00,524] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 7: [2023-05-25 13:38:00,525] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 22: [2023-05-25 13:38:00,526] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 23: [2023-05-25 13:38:00,526] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 23: [2023-05-25 13:38:00,526] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 17: [2023-05-25 13:38:00,527] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt... 17: [2023-05-25 13:38:00,527] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt... 20: [2023-05-25 13:38:00,527] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 7: [2023-05-25 13:38:00,528] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_00-model_states.pt. 16: [2023-05-25 13:38:00,528] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt... 16: [2023-05-25 13:38:00,528] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt... 21: [2023-05-25 13:38:00,528] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 23: [2023-05-25 13:38:00,528] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt... 23: [2023-05-25 13:38:00,528] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt... 22: [2023-05-25 13:38:00,529] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt... 20: [2023-05-25 13:38:00,529] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 18: [2023-05-25 13:38:00,529] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 21: [2023-05-25 13:38:00,531] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt... 18: [2023-05-25 13:38:00,532] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 18: [2023-05-25 13:38:00,533] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt... 21: [2023-05-25 13:38:00,533] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 22: [2023-05-25 13:38:00,534] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 22: [2023-05-25 13:38:00,535] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 22: [2023-05-25 13:38:00,535] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 18: [2023-05-25 13:38:00,534] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 21: [2023-05-25 13:38:00,535] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt... 20: [2023-05-25 13:38:00,536] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt... 20: [2023-05-25 13:38:00,536] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt... 18: [2023-05-25 13:38:00,536] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt... 18: [2023-05-25 13:38:00,537] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt... 22: [2023-05-25 13:38:00,537] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt... 22: [2023-05-25 13:38:00,538] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt... 22: [2023-05-25 13:38:00,538] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt... 21: [2023-05-25 13:38:00,539] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 18: [2023-05-25 13:38:00,541] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 21: [2023-05-25 13:38:00,541] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_00-model_states.pt. 21: [2023-05-25 13:38:00,542] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt... 21: [2023-05-25 13:38:00,543] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt... 18: [2023-05-25 13:38:00,543] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt... 7: [2023-05-25 13:38:00,544] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 7: [2023-05-25 13:38:00,547] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 18: [2023-05-25 13:38:00,581] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt. 18: [2023-05-25 13:38:00,581] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt. 18: [2023-05-25 13:38:00,596] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 20: [2023-05-25 13:38:00,596] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt. 20: [2023-05-25 13:38:00,596] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt. 18: [2023-05-25 13:38:00,599] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 16: [2023-05-25 13:38:00,604] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt. 16: [2023-05-25 13:38:00,604] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt. 23: [2023-05-25 13:38:00,608] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt. 23: [2023-05-25 13:38:00,608] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt. 17: [2023-05-25 13:38:00,609] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt. 17: [2023-05-25 13:38:00,609] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt. 20: [2023-05-25 13:38:00,609] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 19: [2023-05-25 13:38:00,609] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt. 20: [2023-05-25 13:38:00,609] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 19: [2023-05-25 13:38:00,609] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt. 21: [2023-05-25 13:38:00,611] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt. 21: [2023-05-25 13:38:00,612] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt. 16: [2023-05-25 13:38:00,617] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 16: [2023-05-25 13:38:00,618] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 22: [2023-05-25 13:38:00,619] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt. 24: [2023-05-25 13:38:00,621] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt. 24: [2023-05-25 13:38:00,621] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt. 17: [2023-05-25 13:38:00,622] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 23: [2023-05-25 13:38:00,622] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 17: [2023-05-25 13:38:00,623] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 23: [2023-05-25 13:38:00,622] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 19: [2023-05-25 13:38:00,623] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 19: [2023-05-25 13:38:00,623] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 21: [2023-05-25 13:38:00,625] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 24: [2023-05-25 13:38:00,626] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt. 24: [2023-05-25 13:38:00,626] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt. 21: [2023-05-25 13:38:00,626] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 22: [2023-05-25 13:38:00,632] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 22: [2023-05-25 13:38:00,632] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_02-model_states.pt. 28: [2023-05-25 13:38:00,635] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt. 28: [2023-05-25 13:38:00,636] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt. 24: [2023-05-25 13:38:00,638] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 24: [2023-05-25 13:38:00,638] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 24: [2023-05-25 13:38:00,639] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 24: [2023-05-25 13:38:00,640] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 13: [2023-05-25 13:38:00,640] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt. 13: [2023-05-25 13:38:00,640] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt. 22: [2023-05-25 13:38:00,646] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 28: [2023-05-25 13:38:00,650] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 28: [2023-05-25 13:38:00,650] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 12: [2023-05-25 13:38:00,651] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt. 10: [2023-05-25 13:38:00,651] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt. 10: [2023-05-25 13:38:00,652] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt. 8: [2023-05-25 13:38:00,652] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt. 8: [2023-05-25 13:38:00,652] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt. 13: [2023-05-25 13:38:00,654] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 13: [2023-05-25 13:38:00,654] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 12: [2023-05-25 13:38:00,654] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt. 26: [2023-05-25 13:38:00,655] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt. 26: [2023-05-25 13:38:00,655] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt. 31: [2023-05-25 13:38:00,654] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt. 31: [2023-05-25 13:38:00,655] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt. 13: [2023-05-25 13:38:00,656] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt. 14: [2023-05-25 13:38:00,657] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt. 13: [2023-05-25 13:38:00,657] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt. 14: [2023-05-25 13:38:00,657] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt. 9: [2023-05-25 13:38:00,657] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt. 9: [2023-05-25 13:38:00,657] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt. 29: [2023-05-25 13:38:00,657] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt. 29: [2023-05-25 13:38:00,658] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt. 26: [2023-05-25 13:38:00,661] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt. 26: [2023-05-25 13:38:00,661] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt. 30: [2023-05-25 13:38:00,661] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt. 30: [2023-05-25 13:38:00,661] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt. 12: [2023-05-25 13:38:00,664] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 15: [2023-05-25 13:38:00,664] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt. 10: [2023-05-25 13:38:00,664] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 28: [2023-05-25 13:38:00,664] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt. 10: [2023-05-25 13:38:00,664] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 15: [2023-05-25 13:38:00,664] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt. 28: [2023-05-25 13:38:00,664] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt. 11: [2023-05-25 13:38:00,665] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt. 11: [2023-05-25 13:38:00,665] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_02-model_states.pt. 8: [2023-05-25 13:38:00,665] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 8: [2023-05-25 13:38:00,665] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 31: [2023-05-25 13:38:00,665] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt. 31: [2023-05-25 13:38:00,666] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt. 0: [2023-05-25 13:38:00,666] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt. 0: [2023-05-25 13:38:00,666] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt. 31: [2023-05-25 13:38:00,667] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 27: [2023-05-25 13:38:00,668] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt. 26: [2023-05-25 13:38:00,668] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 27: [2023-05-25 13:38:00,668] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt. 31: [2023-05-25 13:38:00,669] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 14: [2023-05-25 13:38:00,670] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 12: [2023-05-25 13:38:00,670] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 9: [2023-05-25 13:38:00,670] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 9: [2023-05-25 13:38:00,670] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 13: [2023-05-25 13:38:00,671] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 14: [2023-05-25 13:38:00,671] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 26: [2023-05-25 13:38:00,671] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 13: [2023-05-25 13:38:00,671] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 25: [2023-05-25 13:38:00,672] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt. 25: [2023-05-25 13:38:00,672] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_02-model_states.pt. 30: [2023-05-25 13:38:00,675] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt. 30: [2023-05-25 13:38:00,675] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 26: [2023-05-25 13:38:00,675] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 30: [2023-05-25 13:38:00,675] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt. 30: [2023-05-25 13:38:00,675] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 26: [2023-05-25 13:38:00,676] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 30: [2023-05-25 13:38:00,677] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt. 30: [2023-05-25 13:38:00,677] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt. 28: [2023-05-25 13:38:00,678] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 29: [2023-05-25 13:38:00,679] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 15: [2023-05-25 13:38:00,679] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 11: [2023-05-25 13:38:00,679] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 11: [2023-05-25 13:38:00,679] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 0: [2023-05-25 13:38:00,679] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 28: [2023-05-25 13:38:00,679] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 24: [2023-05-25 13:38:00,679] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt. 0: [2023-05-25 13:38:00,680] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 24: [2023-05-25 13:38:00,680] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt. 1: [2023-05-25 13:38:00,680] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt. 15: [2023-05-25 13:38:00,680] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 1: [2023-05-25 13:38:00,680] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt. 31: [2023-05-25 13:38:00,679] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 27: [2023-05-25 13:38:00,682] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 31: [2023-05-25 13:38:00,679] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 11: [2023-05-25 13:38:00,683] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt. 8: [2023-05-25 13:38:00,683] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt. 11: [2023-05-25 13:38:00,683] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt. 8: [2023-05-25 13:38:00,684] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt. 27: [2023-05-25 13:38:00,685] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 25: [2023-05-25 13:38:00,686] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 28: [2023-05-25 13:38:00,687] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 28: [2023-05-25 13:38:00,687] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 26: [2023-05-25 13:38:00,687] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 26: [2023-05-25 13:38:00,687] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 30: [2023-05-25 13:38:00,687] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 28: [2023-05-25 13:38:00,687] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 28: [2023-05-25 13:38:00,687] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 28: [2023-05-25 13:38:00,688] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 29: [2023-05-25 13:38:00,688] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 25: [2023-05-25 13:38:00,686] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 25: [2023-05-25 13:38:00,687] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 25: [2023-05-25 13:38:00,687] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 29: [2023-05-25 13:38:00,688] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 29: [2023-05-25 13:38:00,688] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 27: [2023-05-25 13:38:00,689] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 26: [2023-05-25 13:38:00,689] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 27: [2023-05-25 13:38:00,689] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 30: [2023-05-25 13:38:00,689] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 28: [2023-05-25 13:38:00,689] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 28: [2023-05-25 13:38:00,690] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt... 28: [2023-05-25 13:38:00,690] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt... 26: [2023-05-25 13:38:00,690] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 30: [2023-05-25 13:38:00,690] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 25: [2023-05-25 13:38:00,690] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 30: [2023-05-25 13:38:00,690] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 25: [2023-05-25 13:38:00,691] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 31: [2023-05-25 13:38:00,691] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 31: [2023-05-25 13:38:00,691] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 29: [2023-05-25 13:38:00,691] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 27: [2023-05-25 13:38:00,691] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 27: [2023-05-25 13:38:00,691] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 26: [2023-05-25 13:38:00,693] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt. 1: [2023-05-25 13:38:00,693] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 29: [2023-05-25 13:38:00,693] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 26: [2023-05-25 13:38:00,693] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt. 31: [2023-05-25 13:38:00,693] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 31: [2023-05-25 13:38:00,694] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 30: [2023-05-25 13:38:00,694] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 30: [2023-05-25 13:38:00,694] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 24: [2023-05-25 13:38:00,695] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 30: [2023-05-25 13:38:00,695] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 30: [2023-05-25 13:38:00,695] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 1: [2023-05-25 13:38:00,696] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 28: [2023-05-25 13:38:00,697] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt. 28: [2023-05-25 13:38:00,697] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt. 24: [2023-05-25 13:38:00,697] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 8: [2023-05-25 13:38:00,697] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 8: [2023-05-25 13:38:00,697] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 11: [2023-05-25 13:38:00,697] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 11: [2023-05-25 13:38:00,698] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 18: [2023-05-25 13:38:00,699] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 18: [2023-05-25 13:38:00,699] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 18: [2023-05-25 13:38:00,699] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 23: [2023-05-25 13:38:00,699] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 23: [2023-05-25 13:38:00,699] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 18: [2023-05-25 13:38:00,699] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 23: [2023-05-25 13:38:00,700] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 23: [2023-05-25 13:38:00,700] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 23: [2023-05-25 13:38:00,702] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 23: [2023-05-25 13:38:00,702] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 23: [2023-05-25 13:38:00,702] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt... 23: [2023-05-25 13:38:00,702] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt... 29: [2023-05-25 13:38:00,702] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt. 18: [2023-05-25 13:38:00,701] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 18: [2023-05-25 13:38:00,701] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 29: [2023-05-25 13:38:00,702] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt. 1: [2023-05-25 13:38:00,702] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt. 18: [2023-05-25 13:38:00,702] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt... 18: [2023-05-25 13:38:00,702] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt... 1: [2023-05-25 13:38:00,702] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt. 29: [2023-05-25 13:38:00,703] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt. 29: [2023-05-25 13:38:00,703] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt. 31: [2023-05-25 13:38:00,703] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt. 31: [2023-05-25 13:38:00,703] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt. 25: [2023-05-25 13:38:00,704] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt. 25: [2023-05-25 13:38:00,704] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt. 15: [2023-05-25 13:38:00,705] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt. 15: [2023-05-25 13:38:00,705] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt. 27: [2023-05-25 13:38:00,705] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt. 27: [2023-05-25 13:38:00,705] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt. 22: [2023-05-25 13:38:00,706] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 22: [2023-05-25 13:38:00,706] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 22: [2023-05-25 13:38:00,706] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 22: [2023-05-25 13:38:00,706] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 27: [2023-05-25 13:38:00,706] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt. 25: [2023-05-25 13:38:00,706] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt. 27: [2023-05-25 13:38:00,707] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_03-model_states.pt. 25: [2023-05-25 13:38:00,707] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_38-model_01-model_states.pt. 26: [2023-05-25 13:38:00,707] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 22: [2023-05-25 13:38:00,708] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 22: [2023-05-25 13:38:00,708] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 22: [2023-05-25 13:38:00,708] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt... 22: [2023-05-25 13:38:00,708] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt... 26: [2023-05-25 13:38:00,709] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 31: [2023-05-25 13:38:00,709] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 31: [2023-05-25 13:38:00,709] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 28: [2023-05-25 13:38:00,710] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 26: [2023-05-25 13:38:00,710] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 31: [2023-05-25 13:38:00,710] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 28: [2023-05-25 13:38:00,711] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 31: [2023-05-25 13:38:00,712] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt... 31: [2023-05-25 13:38:00,712] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt... 26: [2023-05-25 13:38:00,713] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 26: [2023-05-25 13:38:00,713] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt... 31: [2023-05-25 13:38:00,713] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt... 24: [2023-05-25 13:38:00,714] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 24: [2023-05-25 13:38:00,714] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 24: [2023-05-25 13:38:00,714] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 26: [2023-05-25 13:38:00,714] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 26: [2023-05-25 13:38:00,714] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 24: [2023-05-25 13:38:00,714] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 24: [2023-05-25 13:38:00,714] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 24: [2023-05-25 13:38:00,714] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 28: [2023-05-25 13:38:00,714] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 28: [2023-05-25 13:38:00,715] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 26: [2023-05-25 13:38:00,715] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt... 1: [2023-05-25 13:38:00,716] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 21: [2023-05-25 13:38:00,716] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 4: [2023-05-25 13:38:00,716] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt. 1: [2023-05-25 13:38:00,716] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 21: [2023-05-25 13:38:00,716] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 21: [2023-05-25 13:38:00,716] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 4: [2023-05-25 13:38:00,716] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt. 24: [2023-05-25 13:38:00,716] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt... 24: [2023-05-25 13:38:00,716] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt... 28: [2023-05-25 13:38:00,716] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 19: [2023-05-25 13:38:00,717] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 19: [2023-05-25 13:38:00,717] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 26: [2023-05-25 13:38:00,717] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt... 26: [2023-05-25 13:38:00,717] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt... 19: [2023-05-25 13:38:00,717] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 19: [2023-05-25 13:38:00,717] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 24: [2023-05-25 13:38:00,717] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt... 24: [2023-05-25 13:38:00,717] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 24: [2023-05-25 13:38:00,717] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt... 21: [2023-05-25 13:38:00,717] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 24: [2023-05-25 13:38:00,717] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 21: [2023-05-25 13:38:00,717] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 28: [2023-05-25 13:38:00,717] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt... 28: [2023-05-25 13:38:00,717] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt... 21: [2023-05-25 13:38:00,718] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 16: [2023-05-25 13:38:00,717] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt. 16: [2023-05-25 13:38:00,718] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt. 16: [2023-05-25 13:38:00,718] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 16: [2023-05-25 13:38:00,718] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 19: [2023-05-25 13:38:00,719] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt... 16: [2023-05-25 13:38:00,719] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 15: [2023-05-25 13:38:00,719] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 21: [2023-05-25 13:38:00,719] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt... 21: [2023-05-25 13:38:00,719] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt... 15: [2023-05-25 13:38:00,719] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 16: [2023-05-25 13:38:00,719] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 19: [2023-05-25 13:38:00,719] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 30: [2023-05-25 13:38:00,718] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 19: [2023-05-25 13:38:00,719] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 19: [2023-05-25 13:38:00,720] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt... 26: [2023-05-25 13:38:00,719] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 31: [2023-05-25 13:38:00,720] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 17: [2023-05-25 13:38:00,720] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 17: [2023-05-25 13:38:00,720] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 17: [2023-05-25 13:38:00,720] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 31: [2023-05-25 13:38:00,720] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 17: [2023-05-25 13:38:00,720] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 16: [2023-05-25 13:38:00,721] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt... 16: [2023-05-25 13:38:00,721] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt... 16: [2023-05-25 13:38:00,721] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 30: [2023-05-25 13:38:00,721] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 30: [2023-05-25 13:38:00,721] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt... 16: [2023-05-25 13:38:00,721] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 25: [2023-05-25 13:38:00,721] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 31: [2023-05-25 13:38:00,721] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 25: [2023-05-25 13:38:00,722] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 17: [2023-05-25 13:38:00,722] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 17: [2023-05-25 13:38:00,723] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 30: [2023-05-25 13:38:00,723] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt... 29: [2023-05-25 13:38:00,723] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 17: [2023-05-25 13:38:00,723] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt... 17: [2023-05-25 13:38:00,723] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt... 25: [2023-05-25 13:38:00,723] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 25: [2023-05-25 13:38:00,723] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 31: [2023-05-25 13:38:00,725] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt... 27: [2023-05-25 13:38:00,725] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 28: [2023-05-25 13:38:00,726] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 29: [2023-05-25 13:38:00,726] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 29: [2023-05-25 13:38:00,726] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 26: [2023-05-25 13:38:00,726] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 29: [2023-05-25 13:38:00,727] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 29: [2023-05-25 13:38:00,727] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 31: [2023-05-25 13:38:00,727] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 27: [2023-05-25 13:38:00,727] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 27: [2023-05-25 13:38:00,728] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 25: [2023-05-25 13:38:00,728] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 25: [2023-05-25 13:38:00,728] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 4: [2023-05-25 13:38:00,728] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt. 29: [2023-05-25 13:38:00,728] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 27: [2023-05-25 13:38:00,728] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt... 24: [2023-05-25 13:38:00,728] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 25: [2023-05-25 13:38:00,729] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 28: [2023-05-25 13:38:00,729] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 27: [2023-05-25 13:38:00,729] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 4: [2023-05-25 13:38:00,729] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt. 25: [2023-05-25 13:38:00,729] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 4: [2023-05-25 13:38:00,729] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 27: [2023-05-25 13:38:00,729] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 27: [2023-05-25 13:38:00,729] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 4: [2023-05-25 13:38:00,730] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 23: [2023-05-25 13:38:00,730] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 25: [2023-05-25 13:38:00,731] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt... 25: [2023-05-25 13:38:00,731] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt... 18: [2023-05-25 13:38:00,731] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 23: [2023-05-25 13:38:00,731] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 24: [2023-05-25 13:38:00,732] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt... 22: [2023-05-25 13:38:00,732] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt. 18: [2023-05-25 13:38:00,732] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 30: [2023-05-25 13:38:00,731] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 16: [2023-05-25 13:38:00,732] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 22: [2023-05-25 13:38:00,732] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt. 18: [2023-05-25 13:38:00,732] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt. 29: [2023-05-25 13:38:00,732] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt... 18: [2023-05-25 13:38:00,732] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt. 20: [2023-05-25 13:38:00,733] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 30: [2023-05-25 13:38:00,732] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 20: [2023-05-25 13:38:00,733] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 20: [2023-05-25 13:38:00,733] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 0: [2023-05-25 13:38:00,733] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt. 20: [2023-05-25 13:38:00,733] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 16: [2023-05-25 13:38:00,733] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 29: [2023-05-25 13:38:00,733] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 23: [2023-05-25 13:38:00,733] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt. 27: [2023-05-25 13:38:00,734] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt... 23: [2023-05-25 13:38:00,734] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt. 27: [2023-05-25 13:38:00,734] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt... 6: [2023-05-25 13:38:00,734] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt. 6: [2023-05-25 13:38:00,734] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt. 26: [2023-05-25 13:38:00,734] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 20: [2023-05-25 13:38:00,735] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt... 0: [2023-05-25 13:38:00,735] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt. 22: [2023-05-25 13:38:00,735] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 5: [2023-05-25 13:38:00,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt. 24: [2023-05-25 13:38:00,735] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 31: [2023-05-25 13:38:00,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 5: [2023-05-25 13:38:00,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt. 5: [2023-05-25 13:38:00,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt. 5: [2023-05-25 13:38:00,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt. 30: [2023-05-25 13:38:00,735] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 11: [2023-05-25 13:38:00,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 11: [2023-05-25 13:38:00,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 11: [2023-05-25 13:38:00,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 11: [2023-05-25 13:38:00,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 20: [2023-05-25 13:38:00,737] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 11: [2023-05-25 13:38:00,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 20: [2023-05-25 13:38:00,737] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt... 11: [2023-05-25 13:38:00,737] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 27: [2023-05-25 13:38:00,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 30: [2023-05-25 13:38:00,736] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt... 20: [2023-05-25 13:38:00,737] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 24: [2023-05-25 13:38:00,737] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt... 22: [2023-05-25 13:38:00,738] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 21: [2023-05-25 13:38:00,738] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 30: [2023-05-25 13:38:00,738] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt... 29: [2023-05-25 13:38:00,738] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 11: [2023-05-25 13:38:00,738] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 11: [2023-05-25 13:38:00,739] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt... 11: [2023-05-25 13:38:00,739] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt... 31: [2023-05-25 13:38:00,739] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 11: [2023-05-25 13:38:00,740] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt... 14: [2023-05-25 13:38:00,739] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt. 14: [2023-05-25 13:38:00,739] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt. 11: [2023-05-25 13:38:00,740] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt... 26: [2023-05-25 13:38:00,739] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 11: [2023-05-25 13:38:00,740] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 26: [2023-05-25 13:38:00,741] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 26: [2023-05-25 13:38:00,741] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 4: [2023-05-25 13:38:00,741] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 21: [2023-05-25 13:38:00,742] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt. 28: [2023-05-25 13:38:00,743] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 28: [2023-05-25 13:38:00,743] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 26: [2023-05-25 13:38:00,743] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt... 10: [2023-05-25 13:38:00,743] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt. 28: [2023-05-25 13:38:00,743] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 29: [2023-05-25 13:38:00,743] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt... 25: [2023-05-25 13:38:00,743] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 10: [2023-05-25 13:38:00,743] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt. 21: [2023-05-25 13:38:00,743] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt. 4: [2023-05-25 13:38:00,744] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 23: [2023-05-25 13:38:00,744] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 18: [2023-05-25 13:38:00,745] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 28: [2023-05-25 13:38:00,745] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt... 28: [2023-05-25 13:38:00,745] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt... 30: [2023-05-25 13:38:00,745] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 23: [2023-05-25 13:38:00,745] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 30: [2023-05-25 13:38:00,745] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 30: [2023-05-25 13:38:00,745] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 26: [2023-05-25 13:38:00,745] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt... 0: [2023-05-25 13:38:00,745] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 25: [2023-05-25 13:38:00,746] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 18: [2023-05-25 13:38:00,747] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 17: [2023-05-25 13:38:00,746] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 1: [2023-05-25 13:38:00,747] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt. 1: [2023-05-25 13:38:00,747] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt. 30: [2023-05-25 13:38:00,747] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt... 22: [2023-05-25 13:38:00,747] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 22: [2023-05-25 13:38:00,747] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 6: [2023-05-25 13:38:00,748] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 6: [2023-05-25 13:38:00,748] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 19: [2023-05-25 13:38:00,748] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 24: [2023-05-25 13:38:00,748] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 23: [2023-05-25 13:38:00,749] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 23: [2023-05-25 13:38:00,749] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 24: [2023-05-25 13:38:00,749] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 0: [2023-05-25 13:38:00,749] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 21: [2023-05-25 13:38:00,749] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 18: [2023-05-25 13:38:00,749] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 18: [2023-05-25 13:38:00,749] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 29: [2023-05-25 13:38:00,749] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 17: [2023-05-25 13:38:00,749] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt. 31: [2023-05-25 13:38:00,750] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 17: [2023-05-25 13:38:00,750] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt. 5: [2023-05-25 13:38:00,750] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 27: [2023-05-25 13:38:00,751] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 5: [2023-05-25 13:38:00,751] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 22: [2023-05-25 13:38:00,751] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 19: [2023-05-25 13:38:00,751] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 5: [2023-05-25 13:38:00,752] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 22: [2023-05-25 13:38:00,752] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 5: [2023-05-25 13:38:00,752] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 30: [2023-05-25 13:38:00,752] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 12: [2023-05-25 13:38:00,752] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 12: [2023-05-25 13:38:00,752] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 12: [2023-05-25 13:38:00,752] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 14: [2023-05-25 13:38:00,753] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 9: [2023-05-25 13:38:00,753] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt. 12: [2023-05-25 13:38:00,754] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt... 12: [2023-05-25 13:38:00,754] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 2: [2023-05-25 13:38:00,754] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt. 14: [2023-05-25 13:38:00,754] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 3: [2023-05-25 13:38:00,754] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt. 3: [2023-05-25 13:38:00,754] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt. 12: [2023-05-25 13:38:00,754] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 30: [2023-05-25 13:38:00,754] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt... 12: [2023-05-25 13:38:00,754] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 9: [2023-05-25 13:38:00,754] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt. 2: [2023-05-25 13:38:00,754] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt. 21: [2023-05-25 13:38:00,754] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 17: [2023-05-25 13:38:00,755] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 31: [2023-05-25 13:38:00,755] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 10: [2023-05-25 13:38:00,755] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 16: [2023-05-25 13:38:00,755] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 10: [2023-05-25 13:38:00,755] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 31: [2023-05-25 13:38:00,755] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 25: [2023-05-25 13:38:00,756] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 25: [2023-05-25 13:38:00,756] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 12: [2023-05-25 13:38:00,756] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt... 16: [2023-05-25 13:38:00,756] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 21: [2023-05-25 13:38:00,756] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 25: [2023-05-25 13:38:00,756] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 15: [2023-05-25 13:38:00,757] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 15: [2023-05-25 13:38:00,757] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 31: [2023-05-25 13:38:00,757] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt... 15: [2023-05-25 13:38:00,757] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 15: [2023-05-25 13:38:00,757] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 15: [2023-05-25 13:38:00,757] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 15: [2023-05-25 13:38:00,758] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 27: [2023-05-25 13:38:00,758] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 21: [2023-05-25 13:38:00,758] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 31: [2023-05-25 13:38:00,758] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt... 30: [2023-05-25 13:38:00,758] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 25: [2023-05-25 13:38:00,758] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt... 25: [2023-05-25 13:38:00,758] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt... 25: [2023-05-25 13:38:00,759] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt... 29: [2023-05-25 13:38:00,759] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 10: [2023-05-25 13:38:00,759] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 10: [2023-05-25 13:38:00,759] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 10: [2023-05-25 13:38:00,759] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 10: [2023-05-25 13:38:00,759] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 15: [2023-05-25 13:38:00,760] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 15: [2023-05-25 13:38:00,760] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt... 15: [2023-05-25 13:38:00,760] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt... 25: [2023-05-25 13:38:00,760] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 15: [2023-05-25 13:38:00,760] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt... 15: [2023-05-25 13:38:00,760] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 29: [2023-05-25 13:38:00,761] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 10: [2023-05-25 13:38:00,761] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 15: [2023-05-25 13:38:00,761] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt... 10: [2023-05-25 13:38:00,761] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 24: [2023-05-25 13:38:00,762] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 19: [2023-05-25 13:38:00,762] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 10: [2023-05-25 13:38:00,762] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt... 10: [2023-05-25 13:38:00,762] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt... 24: [2023-05-25 13:38:00,762] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 1: [2023-05-25 13:38:00,762] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 1: [2023-05-25 13:38:00,762] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 25: [2023-05-25 13:38:00,762] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt... 0: [2023-05-25 13:38:00,762] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt. 0: [2023-05-25 13:38:00,762] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt. 29: [2023-05-25 13:38:00,763] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt... 21: [2023-05-25 13:38:00,763] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 17: [2023-05-25 13:38:00,764] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 17: [2023-05-25 13:38:00,764] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 16: [2023-05-25 13:38:00,764] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 19: [2023-05-25 13:38:00,764] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt. 19: [2023-05-25 13:38:00,764] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt. 11: [2023-05-25 13:38:00,765] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 17: [2023-05-25 13:38:00,765] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 27: [2023-05-25 13:38:00,766] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 3: [2023-05-25 13:38:00,766] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt. 3: [2023-05-25 13:38:00,766] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt. 3: [2023-05-25 13:38:00,766] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt. 3: [2023-05-25 13:38:00,767] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt. 9: [2023-05-25 13:38:00,767] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 29: [2023-05-25 13:38:00,767] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 29: [2023-05-25 13:38:00,767] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 19: [2023-05-25 13:38:00,767] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 7: [2023-05-25 13:38:00,768] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt. 11: [2023-05-25 13:38:00,768] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 7: [2023-05-25 13:38:00,768] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_03-model_states.pt. 29: [2023-05-25 13:38:00,769] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 16: [2023-05-25 13:38:00,769] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 16: [2023-05-25 13:38:00,769] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 5: [2023-05-25 13:38:00,769] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt. 5: [2023-05-25 13:38:00,770] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt. 9: [2023-05-25 13:38:00,770] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 2: [2023-05-25 13:38:00,770] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 16: [2023-05-25 13:38:00,770] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt... 27: [2023-05-25 13:38:00,770] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt... 17: [2023-05-25 13:38:00,771] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 27: [2023-05-25 13:38:00,771] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 27: [2023-05-25 13:38:00,771] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 3: [2023-05-25 13:38:00,771] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 2: [2023-05-25 13:38:00,771] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 27: [2023-05-25 13:38:00,771] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_00-model_states.pt. 16: [2023-05-25 13:38:00,771] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt... 16: [2023-05-25 13:38:00,772] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 20: [2023-05-25 13:38:00,772] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt. 20: [2023-05-25 13:38:00,772] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_03-model_states.pt. 29: [2023-05-25 13:38:00,772] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt... 29: [2023-05-25 13:38:00,772] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt... 20: [2023-05-25 13:38:00,772] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 29: [2023-05-25 13:38:00,773] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt... 3: [2023-05-25 13:38:00,774] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 2: [2023-05-25 13:38:00,774] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt. 22: [2023-05-25 13:38:00,775] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 20: [2023-05-25 13:38:00,776] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 0: [2023-05-25 13:38:00,776] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 23: [2023-05-25 13:38:00,776] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 27: [2023-05-25 13:38:00,776] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt... 27: [2023-05-25 13:38:00,776] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt... 27: [2023-05-25 13:38:00,777] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt... 2: [2023-05-25 13:38:00,776] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt. 0: [2023-05-25 13:38:00,777] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 19: [2023-05-25 13:38:00,777] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 18: [2023-05-25 13:38:00,777] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 23: [2023-05-25 13:38:00,777] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 19: [2023-05-25 13:38:00,778] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 22: [2023-05-25 13:38:00,778] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt... 11: [2023-05-25 13:38:00,779] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 18: [2023-05-25 13:38:00,779] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 23: [2023-05-25 13:38:00,780] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt... 23: [2023-05-25 13:38:00,780] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt... 18: [2023-05-25 13:38:00,781] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt... 18: [2023-05-25 13:38:00,781] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt... 3: [2023-05-25 13:38:00,781] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 0: [2023-05-25 13:38:00,781] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 0: [2023-05-25 13:38:00,782] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 0: [2023-05-25 13:38:00,782] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 0: [2023-05-25 13:38:00,782] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 12: [2023-05-25 13:38:00,782] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 3: [2023-05-25 13:38:00,782] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 22: [2023-05-25 13:38:00,782] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 10: [2023-05-25 13:38:00,783] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 0: [2023-05-25 13:38:00,783] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 11: [2023-05-25 13:38:00,783] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 3: [2023-05-25 13:38:00,783] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 3: [2023-05-25 13:38:00,784] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 21: [2023-05-25 13:38:00,784] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 22: [2023-05-25 13:38:00,784] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt... 10: [2023-05-25 13:38:00,785] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 17: [2023-05-25 13:38:00,785] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 0: [2023-05-25 13:38:00,785] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 10: [2023-05-25 13:38:00,785] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt... 0: [2023-05-25 13:38:00,786] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt... 7: [2023-05-25 13:38:00,786] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 0: [2023-05-25 13:38:00,786] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 10: [2023-05-25 13:38:00,787] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt... 21: [2023-05-25 13:38:00,787] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt... 0: [2023-05-25 13:38:00,787] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt... 2: [2023-05-25 13:38:00,787] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 17: [2023-05-25 13:38:00,788] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt... 5: [2023-05-25 13:38:00,788] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 5: [2023-05-25 13:38:00,788] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 7: [2023-05-25 13:38:00,787] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 12: [2023-05-25 13:38:00,788] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 0: [2023-05-25 13:38:00,788] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt... 15: [2023-05-25 13:38:00,788] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 21: [2023-05-25 13:38:00,789] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 15: [2023-05-25 13:38:00,789] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 10: [2023-05-25 13:38:00,790] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 21: [2023-05-25 13:38:00,791] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt... 12: [2023-05-25 13:38:00,791] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt. 12: [2023-05-25 13:38:00,792] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_01-model_states.pt. 2: [2023-05-25 13:38:00,792] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 20: [2023-05-25 13:38:00,792] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 20: [2023-05-25 13:38:00,793] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 12: [2023-05-25 13:38:00,796] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 8: [2023-05-25 13:38:00,795] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 8: [2023-05-25 13:38:00,795] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 8: [2023-05-25 13:38:00,796] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 8: [2023-05-25 13:38:00,796] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 8: [2023-05-25 13:38:00,796] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 17: [2023-05-25 13:38:00,796] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 8: [2023-05-25 13:38:00,796] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 10: [2023-05-25 13:38:00,797] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 0: [2023-05-25 13:38:00,797] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 8: [2023-05-25 13:38:00,798] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt. 17: [2023-05-25 13:38:00,798] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt... 8: [2023-05-25 13:38:00,798] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt. 20: [2023-05-25 13:38:00,798] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 8: [2023-05-25 13:38:00,798] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt... 8: [2023-05-25 13:38:00,798] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt... 4: [2023-05-25 13:38:00,798] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt. 4: [2023-05-25 13:38:00,798] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt. 8: [2023-05-25 13:38:00,799] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt... 8: [2023-05-25 13:38:00,799] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 8: [2023-05-25 13:38:00,799] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 13: [2023-05-25 13:38:00,799] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 20: [2023-05-25 13:38:00,799] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 8: [2023-05-25 13:38:00,799] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt... 13: [2023-05-25 13:38:00,799] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 13: [2023-05-25 13:38:00,800] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 13: [2023-05-25 13:38:00,800] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 13: [2023-05-25 13:38:00,800] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 13: [2023-05-25 13:38:00,800] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 0: [2023-05-25 13:38:00,800] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt... 13: [2023-05-25 13:38:00,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt. 13: [2023-05-25 13:38:00,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt. 13: [2023-05-25 13:38:00,801] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 14: [2023-05-25 13:38:00,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 14: [2023-05-25 13:38:00,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 14: [2023-05-25 13:38:00,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 12: [2023-05-25 13:38:00,801] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 14: [2023-05-25 13:38:00,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 14: [2023-05-25 13:38:00,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 10: [2023-05-25 13:38:00,801] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 14: [2023-05-25 13:38:00,802] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 4: [2023-05-25 13:38:00,802] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 4: [2023-05-25 13:38:00,802] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 4: [2023-05-25 13:38:00,802] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 4: [2023-05-25 13:38:00,802] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 4: [2023-05-25 13:38:00,802] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 13: [2023-05-25 13:38:00,802] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt... 4: [2023-05-25 13:38:00,803] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 13: [2023-05-25 13:38:00,803] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt... 13: [2023-05-25 13:38:00,803] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt... 13: [2023-05-25 13:38:00,803] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 9: [2023-05-25 13:38:00,803] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 13: [2023-05-25 13:38:00,803] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt... 9: [2023-05-25 13:38:00,803] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 9: [2023-05-25 13:38:00,803] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 9: [2023-05-25 13:38:00,803] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 9: [2023-05-25 13:38:00,803] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 14: [2023-05-25 13:38:00,804] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 9: [2023-05-25 13:38:00,804] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 14: [2023-05-25 13:38:00,804] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt... 14: [2023-05-25 13:38:00,804] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt... 15: [2023-05-25 13:38:00,804] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 14: [2023-05-25 13:38:00,804] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt... 14: [2023-05-25 13:38:00,804] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 14: [2023-05-25 13:38:00,805] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt... 19: [2023-05-25 13:38:00,805] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 9: [2023-05-25 13:38:00,805] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt. 4: [2023-05-25 13:38:00,805] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt... 4: [2023-05-25 13:38:00,805] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt... 4: [2023-05-25 13:38:00,805] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt... 12: [2023-05-25 13:38:00,805] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 9: [2023-05-25 13:38:00,805] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt. 12: [2023-05-25 13:38:00,805] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 4: [2023-05-25 13:38:00,805] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt... 4: [2023-05-25 13:38:00,805] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 2: [2023-05-25 13:38:00,806] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt. 2: [2023-05-25 13:38:00,806] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt. 6: [2023-05-25 13:38:00,806] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt. 4: [2023-05-25 13:38:00,806] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 9: [2023-05-25 13:38:00,806] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt... 9: [2023-05-25 13:38:00,806] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt... 6: [2023-05-25 13:38:00,806] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt. 14: [2023-05-25 13:38:00,807] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt. 15: [2023-05-25 13:38:00,807] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 14: [2023-05-25 13:38:00,807] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt. 6: [2023-05-25 13:38:00,807] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt. 6: [2023-05-25 13:38:00,807] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt. 11: [2023-05-25 13:38:00,807] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt. 11: [2023-05-25 13:38:00,807] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt. 9: [2023-05-25 13:38:00,807] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt... 9: [2023-05-25 13:38:00,808] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 9: [2023-05-25 13:38:00,808] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 19: [2023-05-25 13:38:00,808] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 9: [2023-05-25 13:38:00,808] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt... 10: [2023-05-25 13:38:00,809] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 19: [2023-05-25 13:38:00,810] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt... 19: [2023-05-25 13:38:00,810] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt... 10: [2023-05-25 13:38:00,810] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt. 10: [2023-05-25 13:38:00,810] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt. 15: [2023-05-25 13:38:00,810] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt. 15: [2023-05-25 13:38:00,810] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt. 0: [2023-05-25 13:38:00,811] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 8: [2023-05-25 13:38:00,811] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 4: [2023-05-25 13:38:00,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 8: [2023-05-25 13:38:00,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 4: [2023-05-25 13:38:00,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 7: [2023-05-25 13:38:00,814] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt. 7: [2023-05-25 13:38:00,815] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_02-model_states.pt. 20: [2023-05-25 13:38:00,814] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 13: [2023-05-25 13:38:00,815] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 13: [2023-05-25 13:38:00,815] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 20: [2023-05-25 13:38:00,816] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt... 0: [2023-05-25 13:38:00,816] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 0: [2023-05-25 13:38:00,817] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 6: [2023-05-25 13:38:00,820] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 11: [2023-05-25 13:38:00,821] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 0: [2023-05-25 13:38:00,821] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt... 0: [2023-05-25 13:38:00,821] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt... 14: [2023-05-25 13:38:00,821] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 2: [2023-05-25 13:38:00,821] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 11: [2023-05-25 13:38:00,821] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 9: [2023-05-25 13:38:00,821] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 6: [2023-05-25 13:38:00,821] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 6: [2023-05-25 13:38:00,822] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 14: [2023-05-25 13:38:00,822] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 2: [2023-05-25 13:38:00,822] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 10: [2023-05-25 13:38:00,822] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 10: [2023-05-25 13:38:00,822] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 6: [2023-05-25 13:38:00,822] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 9: [2023-05-25 13:38:00,823] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 0: [2023-05-25 13:38:00,825] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 15: [2023-05-25 13:38:00,826] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 7: [2023-05-25 13:38:00,826] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt. 0: [2023-05-25 13:38:00,826] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 15: [2023-05-25 13:38:00,826] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 7: [2023-05-25 13:38:00,826] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_04-model_01-model_states.pt. 20: [2023-05-25 13:38:00,826] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 7: [2023-05-25 13:38:00,829] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 7: [2023-05-25 13:38:00,829] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 20: [2023-05-25 13:38:00,829] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt... 7: [2023-05-25 13:38:00,830] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 6: [2023-05-25 13:38:00,830] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 13: [2023-05-25 13:38:00,830] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 6: [2023-05-25 13:38:00,830] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 7: [2023-05-25 13:38:00,830] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 6: [2023-05-25 13:38:00,830] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 6: [2023-05-25 13:38:00,830] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 7: [2023-05-25 13:38:00,831] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 7: [2023-05-25 13:38:00,831] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 7: [2023-05-25 13:38:00,832] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 7: [2023-05-25 13:38:00,832] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 6: [2023-05-25 13:38:00,832] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 6: [2023-05-25 13:38:00,833] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 2: [2023-05-25 13:38:00,833] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 2: [2023-05-25 13:38:00,833] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 2: [2023-05-25 13:38:00,833] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 2: [2023-05-25 13:38:00,833] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 14: [2023-05-25 13:38:00,833] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 2: [2023-05-25 13:38:00,833] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 2: [2023-05-25 13:38:00,833] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 6: [2023-05-25 13:38:00,834] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt... 6: [2023-05-25 13:38:00,834] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt... 7: [2023-05-25 13:38:00,835] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt... 7: [2023-05-25 13:38:00,835] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt... 2: [2023-05-25 13:38:00,835] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt... 2: [2023-05-25 13:38:00,835] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 2: [2023-05-25 13:38:00,836] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 2: [2023-05-25 13:38:00,836] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt... 2: [2023-05-25 13:38:00,836] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt... 2: [2023-05-25 13:38:00,836] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt... 12: [2023-05-25 13:38:00,836] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 12: [2023-05-25 13:38:00,836] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 19: [2023-05-25 13:38:00,836] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt. 19: [2023-05-25 13:38:00,837] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt. 21: [2023-05-25 13:38:00,837] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt. 8: [2023-05-25 13:38:00,837] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 8: [2023-05-25 13:38:00,837] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 1: [2023-05-25 13:38:00,837] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 1: [2023-05-25 13:38:00,837] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 1: [2023-05-25 13:38:00,837] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 1: [2023-05-25 13:38:00,837] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 21: [2023-05-25 13:38:00,837] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt. 1: [2023-05-25 13:38:00,837] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 1: [2023-05-25 13:38:00,837] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 1: [2023-05-25 13:38:00,837] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 1: [2023-05-25 13:38:00,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 14: [2023-05-25 13:38:00,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 13: [2023-05-25 13:38:00,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 12: [2023-05-25 13:38:00,839] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt... 3: [2023-05-25 13:38:00,839] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 12: [2023-05-25 13:38:00,839] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt... 3: [2023-05-25 13:38:00,839] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 3: [2023-05-25 13:38:00,839] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 3: [2023-05-25 13:38:00,839] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 3: [2023-05-25 13:38:00,839] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 3: [2023-05-25 13:38:00,839] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 3: [2023-05-25 13:38:00,839] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 1: [2023-05-25 13:38:00,839] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt... 3: [2023-05-25 13:38:00,839] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 8: [2023-05-25 13:38:00,840] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 1: [2023-05-25 13:38:00,840] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt... 1: [2023-05-25 13:38:00,840] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt... 1: [2023-05-25 13:38:00,840] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt... 5: [2023-05-25 13:38:00,840] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 5: [2023-05-25 13:38:00,841] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 5: [2023-05-25 13:38:00,840] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 5: [2023-05-25 13:38:00,841] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 5: [2023-05-25 13:38:00,841] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 5: [2023-05-25 13:38:00,841] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 1: [2023-05-25 13:38:00,841] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 5: [2023-05-25 13:38:00,841] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 5: [2023-05-25 13:38:00,841] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 1: [2023-05-25 13:38:00,841] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt... 1: [2023-05-25 13:38:00,841] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt... 4: [2023-05-25 13:38:00,840] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 4: [2023-05-25 13:38:00,840] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 1: [2023-05-25 13:38:00,841] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 18: [2023-05-25 13:38:00,841] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt. 4: [2023-05-25 13:38:00,841] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 18: [2023-05-25 13:38:00,841] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt. 0: [2023-05-25 13:38:00,842] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 3: [2023-05-25 13:38:00,842] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 13: [2023-05-25 13:38:00,842] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 3: [2023-05-25 13:38:00,842] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt... 3: [2023-05-25 13:38:00,842] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt... 3: [2023-05-25 13:38:00,842] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt... 3: [2023-05-25 13:38:00,842] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt... 3: [2023-05-25 13:38:00,842] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt... 3: [2023-05-25 13:38:00,842] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt... 3: [2023-05-25 13:38:00,842] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 4: [2023-05-25 13:38:00,843] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt... 5: [2023-05-25 13:38:00,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 5: [2023-05-25 13:38:00,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 5: [2023-05-25 13:38:00,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt... 5: [2023-05-25 13:38:00,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt... 5: [2023-05-25 13:38:00,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt... 5: [2023-05-25 13:38:00,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt... 5: [2023-05-25 13:38:00,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt... 5: [2023-05-25 13:38:00,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt... 7: [2023-05-25 13:38:00,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 13: [2023-05-25 13:38:00,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 8: [2023-05-25 13:38:00,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt... 14: [2023-05-25 13:38:00,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 8: [2023-05-25 13:38:00,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 13: [2023-05-25 13:38:00,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt... 7: [2023-05-25 13:38:00,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt... 13: [2023-05-25 13:38:00,848] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 9: [2023-05-25 13:38:00,849] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 9: [2023-05-25 13:38:00,849] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 8: [2023-05-25 13:38:00,849] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt... 11: [2023-05-25 13:38:00,850] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 19: [2023-05-25 13:38:00,850] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 21: [2023-05-25 13:38:00,850] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 21: [2023-05-25 13:38:00,850] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 8: [2023-05-25 13:38:00,851] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 12: [2023-05-25 13:38:00,851] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt. 13: [2023-05-25 13:38:00,851] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt... 8: [2023-05-25 13:38:00,851] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 23: [2023-05-25 13:38:00,851] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt. 23: [2023-05-25 13:38:00,851] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt. 6: [2023-05-25 13:38:00,851] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 14: [2023-05-25 13:38:00,851] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 4: [2023-05-25 13:38:00,851] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 19: [2023-05-25 13:38:00,852] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 14: [2023-05-25 13:38:00,852] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 11: [2023-05-25 13:38:00,852] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt... 12: [2023-05-25 13:38:00,852] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_16-model_03-model_states.pt. 10: [2023-05-25 13:38:00,853] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 10: [2023-05-25 13:38:00,853] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 13: [2023-05-25 13:38:00,853] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 6: [2023-05-25 13:38:00,853] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 17: [2023-05-25 13:38:00,853] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt. 14: [2023-05-25 13:38:00,854] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt... 17: [2023-05-25 13:38:00,854] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt. 6: [2023-05-25 13:38:00,854] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt... 2: [2023-05-25 13:38:00,854] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 2: [2023-05-25 13:38:00,854] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 18: [2023-05-25 13:38:00,854] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 4: [2023-05-25 13:38:00,854] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt... 14: [2023-05-25 13:38:00,854] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 6: [2023-05-25 13:38:00,855] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 4: [2023-05-25 13:38:00,855] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 9: [2023-05-25 13:38:00,855] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 10: [2023-05-25 13:38:00,855] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt... 10: [2023-05-25 13:38:00,855] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt... 4: [2023-05-25 13:38:00,855] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 18: [2023-05-25 13:38:00,855] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 15: [2023-05-25 13:38:00,855] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 6: [2023-05-25 13:38:00,856] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt... 11: [2023-05-25 13:38:00,856] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 14: [2023-05-25 13:38:00,857] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt... 6: [2023-05-25 13:38:00,857] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt... 15: [2023-05-25 13:38:00,857] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 9: [2023-05-25 13:38:00,858] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt... 11: [2023-05-25 13:38:00,858] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt... 2: [2023-05-25 13:38:00,858] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt... 2: [2023-05-25 13:38:00,858] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt... 16: [2023-05-25 13:38:00,858] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt. 16: [2023-05-25 13:38:00,859] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt. 15: [2023-05-25 13:38:00,859] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt... 15: [2023-05-25 13:38:00,860] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt... 9: [2023-05-25 13:38:00,860] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 6: [2023-05-25 13:38:00,861] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 9: [2023-05-25 13:38:00,863] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt... 6: [2023-05-25 13:38:00,863] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 9: [2023-05-25 13:38:00,863] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 9: [2023-05-25 13:38:00,863] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 22: [2023-05-25 13:38:00,863] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt. 6: [2023-05-25 13:38:00,863] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt... 12: [2023-05-25 13:38:00,863] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 22: [2023-05-25 13:38:00,863] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt. 6: [2023-05-25 13:38:00,864] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 7: [2023-05-25 13:38:00,865] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 23: [2023-05-25 13:38:00,865] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 23: [2023-05-25 13:38:00,866] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 2: [2023-05-25 13:38:00,866] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 2: [2023-05-25 13:38:00,866] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 12: [2023-05-25 13:38:00,868] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt... 17: [2023-05-25 13:38:00,868] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 17: [2023-05-25 13:38:00,868] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 7: [2023-05-25 13:38:00,869] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 7: [2023-05-25 13:38:00,869] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 3: [2023-05-25 13:38:00,871] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 1: [2023-05-25 13:38:00,872] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 16: [2023-05-25 13:38:00,872] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 16: [2023-05-25 13:38:00,872] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 7: [2023-05-25 13:38:00,873] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 7: [2023-05-25 13:38:00,873] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt... 7: [2023-05-25 13:38:00,873] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt... 5: [2023-05-25 13:38:00,875] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 3: [2023-05-25 13:38:00,877] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 6: [2023-05-25 13:38:00,877] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 1: [2023-05-25 13:38:00,877] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 19: [2023-05-25 13:38:00,877] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 22: [2023-05-25 13:38:00,878] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 22: [2023-05-25 13:38:00,878] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 7: [2023-05-25 13:38:00,879] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 7: [2023-05-25 13:38:00,879] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 6: [2023-05-25 13:38:00,879] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 19: [2023-05-25 13:38:00,880] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt... 21: [2023-05-25 13:38:00,880] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 21: [2023-05-25 13:38:00,880] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 2: [2023-05-25 13:38:00,880] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 2: [2023-05-25 13:38:00,880] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 19: [2023-05-25 13:38:00,880] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 5: [2023-05-25 13:38:00,881] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_00-model_states.pt. 18: [2023-05-25 13:38:00,881] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 7: [2023-05-25 13:38:00,882] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt... 7: [2023-05-25 13:38:00,882] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt... 21: [2023-05-25 13:38:00,882] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt... 21: [2023-05-25 13:38:00,882] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt... 19: [2023-05-25 13:38:00,882] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt... 18: [2023-05-25 13:38:00,883] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 7: [2023-05-25 13:38:00,884] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 1: [2023-05-25 13:38:00,885] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 18: [2023-05-25 13:38:00,886] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt... 18: [2023-05-25 13:38:00,886] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt... 3: [2023-05-25 13:38:00,887] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 1: [2023-05-25 13:38:00,891] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 7: [2023-05-25 13:38:00,891] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 3: [2023-05-25 13:38:00,892] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 12: [2023-05-25 13:38:00,892] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 23: [2023-05-25 13:38:00,893] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 5: [2023-05-25 13:38:00,894] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 12: [2023-05-25 13:38:00,895] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt... 23: [2023-05-25 13:38:00,897] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 12: [2023-05-25 13:38:00,897] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_00-model_states.pt. 17: [2023-05-25 13:38:00,897] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 17: [2023-05-25 13:38:00,897] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 20: [2023-05-25 13:38:00,898] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt. 20: [2023-05-25 13:38:00,898] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_27-model_01-model_states.pt. 23: [2023-05-25 13:38:00,899] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt... 23: [2023-05-25 13:38:00,899] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt... 5: [2023-05-25 13:38:00,899] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 12: [2023-05-25 13:38:00,899] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt... 16: [2023-05-25 13:38:00,903] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 16: [2023-05-25 13:38:00,903] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 17: [2023-05-25 13:38:00,903] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt... 17: [2023-05-25 13:38:00,903] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt... 16: [2023-05-25 13:38:00,905] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt... 16: [2023-05-25 13:38:00,905] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt... 22: [2023-05-25 13:38:00,910] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 22: [2023-05-25 13:38:00,910] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 18: [2023-05-25 13:38:00,911] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt. 18: [2023-05-25 13:38:00,911] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt. 22: [2023-05-25 13:38:00,914] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt... 22: [2023-05-25 13:38:00,914] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt... 20: [2023-05-25 13:38:00,916] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 20: [2023-05-25 13:38:00,921] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt... 18: [2023-05-25 13:38:00,923] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 18: [2023-05-25 13:38:00,927] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 20: [2023-05-25 13:38:00,944] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 20: [2023-05-25 13:38:00,947] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt... 22: [2023-05-25 13:38:00,947] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt. 22: [2023-05-25 13:38:00,947] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt. 21: [2023-05-25 13:38:00,949] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt. 21: [2023-05-25 13:38:00,949] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt. 20: [2023-05-25 13:38:00,950] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_00-model_states.pt. 23: [2023-05-25 13:38:00,952] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt. 23: [2023-05-25 13:38:00,952] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt. 20: [2023-05-25 13:38:00,953] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt... 22: [2023-05-25 13:38:00,960] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 22: [2023-05-25 13:38:00,961] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 21: [2023-05-25 13:38:00,963] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 21: [2023-05-25 13:38:00,963] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 23: [2023-05-25 13:38:00,965] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 23: [2023-05-25 13:38:00,966] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 28: [2023-05-25 13:38:00,969] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt. 28: [2023-05-25 13:38:00,969] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt. 28: [2023-05-25 13:38:00,982] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 28: [2023-05-25 13:38:00,982] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 19: [2023-05-25 13:38:01,000] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt. 19: [2023-05-25 13:38:01,000] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt. 31: [2023-05-25 13:38:01,002] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt. 31: [2023-05-25 13:38:01,002] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt. 31: [2023-05-25 13:38:01,007] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt. 31: [2023-05-25 13:38:01,007] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt. 25: [2023-05-25 13:38:01,010] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt. 25: [2023-05-25 13:38:01,010] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt. 17: [2023-05-25 13:38:01,012] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt. 17: [2023-05-25 13:38:01,012] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt. 19: [2023-05-25 13:38:01,013] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 19: [2023-05-25 13:38:01,014] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 24: [2023-05-25 13:38:01,015] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt. 24: [2023-05-25 13:38:01,015] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt. 31: [2023-05-25 13:38:01,016] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 10: [2023-05-25 13:38:01,016] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 10: [2023-05-25 13:38:01,017] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 25: [2023-05-25 13:38:01,017] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt. 25: [2023-05-25 13:38:01,017] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt. 10: [2023-05-25 13:38:01,018] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 31: [2023-05-25 13:38:01,018] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 10: [2023-05-25 13:38:01,019] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 25: [2023-05-25 13:38:01,019] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt. 25: [2023-05-25 13:38:01,020] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt. 16: [2023-05-25 13:38:01,021] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt. 31: [2023-05-25 13:38:01,022] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 27: [2023-05-25 13:38:01,022] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt. 27: [2023-05-25 13:38:01,023] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt. 25: [2023-05-25 13:38:01,023] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 25: [2023-05-25 13:38:01,024] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 16: [2023-05-25 13:38:01,024] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt. 31: [2023-05-25 13:38:01,025] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 17: [2023-05-25 13:38:01,025] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 28: [2023-05-25 13:38:01,026] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt. 28: [2023-05-25 13:38:01,026] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt. 17: [2023-05-25 13:38:01,027] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 29: [2023-05-25 13:38:01,028] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt. 24: [2023-05-25 13:38:01,027] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt. 29: [2023-05-25 13:38:01,028] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt. 24: [2023-05-25 13:38:01,029] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 24: [2023-05-25 13:38:01,029] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt. 24: [2023-05-25 13:38:01,029] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 30: [2023-05-25 13:38:01,031] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt. 30: [2023-05-25 13:38:01,032] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt. 26: [2023-05-25 13:38:01,032] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt. 26: [2023-05-25 13:38:01,032] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_02-model_states.pt. 25: [2023-05-25 13:38:01,032] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 16: [2023-05-25 13:38:01,033] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 25: [2023-05-25 13:38:01,034] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 25: [2023-05-25 13:38:01,034] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 15: [2023-05-25 13:38:01,035] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 25: [2023-05-25 13:38:01,035] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 15: [2023-05-25 13:38:01,035] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 15: [2023-05-25 13:38:01,037] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 28: [2023-05-25 13:38:01,037] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt. 11: [2023-05-25 13:38:01,037] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 15: [2023-05-25 13:38:01,037] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 28: [2023-05-25 13:38:01,038] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 11: [2023-05-25 13:38:01,038] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 16: [2023-05-25 13:38:01,038] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 28: [2023-05-25 13:38:01,039] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 29: [2023-05-25 13:38:01,039] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt. 11: [2023-05-25 13:38:01,039] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 29: [2023-05-25 13:38:01,040] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt. 11: [2023-05-25 13:38:01,040] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 2: [2023-05-25 13:38:01,040] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt. 2: [2023-05-25 13:38:01,040] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt. 26: [2023-05-25 13:38:01,041] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt. 24: [2023-05-25 13:38:01,042] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 30: [2023-05-25 13:38:01,042] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt. 27: [2023-05-25 13:38:01,042] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 9: [2023-05-25 13:38:01,042] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt. 9: [2023-05-25 13:38:01,042] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt. 26: [2023-05-25 13:38:01,042] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt. 12: [2023-05-25 13:38:01,043] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 11: [2023-05-25 13:38:01,043] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt. 11: [2023-05-25 13:38:01,043] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt. 12: [2023-05-25 13:38:01,043] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 27: [2023-05-25 13:38:01,043] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 30: [2023-05-25 13:38:01,043] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt. 24: [2023-05-25 13:38:01,044] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 30: [2023-05-25 13:38:01,045] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 12: [2023-05-25 13:38:01,045] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 12: [2023-05-25 13:38:01,045] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 28: [2023-05-25 13:38:01,045] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt. 30: [2023-05-25 13:38:01,045] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 29: [2023-05-25 13:38:01,046] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 26: [2023-05-25 13:38:01,050] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 28: [2023-05-25 13:38:01,050] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 26: [2023-05-25 13:38:01,051] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 29: [2023-05-25 13:38:01,051] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 20: [2023-05-25 13:38:01,052] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt. 24: [2023-05-25 13:38:01,054] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt. 24: [2023-05-25 13:38:01,055] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt. 2: [2023-05-25 13:38:01,055] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 2: [2023-05-25 13:38:01,055] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 29: [2023-05-25 13:38:01,055] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 9: [2023-05-25 13:38:01,055] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 20: [2023-05-25 13:38:01,055] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_02-model_states.pt. 10: [2023-05-25 13:38:01,055] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 10: [2023-05-25 13:38:01,056] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 9: [2023-05-25 13:38:01,057] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 11: [2023-05-25 13:38:01,057] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 11: [2023-05-25 13:38:01,057] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 26: [2023-05-25 13:38:01,057] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 26: [2023-05-25 13:38:01,059] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 4: [2023-05-25 13:38:01,059] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt. 4: [2023-05-25 13:38:01,059] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt. 30: [2023-05-25 13:38:01,059] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt. 30: [2023-05-25 13:38:01,060] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 30: [2023-05-25 13:38:01,060] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 28: [2023-05-25 13:38:01,061] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 9: [2023-05-25 13:38:01,062] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt. 9: [2023-05-25 13:38:01,062] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt. 30: [2023-05-25 13:38:01,063] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt. 29: [2023-05-25 13:38:01,064] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 27: [2023-05-25 13:38:01,065] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt. 27: [2023-05-25 13:38:01,065] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt. 31: [2023-05-25 13:38:01,066] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt. 28: [2023-05-25 13:38:01,066] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 28: [2023-05-25 13:38:01,066] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 28: [2023-05-25 13:38:01,066] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 28: [2023-05-25 13:38:01,066] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 31: [2023-05-25 13:38:01,066] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt. 10: [2023-05-25 13:38:01,067] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 30: [2023-05-25 13:38:01,067] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 20: [2023-05-25 13:38:01,068] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 30: [2023-05-25 13:38:01,068] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 10: [2023-05-25 13:38:01,069] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 28: [2023-05-25 13:38:01,069] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 28: [2023-05-25 13:38:01,069] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 15: [2023-05-25 13:38:01,069] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt. 15: [2023-05-25 13:38:01,070] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt. 28: [2023-05-25 13:38:01,070] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt... 28: [2023-05-25 13:38:01,070] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt... 28: [2023-05-25 13:38:01,070] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 30: [2023-05-25 13:38:01,070] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 24: [2023-05-25 13:38:01,070] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 27: [2023-05-25 13:38:01,071] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt. 27: [2023-05-25 13:38:01,071] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_01-model_states.pt. 24: [2023-05-25 13:38:01,071] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 4: [2023-05-25 13:38:01,071] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt. 4: [2023-05-25 13:38:01,072] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 10: [2023-05-25 13:38:01,072] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt. 0: [2023-05-25 13:38:01,072] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt. 28: [2023-05-25 13:38:01,073] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt... 4: [2023-05-25 13:38:01,073] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 0: [2023-05-25 13:38:01,073] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt. 27: [2023-05-25 13:38:01,073] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 29: [2023-05-25 13:38:01,073] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt. 30: [2023-05-25 13:38:01,072] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 4: [2023-05-25 13:38:01,073] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt. 29: [2023-05-25 13:38:01,073] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt. 27: [2023-05-25 13:38:01,073] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 10: [2023-05-25 13:38:01,074] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt. 20: [2023-05-25 13:38:01,074] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 30: [2023-05-25 13:38:01,074] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 23: [2023-05-25 13:38:01,075] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 23: [2023-05-25 13:38:01,075] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 23: [2023-05-25 13:38:01,075] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 9: [2023-05-25 13:38:01,076] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 9: [2023-05-25 13:38:01,076] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 23: [2023-05-25 13:38:01,076] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 27: [2023-05-25 13:38:01,077] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 23: [2023-05-25 13:38:01,077] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 28: [2023-05-25 13:38:01,077] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 27: [2023-05-25 13:38:01,078] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 23: [2023-05-25 13:38:01,078] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 0: [2023-05-25 13:38:01,078] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt. 31: [2023-05-25 13:38:01,079] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 26: [2023-05-25 13:38:01,079] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt. 26: [2023-05-25 13:38:01,079] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_39-model_03-model_states.pt. 28: [2023-05-25 13:38:01,079] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt... 1: [2023-05-25 13:38:01,079] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt. 0: [2023-05-25 13:38:01,079] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt. 23: [2023-05-25 13:38:01,079] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt... 23: [2023-05-25 13:38:01,079] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt... 31: [2023-05-25 13:38:01,080] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 1: [2023-05-25 13:38:01,080] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt. 12: [2023-05-25 13:38:01,080] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 14: [2023-05-25 13:38:01,080] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 25: [2023-05-25 13:38:01,080] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 25: [2023-05-25 13:38:01,081] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 14: [2023-05-25 13:38:01,081] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 25: [2023-05-25 13:38:01,081] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 25: [2023-05-25 13:38:01,081] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 25: [2023-05-25 13:38:01,081] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 27: [2023-05-25 13:38:01,081] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 25: [2023-05-25 13:38:01,081] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 25: [2023-05-25 13:38:01,081] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 7: [2023-05-25 13:38:01,081] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt. 25: [2023-05-25 13:38:01,081] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 13: [2023-05-25 13:38:01,082] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt. 7: [2023-05-25 13:38:01,082] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt. 3: [2023-05-25 13:38:01,082] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt. 3: [2023-05-25 13:38:01,082] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt. 13: [2023-05-25 13:38:01,082] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt. 14: [2023-05-25 13:38:01,083] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 25: [2023-05-25 13:38:01,083] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 30: [2023-05-25 13:38:01,082] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 14: [2023-05-25 13:38:01,083] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 27: [2023-05-25 13:38:01,083] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 25: [2023-05-25 13:38:01,083] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 25: [2023-05-25 13:38:01,083] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt... 25: [2023-05-25 13:38:01,084] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt... 25: [2023-05-25 13:38:01,084] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt... 25: [2023-05-25 13:38:01,084] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt... 25: [2023-05-25 13:38:01,084] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt... 25: [2023-05-25 13:38:01,084] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt... 15: [2023-05-25 13:38:01,084] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 15: [2023-05-25 13:38:01,084] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 28: [2023-05-25 13:38:01,084] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 4: [2023-05-25 13:38:01,085] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 6: [2023-05-25 13:38:01,085] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt. 12: [2023-05-25 13:38:01,085] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 10: [2023-05-25 13:38:01,085] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 6: [2023-05-25 13:38:01,085] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt. 15: [2023-05-25 13:38:01,085] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 4: [2023-05-25 13:38:01,087] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 28: [2023-05-25 13:38:01,087] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt... 1: [2023-05-25 13:38:01,087] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt. 1: [2023-05-25 13:38:01,087] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt. 10: [2023-05-25 13:38:01,087] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 27: [2023-05-25 13:38:01,088] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 11: [2023-05-25 13:38:01,088] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 11: [2023-05-25 13:38:01,088] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 7: [2023-05-25 13:38:01,088] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt. 27: [2023-05-25 13:38:01,088] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 7: [2023-05-25 13:38:01,088] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt. 5: [2023-05-25 13:38:01,088] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt. 5: [2023-05-25 13:38:01,089] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt. 5: [2023-05-25 13:38:01,089] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt. 5: [2023-05-25 13:38:01,089] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt. 27: [2023-05-25 13:38:01,089] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 0: [2023-05-25 13:38:01,089] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 31: [2023-05-25 13:38:01,089] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 31: [2023-05-25 13:38:01,089] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 31: [2023-05-25 13:38:01,089] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 11: [2023-05-25 13:38:01,089] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 24: [2023-05-25 13:38:01,089] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 24: [2023-05-25 13:38:01,089] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 24: [2023-05-25 13:38:01,089] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 24: [2023-05-25 13:38:01,089] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 24: [2023-05-25 13:38:01,089] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 24: [2023-05-25 13:38:01,089] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 2: [2023-05-25 13:38:01,090] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt. 31: [2023-05-25 13:38:01,090] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 31: [2023-05-25 13:38:01,090] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 31: [2023-05-25 13:38:01,090] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 2: [2023-05-25 13:38:01,090] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_02-model_states.pt. 10: [2023-05-25 13:38:01,090] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt. 8: [2023-05-25 13:38:01,090] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt. 0: [2023-05-25 13:38:01,090] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 8: [2023-05-25 13:38:01,090] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt. 10: [2023-05-25 13:38:01,091] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt. 11: [2023-05-25 13:38:01,091] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 29: [2023-05-25 13:38:01,091] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 3: [2023-05-25 13:38:01,091] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt. 29: [2023-05-25 13:38:01,092] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 3: [2023-05-25 13:38:01,092] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt. 27: [2023-05-25 13:38:01,092] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt... 15: [2023-05-25 13:38:01,092] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 24: [2023-05-25 13:38:01,092] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt... 30: [2023-05-25 13:38:01,092] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 27: [2023-05-25 13:38:01,092] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 27: [2023-05-25 13:38:01,092] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt... 13: [2023-05-25 13:38:01,092] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 13: [2023-05-25 13:38:01,092] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 24: [2023-05-25 13:38:01,092] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt... 24: [2023-05-25 13:38:01,092] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt... 24: [2023-05-25 13:38:01,093] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 10: [2023-05-25 13:38:01,093] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt. 10: [2023-05-25 13:38:01,093] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt. 24: [2023-05-25 13:38:01,093] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt... 12: [2023-05-25 13:38:01,093] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 13: [2023-05-25 13:38:01,094] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 11: [2023-05-25 13:38:01,094] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt... 11: [2023-05-25 13:38:01,094] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt... 30: [2023-05-25 13:38:01,094] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt... 24: [2023-05-25 13:38:01,095] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 1: [2023-05-25 13:38:01,095] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 13: [2023-05-25 13:38:01,095] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 1: [2023-05-25 13:38:01,095] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 13: [2023-05-25 13:38:01,096] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 26: [2023-05-25 13:38:01,096] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 0: [2023-05-25 13:38:01,096] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 6: [2023-05-25 13:38:01,097] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt. 6: [2023-05-25 13:38:01,097] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt. 23: [2023-05-25 13:38:01,097] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt. 23: [2023-05-25 13:38:01,097] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt. 12: [2023-05-25 13:38:01,097] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 13: [2023-05-25 13:38:01,097] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 19: [2023-05-25 13:38:01,097] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt. 17: [2023-05-25 13:38:01,098] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt. 6: [2023-05-25 13:38:01,098] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 17: [2023-05-25 13:38:01,098] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt. 28: [2023-05-25 13:38:01,099] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 31: [2023-05-25 13:38:01,092] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 31: [2023-05-25 13:38:01,092] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt... 8: [2023-05-25 13:38:01,096] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt. 31: [2023-05-25 13:38:01,093] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt... 8: [2023-05-25 13:38:01,097] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt. 31: [2023-05-25 13:38:01,093] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt... 8: [2023-05-25 13:38:01,098] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 19: [2023-05-25 13:38:01,099] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt. 31: [2023-05-25 13:38:01,094] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 8: [2023-05-25 13:38:01,099] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 31: [2023-05-25 13:38:01,094] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt... 4: [2023-05-25 13:38:01,099] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 4: [2023-05-25 13:38:01,099] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 0: [2023-05-25 13:38:01,099] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 7: [2023-05-25 13:38:01,100] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 28: [2023-05-25 13:38:01,100] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 6: [2023-05-25 13:38:01,100] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 28: [2023-05-25 13:38:01,100] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 15: [2023-05-25 13:38:01,100] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 30: [2023-05-25 13:38:01,100] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 7: [2023-05-25 13:38:01,100] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 1: [2023-05-25 13:38:01,100] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 20: [2023-05-25 13:38:01,101] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt. 28: [2023-05-25 13:38:01,101] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt... 20: [2023-05-25 13:38:01,101] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt. 11: [2023-05-25 13:38:01,101] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 3: [2023-05-25 13:38:01,101] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 3: [2023-05-25 13:38:01,101] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 8: [2023-05-25 13:38:01,101] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 11: [2023-05-25 13:38:01,102] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 8: [2023-05-25 13:38:01,102] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 4: [2023-05-25 13:38:01,102] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 4: [2023-05-25 13:38:01,102] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 2: [2023-05-25 13:38:01,103] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 22: [2023-05-25 13:38:01,103] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 22: [2023-05-25 13:38:01,103] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 22: [2023-05-25 13:38:01,103] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 22: [2023-05-25 13:38:01,103] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 26: [2023-05-25 13:38:01,103] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 26: [2023-05-25 13:38:01,103] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 26: [2023-05-25 13:38:01,103] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 26: [2023-05-25 13:38:01,103] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 26: [2023-05-25 13:38:01,103] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 26: [2023-05-25 13:38:01,103] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 30: [2023-05-25 13:38:01,103] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt... 2: [2023-05-25 13:38:01,104] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 22: [2023-05-25 13:38:01,104] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt. 22: [2023-05-25 13:38:01,104] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt. 24: [2023-05-25 13:38:01,104] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 24: [2023-05-25 13:38:01,104] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 1: [2023-05-25 13:38:01,104] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 4: [2023-05-25 13:38:01,104] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt. 4: [2023-05-25 13:38:01,105] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt. 5: [2023-05-25 13:38:01,105] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 10: [2023-05-25 13:38:01,105] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 22: [2023-05-25 13:38:01,105] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 10: [2023-05-25 13:38:01,105] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 22: [2023-05-25 13:38:01,105] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 22: [2023-05-25 13:38:01,106] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt... 7: [2023-05-25 13:38:01,106] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 22: [2023-05-25 13:38:01,106] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt... 10: [2023-05-25 13:38:01,106] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 10: [2023-05-25 13:38:01,106] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 26: [2023-05-25 13:38:01,106] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 30: [2023-05-25 13:38:01,106] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 18: [2023-05-25 13:38:01,107] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt. 18: [2023-05-25 13:38:01,107] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt. 26: [2023-05-25 13:38:01,107] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt... 26: [2023-05-25 13:38:01,107] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt... 13: [2023-05-25 13:38:01,107] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt. 5: [2023-05-25 13:38:01,107] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 7: [2023-05-25 13:38:01,108] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 15: [2023-05-25 13:38:01,107] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 5: [2023-05-25 13:38:01,108] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 24: [2023-05-25 13:38:01,108] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt... 24: [2023-05-25 13:38:01,108] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt... 26: [2023-05-25 13:38:01,108] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 26: [2023-05-25 13:38:01,108] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt... 13: [2023-05-25 13:38:01,108] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt. 26: [2023-05-25 13:38:01,108] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt... 8: [2023-05-25 13:38:01,108] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 8: [2023-05-25 13:38:01,108] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 5: [2023-05-25 13:38:01,108] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 3: [2023-05-25 13:38:01,109] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 26: [2023-05-25 13:38:01,109] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 12: [2023-05-25 13:38:01,109] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt. 3: [2023-05-25 13:38:01,109] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 21: [2023-05-25 13:38:01,109] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt. 21: [2023-05-25 13:38:01,109] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt. 30: [2023-05-25 13:38:01,110] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt... 30: [2023-05-25 13:38:01,110] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 19: [2023-05-25 13:38:01,110] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 12: [2023-05-25 13:38:01,110] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt. 31: [2023-05-25 13:38:01,110] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 23: [2023-05-25 13:38:01,111] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 23: [2023-05-25 13:38:01,111] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 21: [2023-05-25 13:38:01,111] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 21: [2023-05-25 13:38:01,111] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 30: [2023-05-25 13:38:01,112] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 6: [2023-05-25 13:38:01,112] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 27: [2023-05-25 13:38:01,111] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 6: [2023-05-25 13:38:01,112] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 16: [2023-05-25 13:38:01,112] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt. 16: [2023-05-25 13:38:01,112] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_03-model_states.pt. 6: [2023-05-25 13:38:01,113] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 6: [2023-05-25 13:38:01,113] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 23: [2023-05-25 13:38:01,113] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 21: [2023-05-25 13:38:01,113] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 19: [2023-05-25 13:38:01,113] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 28: [2023-05-25 13:38:01,113] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 31: [2023-05-25 13:38:01,113] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 8: [2023-05-25 13:38:01,113] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 17: [2023-05-25 13:38:01,114] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 17: [2023-05-25 13:38:01,114] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 21: [2023-05-25 13:38:01,114] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 30: [2023-05-25 13:38:01,114] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt... 25: [2023-05-25 13:38:01,114] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 23: [2023-05-25 13:38:01,114] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 12: [2023-05-25 13:38:01,114] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt. 31: [2023-05-25 13:38:01,115] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt... 16: [2023-05-25 13:38:01,115] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 16: [2023-05-25 13:38:01,115] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 16: [2023-05-25 13:38:01,115] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 12: [2023-05-25 13:38:01,115] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt. 6: [2023-05-25 13:38:01,115] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 16: [2023-05-25 13:38:01,115] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 6: [2023-05-25 13:38:01,115] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 20: [2023-05-25 13:38:01,115] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 4: [2023-05-25 13:38:01,115] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 28: [2023-05-25 13:38:01,115] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 21: [2023-05-25 13:38:01,115] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 31: [2023-05-25 13:38:01,116] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt... 8: [2023-05-25 13:38:01,116] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 30: [2023-05-25 13:38:01,116] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 25: [2023-05-25 13:38:01,117] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 11: [2023-05-25 13:38:01,117] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt. 11: [2023-05-25 13:38:01,117] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt. 16: [2023-05-25 13:38:01,117] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt... 5: [2023-05-25 13:38:01,117] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 16: [2023-05-25 13:38:01,117] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 21: [2023-05-25 13:38:01,117] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 16: [2023-05-25 13:38:01,117] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt... 21: [2023-05-25 13:38:01,117] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt... 5: [2023-05-25 13:38:01,117] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 16: [2023-05-25 13:38:01,117] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 15: [2023-05-25 13:38:01,117] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt. 4: [2023-05-25 13:38:01,117] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 4: [2023-05-25 13:38:01,118] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 15: [2023-05-25 13:38:01,118] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt. 4: [2023-05-25 13:38:01,118] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 4: [2023-05-25 13:38:01,118] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 15: [2023-05-25 13:38:01,118] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 14: [2023-05-25 13:38:01,117] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 22: [2023-05-25 13:38:01,118] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 4: [2023-05-25 13:38:01,118] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 20: [2023-05-25 13:38:01,118] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 4: [2023-05-25 13:38:01,118] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt... 22: [2023-05-25 13:38:01,119] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 0: [2023-05-25 13:38:01,119] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 4: [2023-05-25 13:38:01,119] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt... 5: [2023-05-25 13:38:01,119] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 0: [2023-05-25 13:38:01,119] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 30: [2023-05-25 13:38:01,120] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt... 4: [2023-05-25 13:38:01,120] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt... 18: [2023-05-25 13:38:01,120] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 4: [2023-05-25 13:38:01,120] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt... 15: [2023-05-25 13:38:01,120] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt... 18: [2023-05-25 13:38:01,120] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 5: [2023-05-25 13:38:01,120] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 14: [2023-05-25 13:38:01,120] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt. 21: [2023-05-25 13:38:01,121] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt... 14: [2023-05-25 13:38:01,121] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt. 27: [2023-05-25 13:38:01,121] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 14: [2023-05-25 13:38:01,121] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 7: [2023-05-25 13:38:01,121] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 7: [2023-05-25 13:38:01,121] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 0: [2023-05-25 13:38:01,121] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 18: [2023-05-25 13:38:01,122] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt. 12: [2023-05-25 13:38:01,122] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 13: [2023-05-25 13:38:01,122] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 18: [2023-05-25 13:38:01,122] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt. 30: [2023-05-25 13:38:01,123] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 21: [2023-05-25 13:38:01,123] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 14: [2023-05-25 13:38:01,123] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt. 11: [2023-05-25 13:38:01,123] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt. 14: [2023-05-25 13:38:01,123] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt. 21: [2023-05-25 13:38:01,123] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 15: [2023-05-25 13:38:01,123] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 13: [2023-05-25 13:38:01,124] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 8: [2023-05-25 13:38:01,124] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt. 11: [2023-05-25 13:38:01,124] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt. 8: [2023-05-25 13:38:01,124] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt. 27: [2023-05-25 13:38:01,124] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt... 0: [2023-05-25 13:38:01,124] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 7: [2023-05-25 13:38:01,124] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 7: [2023-05-25 13:38:01,124] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 23: [2023-05-25 13:38:01,125] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 27: [2023-05-25 13:38:01,125] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 26: [2023-05-25 13:38:01,125] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 31: [2023-05-25 13:38:01,125] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 30: [2023-05-25 13:38:01,126] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 15: [2023-05-25 13:38:01,126] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt... 16: [2023-05-25 13:38:01,126] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 14: [2023-05-25 13:38:01,126] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt. 12: [2023-05-25 13:38:01,126] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 14: [2023-05-25 13:38:01,126] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_01-model_states.pt. 31: [2023-05-25 13:38:01,127] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 23: [2023-05-25 13:38:01,127] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 25: [2023-05-25 13:38:01,127] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 27: [2023-05-25 13:38:01,128] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 26: [2023-05-25 13:38:01,128] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt... 13: [2023-05-25 13:38:01,128] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt. 10: [2023-05-25 13:38:01,128] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 13: [2023-05-25 13:38:01,128] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_02-model_states.pt. 15: [2023-05-25 13:38:01,128] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt. 27: [2023-05-25 13:38:01,128] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 12: [2023-05-25 13:38:01,128] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 15: [2023-05-25 13:38:01,129] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt. 12: [2023-05-25 13:38:01,129] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 16: [2023-05-25 13:38:01,129] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 18: [2023-05-25 13:38:01,129] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 24: [2023-05-25 13:38:01,129] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 18: [2023-05-25 13:38:01,129] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 18: [2023-05-25 13:38:01,130] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 18: [2023-05-25 13:38:01,130] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 10: [2023-05-25 13:38:01,130] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt... 27: [2023-05-25 13:38:01,130] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt... 19: [2023-05-25 13:38:01,131] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 11: [2023-05-25 13:38:01,131] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 19: [2023-05-25 13:38:01,131] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 24: [2023-05-25 13:38:01,131] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 30: [2023-05-25 13:38:01,131] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 27: [2023-05-25 13:38:01,131] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 25: [2023-05-25 13:38:01,132] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 18: [2023-05-25 13:38:01,132] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 18: [2023-05-25 13:38:01,132] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt... 19: [2023-05-25 13:38:01,132] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 19: [2023-05-25 13:38:01,132] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 10: [2023-05-25 13:38:01,132] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 15: [2023-05-25 13:38:01,132] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 18: [2023-05-25 13:38:01,133] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 11: [2023-05-25 13:38:01,133] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 18: [2023-05-25 13:38:01,133] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt... 14: [2023-05-25 13:38:01,133] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 30: [2023-05-25 13:38:01,133] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt... 10: [2023-05-25 13:38:01,133] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 19: [2023-05-25 13:38:01,134] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt... 19: [2023-05-25 13:38:01,134] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt... 6: [2023-05-25 13:38:01,134] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt. 27: [2023-05-25 13:38:01,134] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt... 6: [2023-05-25 13:38:01,134] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt. 2: [2023-05-25 13:38:01,134] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt. 5: [2023-05-25 13:38:01,134] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt. 19: [2023-05-25 13:38:01,134] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 2: [2023-05-25 13:38:01,134] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt. 5: [2023-05-25 13:38:01,134] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt. 19: [2023-05-25 13:38:01,134] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 15: [2023-05-25 13:38:01,134] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 10: [2023-05-25 13:38:01,134] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt... 26: [2023-05-25 13:38:01,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 27: [2023-05-25 13:38:01,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 14: [2023-05-25 13:38:01,135] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 18: [2023-05-25 13:38:01,135] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 10: [2023-05-25 13:38:01,135] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt... 13: [2023-05-25 13:38:01,136] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 4: [2023-05-25 13:38:01,136] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 20: [2023-05-25 13:38:01,136] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 20: [2023-05-25 13:38:01,136] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 20: [2023-05-25 13:38:01,136] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 20: [2023-05-25 13:38:01,136] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 18: [2023-05-25 13:38:01,136] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 14: [2023-05-25 13:38:01,137] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 11: [2023-05-25 13:38:01,137] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 27: [2023-05-25 13:38:01,137] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt... 14: [2023-05-25 13:38:01,137] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 26: [2023-05-25 13:38:01,138] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt... 4: [2023-05-25 13:38:01,138] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 1: [2023-05-25 13:38:01,138] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt. 0: [2023-05-25 13:38:01,139] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 1: [2023-05-25 13:38:01,139] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt. 20: [2023-05-25 13:38:01,139] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 20: [2023-05-25 13:38:01,139] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 13: [2023-05-25 13:38:01,139] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 30: [2023-05-25 13:38:01,139] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 31: [2023-05-25 13:38:01,139] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 20: [2023-05-25 13:38:01,139] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt... 20: [2023-05-25 13:38:01,139] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt... 0: [2023-05-25 13:38:01,139] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt. 0: [2023-05-25 13:38:01,139] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt. 0: [2023-05-25 13:38:01,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 10: [2023-05-25 13:38:01,139] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 22: [2023-05-25 13:38:01,139] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 22: [2023-05-25 13:38:01,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 6: [2023-05-25 13:38:01,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 19: [2023-05-25 13:38:01,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 7: [2023-05-25 13:38:01,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt. 10: [2023-05-25 13:38:01,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 10: [2023-05-25 13:38:01,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 11: [2023-05-25 13:38:01,140] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 31: [2023-05-25 13:38:01,141] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 7: [2023-05-25 13:38:01,141] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_01-model_states.pt. 6: [2023-05-25 13:38:01,141] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 10: [2023-05-25 13:38:01,141] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt... 14: [2023-05-25 13:38:01,141] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 14: [2023-05-25 13:38:01,141] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 3: [2023-05-25 13:38:01,141] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt. 27: [2023-05-25 13:38:01,141] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 8: [2023-05-25 13:38:01,141] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 26: [2023-05-25 13:38:01,142] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 19: [2023-05-25 13:38:01,142] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt... 15: [2023-05-25 13:38:01,142] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 7: [2023-05-25 13:38:01,142] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 22: [2023-05-25 13:38:01,142] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 13: [2023-05-25 13:38:01,142] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt... 8: [2023-05-25 13:38:01,142] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 7: [2023-05-25 13:38:01,142] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 7: [2023-05-25 13:38:01,142] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 6: [2023-05-25 13:38:01,142] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt... 7: [2023-05-25 13:38:01,143] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 0: [2023-05-25 13:38:01,143] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt... 10: [2023-05-25 13:38:01,143] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt... 24: [2023-05-25 13:38:01,143] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 10: [2023-05-25 13:38:01,143] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt... 0: [2023-05-25 13:38:01,143] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 6: [2023-05-25 13:38:01,143] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 17: [2023-05-25 13:38:01,143] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 17: [2023-05-25 13:38:01,143] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 17: [2023-05-25 13:38:01,143] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 17: [2023-05-25 13:38:01,143] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 24: [2023-05-25 13:38:01,144] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 22: [2023-05-25 13:38:01,144] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt... 3: [2023-05-25 13:38:01,144] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_05-model_03-model_states.pt. 23: [2023-05-25 13:38:01,144] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 6: [2023-05-25 13:38:01,144] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt... 0: [2023-05-25 13:38:01,144] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt... 17: [2023-05-25 13:38:01,144] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 15: [2023-05-25 13:38:01,145] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 13: [2023-05-25 13:38:01,145] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 7: [2023-05-25 13:38:01,145] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt... 17: [2023-05-25 13:38:01,145] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 26: [2023-05-25 13:38:01,145] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 14: [2023-05-25 13:38:01,145] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 14: [2023-05-25 13:38:01,145] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 7: [2023-05-25 13:38:01,145] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt... 7: [2023-05-25 13:38:01,145] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt... 7: [2023-05-25 13:38:01,146] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt... 13: [2023-05-25 13:38:01,146] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 17: [2023-05-25 13:38:01,146] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt... 17: [2023-05-25 13:38:01,147] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt... 17: [2023-05-25 13:38:01,147] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 17: [2023-05-25 13:38:01,147] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 6: [2023-05-25 13:38:01,147] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt... 23: [2023-05-25 13:38:01,147] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt... 2: [2023-05-25 13:38:01,147] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 0: [2023-05-25 13:38:01,147] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt... 1: [2023-05-25 13:38:01,147] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 1: [2023-05-25 13:38:01,147] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 1: [2023-05-25 13:38:01,147] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 17: [2023-05-25 13:38:01,148] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt... 17: [2023-05-25 13:38:01,148] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt... 4: [2023-05-25 13:38:01,148] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 2: [2023-05-25 13:38:01,148] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 5: [2023-05-25 13:38:01,148] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 1: [2023-05-25 13:38:01,148] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 1: [2023-05-25 13:38:01,148] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 5: [2023-05-25 13:38:01,149] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 5: [2023-05-25 13:38:01,149] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 5: [2023-05-25 13:38:01,149] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 1: [2023-05-25 13:38:01,149] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 4: [2023-05-25 13:38:01,149] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 29: [2023-05-25 13:38:01,149] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 29: [2023-05-25 13:38:01,149] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 13: [2023-05-25 13:38:01,149] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 29: [2023-05-25 13:38:01,149] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 29: [2023-05-25 13:38:01,149] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 13: [2023-05-25 13:38:01,149] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 6: [2023-05-25 13:38:01,150] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 20: [2023-05-25 13:38:01,150] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 23: [2023-05-25 13:38:01,149] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 9: [2023-05-25 13:38:01,150] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 9: [2023-05-25 13:38:01,150] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 29: [2023-05-25 13:38:01,150] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 29: [2023-05-25 13:38:01,150] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 20: [2023-05-25 13:38:01,150] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 29: [2023-05-25 13:38:01,150] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 9: [2023-05-25 13:38:01,150] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 9: [2023-05-25 13:38:01,150] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 9: [2023-05-25 13:38:01,150] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 29: [2023-05-25 13:38:01,150] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 4: [2023-05-25 13:38:01,150] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt... 1: [2023-05-25 13:38:01,150] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt... 9: [2023-05-25 13:38:01,150] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 6: [2023-05-25 13:38:01,150] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 6: [2023-05-25 13:38:01,150] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 1: [2023-05-25 13:38:01,150] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt... 4: [2023-05-25 13:38:01,150] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 1: [2023-05-25 13:38:01,150] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt... 4: [2023-05-25 13:38:01,151] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt... 1: [2023-05-25 13:38:01,151] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt... 6: [2023-05-25 13:38:01,151] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 1: [2023-05-25 13:38:01,151] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 21: [2023-05-25 13:38:01,151] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 1: [2023-05-25 13:38:01,151] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 5: [2023-05-25 13:38:01,151] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 23: [2023-05-25 13:38:01,151] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt... 5: [2023-05-25 13:38:01,151] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt... 1: [2023-05-25 13:38:01,152] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 5: [2023-05-25 13:38:01,152] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt... 0: [2023-05-25 13:38:01,152] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 6: [2023-05-25 13:38:01,152] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt... 16: [2023-05-25 13:38:01,151] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 5: [2023-05-25 13:38:01,152] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt... 19: [2023-05-25 13:38:01,152] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 4: [2023-05-25 13:38:01,152] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 1: [2023-05-25 13:38:01,152] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 9: [2023-05-25 13:38:01,152] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt... 9: [2023-05-25 13:38:01,152] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt... 22: [2023-05-25 13:38:01,152] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 29: [2023-05-25 13:38:01,152] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt... 6: [2023-05-25 13:38:01,152] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 13: [2023-05-25 13:38:01,152] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt... 9: [2023-05-25 13:38:01,153] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt. 9: [2023-05-25 13:38:01,153] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt. 20: [2023-05-25 13:38:01,153] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt... 20: [2023-05-25 13:38:01,153] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt... 8: [2023-05-25 13:38:01,153] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 8: [2023-05-25 13:38:01,153] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 8: [2023-05-25 13:38:01,153] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 9: [2023-05-25 13:38:01,153] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt... 29: [2023-05-25 13:38:01,153] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 9: [2023-05-25 13:38:01,153] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 29: [2023-05-25 13:38:01,153] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt... 23: [2023-05-25 13:38:01,154] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt. 9: [2023-05-25 13:38:01,154] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 9: [2023-05-25 13:38:01,154] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt... 29: [2023-05-25 13:38:01,154] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt... 5: [2023-05-25 13:38:01,154] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 21: [2023-05-25 13:38:01,154] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 21: [2023-05-25 13:38:01,154] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 19: [2023-05-25 13:38:01,154] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt... 23: [2023-05-25 13:38:01,154] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt. 29: [2023-05-25 13:38:01,154] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt... 29: [2023-05-25 13:38:01,154] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt... 29: [2023-05-25 13:38:01,154] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt... 29: [2023-05-25 13:38:01,155] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt... 22: [2023-05-25 13:38:01,155] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 22: [2023-05-25 13:38:01,155] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt... 22: [2023-05-25 13:38:01,155] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 21: [2023-05-25 13:38:01,155] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 2: [2023-05-25 13:38:01,155] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 2: [2023-05-25 13:38:01,156] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 2: [2023-05-25 13:38:01,156] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 2: [2023-05-25 13:38:01,156] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 16: [2023-05-25 13:38:01,156] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 2: [2023-05-25 13:38:01,156] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 2: [2023-05-25 13:38:01,156] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 0: [2023-05-25 13:38:01,156] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 5: [2023-05-25 13:38:01,156] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 5: [2023-05-25 13:38:01,156] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt... 8: [2023-05-25 13:38:01,156] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt... 7: [2023-05-25 13:38:01,156] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 18: [2023-05-25 13:38:01,156] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 0: [2023-05-25 13:38:01,156] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 13: [2023-05-25 13:38:01,156] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 8: [2023-05-25 13:38:01,157] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt... 8: [2023-05-25 13:38:01,157] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt... 0: [2023-05-25 13:38:01,157] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 26: [2023-05-25 13:38:01,157] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 3: [2023-05-25 13:38:01,157] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 5: [2023-05-25 13:38:01,157] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 13: [2023-05-25 13:38:01,157] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 7: [2023-05-25 13:38:01,157] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 2: [2023-05-25 13:38:01,158] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt... 8: [2023-05-25 13:38:01,158] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 8: [2023-05-25 13:38:01,158] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 12: [2023-05-25 13:38:01,158] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 2: [2023-05-25 13:38:01,158] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 13: [2023-05-25 13:38:01,158] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt... 21: [2023-05-25 13:38:01,158] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt... 2: [2023-05-25 13:38:01,158] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 7: [2023-05-25 13:38:01,158] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 21: [2023-05-25 13:38:01,158] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt... 8: [2023-05-25 13:38:01,159] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 18: [2023-05-25 13:38:01,159] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 3: [2023-05-25 13:38:01,159] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 2: [2023-05-25 13:38:01,159] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt... 3: [2023-05-25 13:38:01,159] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 3: [2023-05-25 13:38:01,159] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 11: [2023-05-25 13:38:01,159] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 3: [2023-05-25 13:38:01,159] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 18: [2023-05-25 13:38:01,159] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt... 7: [2023-05-25 13:38:01,159] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 3: [2023-05-25 13:38:01,159] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 3: [2023-05-25 13:38:01,159] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 3: [2023-05-25 13:38:01,159] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 2: [2023-05-25 13:38:01,160] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt... 2: [2023-05-25 13:38:01,160] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt... 12: [2023-05-25 13:38:01,160] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 0: [2023-05-25 13:38:01,160] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt... 12: [2023-05-25 13:38:01,161] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt... 16: [2023-05-25 13:38:01,161] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 8: [2023-05-25 13:38:01,162] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt... 11: [2023-05-25 13:38:01,162] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt... 12: [2023-05-25 13:38:01,162] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt... 18: [2023-05-25 13:38:01,161] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt... 3: [2023-05-25 13:38:01,162] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 12: [2023-05-25 13:38:01,163] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 3: [2023-05-25 13:38:01,163] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt... 3: [2023-05-25 13:38:01,163] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt... 3: [2023-05-25 13:38:01,163] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt... 3: [2023-05-25 13:38:01,163] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt... 3: [2023-05-25 13:38:01,163] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt... 12: [2023-05-25 13:38:01,164] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 6: [2023-05-25 13:38:01,164] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 21: [2023-05-25 13:38:01,165] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 16: [2023-05-25 13:38:01,165] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 11: [2023-05-25 13:38:01,165] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 0: [2023-05-25 13:38:01,165] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 19: [2023-05-25 13:38:01,165] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 18: [2023-05-25 13:38:01,165] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 12: [2023-05-25 13:38:01,165] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt... 19: [2023-05-25 13:38:01,165] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 26: [2023-05-25 13:38:01,166] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 12: [2023-05-25 13:38:01,166] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt. 12: [2023-05-25 13:38:01,166] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt... 12: [2023-05-25 13:38:01,166] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_17-model_03-model_states.pt. 23: [2023-05-25 13:38:01,167] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 19: [2023-05-25 13:38:01,167] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt. 14: [2023-05-25 13:38:01,167] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 11: [2023-05-25 13:38:01,167] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt... 6: [2023-05-25 13:38:01,167] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 14: [2023-05-25 13:38:01,167] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 21: [2023-05-25 13:38:01,168] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 19: [2023-05-25 13:38:01,168] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt. 16: [2023-05-25 13:38:01,168] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 23: [2023-05-25 13:38:01,169] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 11: [2023-05-25 13:38:01,169] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 9: [2023-05-25 13:38:01,169] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 9: [2023-05-25 13:38:01,169] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 18: [2023-05-25 13:38:01,170] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 14: [2023-05-25 13:38:01,170] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt... 14: [2023-05-25 13:38:01,170] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt... 14: [2023-05-25 13:38:01,170] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 13: [2023-05-25 13:38:01,171] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 18: [2023-05-25 13:38:01,171] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 20: [2023-05-25 13:38:01,171] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 16: [2023-05-25 13:38:01,171] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 5: [2023-05-25 13:38:01,171] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 15: [2023-05-25 13:38:01,172] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 18: [2023-05-25 13:38:01,172] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 5: [2023-05-25 13:38:01,172] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 0: [2023-05-25 13:38:01,172] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 18: [2023-05-25 13:38:01,172] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt... 11: [2023-05-25 13:38:01,172] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 14: [2023-05-25 13:38:01,173] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt... 11: [2023-05-25 13:38:01,173] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt... 18: [2023-05-25 13:38:01,173] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt... 15: [2023-05-25 13:38:01,174] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 13: [2023-05-25 13:38:01,174] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt... 20: [2023-05-25 13:38:01,174] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 13: [2023-05-25 13:38:01,174] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 8: [2023-05-25 13:38:01,174] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 8: [2023-05-25 13:38:01,174] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 11: [2023-05-25 13:38:01,174] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt... 14: [2023-05-25 13:38:01,175] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 7: [2023-05-25 13:38:01,176] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 6: [2023-05-25 13:38:01,176] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 15: [2023-05-25 13:38:01,176] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt... 7: [2023-05-25 13:38:01,176] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 15: [2023-05-25 13:38:01,176] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 15: [2023-05-25 13:38:01,176] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt... 14: [2023-05-25 13:38:01,177] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 14: [2023-05-25 13:38:01,177] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt... 18: [2023-05-25 13:38:01,178] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 21: [2023-05-25 13:38:01,178] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt. 2: [2023-05-25 13:38:01,178] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 21: [2023-05-25 13:38:01,179] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt. 15: [2023-05-25 13:38:01,179] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt... 1: [2023-05-25 13:38:01,179] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 6: [2023-05-25 13:38:01,180] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt... 16: [2023-05-25 13:38:01,180] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt... 16: [2023-05-25 13:38:01,180] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt... 19: [2023-05-25 13:38:01,180] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 2: [2023-05-25 13:38:01,181] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 17: [2023-05-25 13:38:01,181] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 12: [2023-05-25 13:38:01,181] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 14: [2023-05-25 13:38:01,181] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 8: [2023-05-25 13:38:01,181] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 8: [2023-05-25 13:38:01,181] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 17: [2023-05-25 13:38:01,181] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 1: [2023-05-25 13:38:01,181] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt... 5: [2023-05-25 13:38:01,181] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 17: [2023-05-25 13:38:01,182] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt. 17: [2023-05-25 13:38:01,182] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt. 14: [2023-05-25 13:38:01,182] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt... 15: [2023-05-25 13:38:01,182] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 14: [2023-05-25 13:38:01,183] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt... 19: [2023-05-25 13:38:01,183] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 8: [2023-05-25 13:38:01,184] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt... 8: [2023-05-25 13:38:01,184] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt... 12: [2023-05-25 13:38:01,184] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt... 19: [2023-05-25 13:38:01,184] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 19: [2023-05-25 13:38:01,184] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 2: [2023-05-25 13:38:01,184] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt... 13: [2023-05-25 13:38:01,184] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 15: [2023-05-25 13:38:01,184] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt... 2: [2023-05-25 13:38:01,185] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt... 5: [2023-05-25 13:38:01,186] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt... 6: [2023-05-25 13:38:01,186] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 3: [2023-05-25 13:38:01,186] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 7: [2023-05-25 13:38:01,186] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 18: [2023-05-25 13:38:01,186] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 1: [2023-05-25 13:38:01,186] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 1: [2023-05-25 13:38:01,186] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 20: [2023-05-25 13:38:01,187] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 13: [2023-05-25 13:38:01,187] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt... 0: [2023-05-25 13:38:01,187] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 6: [2023-05-25 13:38:01,188] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt... 0: [2023-05-25 13:38:01,188] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 3: [2023-05-25 13:38:01,189] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt... 20: [2023-05-25 13:38:01,189] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 16: [2023-05-25 13:38:01,189] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt. 7: [2023-05-25 13:38:01,189] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt... 0: [2023-05-25 13:38:01,189] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt... 16: [2023-05-25 13:38:01,189] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt. 13: [2023-05-25 13:38:01,189] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 1: [2023-05-25 13:38:01,189] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt... 22: [2023-05-25 13:38:01,190] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt. 3: [2023-05-25 13:38:01,190] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 22: [2023-05-25 13:38:01,190] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt. 29: [2023-05-25 13:38:01,190] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 29: [2023-05-25 13:38:01,190] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_00-model_states.pt. 7: [2023-05-25 13:38:01,191] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 5: [2023-05-25 13:38:01,191] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 13: [2023-05-25 13:38:01,191] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt... 1: [2023-05-25 13:38:01,192] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 3: [2023-05-25 13:38:01,192] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt... 9: [2023-05-25 13:38:01,193] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 2: [2023-05-25 13:38:01,193] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 3: [2023-05-25 13:38:01,193] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 21: [2023-05-25 13:38:01,193] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 7: [2023-05-25 13:38:01,193] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt... 21: [2023-05-25 13:38:01,193] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 5: [2023-05-25 13:38:01,193] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt... 3: [2023-05-25 13:38:01,194] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 0: [2023-05-25 13:38:01,194] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 23: [2023-05-25 13:38:01,195] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 9: [2023-05-25 13:38:01,195] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 17: [2023-05-25 13:38:01,196] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 0: [2023-05-25 13:38:01,196] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt... 17: [2023-05-25 13:38:01,197] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 17: [2023-05-25 13:38:01,197] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 17: [2023-05-25 13:38:01,197] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 2: [2023-05-25 13:38:01,197] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_00-model_states.pt. 23: [2023-05-25 13:38:01,197] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt... 1: [2023-05-25 13:38:01,201] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 23: [2023-05-25 13:38:01,201] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 16: [2023-05-25 13:38:01,203] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 16: [2023-05-25 13:38:01,203] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 23: [2023-05-25 13:38:01,203] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt... 1: [2023-05-25 13:38:01,205] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 9: [2023-05-25 13:38:01,205] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 22: [2023-05-25 13:38:01,206] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 22: [2023-05-25 13:38:01,206] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 29: [2023-05-25 13:38:01,207] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 12: [2023-05-25 13:38:01,208] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 9: [2023-05-25 13:38:01,209] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt... 3: [2023-05-25 13:38:01,209] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 3: [2023-05-25 13:38:01,209] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 9: [2023-05-25 13:38:01,210] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 9: [2023-05-25 13:38:01,210] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 9: [2023-05-25 13:38:01,211] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 29: [2023-05-25 13:38:01,212] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 12: [2023-05-25 13:38:01,212] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt... 12: [2023-05-25 13:38:01,213] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_00-model_states.pt. 9: [2023-05-25 13:38:01,213] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt... 19: [2023-05-25 13:38:01,213] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 12: [2023-05-25 13:38:01,215] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt... 2: [2023-05-25 13:38:01,215] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 19: [2023-05-25 13:38:01,216] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt... 19: [2023-05-25 13:38:01,216] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 19: [2023-05-25 13:38:01,218] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt... 2: [2023-05-25 13:38:01,219] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 21: [2023-05-25 13:38:01,222] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 21: [2023-05-25 13:38:01,225] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt... 17: [2023-05-25 13:38:01,226] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 17: [2023-05-25 13:38:01,226] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 20: [2023-05-25 13:38:01,228] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt. 20: [2023-05-25 13:38:01,228] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_28-model_01-model_states.pt. 21: [2023-05-25 13:38:01,229] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 17: [2023-05-25 13:38:01,229] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt... 17: [2023-05-25 13:38:01,229] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt... 21: [2023-05-25 13:38:01,230] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt... 16: [2023-05-25 13:38:01,234] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 22: [2023-05-25 13:38:01,235] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 16: [2023-05-25 13:38:01,237] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt... 22: [2023-05-25 13:38:01,238] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt... 16: [2023-05-25 13:38:01,240] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 22: [2023-05-25 13:38:01,241] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 16: [2023-05-25 13:38:01,242] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt... 20: [2023-05-25 13:38:01,243] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 22: [2023-05-25 13:38:01,243] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt... 20: [2023-05-25 13:38:01,245] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt... 28: [2023-05-25 13:38:01,262] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt. 28: [2023-05-25 13:38:01,262] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt. 20: [2023-05-25 13:38:01,273] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 28: [2023-05-25 13:38:01,274] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 28: [2023-05-25 13:38:01,275] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 20: [2023-05-25 13:38:01,277] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_00-model_states.pt. 20: [2023-05-25 13:38:01,278] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt... 20: [2023-05-25 13:38:01,279] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt... 12: [2023-05-25 13:38:01,295] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 12: [2023-05-25 13:38:01,295] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 15: [2023-05-25 13:38:01,295] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 15: [2023-05-25 13:38:01,295] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 12: [2023-05-25 13:38:01,296] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 15: [2023-05-25 13:38:01,296] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 15: [2023-05-25 13:38:01,297] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 12: [2023-05-25 13:38:01,298] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 25: [2023-05-25 13:38:01,308] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt. 25: [2023-05-25 13:38:01,309] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt. 10: [2023-05-25 13:38:01,314] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 10: [2023-05-25 13:38:01,314] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 10: [2023-05-25 13:38:01,315] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 10: [2023-05-25 13:38:01,315] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 31: [2023-05-25 13:38:01,319] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt. 31: [2023-05-25 13:38:01,319] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt. 4: [2023-05-25 13:38:01,320] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt. 31: [2023-05-25 13:38:01,321] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt. 14: [2023-05-25 13:38:01,321] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 4: [2023-05-25 13:38:01,322] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt. 14: [2023-05-25 13:38:01,322] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 31: [2023-05-25 13:38:01,323] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt. 25: [2023-05-25 13:38:01,324] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 25: [2023-05-25 13:38:01,324] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 12: [2023-05-25 13:38:01,324] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 14: [2023-05-25 13:38:01,325] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 21: [2023-05-25 13:38:01,326] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt. 21: [2023-05-25 13:38:01,326] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt. 16: [2023-05-25 13:38:01,326] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt. 16: [2023-05-25 13:38:01,326] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt. 15: [2023-05-25 13:38:01,327] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 14: [2023-05-25 13:38:01,327] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 15: [2023-05-25 13:38:01,327] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 24: [2023-05-25 13:38:01,330] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt. 24: [2023-05-25 13:38:01,331] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt. 30: [2023-05-25 13:38:01,331] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt. 30: [2023-05-25 13:38:01,333] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt. 31: [2023-05-25 13:38:01,334] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 31: [2023-05-25 13:38:01,335] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 31: [2023-05-25 13:38:01,335] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 4: [2023-05-25 13:38:01,335] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 4: [2023-05-25 13:38:01,335] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 12: [2023-05-25 13:38:01,336] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 26: [2023-05-25 13:38:01,337] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt. 26: [2023-05-25 13:38:01,337] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt. 12: [2023-05-25 13:38:01,338] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 21: [2023-05-25 13:38:01,339] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 17: [2023-05-25 13:38:01,339] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt. 17: [2023-05-25 13:38:01,339] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt. 16: [2023-05-25 13:38:01,339] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 27: [2023-05-25 13:38:01,340] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt. 29: [2023-05-25 13:38:01,340] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt. 27: [2023-05-25 13:38:01,340] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt. 15: [2023-05-25 13:38:01,340] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 16: [2023-05-25 13:38:01,340] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 21: [2023-05-25 13:38:01,341] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 31: [2023-05-25 13:38:01,341] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 29: [2023-05-25 13:38:01,341] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_03-model_states.pt. 15: [2023-05-25 13:38:01,343] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 30: [2023-05-25 13:38:01,343] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 24: [2023-05-25 13:38:01,346] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 24: [2023-05-25 13:38:01,346] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 10: [2023-05-25 13:38:01,346] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 10: [2023-05-25 13:38:01,346] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 30: [2023-05-25 13:38:01,347] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 19: [2023-05-25 13:38:01,350] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt. 12: [2023-05-25 13:38:01,350] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 27: [2023-05-25 13:38:01,350] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 18: [2023-05-25 13:38:01,350] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt. 19: [2023-05-25 13:38:01,350] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt. 23: [2023-05-25 13:38:01,350] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt. 4: [2023-05-25 13:38:01,350] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt. 4: [2023-05-25 13:38:01,351] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt. 18: [2023-05-25 13:38:01,352] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt. 27: [2023-05-25 13:38:01,352] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 17: [2023-05-25 13:38:01,353] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 17: [2023-05-25 13:38:01,353] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 22: [2023-05-25 13:38:01,354] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt. 30: [2023-05-25 13:38:01,354] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 26: [2023-05-25 13:38:01,354] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 23: [2023-05-25 13:38:01,354] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt. 30: [2023-05-25 13:38:01,354] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 22: [2023-05-25 13:38:01,354] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt. 27: [2023-05-25 13:38:01,354] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 27: [2023-05-25 13:38:01,354] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 29: [2023-05-25 13:38:01,354] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 26: [2023-05-25 13:38:01,355] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 30: [2023-05-25 13:38:01,355] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 29: [2023-05-25 13:38:01,356] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 30: [2023-05-25 13:38:01,356] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 26: [2023-05-25 13:38:01,357] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt. 26: [2023-05-25 13:38:01,358] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt. 10: [2023-05-25 13:38:01,358] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 14: [2023-05-25 13:38:01,359] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 10: [2023-05-25 13:38:01,360] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 19: [2023-05-25 13:38:01,362] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 14: [2023-05-25 13:38:01,362] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 4: [2023-05-25 13:38:01,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 19: [2023-05-25 13:38:01,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 4: [2023-05-25 13:38:01,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 27: [2023-05-25 13:38:01,364] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 23: [2023-05-25 13:38:01,365] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 18: [2023-05-25 13:38:01,365] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 27: [2023-05-25 13:38:01,366] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 13: [2023-05-25 13:38:01,366] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt. 18: [2023-05-25 13:38:01,366] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 13: [2023-05-25 13:38:01,366] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt. 22: [2023-05-25 13:38:01,367] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 23: [2023-05-25 13:38:01,368] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 20: [2023-05-25 13:38:01,369] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt. 20: [2023-05-25 13:38:01,369] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_02-model_states.pt. 1: [2023-05-25 13:38:01,369] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt. 1: [2023-05-25 13:38:01,369] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt. 22: [2023-05-25 13:38:01,370] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 26: [2023-05-25 13:38:01,371] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 30: [2023-05-25 13:38:01,371] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt. 30: [2023-05-25 13:38:01,371] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt. 14: [2023-05-25 13:38:01,374] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 26: [2023-05-25 13:38:01,374] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 14: [2023-05-25 13:38:01,375] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 25: [2023-05-25 13:38:01,375] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 11: [2023-05-25 13:38:01,375] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 25: [2023-05-25 13:38:01,376] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 25: [2023-05-25 13:38:01,376] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 11: [2023-05-25 13:38:01,376] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 25: [2023-05-25 13:38:01,376] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 18: [2023-05-25 13:38:01,376] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt. 25: [2023-05-25 13:38:01,377] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 18: [2023-05-25 13:38:01,377] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt. 11: [2023-05-25 13:38:01,377] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 11: [2023-05-25 13:38:01,377] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 9: [2023-05-25 13:38:01,377] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt. 28: [2023-05-25 13:38:01,377] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt. 28: [2023-05-25 13:38:01,377] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt. 9: [2023-05-25 13:38:01,378] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt. 25: [2023-05-25 13:38:01,378] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 25: [2023-05-25 13:38:01,378] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt... 25: [2023-05-25 13:38:01,378] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt... 8: [2023-05-25 13:38:01,378] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 8: [2023-05-25 13:38:01,378] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 27: [2023-05-25 13:38:01,379] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt. 27: [2023-05-25 13:38:01,379] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt. 25: [2023-05-25 13:38:01,381] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt. 25: [2023-05-25 13:38:01,381] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt. 8: [2023-05-25 13:38:01,381] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 8: [2023-05-25 13:38:01,381] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 13: [2023-05-25 13:38:01,381] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 1: [2023-05-25 13:38:01,382] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 13: [2023-05-25 13:38:01,382] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 13: [2023-05-25 13:38:01,382] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 13: [2023-05-25 13:38:01,383] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 20: [2023-05-25 13:38:01,383] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 30: [2023-05-25 13:38:01,383] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 13: [2023-05-25 13:38:01,384] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 1: [2023-05-25 13:38:01,384] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 30: [2023-05-25 13:38:01,384] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 24: [2023-05-25 13:38:01,385] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 24: [2023-05-25 13:38:01,385] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 5: [2023-05-25 13:38:01,385] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt. 13: [2023-05-25 13:38:01,385] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 5: [2023-05-25 13:38:01,385] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt. 30: [2023-05-25 13:38:01,386] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt... 24: [2023-05-25 13:38:01,386] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 20: [2023-05-25 13:38:01,386] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 30: [2023-05-25 13:38:01,387] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 24: [2023-05-25 13:38:01,387] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 24: [2023-05-25 13:38:01,387] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 24: [2023-05-25 13:38:01,387] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 28: [2023-05-25 13:38:01,390] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 26: [2023-05-25 13:38:01,390] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt. 26: [2023-05-25 13:38:01,390] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt. 24: [2023-05-25 13:38:01,390] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt. 28: [2023-05-25 13:38:01,390] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 24: [2023-05-25 13:38:01,390] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt. 24: [2023-05-25 13:38:01,391] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt... 24: [2023-05-25 13:38:01,391] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt... 9: [2023-05-25 13:38:01,391] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 30: [2023-05-25 13:38:01,391] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 18: [2023-05-25 13:38:01,392] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 12: [2023-05-25 13:38:01,392] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt. 12: [2023-05-25 13:38:01,392] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt. 18: [2023-05-25 13:38:01,392] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 9: [2023-05-25 13:38:01,392] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 25: [2023-05-25 13:38:01,393] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 27: [2023-05-25 13:38:01,393] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 31: [2023-05-25 13:38:01,393] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 31: [2023-05-25 13:38:01,393] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 31: [2023-05-25 13:38:01,394] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 31: [2023-05-25 13:38:01,394] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 31: [2023-05-25 13:38:01,394] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 25: [2023-05-25 13:38:01,394] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 31: [2023-05-25 13:38:01,394] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 27: [2023-05-25 13:38:01,394] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 27: [2023-05-25 13:38:01,394] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 27: [2023-05-25 13:38:01,394] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 30: [2023-05-25 13:38:01,395] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 31: [2023-05-25 13:38:01,395] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 31: [2023-05-25 13:38:01,395] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 30: [2023-05-25 13:38:01,395] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt... 31: [2023-05-25 13:38:01,396] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt... 10: [2023-05-25 13:38:01,397] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt. 10: [2023-05-25 13:38:01,397] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt. 31: [2023-05-25 13:38:01,397] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt... 31: [2023-05-25 13:38:01,397] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt... 31: [2023-05-25 13:38:01,398] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt... 30: [2023-05-25 13:38:01,398] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 27: [2023-05-25 13:38:01,398] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt... 27: [2023-05-25 13:38:01,398] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 27: [2023-05-25 13:38:01,399] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 5: [2023-05-25 13:38:01,399] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 27: [2023-05-25 13:38:01,401] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt... 2: [2023-05-25 13:38:01,401] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt. 2: [2023-05-25 13:38:01,401] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt. 5: [2023-05-25 13:38:01,401] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 25: [2023-05-25 13:38:01,401] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 31: [2023-05-25 13:38:01,403] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt. 31: [2023-05-25 13:38:01,404] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt. 12: [2023-05-25 13:38:01,404] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 12: [2023-05-25 13:38:01,405] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 24: [2023-05-25 13:38:01,405] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 15: [2023-05-25 13:38:01,405] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt. 24: [2023-05-25 13:38:01,406] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 15: [2023-05-25 13:38:01,406] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt. 11: [2023-05-25 13:38:01,406] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 26: [2023-05-25 13:38:01,406] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 26: [2023-05-25 13:38:01,407] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 29: [2023-05-25 13:38:01,407] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt. 26: [2023-05-25 13:38:01,407] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 29: [2023-05-25 13:38:01,407] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_02-model_states.pt. 26: [2023-05-25 13:38:01,407] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 26: [2023-05-25 13:38:01,407] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 26: [2023-05-25 13:38:01,407] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 26: [2023-05-25 13:38:01,407] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 26: [2023-05-25 13:38:01,407] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 30: [2023-05-25 13:38:01,408] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 30: [2023-05-25 13:38:01,408] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 27: [2023-05-25 13:38:01,408] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 25: [2023-05-25 13:38:01,408] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 10: [2023-05-25 13:38:01,408] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 11: [2023-05-25 13:38:01,409] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt. 11: [2023-05-25 13:38:01,409] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt. 26: [2023-05-25 13:38:01,409] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 26: [2023-05-25 13:38:01,410] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt... 26: [2023-05-25 13:38:01,410] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt... 30: [2023-05-25 13:38:01,410] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt... 10: [2023-05-25 13:38:01,410] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 26: [2023-05-25 13:38:01,410] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt... 27: [2023-05-25 13:38:01,410] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 26: [2023-05-25 13:38:01,410] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt... 26: [2023-05-25 13:38:01,411] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 11: [2023-05-25 13:38:01,412] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 22: [2023-05-25 13:38:01,412] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt. 22: [2023-05-25 13:38:01,413] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt. 25: [2023-05-25 13:38:01,414] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 8: [2023-05-25 13:38:01,415] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt. 8: [2023-05-25 13:38:01,416] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt. 30: [2023-05-25 13:38:01,416] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 7: [2023-05-25 13:38:01,417] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt. 7: [2023-05-25 13:38:01,417] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt. 24: [2023-05-25 13:38:01,417] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 9: [2023-05-25 13:38:01,417] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 9: [2023-05-25 13:38:01,418] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 0: [2023-05-25 13:38:01,418] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt. 31: [2023-05-25 13:38:01,418] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 31: [2023-05-25 13:38:01,418] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 6: [2023-05-25 13:38:01,418] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt. 6: [2023-05-25 13:38:01,419] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt. 8: [2023-05-25 13:38:01,419] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 8: [2023-05-25 13:38:01,419] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 5: [2023-05-25 13:38:01,419] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt. 2: [2023-05-25 13:38:01,419] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 11: [2023-05-25 13:38:01,420] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 9: [2023-05-25 13:38:01,420] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 5: [2023-05-25 13:38:01,420] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt. 9: [2023-05-25 13:38:01,420] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 15: [2023-05-25 13:38:01,420] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 27: [2023-05-25 13:38:01,420] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 31: [2023-05-25 13:38:01,420] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 15: [2023-05-25 13:38:01,421] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 27: [2023-05-25 13:38:01,421] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt... 13: [2023-05-25 13:38:01,421] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 13: [2023-05-25 13:38:01,421] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 13: [2023-05-25 13:38:01,421] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 13: [2023-05-25 13:38:01,421] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 0: [2023-05-25 13:38:01,421] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt. 25: [2023-05-25 13:38:01,422] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 29: [2023-05-25 13:38:01,422] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 11: [2023-05-25 13:38:01,422] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt. 3: [2023-05-25 13:38:01,422] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt. 11: [2023-05-25 13:38:01,423] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 10: [2023-05-25 13:38:01,423] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt. 14: [2023-05-25 13:38:01,423] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt. 14: [2023-05-25 13:38:01,423] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt. 29: [2023-05-25 13:38:01,423] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 11: [2023-05-25 13:38:01,423] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 2: [2023-05-25 13:38:01,424] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 3: [2023-05-25 13:38:01,424] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_02-model_states.pt. 13: [2023-05-25 13:38:01,424] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt... 11: [2023-05-25 13:38:01,424] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt. 10: [2023-05-25 13:38:01,424] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt. 30: [2023-05-25 13:38:01,425] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 25: [2023-05-25 13:38:01,425] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 25: [2023-05-25 13:38:01,425] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 13: [2023-05-25 13:38:01,426] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt... 22: [2023-05-25 13:38:01,426] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 30: [2023-05-25 13:38:01,427] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt... 15: [2023-05-25 13:38:01,427] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt. 22: [2023-05-25 13:38:01,427] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 11: [2023-05-25 13:38:01,427] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 15: [2023-05-25 13:38:01,427] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt. 24: [2023-05-25 13:38:01,427] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 25: [2023-05-25 13:38:01,428] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt... 25: [2023-05-25 13:38:01,428] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt... 31: [2023-05-25 13:38:01,428] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 23: [2023-05-25 13:38:01,428] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt. 9: [2023-05-25 13:38:01,428] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 23: [2023-05-25 13:38:01,428] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt. 9: [2023-05-25 13:38:01,429] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 28: [2023-05-25 13:38:01,430] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 28: [2023-05-25 13:38:01,430] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 28: [2023-05-25 13:38:01,430] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 28: [2023-05-25 13:38:01,430] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 28: [2023-05-25 13:38:01,430] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 28: [2023-05-25 13:38:01,431] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 27: [2023-05-25 13:38:01,431] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 9: [2023-05-25 13:38:01,432] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt... 9: [2023-05-25 13:38:01,432] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt... 8: [2023-05-25 13:38:01,432] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 28: [2023-05-25 13:38:01,432] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 0: [2023-05-25 13:38:01,432] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 6: [2023-05-25 13:38:01,432] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 28: [2023-05-25 13:38:01,433] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 28: [2023-05-25 13:38:01,433] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt... 28: [2023-05-25 13:38:01,433] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt... 28: [2023-05-25 13:38:01,433] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt... 28: [2023-05-25 13:38:01,433] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt... 13: [2023-05-25 13:38:01,433] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt. 13: [2023-05-25 13:38:01,433] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt. 27: [2023-05-25 13:38:01,433] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt. 7: [2023-05-25 13:38:01,433] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 27: [2023-05-25 13:38:01,433] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt. 31: [2023-05-25 13:38:01,433] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 24: [2023-05-25 13:38:01,433] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 27: [2023-05-25 13:38:01,434] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt... 7: [2023-05-25 13:38:01,434] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 14: [2023-05-25 13:38:01,434] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt. 6: [2023-05-25 13:38:01,434] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 5: [2023-05-25 13:38:01,434] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 14: [2023-05-25 13:38:01,434] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_01-model_states.pt. 8: [2023-05-25 13:38:01,435] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 5: [2023-05-25 13:38:01,435] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 13: [2023-05-25 13:38:01,435] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 24: [2023-05-25 13:38:01,435] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt. 21: [2023-05-25 13:38:01,435] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt. 12: [2023-05-25 13:38:01,435] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 12: [2023-05-25 13:38:01,435] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 21: [2023-05-25 13:38:01,435] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt. 24: [2023-05-25 13:38:01,435] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt. 29: [2023-05-25 13:38:01,436] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt. 10: [2023-05-25 13:38:01,436] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 14: [2023-05-25 13:38:01,436] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 29: [2023-05-25 13:38:01,436] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt. 24: [2023-05-25 13:38:01,436] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 0: [2023-05-25 13:38:01,436] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 11: [2023-05-25 13:38:01,436] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 10: [2023-05-25 13:38:01,437] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 10: [2023-05-25 13:38:01,437] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 21: [2023-05-25 13:38:01,437] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 21: [2023-05-25 13:38:01,437] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 21: [2023-05-25 13:38:01,438] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 12: [2023-05-25 13:38:01,438] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt... 8: [2023-05-25 13:38:01,438] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 21: [2023-05-25 13:38:01,438] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 8: [2023-05-25 13:38:01,438] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 12: [2023-05-25 13:38:01,438] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt... 3: [2023-05-25 13:38:01,438] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 21: [2023-05-25 13:38:01,438] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 14: [2023-05-25 13:38:01,438] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 24: [2023-05-25 13:38:01,438] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt... 21: [2023-05-25 13:38:01,439] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 10: [2023-05-25 13:38:01,439] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 10: [2023-05-25 13:38:01,439] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt... 3: [2023-05-25 13:38:01,439] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 13: [2023-05-25 13:38:01,440] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 11: [2023-05-25 13:38:01,440] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 26: [2023-05-25 13:38:01,440] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 15: [2023-05-25 13:38:01,440] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 5: [2023-05-25 13:38:01,441] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt. 5: [2023-05-25 13:38:01,441] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt. 10: [2023-05-25 13:38:01,441] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt... 7: [2023-05-25 13:38:01,442] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt. 31: [2023-05-25 13:38:01,442] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 23: [2023-05-25 13:38:01,442] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 7: [2023-05-25 13:38:01,442] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt. 21: [2023-05-25 13:38:01,442] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt... 21: [2023-05-25 13:38:01,442] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt... 4: [2023-05-25 13:38:01,442] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt. 24: [2023-05-25 13:38:01,442] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 4: [2023-05-25 13:38:01,442] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt. 23: [2023-05-25 13:38:01,443] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 15: [2023-05-25 13:38:01,444] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 26: [2023-05-25 13:38:01,444] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 26: [2023-05-25 13:38:01,444] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 31: [2023-05-25 13:38:01,444] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 19: [2023-05-25 13:38:01,445] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt. 19: [2023-05-25 13:38:01,445] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt. 31: [2023-05-25 13:38:01,445] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt... 0: [2023-05-25 13:38:01,446] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt. 0: [2023-05-25 13:38:01,446] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt. 27: [2023-05-25 13:38:01,446] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 27: [2023-05-25 13:38:01,446] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 14: [2023-05-25 13:38:01,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 26: [2023-05-25 13:38:01,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 26: [2023-05-25 13:38:01,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt... 26: [2023-05-25 13:38:01,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt... 24: [2023-05-25 13:38:01,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 24: [2023-05-25 13:38:01,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 24: [2023-05-25 13:38:01,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 0: [2023-05-25 13:38:01,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt. 0: [2023-05-25 13:38:01,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt. 21: [2023-05-25 13:38:01,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 21: [2023-05-25 13:38:01,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 13: [2023-05-25 13:38:01,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 29: [2023-05-25 13:38:01,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 13: [2023-05-25 13:38:01,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 14: [2023-05-25 13:38:01,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 24: [2023-05-25 13:38:01,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt... 16: [2023-05-25 13:38:01,450] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 16: [2023-05-25 13:38:01,450] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 16: [2023-05-25 13:38:01,450] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 16: [2023-05-25 13:38:01,451] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 15: [2023-05-25 13:38:01,451] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 31: [2023-05-25 13:38:01,451] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 17: [2023-05-25 13:38:01,451] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt. 8: [2023-05-25 13:38:01,451] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt. 17: [2023-05-25 13:38:01,451] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt. 8: [2023-05-25 13:38:01,451] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt. 16: [2023-05-25 13:38:01,451] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 11: [2023-05-25 13:38:01,451] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt. 16: [2023-05-25 13:38:01,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt. 16: [2023-05-25 13:38:01,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt. 11: [2023-05-25 13:38:01,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 16: [2023-05-25 13:38:01,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt... 16: [2023-05-25 13:38:01,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 31: [2023-05-25 13:38:01,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt... 9: [2023-05-25 13:38:01,453] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 29: [2023-05-25 13:38:01,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 16: [2023-05-25 13:38:01,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt... 15: [2023-05-25 13:38:01,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt... 21: [2023-05-25 13:38:01,453] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 11: [2023-05-25 13:38:01,454] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt. 11: [2023-05-25 13:38:01,454] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt... 4: [2023-05-25 13:38:01,454] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 26: [2023-05-25 13:38:01,455] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 10: [2023-05-25 13:38:01,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt. 10: [2023-05-25 13:38:01,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt. 5: [2023-05-25 13:38:01,455] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 5: [2023-05-25 13:38:01,455] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 18: [2023-05-25 13:38:01,456] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 18: [2023-05-25 13:38:01,456] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 18: [2023-05-25 13:38:01,456] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 18: [2023-05-25 13:38:01,456] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 18: [2023-05-25 13:38:01,456] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 18: [2023-05-25 13:38:01,456] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 4: [2023-05-25 13:38:01,457] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 18: [2023-05-25 13:38:01,457] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 11: [2023-05-25 13:38:01,457] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 18: [2023-05-25 13:38:01,458] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt... 18: [2023-05-25 13:38:01,458] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt... 7: [2023-05-25 13:38:01,458] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 18: [2023-05-25 13:38:01,458] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt... 18: [2023-05-25 13:38:01,458] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 18: [2023-05-25 13:38:01,458] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt... 19: [2023-05-25 13:38:01,459] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 7: [2023-05-25 13:38:01,459] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 20: [2023-05-25 13:38:01,459] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt. 20: [2023-05-25 13:38:01,459] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_01-model_states.pt. 9: [2023-05-25 13:38:01,460] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 9: [2023-05-25 13:38:01,460] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt. 15: [2023-05-25 13:38:01,460] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 9: [2023-05-25 13:38:01,461] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt. 11: [2023-05-25 13:38:01,461] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt... 30: [2023-05-25 13:38:01,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt. 19: [2023-05-25 13:38:01,462] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 30: [2023-05-25 13:38:01,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt. 26: [2023-05-25 13:38:01,463] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 12: [2023-05-25 13:38:01,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt. 21: [2023-05-25 13:38:01,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 6: [2023-05-25 13:38:01,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt. 28: [2023-05-25 13:38:01,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 28: [2023-05-25 13:38:01,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 6: [2023-05-25 13:38:01,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt. 12: [2023-05-25 13:38:01,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt. 15: [2023-05-25 13:38:01,463] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt... 15: [2023-05-25 13:38:01,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt. 25: [2023-05-25 13:38:01,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt. 15: [2023-05-25 13:38:01,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt. 17: [2023-05-25 13:38:01,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 17: [2023-05-25 13:38:01,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 11: [2023-05-25 13:38:01,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 25: [2023-05-25 13:38:01,465] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt. 8: [2023-05-25 13:38:01,466] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 8: [2023-05-25 13:38:01,466] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 0: [2023-05-25 13:38:01,466] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 0: [2023-05-25 13:38:01,466] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 0: [2023-05-25 13:38:01,466] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 14: [2023-05-25 13:38:01,466] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 1: [2023-05-25 13:38:01,467] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt. 14: [2023-05-25 13:38:01,467] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 1: [2023-05-25 13:38:01,467] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt. 8: [2023-05-25 13:38:01,467] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 16: [2023-05-25 13:38:01,467] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 6: [2023-05-25 13:38:01,467] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt. 16: [2023-05-25 13:38:01,467] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 6: [2023-05-25 13:38:01,467] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt. 23: [2023-05-25 13:38:01,467] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 23: [2023-05-25 13:38:01,467] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 23: [2023-05-25 13:38:01,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 14: [2023-05-25 13:38:01,468] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt... 23: [2023-05-25 13:38:01,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 23: [2023-05-25 13:38:01,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 29: [2023-05-25 13:38:01,467] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 29: [2023-05-25 13:38:01,468] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 23: [2023-05-25 13:38:01,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 11: [2023-05-25 13:38:01,468] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 0: [2023-05-25 13:38:01,468] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 23: [2023-05-25 13:38:01,468] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 14: [2023-05-25 13:38:01,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt. 9: [2023-05-25 13:38:01,468] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 21: [2023-05-25 13:38:01,468] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 8: [2023-05-25 13:38:01,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt. 29: [2023-05-25 13:38:01,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 14: [2023-05-25 13:38:01,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt. 8: [2023-05-25 13:38:01,469] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt. 10: [2023-05-25 13:38:01,468] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 10: [2023-05-25 13:38:01,469] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 14: [2023-05-25 13:38:01,469] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt... 8: [2023-05-25 13:38:01,469] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt... 9: [2023-05-25 13:38:01,469] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt. 4: [2023-05-25 13:38:01,469] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 4: [2023-05-25 13:38:01,469] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 4: [2023-05-25 13:38:01,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 4: [2023-05-25 13:38:01,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 4: [2023-05-25 13:38:01,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 9: [2023-05-25 13:38:01,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_03-model_states.pt. 4: [2023-05-25 13:38:01,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 23: [2023-05-25 13:38:01,470] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 23: [2023-05-25 13:38:01,470] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt... 23: [2023-05-25 13:38:01,470] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt... 2: [2023-05-25 13:38:01,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt. 2: [2023-05-25 13:38:01,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt. 22: [2023-05-25 13:38:01,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 22: [2023-05-25 13:38:01,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 22: [2023-05-25 13:38:01,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 22: [2023-05-25 13:38:01,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 22: [2023-05-25 13:38:01,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 17: [2023-05-25 13:38:01,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 17: [2023-05-25 13:38:01,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 17: [2023-05-25 13:38:01,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 17: [2023-05-25 13:38:01,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 13: [2023-05-25 13:38:01,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 23: [2023-05-25 13:38:01,472] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt... 23: [2023-05-25 13:38:01,472] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt... 4: [2023-05-25 13:38:01,472] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt... 4: [2023-05-25 13:38:01,472] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt... 29: [2023-05-25 13:38:01,472] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt... 17: [2023-05-25 13:38:01,472] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 11: [2023-05-25 13:38:01,472] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 17: [2023-05-25 13:38:01,473] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 4: [2023-05-25 13:38:01,473] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt... 4: [2023-05-25 13:38:01,473] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt... 4: [2023-05-25 13:38:01,473] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 4: [2023-05-25 13:38:01,473] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 22: [2023-05-25 13:38:01,473] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 22: [2023-05-25 13:38:01,474] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 10: [2023-05-25 13:38:01,474] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 10: [2023-05-25 13:38:01,474] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 17: [2023-05-25 13:38:01,474] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt... 17: [2023-05-25 13:38:01,474] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt... 22: [2023-05-25 13:38:01,474] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 29: [2023-05-25 13:38:01,474] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 29: [2023-05-25 13:38:01,474] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 29: [2023-05-25 13:38:01,474] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 29: [2023-05-25 13:38:01,474] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 27: [2023-05-25 13:38:01,474] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 21: [2023-05-25 13:38:01,474] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 30: [2023-05-25 13:38:01,475] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 24: [2023-05-25 13:38:01,474] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 3: [2023-05-25 13:38:01,475] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt. 22: [2023-05-25 13:38:01,475] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt... 3: [2023-05-25 13:38:01,475] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_03-model_states.pt. 20: [2023-05-25 13:38:01,475] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 30: [2023-05-25 13:38:01,475] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 22: [2023-05-25 13:38:01,475] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt... 22: [2023-05-25 13:38:01,475] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt... 11: [2023-05-25 13:38:01,475] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt... 27: [2023-05-25 13:38:01,475] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 24: [2023-05-25 13:38:01,475] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt... 9: [2023-05-25 13:38:01,475] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 16: [2023-05-25 13:38:01,475] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 28: [2023-05-25 13:38:01,476] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 24: [2023-05-25 13:38:01,476] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 8: [2023-05-25 13:38:01,475] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 22: [2023-05-25 13:38:01,476] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt... 12: [2023-05-25 13:38:01,476] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 21: [2023-05-25 13:38:01,476] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 28: [2023-05-25 13:38:01,477] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 11: [2023-05-25 13:38:01,476] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 27: [2023-05-25 13:38:01,477] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt... 10: [2023-05-25 13:38:01,477] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt... 10: [2023-05-25 13:38:01,477] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt... 27: [2023-05-25 13:38:01,477] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt... 29: [2023-05-25 13:38:01,477] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt... 13: [2023-05-25 13:38:01,477] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt... 12: [2023-05-25 13:38:01,477] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 29: [2023-05-25 13:38:01,477] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 25: [2023-05-25 13:38:01,477] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 29: [2023-05-25 13:38:01,478] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt... 29: [2023-05-25 13:38:01,478] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt... 20: [2023-05-25 13:38:01,478] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 24: [2023-05-25 13:38:01,478] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt... 6: [2023-05-25 13:38:01,478] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 2: [2023-05-25 13:38:01,478] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt. 13: [2023-05-25 13:38:01,478] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 29: [2023-05-25 13:38:01,478] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 25: [2023-05-25 13:38:01,479] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 15: [2023-05-25 13:38:01,479] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 14: [2023-05-25 13:38:01,479] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 6: [2023-05-25 13:38:01,479] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 18: [2023-05-25 13:38:01,479] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 11: [2023-05-25 13:38:01,479] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt... 8: [2023-05-25 13:38:01,479] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt... 1: [2023-05-25 13:38:01,479] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt. 1: [2023-05-25 13:38:01,479] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt. 9: [2023-05-25 13:38:01,479] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 21: [2023-05-25 13:38:01,479] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt... 15: [2023-05-25 13:38:01,480] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 19: [2023-05-25 13:38:01,480] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 19: [2023-05-25 13:38:01,480] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 19: [2023-05-25 13:38:01,480] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 19: [2023-05-25 13:38:01,480] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 7: [2023-05-25 13:38:01,480] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt. 7: [2023-05-25 13:38:01,480] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt. 1: [2023-05-25 13:38:01,480] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 15: [2023-05-25 13:38:01,480] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 1: [2023-05-25 13:38:01,480] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 13: [2023-05-25 13:38:01,481] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt... 29: [2023-05-25 13:38:01,481] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt... 9: [2023-05-25 13:38:01,481] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 19: [2023-05-25 13:38:01,481] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt... 21: [2023-05-25 13:38:01,482] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 19: [2023-05-25 13:38:01,482] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt... 19: [2023-05-25 13:38:01,482] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 6: [2023-05-25 13:38:01,482] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 4: [2023-05-25 13:38:01,482] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 2: [2023-05-25 13:38:01,482] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt. 14: [2023-05-25 13:38:01,482] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 19: [2023-05-25 13:38:01,482] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 3: [2023-05-25 13:38:01,482] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt. 6: [2023-05-25 13:38:01,483] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 14: [2023-05-25 13:38:01,483] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 3: [2023-05-25 13:38:01,483] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_06-model_01-model_states.pt. 14: [2023-05-25 13:38:01,483] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt... 21: [2023-05-25 13:38:01,483] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt... 28: [2023-05-25 13:38:01,483] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt. 28: [2023-05-25 13:38:01,484] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_40-model_01-model_states.pt. 15: [2023-05-25 13:38:01,484] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt... 8: [2023-05-25 13:38:01,484] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 4: [2023-05-25 13:38:01,484] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt... 8: [2023-05-25 13:38:01,484] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 29: [2023-05-25 13:38:01,485] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 14: [2023-05-25 13:38:01,485] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 19: [2023-05-25 13:38:01,487] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 29: [2023-05-25 13:38:01,488] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt... 1: [2023-05-25 13:38:01,488] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 14: [2023-05-25 13:38:01,488] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt... 18: [2023-05-25 13:38:01,488] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 3: [2023-05-25 13:38:01,488] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 1: [2023-05-25 13:38:01,488] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 1: [2023-05-25 13:38:01,488] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 1: [2023-05-25 13:38:01,488] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 15: [2023-05-25 13:38:01,488] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 19: [2023-05-25 13:38:01,489] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt... 22: [2023-05-25 13:38:01,489] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt. 3: [2023-05-25 13:38:01,489] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 22: [2023-05-25 13:38:01,489] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt. 19: [2023-05-25 13:38:01,490] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 1: [2023-05-25 13:38:01,490] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 16: [2023-05-25 13:38:01,490] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 17: [2023-05-25 13:38:01,490] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt. 17: [2023-05-25 13:38:01,490] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt. 0: [2023-05-25 13:38:01,491] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 15: [2023-05-25 13:38:01,491] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt... 11: [2023-05-25 13:38:01,491] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 0: [2023-05-25 13:38:01,491] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 1: [2023-05-25 13:38:01,491] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 9: [2023-05-25 13:38:01,491] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 29: [2023-05-25 13:38:01,491] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 9: [2023-05-25 13:38:01,491] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 1: [2023-05-25 13:38:01,492] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt... 1: [2023-05-25 13:38:01,492] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt... 18: [2023-05-25 13:38:01,492] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 2: [2023-05-25 13:38:01,492] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 19: [2023-05-25 13:38:01,492] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt... 23: [2023-05-25 13:38:01,492] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 4: [2023-05-25 13:38:01,493] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 0: [2023-05-25 13:38:01,493] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 0: [2023-05-25 13:38:01,493] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt... 16: [2023-05-25 13:38:01,493] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 11: [2023-05-25 13:38:01,494] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt... 4: [2023-05-25 13:38:01,494] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt... 3: [2023-05-25 13:38:01,494] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 3: [2023-05-25 13:38:01,494] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 3: [2023-05-25 13:38:01,494] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 3: [2023-05-25 13:38:01,495] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 6: [2023-05-25 13:38:01,495] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 6: [2023-05-25 13:38:01,495] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 6: [2023-05-25 13:38:01,495] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 1: [2023-05-25 13:38:01,495] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 1: [2023-05-25 13:38:01,495] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 7: [2023-05-25 13:38:01,495] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 16: [2023-05-25 13:38:01,495] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 11: [2023-05-25 13:38:01,495] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 7: [2023-05-25 13:38:01,495] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 7: [2023-05-25 13:38:01,495] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 7: [2023-05-25 13:38:01,495] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 5: [2023-05-25 13:38:01,496] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 5: [2023-05-25 13:38:01,496] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 7: [2023-05-25 13:38:01,496] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 7: [2023-05-25 13:38:01,496] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 5: [2023-05-25 13:38:01,496] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 5: [2023-05-25 13:38:01,496] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 5: [2023-05-25 13:38:01,496] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 5: [2023-05-25 13:38:01,496] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 5: [2023-05-25 13:38:01,496] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 5: [2023-05-25 13:38:01,496] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 30: [2023-05-25 13:38:01,496] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 28: [2023-05-25 13:38:01,496] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 28: [2023-05-25 13:38:01,496] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt... 7: [2023-05-25 13:38:01,496] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 30: [2023-05-25 13:38:01,497] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt... 5: [2023-05-25 13:38:01,497] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 6: [2023-05-25 13:38:01,497] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 6: [2023-05-25 13:38:01,497] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 6: [2023-05-25 13:38:01,497] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 2: [2023-05-25 13:38:01,497] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 16: [2023-05-25 13:38:01,497] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt... 0: [2023-05-25 13:38:01,497] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 3: [2023-05-25 13:38:01,497] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 3: [2023-05-25 13:38:01,498] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 6: [2023-05-25 13:38:01,498] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt... 11: [2023-05-25 13:38:01,497] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt... 3: [2023-05-25 13:38:01,498] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt... 7: [2023-05-25 13:38:01,498] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 17: [2023-05-25 13:38:01,498] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 17: [2023-05-25 13:38:01,499] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 7: [2023-05-25 13:38:01,499] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 7: [2023-05-25 13:38:01,499] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 7: [2023-05-25 13:38:01,499] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt... 0: [2023-05-25 13:38:01,499] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 7: [2023-05-25 13:38:01,499] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt... 7: [2023-05-25 13:38:01,499] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt... 10: [2023-05-25 13:38:01,499] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 10: [2023-05-25 13:38:01,499] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 5: [2023-05-25 13:38:01,499] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 3: [2023-05-25 13:38:01,499] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt... 3: [2023-05-25 13:38:01,499] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 5: [2023-05-25 13:38:01,500] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt... 5: [2023-05-25 13:38:01,500] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt... 3: [2023-05-25 13:38:01,500] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 5: [2023-05-25 13:38:01,500] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt... 5: [2023-05-25 13:38:01,500] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt... 5: [2023-05-25 13:38:01,500] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt... 5: [2023-05-25 13:38:01,500] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt... 15: [2023-05-25 13:38:01,500] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 6: [2023-05-25 13:38:01,500] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt... 23: [2023-05-25 13:38:01,501] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 10: [2023-05-25 13:38:01,501] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt... 10: [2023-05-25 13:38:01,501] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt... 0: [2023-05-25 13:38:01,501] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 2: [2023-05-25 13:38:01,501] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 7: [2023-05-25 13:38:01,501] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt... 30: [2023-05-25 13:38:01,501] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 15: [2023-05-25 13:38:01,501] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt... 2: [2023-05-25 13:38:01,501] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 2: [2023-05-25 13:38:01,501] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 2: [2023-05-25 13:38:01,501] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 8: [2023-05-25 13:38:01,501] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 8: [2023-05-25 13:38:01,501] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 17: [2023-05-25 13:38:01,502] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 20: [2023-05-25 13:38:01,502] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 20: [2023-05-25 13:38:01,502] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 20: [2023-05-25 13:38:01,502] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 20: [2023-05-25 13:38:01,502] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 17: [2023-05-25 13:38:01,502] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt... 17: [2023-05-25 13:38:01,502] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt... 20: [2023-05-25 13:38:01,502] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 20: [2023-05-25 13:38:01,502] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 4: [2023-05-25 13:38:01,502] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 0: [2023-05-25 13:38:01,502] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt... 30: [2023-05-25 13:38:01,502] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt... 2: [2023-05-25 13:38:01,502] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 18: [2023-05-25 13:38:01,503] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 20: [2023-05-25 13:38:01,504] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 20: [2023-05-25 13:38:01,504] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt... 20: [2023-05-25 13:38:01,504] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt... 20: [2023-05-25 13:38:01,504] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt... 17: [2023-05-25 13:38:01,504] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 8: [2023-05-25 13:38:01,504] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt... 8: [2023-05-25 13:38:01,504] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt... 22: [2023-05-25 13:38:01,504] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 23: [2023-05-25 13:38:01,504] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 2: [2023-05-25 13:38:01,504] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 20: [2023-05-25 13:38:01,505] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 25: [2023-05-25 13:38:01,505] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 20: [2023-05-25 13:38:01,505] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt... 22: [2023-05-25 13:38:01,505] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 2: [2023-05-25 13:38:01,505] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt... 12: [2023-05-25 13:38:01,505] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 17: [2023-05-25 13:38:01,506] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 2: [2023-05-25 13:38:01,506] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt... 16: [2023-05-25 13:38:01,506] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 2: [2023-05-25 13:38:01,506] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 9: [2023-05-25 13:38:01,506] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 2: [2023-05-25 13:38:01,506] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt... 22: [2023-05-25 13:38:01,506] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 16: [2023-05-25 13:38:01,507] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 29: [2023-05-25 13:38:01,507] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 25: [2023-05-25 13:38:01,507] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 25: [2023-05-25 13:38:01,508] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt... 16: [2023-05-25 13:38:01,508] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt... 12: [2023-05-25 13:38:01,508] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 12: [2023-05-25 13:38:01,508] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt... 4: [2023-05-25 13:38:01,509] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 9: [2023-05-25 13:38:01,509] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt... 29: [2023-05-25 13:38:01,509] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 6: [2023-05-25 13:38:01,509] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 25: [2023-05-25 13:38:01,509] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt... 1: [2023-05-25 13:38:01,509] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 22: [2023-05-25 13:38:01,510] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 12: [2023-05-25 13:38:01,510] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt... 0: [2023-05-25 13:38:01,510] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 19: [2023-05-25 13:38:01,510] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 15: [2023-05-25 13:38:01,511] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 17: [2023-05-25 13:38:01,511] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 14: [2023-05-25 13:38:01,511] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 1: [2023-05-25 13:38:01,512] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt... 6: [2023-05-25 13:38:01,512] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt... 0: [2023-05-25 13:38:01,512] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt... 6: [2023-05-25 13:38:01,512] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 14: [2023-05-25 13:38:01,512] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 8: [2023-05-25 13:38:01,512] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 15: [2023-05-25 13:38:01,513] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt... 14: [2023-05-25 13:38:01,513] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt... 23: [2023-05-25 13:38:01,514] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 6: [2023-05-25 13:38:01,514] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt... 14: [2023-05-25 13:38:01,514] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt... 8: [2023-05-25 13:38:01,515] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt... 8: [2023-05-25 13:38:01,515] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 0: [2023-05-25 13:38:01,515] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 0: [2023-05-25 13:38:01,515] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 0: [2023-05-25 13:38:01,515] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 19: [2023-05-25 13:38:01,515] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 16: [2023-05-25 13:38:01,516] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt. 16: [2023-05-25 13:38:01,516] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt. 4: [2023-05-25 13:38:01,516] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 19: [2023-05-25 13:38:01,516] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt. 1: [2023-05-25 13:38:01,517] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 21: [2023-05-25 13:38:01,517] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt. 21: [2023-05-25 13:38:01,517] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt. 8: [2023-05-25 13:38:01,517] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt... 19: [2023-05-25 13:38:01,517] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt. 17: [2023-05-25 13:38:01,517] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 0: [2023-05-25 13:38:01,517] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt... 18: [2023-05-25 13:38:01,517] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt. 20: [2023-05-25 13:38:01,517] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt. 18: [2023-05-25 13:38:01,517] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt. 12: [2023-05-25 13:38:01,517] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt. 20: [2023-05-25 13:38:01,517] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt. 0: [2023-05-25 13:38:01,518] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt... 0: [2023-05-25 13:38:01,518] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt... 23: [2023-05-25 13:38:01,518] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt. 5: [2023-05-25 13:38:01,518] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 23: [2023-05-25 13:38:01,518] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_29-model_03-model_states.pt. 12: [2023-05-25 13:38:01,519] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt. 1: [2023-05-25 13:38:01,519] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt... 9: [2023-05-25 13:38:01,519] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 6: [2023-05-25 13:38:01,519] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 6: [2023-05-25 13:38:01,519] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 9: [2023-05-25 13:38:01,520] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 3: [2023-05-25 13:38:01,519] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 17: [2023-05-25 13:38:01,520] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 22: [2023-05-25 13:38:01,520] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 13: [2023-05-25 13:38:01,521] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt. 13: [2023-05-25 13:38:01,521] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_18-model_02-model_states.pt. 9: [2023-05-25 13:38:01,521] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt... 6: [2023-05-25 13:38:01,521] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt... 6: [2023-05-25 13:38:01,521] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt... 3: [2023-05-25 13:38:01,522] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt... 4: [2023-05-25 13:38:01,522] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 0: [2023-05-25 13:38:01,522] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 22: [2023-05-25 13:38:01,522] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 9: [2023-05-25 13:38:01,523] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt... 17: [2023-05-25 13:38:01,524] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt... 7: [2023-05-25 13:38:01,524] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 1: [2023-05-25 13:38:01,524] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 3: [2023-05-25 13:38:01,524] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 9: [2023-05-25 13:38:01,525] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 19: [2023-05-25 13:38:01,526] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 7: [2023-05-25 13:38:01,526] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt... 28: [2023-05-25 13:38:01,526] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 28: [2023-05-25 13:38:01,526] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_00-model_states.pt. 3: [2023-05-25 13:38:01,527] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt... 7: [2023-05-25 13:38:01,527] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 17: [2023-05-25 13:38:01,527] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 6: [2023-05-25 13:38:01,527] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 6: [2023-05-25 13:38:01,527] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 9: [2023-05-25 13:38:01,527] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt... 29: [2023-05-25 13:38:01,528] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 2: [2023-05-25 13:38:01,527] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 20: [2023-05-25 13:38:01,529] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 5: [2023-05-25 13:38:01,529] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 3: [2023-05-25 13:38:01,528] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 7: [2023-05-25 13:38:01,529] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt... 12: [2023-05-25 13:38:01,529] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 1: [2023-05-25 13:38:01,529] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 19: [2023-05-25 13:38:01,529] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 16: [2023-05-25 13:38:01,530] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 21: [2023-05-25 13:38:01,530] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 18: [2023-05-25 13:38:01,531] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 7: [2023-05-25 13:38:01,531] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 7: [2023-05-25 13:38:01,531] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 3: [2023-05-25 13:38:01,531] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 18: [2023-05-25 13:38:01,531] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 16: [2023-05-25 13:38:01,531] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 1: [2023-05-25 13:38:01,532] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 23: [2023-05-25 13:38:01,532] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 23: [2023-05-25 13:38:01,532] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 3: [2023-05-25 13:38:01,532] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 2: [2023-05-25 13:38:01,532] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt... 20: [2023-05-25 13:38:01,532] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 1: [2023-05-25 13:38:01,532] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt... 5: [2023-05-25 13:38:01,532] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 28: [2023-05-25 13:38:01,532] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt... 28: [2023-05-25 13:38:01,532] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt... 21: [2023-05-25 13:38:01,533] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 19: [2023-05-25 13:38:01,533] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 0: [2023-05-25 13:38:01,533] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 12: [2023-05-25 13:38:01,533] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 1: [2023-05-25 13:38:01,534] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 2: [2023-05-25 13:38:01,534] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 3: [2023-05-25 13:38:01,534] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt... 19: [2023-05-25 13:38:01,535] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 22: [2023-05-25 13:38:01,535] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 22: [2023-05-25 13:38:01,536] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 1: [2023-05-25 13:38:01,536] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt... 0: [2023-05-25 13:38:01,536] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 13: [2023-05-25 13:38:01,536] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 20: [2023-05-25 13:38:01,536] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt... 13: [2023-05-25 13:38:01,536] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt... 1: [2023-05-25 13:38:01,537] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 22: [2023-05-25 13:38:01,538] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt... 22: [2023-05-25 13:38:01,538] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt... 2: [2023-05-25 13:38:01,538] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt... 3: [2023-05-25 13:38:01,539] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 2: [2023-05-25 13:38:01,539] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 2: [2023-05-25 13:38:01,540] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 2: [2023-05-25 13:38:01,540] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 2: [2023-05-25 13:38:01,541] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_00-model_states.pt. 3: [2023-05-25 13:38:01,541] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt... 5: [2023-05-25 13:38:01,541] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 3: [2023-05-25 13:38:01,542] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 20: [2023-05-25 13:38:01,542] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 6: [2023-05-25 13:38:01,542] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 17: [2023-05-25 13:38:01,543] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 6: [2023-05-25 13:38:01,543] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 2: [2023-05-25 13:38:01,545] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt... 17: [2023-05-25 13:38:01,545] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt... 1: [2023-05-25 13:38:01,545] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 20: [2023-05-25 13:38:01,546] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 7: [2023-05-25 13:38:01,546] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 7: [2023-05-25 13:38:01,546] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 3: [2023-05-25 13:38:01,546] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 2: [2023-05-25 13:38:01,547] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt... 0: [2023-05-25 13:38:01,551] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 19: [2023-05-25 13:38:01,551] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 12: [2023-05-25 13:38:01,551] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 19: [2023-05-25 13:38:01,552] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt... 18: [2023-05-25 13:38:01,553] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 12: [2023-05-25 13:38:01,553] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt... 18: [2023-05-25 13:38:01,554] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt... 2: [2023-05-25 13:38:01,558] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 20: [2023-05-25 13:38:01,558] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 21: [2023-05-25 13:38:01,558] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 23: [2023-05-25 13:38:01,559] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 20: [2023-05-25 13:38:01,559] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 23: [2023-05-25 13:38:01,559] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 2: [2023-05-25 13:38:01,560] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 16: [2023-05-25 13:38:01,560] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 21: [2023-05-25 13:38:01,560] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt... 20: [2023-05-25 13:38:01,560] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt... 16: [2023-05-25 13:38:01,560] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 18: [2023-05-25 13:38:01,561] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 21: [2023-05-25 13:38:01,561] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 16: [2023-05-25 13:38:01,562] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt... 18: [2023-05-25 13:38:01,562] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt... 16: [2023-05-25 13:38:01,563] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt... 21: [2023-05-25 13:38:01,563] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt... 19: [2023-05-25 13:38:01,564] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 23: [2023-05-25 13:38:01,564] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt... 12: [2023-05-25 13:38:01,565] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 23: [2023-05-25 13:38:01,565] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt... 19: [2023-05-25 13:38:01,566] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt... 13: [2023-05-25 13:38:01,566] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 13: [2023-05-25 13:38:01,566] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_00-model_states.pt. 12: [2023-05-25 13:38:01,567] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt... 13: [2023-05-25 13:38:01,569] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt... 13: [2023-05-25 13:38:01,569] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt... 20: [2023-05-25 13:38:01,570] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_00-model_states.pt. 20: [2023-05-25 13:38:01,573] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt... 31: [2023-05-25 13:38:01,621] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 31: [2023-05-25 13:38:01,621] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 31: [2023-05-25 13:38:01,623] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 31: [2023-05-25 13:38:01,623] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 25: [2023-05-25 13:38:01,633] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt. 25: [2023-05-25 13:38:01,633] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt. 30: [2023-05-25 13:38:01,640] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt. 30: [2023-05-25 13:38:01,640] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt. 30: [2023-05-25 13:38:01,643] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 30: [2023-05-25 13:38:01,643] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 30: [2023-05-25 13:38:01,646] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 30: [2023-05-25 13:38:01,646] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 25: [2023-05-25 13:38:01,647] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 25: [2023-05-25 13:38:01,649] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 26: [2023-05-25 13:38:01,650] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt. 26: [2023-05-25 13:38:01,650] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt. 30: [2023-05-25 13:38:01,653] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 24: [2023-05-25 13:38:01,656] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 24: [2023-05-25 13:38:01,656] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 30: [2023-05-25 13:38:01,658] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 24: [2023-05-25 13:38:01,659] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 27: [2023-05-25 13:38:01,659] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 24: [2023-05-25 13:38:01,659] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 27: [2023-05-25 13:38:01,661] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 27: [2023-05-25 13:38:01,661] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 31: [2023-05-25 13:38:01,661] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt. 31: [2023-05-25 13:38:01,661] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt. 31: [2023-05-25 13:38:01,663] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 31: [2023-05-25 13:38:01,663] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 28: [2023-05-25 13:38:01,663] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt. 28: [2023-05-25 13:38:01,663] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt. 27: [2023-05-25 13:38:01,663] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 24: [2023-05-25 13:38:01,664] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt. 24: [2023-05-25 13:38:01,664] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt. 26: [2023-05-25 13:38:01,665] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 26: [2023-05-25 13:38:01,667] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 11: [2023-05-25 13:38:01,670] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 11: [2023-05-25 13:38:01,670] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 25: [2023-05-25 13:38:01,671] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt. 25: [2023-05-25 13:38:01,671] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt. 25: [2023-05-25 13:38:01,671] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 25: [2023-05-25 13:38:01,671] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 11: [2023-05-25 13:38:01,672] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 11: [2023-05-25 13:38:01,672] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 25: [2023-05-25 13:38:01,673] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 25: [2023-05-25 13:38:01,674] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 28: [2023-05-25 13:38:01,674] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 28: [2023-05-25 13:38:01,674] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 28: [2023-05-25 13:38:01,676] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 28: [2023-05-25 13:38:01,677] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 28: [2023-05-25 13:38:01,677] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 28: [2023-05-25 13:38:01,677] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 24: [2023-05-25 13:38:01,678] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt. 24: [2023-05-25 13:38:01,678] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 31: [2023-05-25 13:38:01,676] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 31: [2023-05-25 13:38:01,676] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 31: [2023-05-25 13:38:01,677] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 24: [2023-05-25 13:38:01,679] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt. 24: [2023-05-25 13:38:01,679] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 31: [2023-05-25 13:38:01,679] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 26: [2023-05-25 13:38:01,684] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 13: [2023-05-25 13:38:01,684] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt. 26: [2023-05-25 13:38:01,684] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 13: [2023-05-25 13:38:01,685] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt. 25: [2023-05-25 13:38:01,686] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 27: [2023-05-25 13:38:01,687] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt. 25: [2023-05-25 13:38:01,687] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 25: [2023-05-25 13:38:01,687] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 26: [2023-05-25 13:38:01,687] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 27: [2023-05-25 13:38:01,687] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt. 26: [2023-05-25 13:38:01,688] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt. 25: [2023-05-25 13:38:01,688] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 26: [2023-05-25 13:38:01,688] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt. 26: [2023-05-25 13:38:01,688] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt. 26: [2023-05-25 13:38:01,688] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt. 30: [2023-05-25 13:38:01,688] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 30: [2023-05-25 13:38:01,688] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 30: [2023-05-25 13:38:01,689] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 25: [2023-05-25 13:38:01,690] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt... 25: [2023-05-25 13:38:01,690] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt... 29: [2023-05-25 13:38:01,690] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt. 26: [2023-05-25 13:38:01,690] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 29: [2023-05-25 13:38:01,691] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt. 30: [2023-05-25 13:38:01,691] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt... 24: [2023-05-25 13:38:01,695] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 24: [2023-05-25 13:38:01,695] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 13: [2023-05-25 13:38:01,699] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 13: [2023-05-25 13:38:01,699] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 10: [2023-05-25 13:38:01,701] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt. 24: [2023-05-25 13:38:01,702] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt. 27: [2023-05-25 13:38:01,702] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 27: [2023-05-25 13:38:01,702] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 24: [2023-05-25 13:38:01,702] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 24: [2023-05-25 13:38:01,702] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 25: [2023-05-25 13:38:01,702] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 24: [2023-05-25 13:38:01,702] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt. 30: [2023-05-25 13:38:01,702] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 12: [2023-05-25 13:38:01,702] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt. 10: [2023-05-25 13:38:01,702] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt. 26: [2023-05-25 13:38:01,702] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 27: [2023-05-25 13:38:01,703] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 12: [2023-05-25 13:38:01,704] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt. 30: [2023-05-25 13:38:01,704] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 30: [2023-05-25 13:38:01,704] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 29: [2023-05-25 13:38:01,705] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 26: [2023-05-25 13:38:01,705] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 29: [2023-05-25 13:38:01,706] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 26: [2023-05-25 13:38:01,706] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 14: [2023-05-25 13:38:01,706] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt. 30: [2023-05-25 13:38:01,706] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt... 14: [2023-05-25 13:38:01,706] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt. 26: [2023-05-25 13:38:01,707] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 11: [2023-05-25 13:38:01,707] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 26: [2023-05-25 13:38:01,707] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 29: [2023-05-25 13:38:01,707] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt. 31: [2023-05-25 13:38:01,708] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 29: [2023-05-25 13:38:01,709] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt. 31: [2023-05-25 13:38:01,709] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 11: [2023-05-25 13:38:01,709] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 27: [2023-05-25 13:38:01,709] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 25: [2023-05-25 13:38:01,709] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 26: [2023-05-25 13:38:01,710] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt... 31: [2023-05-25 13:38:01,710] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt... 10: [2023-05-25 13:38:01,710] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 26: [2023-05-25 13:38:01,711] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 31: [2023-05-25 13:38:01,711] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt... 10: [2023-05-25 13:38:01,712] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 10: [2023-05-25 13:38:01,712] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 26: [2023-05-25 13:38:01,713] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt... 10: [2023-05-25 13:38:01,714] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 8: [2023-05-25 13:38:01,714] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt. 10: [2023-05-25 13:38:01,714] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 8: [2023-05-25 13:38:01,714] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt. 10: [2023-05-25 13:38:01,714] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 12: [2023-05-25 13:38:01,714] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 24: [2023-05-25 13:38:01,715] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 27: [2023-05-25 13:38:01,716] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 28: [2023-05-25 13:38:01,715] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 28: [2023-05-25 13:38:01,715] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 28: [2023-05-25 13:38:01,715] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 28: [2023-05-25 13:38:01,715] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 24: [2023-05-25 13:38:01,715] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 5: [2023-05-25 13:38:01,716] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt. 25: [2023-05-25 13:38:01,716] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 5: [2023-05-25 13:38:01,716] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt. 24: [2023-05-25 13:38:01,716] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 12: [2023-05-25 13:38:01,716] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 25: [2023-05-25 13:38:01,717] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 27: [2023-05-25 13:38:01,717] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 30: [2023-05-25 13:38:01,718] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt. 28: [2023-05-25 13:38:01,718] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt... 28: [2023-05-25 13:38:01,718] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt... 30: [2023-05-25 13:38:01,718] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt. 11: [2023-05-25 13:38:01,718] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 24: [2023-05-25 13:38:01,718] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 24: [2023-05-25 13:38:01,719] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 25: [2023-05-25 13:38:01,719] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt... 14: [2023-05-25 13:38:01,719] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 24: [2023-05-25 13:38:01,719] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt... 24: [2023-05-25 13:38:01,719] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 14: [2023-05-25 13:38:01,720] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 28: [2023-05-25 13:38:01,720] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt. 28: [2023-05-25 13:38:01,720] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt. 21: [2023-05-25 13:38:01,721] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt. 24: [2023-05-25 13:38:01,721] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt... 11: [2023-05-25 13:38:01,721] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 21: [2023-05-25 13:38:01,721] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt. 25: [2023-05-25 13:38:01,722] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 27: [2023-05-25 13:38:01,722] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt. 27: [2023-05-25 13:38:01,722] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt. 29: [2023-05-25 13:38:01,723] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 25: [2023-05-25 13:38:01,723] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 29: [2023-05-25 13:38:01,725] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 30: [2023-05-25 13:38:01,725] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt. 25: [2023-05-25 13:38:01,725] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt... 25: [2023-05-25 13:38:01,725] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt. 30: [2023-05-25 13:38:01,725] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt. 25: [2023-05-25 13:38:01,726] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt. 31: [2023-05-25 13:38:01,726] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt. 31: [2023-05-25 13:38:01,726] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt. 15: [2023-05-25 13:38:01,726] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 15: [2023-05-25 13:38:01,727] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 8: [2023-05-25 13:38:01,727] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 8: [2023-05-25 13:38:01,727] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 15: [2023-05-25 13:38:01,729] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 15: [2023-05-25 13:38:01,729] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 27: [2023-05-25 13:38:01,730] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 5: [2023-05-25 13:38:01,730] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 5: [2023-05-25 13:38:01,730] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 28: [2023-05-25 13:38:01,730] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 28: [2023-05-25 13:38:01,731] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 30: [2023-05-25 13:38:01,731] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 15: [2023-05-25 13:38:01,731] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt. 20: [2023-05-25 13:38:01,732] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt. 15: [2023-05-25 13:38:01,732] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt. 20: [2023-05-25 13:38:01,732] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt. 26: [2023-05-25 13:38:01,732] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 27: [2023-05-25 13:38:01,733] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt... 30: [2023-05-25 13:38:01,733] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 24: [2023-05-25 13:38:01,733] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 28: [2023-05-25 13:38:01,734] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 28: [2023-05-25 13:38:01,734] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 26: [2023-05-25 13:38:01,735] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 27: [2023-05-25 13:38:01,736] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 27: [2023-05-25 13:38:01,736] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 24: [2023-05-25 13:38:01,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 21: [2023-05-25 13:38:01,736] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 24: [2023-05-25 13:38:01,736] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt... 15: [2023-05-25 13:38:01,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt. 15: [2023-05-25 13:38:01,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt. 26: [2023-05-25 13:38:01,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 21: [2023-05-25 13:38:01,737] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 26: [2023-05-25 13:38:01,738] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt... 24: [2023-05-25 13:38:01,738] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt... 25: [2023-05-25 13:38:01,738] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 25: [2023-05-25 13:38:01,738] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 26: [2023-05-25 13:38:01,739] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt... 30: [2023-05-25 13:38:01,740] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 27: [2023-05-25 13:38:01,740] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt. 27: [2023-05-25 13:38:01,740] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt. 31: [2023-05-25 13:38:01,740] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 26: [2023-05-25 13:38:01,740] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 9: [2023-05-25 13:38:01,740] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt. 29: [2023-05-25 13:38:01,740] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt. 29: [2023-05-25 13:38:01,740] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_03-model_states.pt. 30: [2023-05-25 13:38:01,741] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 0: [2023-05-25 13:38:01,741] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 9: [2023-05-25 13:38:01,742] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt. 31: [2023-05-25 13:38:01,742] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 0: [2023-05-25 13:38:01,742] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 0: [2023-05-25 13:38:01,743] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 16: [2023-05-25 13:38:01,743] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 16: [2023-05-25 13:38:01,743] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 27: [2023-05-25 13:38:01,744] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 0: [2023-05-25 13:38:01,744] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 24: [2023-05-25 13:38:01,744] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 26: [2023-05-25 13:38:01,745] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 29: [2023-05-25 13:38:01,745] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 16: [2023-05-25 13:38:01,745] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 29: [2023-05-25 13:38:01,745] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 29: [2023-05-25 13:38:01,745] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 16: [2023-05-25 13:38:01,746] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 4: [2023-05-25 13:38:01,746] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 20: [2023-05-25 13:38:01,746] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 27: [2023-05-25 13:38:01,746] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt... 20: [2023-05-25 13:38:01,746] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 24: [2023-05-25 13:38:01,746] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt... 4: [2023-05-25 13:38:01,746] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 26: [2023-05-25 13:38:01,747] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 26: [2023-05-25 13:38:01,747] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt... 26: [2023-05-25 13:38:01,747] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 29: [2023-05-25 13:38:01,747] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 4: [2023-05-25 13:38:01,748] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 15: [2023-05-25 13:38:01,748] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 4: [2023-05-25 13:38:01,748] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 29: [2023-05-25 13:38:01,748] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 29: [2023-05-25 13:38:01,749] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt... 29: [2023-05-25 13:38:01,749] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 11: [2023-05-25 13:38:01,749] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt. 15: [2023-05-25 13:38:01,750] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 26: [2023-05-25 13:38:01,750] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt... 11: [2023-05-25 13:38:01,750] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt. 29: [2023-05-25 13:38:01,750] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt... 18: [2023-05-25 13:38:01,751] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt. 24: [2023-05-25 13:38:01,751] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 18: [2023-05-25 13:38:01,752] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt. 12: [2023-05-25 13:38:01,752] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 12: [2023-05-25 13:38:01,752] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 15: [2023-05-25 13:38:01,752] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 8: [2023-05-25 13:38:01,752] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 8: [2023-05-25 13:38:01,752] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 12: [2023-05-25 13:38:01,753] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 13: [2023-05-25 13:38:01,753] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 13: [2023-05-25 13:38:01,753] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 12: [2023-05-25 13:38:01,753] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 24: [2023-05-25 13:38:01,753] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt... 15: [2023-05-25 13:38:01,753] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 13: [2023-05-25 13:38:01,754] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 14: [2023-05-25 13:38:01,754] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 14: [2023-05-25 13:38:01,754] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 10: [2023-05-25 13:38:01,754] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 27: [2023-05-25 13:38:01,754] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 10: [2023-05-25 13:38:01,755] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 12: [2023-05-25 13:38:01,755] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 12: [2023-05-25 13:38:01,755] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 13: [2023-05-25 13:38:01,755] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 26: [2023-05-25 13:38:01,755] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 29: [2023-05-25 13:38:01,755] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 10: [2023-05-25 13:38:01,755] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 10: [2023-05-25 13:38:01,755] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 13: [2023-05-25 13:38:01,755] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 12: [2023-05-25 13:38:01,756] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt... 12: [2023-05-25 13:38:01,756] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt... 13: [2023-05-25 13:38:01,756] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt... 14: [2023-05-25 13:38:01,756] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 1: [2023-05-25 13:38:01,756] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 9: [2023-05-25 13:38:01,756] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 13: [2023-05-25 13:38:01,756] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt... 1: [2023-05-25 13:38:01,756] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 29: [2023-05-25 13:38:01,757] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 14: [2023-05-25 13:38:01,757] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 27: [2023-05-25 13:38:01,757] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 13: [2023-05-25 13:38:01,757] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 9: [2023-05-25 13:38:01,757] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 9: [2023-05-25 13:38:01,757] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 1: [2023-05-25 13:38:01,757] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 8: [2023-05-25 13:38:01,755] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 8: [2023-05-25 13:38:01,755] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 28: [2023-05-25 13:38:01,758] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt. 28: [2023-05-25 13:38:01,758] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_01-model_states.pt. 9: [2023-05-25 13:38:01,759] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 1: [2023-05-25 13:38:01,759] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 9: [2023-05-25 13:38:01,759] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 9: [2023-05-25 13:38:01,760] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 10: [2023-05-25 13:38:01,760] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt... 29: [2023-05-25 13:38:01,760] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 29: [2023-05-25 13:38:01,760] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 17: [2023-05-25 13:38:01,761] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 30: [2023-05-25 13:38:01,760] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 17: [2023-05-25 13:38:01,761] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 10: [2023-05-25 13:38:01,761] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt... 14: [2023-05-25 13:38:01,762] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 14: [2023-05-25 13:38:01,762] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 11: [2023-05-25 13:38:01,762] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 30: [2023-05-25 13:38:01,762] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt... 29: [2023-05-25 13:38:01,763] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt... 29: [2023-05-25 13:38:01,763] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt... 17: [2023-05-25 13:38:01,763] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 7: [2023-05-25 13:38:01,764] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt. 17: [2023-05-25 13:38:01,764] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 14: [2023-05-25 13:38:01,764] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt... 7: [2023-05-25 13:38:01,764] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt. 11: [2023-05-25 13:38:01,764] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 28: [2023-05-25 13:38:01,764] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 28: [2023-05-25 13:38:01,764] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 14: [2023-05-25 13:38:01,764] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt... 18: [2023-05-25 13:38:01,765] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 19: [2023-05-25 13:38:01,765] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt. 19: [2023-05-25 13:38:01,766] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt. 27: [2023-05-25 13:38:01,766] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 4: [2023-05-25 13:38:01,766] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt. 4: [2023-05-25 13:38:01,766] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt. 10: [2023-05-25 13:38:01,766] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 10: [2023-05-25 13:38:01,766] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 25: [2023-05-25 13:38:01,768] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 25: [2023-05-25 13:38:01,768] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 3: [2023-05-25 13:38:01,768] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 18: [2023-05-25 13:38:01,768] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 3: [2023-05-25 13:38:01,768] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 27: [2023-05-25 13:38:01,768] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt... 8: [2023-05-25 13:38:01,768] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 8: [2023-05-25 13:38:01,768] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 5: [2023-05-25 13:38:01,769] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 7: [2023-05-25 13:38:01,769] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 7: [2023-05-25 13:38:01,769] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 31: [2023-05-25 13:38:01,769] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt. 31: [2023-05-25 13:38:01,769] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_41-model_02-model_states.pt. 5: [2023-05-25 13:38:01,770] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 5: [2023-05-25 13:38:01,770] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 5: [2023-05-25 13:38:01,770] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 31: [2023-05-25 13:38:01,770] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 3: [2023-05-25 13:38:01,770] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 3: [2023-05-25 13:38:01,770] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 2: [2023-05-25 13:38:01,771] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 5: [2023-05-25 13:38:01,771] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 25: [2023-05-25 13:38:01,771] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt... 25: [2023-05-25 13:38:01,771] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt... 2: [2023-05-25 13:38:01,771] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 28: [2023-05-25 13:38:01,771] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 13: [2023-05-25 13:38:01,771] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt. 30: [2023-05-25 13:38:01,771] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 28: [2023-05-25 13:38:01,772] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 28: [2023-05-25 13:38:01,772] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt... 28: [2023-05-25 13:38:01,772] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt... 13: [2023-05-25 13:38:01,772] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt. 8: [2023-05-25 13:38:01,772] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt... 8: [2023-05-25 13:38:01,772] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt... 4: [2023-05-25 13:38:01,772] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt. 5: [2023-05-25 13:38:01,772] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 5: [2023-05-25 13:38:01,772] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt... 5: [2023-05-25 13:38:01,772] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt... 7: [2023-05-25 13:38:01,773] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 0: [2023-05-25 13:38:01,773] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt. 0: [2023-05-25 13:38:01,773] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt. 7: [2023-05-25 13:38:01,773] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 0: [2023-05-25 13:38:01,773] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt. 4: [2023-05-25 13:38:01,773] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt. 2: [2023-05-25 13:38:01,773] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 0: [2023-05-25 13:38:01,773] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt. 15: [2023-05-25 13:38:01,773] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 30: [2023-05-25 13:38:01,774] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt... 27: [2023-05-25 13:38:01,774] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 2: [2023-05-25 13:38:01,774] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 21: [2023-05-25 13:38:01,775] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 31: [2023-05-25 13:38:01,775] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 21: [2023-05-25 13:38:01,775] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 30: [2023-05-25 13:38:01,775] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 31: [2023-05-25 13:38:01,776] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt... 6: [2023-05-25 13:38:01,776] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 27: [2023-05-25 13:38:01,777] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt... 21: [2023-05-25 13:38:01,777] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 31: [2023-05-25 13:38:01,777] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt... 6: [2023-05-25 13:38:01,778] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 30: [2023-05-25 13:38:01,778] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 21: [2023-05-25 13:38:01,778] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 6: [2023-05-25 13:38:01,778] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 15: [2023-05-25 13:38:01,779] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 19: [2023-05-25 13:38:01,780] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 6: [2023-05-25 13:38:01,780] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 27: [2023-05-25 13:38:01,780] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 15: [2023-05-25 13:38:01,781] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt. 19: [2023-05-25 13:38:01,781] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 16: [2023-05-25 13:38:01,781] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 0: [2023-05-25 13:38:01,781] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 30: [2023-05-25 13:38:01,781] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt... 30: [2023-05-25 13:38:01,781] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt... 16: [2023-05-25 13:38:01,781] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 15: [2023-05-25 13:38:01,782] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt. 27: [2023-05-25 13:38:01,782] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt... 29: [2023-05-25 13:38:01,782] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 31: [2023-05-25 13:38:01,782] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 10: [2023-05-25 13:38:01,783] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt. 10: [2023-05-25 13:38:01,783] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt. 4: [2023-05-25 13:38:01,783] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt. 31: [2023-05-25 13:38:01,784] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt... 21: [2023-05-25 13:38:01,783] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 21: [2023-05-25 13:38:01,783] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 4: [2023-05-25 13:38:01,784] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 14: [2023-05-25 13:38:01,784] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 4: [2023-05-25 13:38:01,784] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 7: [2023-05-25 13:38:01,785] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 10: [2023-05-25 13:38:01,785] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt. 10: [2023-05-25 13:38:01,785] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt. 1: [2023-05-25 13:38:01,785] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 23: [2023-05-25 13:38:01,786] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt. 11: [2023-05-25 13:38:01,786] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt. 23: [2023-05-25 13:38:01,786] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt. 11: [2023-05-25 13:38:01,786] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt. 12: [2023-05-25 13:38:01,786] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 12: [2023-05-25 13:38:01,786] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 13: [2023-05-25 13:38:01,787] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 4: [2023-05-25 13:38:01,787] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt. 13: [2023-05-25 13:38:01,787] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 13: [2023-05-25 13:38:01,787] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 13: [2023-05-25 13:38:01,788] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 1: [2023-05-25 13:38:01,788] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt. 4: [2023-05-25 13:38:01,788] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 7: [2023-05-25 13:38:01,788] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 4: [2023-05-25 13:38:01,788] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 15: [2023-05-25 13:38:01,789] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 8: [2023-05-25 13:38:01,789] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 1: [2023-05-25 13:38:01,789] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 29: [2023-05-25 13:38:01,790] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 9: [2023-05-25 13:38:01,790] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 20: [2023-05-25 13:38:01,790] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt. 20: [2023-05-25 13:38:01,790] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt. 9: [2023-05-25 13:38:01,790] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 0: [2023-05-25 13:38:01,790] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 21: [2023-05-25 13:38:01,791] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt... 21: [2023-05-25 13:38:01,791] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt... 0: [2023-05-25 13:38:01,791] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 15: [2023-05-25 13:38:01,791] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 22: [2023-05-25 13:38:01,791] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt. 22: [2023-05-25 13:38:01,792] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt. 14: [2023-05-25 13:38:01,792] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 8: [2023-05-25 13:38:01,792] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 1: [2023-05-25 13:38:01,792] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt. 4: [2023-05-25 13:38:01,792] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 4: [2023-05-25 13:38:01,792] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 0: [2023-05-25 13:38:01,792] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 29: [2023-05-25 13:38:01,792] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 0: [2023-05-25 13:38:01,793] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 9: [2023-05-25 13:38:01,793] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt... 0: [2023-05-25 13:38:01,793] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 6: [2023-05-25 13:38:01,793] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt. 6: [2023-05-25 13:38:01,793] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt. 0: [2023-05-25 13:38:01,794] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 11: [2023-05-25 13:38:01,794] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 1: [2023-05-25 13:38:01,794] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt. 16: [2023-05-25 13:38:01,795] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 1: [2023-05-25 13:38:01,795] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt. 15: [2023-05-25 13:38:01,795] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 15: [2023-05-25 13:38:01,795] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 16: [2023-05-25 13:38:01,795] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 27: [2023-05-25 13:38:01,796] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 15: [2023-05-25 13:38:01,796] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt... 15: [2023-05-25 13:38:01,797] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 4: [2023-05-25 13:38:01,797] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 10: [2023-05-25 13:38:01,797] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 29: [2023-05-25 13:38:01,797] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 10: [2023-05-25 13:38:01,797] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 19: [2023-05-25 13:38:01,797] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt. 11: [2023-05-25 13:38:01,797] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 19: [2023-05-25 13:38:01,797] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt. 27: [2023-05-25 13:38:01,798] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt... 11: [2023-05-25 13:38:01,798] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt... 14: [2023-05-25 13:38:01,798] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 12: [2023-05-25 13:38:01,799] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 29: [2023-05-25 13:38:01,799] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 5: [2023-05-25 13:38:01,799] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 3: [2023-05-25 13:38:01,799] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt. 18: [2023-05-25 13:38:01,799] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt. 23: [2023-05-25 13:38:01,799] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 29: [2023-05-25 13:38:01,799] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt... 3: [2023-05-25 13:38:01,799] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt. 18: [2023-05-25 13:38:01,800] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt. 12: [2023-05-25 13:38:01,800] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 10: [2023-05-25 13:38:01,800] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 1: [2023-05-25 13:38:01,800] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 15: [2023-05-25 13:38:01,800] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 11: [2023-05-25 13:38:01,800] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt... 23: [2023-05-25 13:38:01,800] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 11: [2023-05-25 13:38:01,801] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 8: [2023-05-25 13:38:01,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt. 15: [2023-05-25 13:38:01,801] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 10: [2023-05-25 13:38:01,801] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 28: [2023-05-25 13:38:01,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 28: [2023-05-25 13:38:01,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 9: [2023-05-25 13:38:01,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt. 4: [2023-05-25 13:38:01,801] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 5: [2023-05-25 13:38:01,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt. 8: [2023-05-25 13:38:01,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt. 5: [2023-05-25 13:38:01,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt. 9: [2023-05-25 13:38:01,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 17: [2023-05-25 13:38:01,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 17: [2023-05-25 13:38:01,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 5: [2023-05-25 13:38:01,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 9: [2023-05-25 13:38:01,802] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 29: [2023-05-25 13:38:01,802] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt... 15: [2023-05-25 13:38:01,802] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt... 14: [2023-05-25 13:38:01,802] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt. 2: [2023-05-25 13:38:01,802] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt. 12: [2023-05-25 13:38:01,802] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt. 19: [2023-05-25 13:38:01,802] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 14: [2023-05-25 13:38:01,802] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt. 12: [2023-05-25 13:38:01,802] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt. 2: [2023-05-25 13:38:01,802] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_01-model_states.pt. 19: [2023-05-25 13:38:01,802] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 13: [2023-05-25 13:38:01,802] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 13: [2023-05-25 13:38:01,802] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 9: [2023-05-25 13:38:01,802] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt. 11: [2023-05-25 13:38:01,802] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 15: [2023-05-25 13:38:01,803] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 15: [2023-05-25 13:38:01,803] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt... 28: [2023-05-25 13:38:01,803] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt... 28: [2023-05-25 13:38:01,803] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt... 1: [2023-05-25 13:38:01,804] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 1: [2023-05-25 13:38:01,805] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 19: [2023-05-25 13:38:01,805] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 19: [2023-05-25 13:38:01,805] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 21: [2023-05-25 13:38:01,805] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 8: [2023-05-25 13:38:01,805] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 15: [2023-05-25 13:38:01,805] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt... 14: [2023-05-25 13:38:01,805] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 20: [2023-05-25 13:38:01,806] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 29: [2023-05-25 13:38:01,806] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 9: [2023-05-25 13:38:01,806] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt... 17: [2023-05-25 13:38:01,806] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt. 8: [2023-05-25 13:38:01,806] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 17: [2023-05-25 13:38:01,807] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt. 20: [2023-05-25 13:38:01,807] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 22: [2023-05-25 13:38:01,807] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 0: [2023-05-25 13:38:01,807] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 6: [2023-05-25 13:38:01,808] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 8: [2023-05-25 13:38:01,808] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt. 22: [2023-05-25 13:38:01,808] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt. 12: [2023-05-25 13:38:01,808] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt. 6: [2023-05-25 13:38:01,808] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 8: [2023-05-25 13:38:01,808] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt. 1: [2023-05-25 13:38:01,809] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 3: [2023-05-25 13:38:01,809] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 3: [2023-05-25 13:38:01,809] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 19: [2023-05-25 13:38:01,810] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 9: [2023-05-25 13:38:01,810] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 12: [2023-05-25 13:38:01,810] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt. 1: [2023-05-25 13:38:01,810] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 22: [2023-05-25 13:38:01,810] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 22: [2023-05-25 13:38:01,810] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt. 4: [2023-05-25 13:38:01,810] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 31: [2023-05-25 13:38:01,810] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 1: [2023-05-25 13:38:01,810] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 19: [2023-05-25 13:38:01,811] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 2: [2023-05-25 13:38:01,811] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 2: [2023-05-25 13:38:01,811] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 13: [2023-05-25 13:38:01,811] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt. 22: [2023-05-25 13:38:01,811] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 18: [2023-05-25 13:38:01,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 23: [2023-05-25 13:38:01,811] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt. 23: [2023-05-25 13:38:01,812] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt. 13: [2023-05-25 13:38:01,812] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_02-model_states.pt. 21: [2023-05-25 13:38:01,812] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt. 5: [2023-05-25 13:38:01,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 18: [2023-05-25 13:38:01,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 22: [2023-05-25 13:38:01,813] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 18: [2023-05-25 13:38:01,813] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt. 21: [2023-05-25 13:38:01,813] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt. 31: [2023-05-25 13:38:01,813] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_00-model_states.pt. 4: [2023-05-25 13:38:01,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 3: [2023-05-25 13:38:01,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 19: [2023-05-25 13:38:01,814] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 7: [2023-05-25 13:38:01,813] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 7: [2023-05-25 13:38:01,813] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 21: [2023-05-25 13:38:01,814] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 17: [2023-05-25 13:38:01,814] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 22: [2023-05-25 13:38:01,814] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 18: [2023-05-25 13:38:01,814] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt. 31: [2023-05-25 13:38:01,814] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt... 14: [2023-05-25 13:38:01,815] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 12: [2023-05-25 13:38:01,815] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 8: [2023-05-25 13:38:01,815] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 17: [2023-05-25 13:38:01,815] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 19: [2023-05-25 13:38:01,815] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt... 31: [2023-05-25 13:38:01,815] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt... 5: [2023-05-25 13:38:01,815] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 5: [2023-05-25 13:38:01,816] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 5: [2023-05-25 13:38:01,816] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 9: [2023-05-25 13:38:01,816] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 22: [2023-05-25 13:38:01,816] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 3: [2023-05-25 13:38:01,817] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 16: [2023-05-25 13:38:01,817] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt. 16: [2023-05-25 13:38:01,817] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_01-model_states.pt. 12: [2023-05-25 13:38:01,817] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 14: [2023-05-25 13:38:01,818] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 19: [2023-05-25 13:38:01,817] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 7: [2023-05-25 13:38:01,818] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt. 7: [2023-05-25 13:38:01,818] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt. 6: [2023-05-25 13:38:01,819] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 6: [2023-05-25 13:38:01,819] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 13: [2023-05-25 13:38:01,819] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 7: [2023-05-25 13:38:01,819] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 21: [2023-05-25 13:38:01,819] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 9: [2023-05-25 13:38:01,820] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 17: [2023-05-25 13:38:01,820] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 9: [2023-05-25 13:38:01,820] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 17: [2023-05-25 13:38:01,820] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 20: [2023-05-25 13:38:01,820] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 16: [2023-05-25 13:38:01,820] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt. 20: [2023-05-25 13:38:01,820] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 20: [2023-05-25 13:38:01,820] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 20: [2023-05-25 13:38:01,820] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 16: [2023-05-25 13:38:01,821] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt. 8: [2023-05-25 13:38:01,821] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 19: [2023-05-25 13:38:01,821] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt... 5: [2023-05-25 13:38:01,821] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt. 22: [2023-05-25 13:38:01,822] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 5: [2023-05-25 13:38:01,822] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt. 4: [2023-05-25 13:38:01,822] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 4: [2023-05-25 13:38:01,822] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 13: [2023-05-25 13:38:01,822] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt... 2: [2023-05-25 13:38:01,822] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 20: [2023-05-25 13:38:01,823] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt... 3: [2023-05-25 13:38:01,823] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 20: [2023-05-25 13:38:01,823] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 20: [2023-05-25 13:38:01,823] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt... 20: [2023-05-25 13:38:01,823] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 7: [2023-05-25 13:38:01,823] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt... 12: [2023-05-25 13:38:01,824] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 4: [2023-05-25 13:38:01,824] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt... 4: [2023-05-25 13:38:01,824] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt... 15: [2023-05-25 13:38:01,824] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 2: [2023-05-25 13:38:01,825] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 3: [2023-05-25 13:38:01,825] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 15: [2023-05-25 13:38:01,825] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 22: [2023-05-25 13:38:01,825] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt. 4: [2023-05-25 13:38:01,825] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 7: [2023-05-25 13:38:01,825] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 13: [2023-05-25 13:38:01,825] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 23: [2023-05-25 13:38:01,825] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 8: [2023-05-25 13:38:01,826] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 8: [2023-05-25 13:38:01,826] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 13: [2023-05-25 13:38:01,826] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 13: [2023-05-25 13:38:01,826] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 12: [2023-05-25 13:38:01,826] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 4: [2023-05-25 13:38:01,826] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 23: [2023-05-25 13:38:01,827] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 23: [2023-05-25 13:38:01,827] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 15: [2023-05-25 13:38:01,827] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt... 15: [2023-05-25 13:38:01,827] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt... 23: [2023-05-25 13:38:01,827] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 4: [2023-05-25 13:38:01,827] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt... 22: [2023-05-25 13:38:01,827] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt. 21: [2023-05-25 13:38:01,828] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 21: [2023-05-25 13:38:01,828] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 13: [2023-05-25 13:38:01,828] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt... 21: [2023-05-25 13:38:01,828] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 18: [2023-05-25 13:38:01,828] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 22: [2023-05-25 13:38:01,828] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 4: [2023-05-25 13:38:01,828] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt... 18: [2023-05-25 13:38:01,829] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 2: [2023-05-25 13:38:01,829] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 23: [2023-05-25 13:38:01,829] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 7: [2023-05-25 13:38:01,829] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt... 4: [2023-05-25 13:38:01,829] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 2: [2023-05-25 13:38:01,829] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 23: [2023-05-25 13:38:01,829] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 17: [2023-05-25 13:38:01,830] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt. 17: [2023-05-25 13:38:01,831] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt. 11: [2023-05-25 13:38:01,831] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 4: [2023-05-25 13:38:01,831] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt... 0: [2023-05-25 13:38:01,832] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 16: [2023-05-25 13:38:01,832] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 6: [2023-05-25 13:38:01,832] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 16: [2023-05-25 13:38:01,832] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 16: [2023-05-25 13:38:01,833] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 4: [2023-05-25 13:38:01,833] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 14: [2023-05-25 13:38:01,833] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt. 11: [2023-05-25 13:38:01,833] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt... 18: [2023-05-25 13:38:01,833] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 14: [2023-05-25 13:38:01,833] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt. 18: [2023-05-25 13:38:01,833] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 6: [2023-05-25 13:38:01,833] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 7: [2023-05-25 13:38:01,834] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 1: [2023-05-25 13:38:01,834] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt. 18: [2023-05-25 13:38:01,834] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 18: [2023-05-25 13:38:01,834] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 10: [2023-05-25 13:38:01,834] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 10: [2023-05-25 13:38:01,834] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 10: [2023-05-25 13:38:01,834] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 10: [2023-05-25 13:38:01,834] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 1: [2023-05-25 13:38:01,834] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt. 4: [2023-05-25 13:38:01,835] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt... 18: [2023-05-25 13:38:01,835] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt... 5: [2023-05-25 13:38:01,835] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 0: [2023-05-25 13:38:01,835] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt... 16: [2023-05-25 13:38:01,836] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 18: [2023-05-25 13:38:01,836] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 5: [2023-05-25 13:38:01,836] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 18: [2023-05-25 13:38:01,836] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt... 11: [2023-05-25 13:38:01,836] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 18: [2023-05-25 13:38:01,837] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 20: [2023-05-25 13:38:01,837] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 7: [2023-05-25 13:38:01,837] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 19: [2023-05-25 13:38:01,837] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 1: [2023-05-25 13:38:01,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 10: [2023-05-25 13:38:01,838] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt... 10: [2023-05-25 13:38:01,838] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt... 10: [2023-05-25 13:38:01,838] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt... 10: [2023-05-25 13:38:01,838] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt... 11: [2023-05-25 13:38:01,838] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt... 3: [2023-05-25 13:38:01,839] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 23: [2023-05-25 13:38:01,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt. 23: [2023-05-25 13:38:01,839] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 0: [2023-05-25 13:38:01,839] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 0: [2023-05-25 13:38:01,839] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 23: [2023-05-25 13:38:01,839] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 0: [2023-05-25 13:38:01,839] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 23: [2023-05-25 13:38:01,839] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt. 19: [2023-05-25 13:38:01,839] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 22: [2023-05-25 13:38:01,840] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 9: [2023-05-25 13:38:01,840] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 7: [2023-05-25 13:38:01,840] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 6: [2023-05-25 13:38:01,841] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 1: [2023-05-25 13:38:01,841] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt... 20: [2023-05-25 13:38:01,841] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt... 6: [2023-05-25 13:38:01,841] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 19: [2023-05-25 13:38:01,841] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 23: [2023-05-25 13:38:01,841] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt... 19: [2023-05-25 13:38:01,841] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 23: [2023-05-25 13:38:01,842] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt... 0: [2023-05-25 13:38:01,842] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt... 22: [2023-05-25 13:38:01,842] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 0: [2023-05-25 13:38:01,842] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt... 0: [2023-05-25 13:38:01,842] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt... 9: [2023-05-25 13:38:01,842] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt... 19: [2023-05-25 13:38:01,843] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt. 1: [2023-05-25 13:38:01,843] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 20: [2023-05-25 13:38:01,843] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 17: [2023-05-25 13:38:01,843] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt. 19: [2023-05-25 13:38:01,843] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt. 17: [2023-05-25 13:38:01,843] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_02-model_states.pt. 6: [2023-05-25 13:38:01,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt... 6: [2023-05-25 13:38:01,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt... 14: [2023-05-25 13:38:01,844] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 19: [2023-05-25 13:38:01,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt... 12: [2023-05-25 13:38:01,844] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 19: [2023-05-25 13:38:01,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt... 20: [2023-05-25 13:38:01,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt... 17: [2023-05-25 13:38:01,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 17: [2023-05-25 13:38:01,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 7: [2023-05-25 13:38:01,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 1: [2023-05-25 13:38:01,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 1: [2023-05-25 13:38:01,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 14: [2023-05-25 13:38:01,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 1: [2023-05-25 13:38:01,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt... 12: [2023-05-25 13:38:01,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt... 1: [2023-05-25 13:38:01,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 3: [2023-05-25 13:38:01,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt... 14: [2023-05-25 13:38:01,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt... 18: [2023-05-25 13:38:01,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 14: [2023-05-25 13:38:01,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 22: [2023-05-25 13:38:01,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 1: [2023-05-25 13:38:01,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt... 1: [2023-05-25 13:38:01,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt... 3: [2023-05-25 13:38:01,848] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt. 1: [2023-05-25 13:38:01,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 8: [2023-05-25 13:38:01,848] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 23: [2023-05-25 13:38:01,849] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 12: [2023-05-25 13:38:01,849] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 18: [2023-05-25 13:38:01,850] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt... 3: [2023-05-25 13:38:01,850] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt. 13: [2023-05-25 13:38:01,850] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 8: [2023-05-25 13:38:01,850] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 14: [2023-05-25 13:38:01,850] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 8: [2023-05-25 13:38:01,851] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt... 23: [2023-05-25 13:38:01,851] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt... 18: [2023-05-25 13:38:01,851] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 12: [2023-05-25 13:38:01,852] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt... 13: [2023-05-25 13:38:01,852] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt... 9: [2023-05-25 13:38:01,852] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 23: [2023-05-25 13:38:01,852] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 8: [2023-05-25 13:38:01,852] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt... 14: [2023-05-25 13:38:01,853] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt... 5: [2023-05-25 13:38:01,852] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 5: [2023-05-25 13:38:01,852] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 23: [2023-05-25 13:38:01,853] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt... 20: [2023-05-25 13:38:01,854] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 18: [2023-05-25 13:38:01,854] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt... 6: [2023-05-25 13:38:01,854] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt. 6: [2023-05-25 13:38:01,854] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt. 16: [2023-05-25 13:38:01,854] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 9: [2023-05-25 13:38:01,855] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt... 3: [2023-05-25 13:38:01,855] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 5: [2023-05-25 13:38:01,855] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt... 5: [2023-05-25 13:38:01,855] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt... 23: [2023-05-25 13:38:01,855] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 23: [2023-05-25 13:38:01,855] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 16: [2023-05-25 13:38:01,855] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt... 22: [2023-05-25 13:38:01,855] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 19: [2023-05-25 13:38:01,856] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 19: [2023-05-25 13:38:01,856] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 19: [2023-05-25 13:38:01,856] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 17: [2023-05-25 13:38:01,857] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 17: [2023-05-25 13:38:01,857] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 3: [2023-05-25 13:38:01,857] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt... 2: [2023-05-25 13:38:01,857] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 8: [2023-05-25 13:38:01,858] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 8: [2023-05-25 13:38:01,858] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 12: [2023-05-25 13:38:01,858] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 2: [2023-05-25 13:38:01,858] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 12: [2023-05-25 13:38:01,858] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 13: [2023-05-25 13:38:01,857] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 22: [2023-05-25 13:38:01,858] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt... 22: [2023-05-25 13:38:01,858] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 17: [2023-05-25 13:38:01,859] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 21: [2023-05-25 13:38:01,859] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 21: [2023-05-25 13:38:01,859] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 17: [2023-05-25 13:38:01,859] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 19: [2023-05-25 13:38:01,859] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 7: [2023-05-25 13:38:01,859] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt. 7: [2023-05-25 13:38:01,859] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt. 13: [2023-05-25 13:38:01,859] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt... 17: [2023-05-25 13:38:01,859] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt... 17: [2023-05-25 13:38:01,859] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt... 20: [2023-05-25 13:38:01,860] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 9: [2023-05-25 13:38:01,860] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt. 12: [2023-05-25 13:38:01,860] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt... 12: [2023-05-25 13:38:01,860] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt... 8: [2023-05-25 13:38:01,860] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt... 8: [2023-05-25 13:38:01,860] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt... 9: [2023-05-25 13:38:01,860] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_03-model_states.pt. 22: [2023-05-25 13:38:01,861] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 2: [2023-05-25 13:38:01,861] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt... 2: [2023-05-25 13:38:01,861] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt... 22: [2023-05-25 13:38:01,861] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt... 21: [2023-05-25 13:38:01,861] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt... 21: [2023-05-25 13:38:01,861] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt... 11: [2023-05-25 13:38:01,862] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt. 3: [2023-05-25 13:38:01,862] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt. 11: [2023-05-25 13:38:01,862] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_19-model_01-model_states.pt. 3: [2023-05-25 13:38:01,862] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt. 18: [2023-05-25 13:38:01,862] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 22: [2023-05-25 13:38:01,863] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 22: [2023-05-25 13:38:01,863] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 5: [2023-05-25 13:38:01,863] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 18: [2023-05-25 13:38:01,863] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 3: [2023-05-25 13:38:01,864] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 3: [2023-05-25 13:38:01,864] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 18: [2023-05-25 13:38:01,865] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt... 23: [2023-05-25 13:38:01,865] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 18: [2023-05-25 13:38:01,865] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt... 5: [2023-05-25 13:38:01,865] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 5: [2023-05-25 13:38:01,866] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt... 20: [2023-05-25 13:38:01,866] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 22: [2023-05-25 13:38:01,866] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt... 18: [2023-05-25 13:38:01,867] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 5: [2023-05-25 13:38:01,867] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt... 6: [2023-05-25 13:38:01,868] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 6: [2023-05-25 13:38:01,868] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 2: [2023-05-25 13:38:01,868] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt. 16: [2023-05-25 13:38:01,868] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 16: [2023-05-25 13:38:01,868] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 16: [2023-05-25 13:38:01,868] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 2: [2023-05-25 13:38:01,869] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt. 23: [2023-05-25 13:38:01,870] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 22: [2023-05-25 13:38:01,870] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 7: [2023-05-25 13:38:01,870] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 16: [2023-05-25 13:38:01,871] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt... 16: [2023-05-25 13:38:01,871] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt... 18: [2023-05-25 13:38:01,872] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 16: [2023-05-25 13:38:01,872] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt... 22: [2023-05-25 13:38:01,873] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt... 7: [2023-05-25 13:38:01,874] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt... 3: [2023-05-25 13:38:01,875] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 1: [2023-05-25 13:38:01,875] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 11: [2023-05-25 13:38:01,875] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 9: [2023-05-25 13:38:01,876] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 20: [2023-05-25 13:38:01,876] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 14: [2023-05-25 13:38:01,876] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 3: [2023-05-25 13:38:01,876] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 14: [2023-05-25 13:38:01,877] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 11: [2023-05-25 13:38:01,877] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 0: [2023-05-25 13:38:01,877] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt. 0: [2023-05-25 13:38:01,877] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt. 23: [2023-05-25 13:38:01,877] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 9: [2023-05-25 13:38:01,877] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt... 22: [2023-05-25 13:38:01,877] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 1: [2023-05-25 13:38:01,878] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt... 7: [2023-05-25 13:38:01,878] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 17: [2023-05-25 13:38:01,878] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 7: [2023-05-25 13:38:01,879] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 14: [2023-05-25 13:38:01,879] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt... 14: [2023-05-25 13:38:01,879] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt... 2: [2023-05-25 13:38:01,879] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt. 2: [2023-05-25 13:38:01,879] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_03-model_states.pt. 18: [2023-05-25 13:38:01,880] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 22: [2023-05-25 13:38:01,881] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 1: [2023-05-25 13:38:01,880] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 17: [2023-05-25 13:38:01,881] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt... 1: [2023-05-25 13:38:01,882] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt... 22: [2023-05-25 13:38:01,883] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt... 23: [2023-05-25 13:38:01,883] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 18: [2023-05-25 13:38:01,884] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 7: [2023-05-25 13:38:01,884] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 17: [2023-05-25 13:38:01,885] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 22: [2023-05-25 13:38:01,885] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 19: [2023-05-25 13:38:01,885] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 23: [2023-05-25 13:38:01,885] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 7: [2023-05-25 13:38:01,887] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt... 2: [2023-05-25 13:38:01,887] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 19: [2023-05-25 13:38:01,887] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt... 17: [2023-05-25 13:38:01,887] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt... 22: [2023-05-25 13:38:01,888] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt... 23: [2023-05-25 13:38:01,889] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt... 2: [2023-05-25 13:38:01,889] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 19: [2023-05-25 13:38:01,890] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 0: [2023-05-25 13:38:01,890] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 0: [2023-05-25 13:38:01,891] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 19: [2023-05-25 13:38:01,892] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt... 23: [2023-05-25 13:38:01,892] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 6: [2023-05-25 13:38:01,892] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt. 17: [2023-05-25 13:38:01,892] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 17: [2023-05-25 13:38:01,892] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 6: [2023-05-25 13:38:01,892] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_07-model_02-model_states.pt. 16: [2023-05-25 13:38:01,893] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt. 16: [2023-05-25 13:38:01,893] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt. 23: [2023-05-25 13:38:01,894] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt... 17: [2023-05-25 13:38:01,895] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt... 17: [2023-05-25 13:38:01,895] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt... 3: [2023-05-25 13:38:01,896] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 3: [2023-05-25 13:38:01,896] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 3: [2023-05-25 13:38:01,897] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 9: [2023-05-25 13:38:01,898] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 3: [2023-05-25 13:38:01,898] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt... 2: [2023-05-25 13:38:01,899] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 3: [2023-05-25 13:38:01,899] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt... 3: [2023-05-25 13:38:01,899] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt... 2: [2023-05-25 13:38:01,899] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 6: [2023-05-25 13:38:01,900] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 9: [2023-05-25 13:38:01,901] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt... 6: [2023-05-25 13:38:01,903] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt... 11: [2023-05-25 13:38:01,903] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 7: [2023-05-25 13:38:01,904] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 3: [2023-05-25 13:38:01,905] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 6: [2023-05-25 13:38:01,906] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 7: [2023-05-25 13:38:01,906] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt... 11: [2023-05-25 13:38:01,906] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt... 9: [2023-05-25 13:38:01,907] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 3: [2023-05-25 13:38:01,907] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt... 6: [2023-05-25 13:38:01,908] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt... 6: [2023-05-25 13:38:01,908] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 11: [2023-05-25 13:38:01,909] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_00-model_states.pt. 16: [2023-05-25 13:38:01,909] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 9: [2023-05-25 13:38:01,909] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt... 7: [2023-05-25 13:38:01,910] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 16: [2023-05-25 13:38:01,910] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 6: [2023-05-25 13:38:01,911] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt... 11: [2023-05-25 13:38:01,911] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt... 7: [2023-05-25 13:38:01,912] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt... 30: [2023-05-25 13:38:01,913] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt. 30: [2023-05-25 13:38:01,913] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt. 2: [2023-05-25 13:38:01,917] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 0: [2023-05-25 13:38:01,921] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 0: [2023-05-25 13:38:01,921] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 2: [2023-05-25 13:38:01,922] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt... 2: [2023-05-25 13:38:01,922] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 30: [2023-05-25 13:38:01,925] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 2: [2023-05-25 13:38:01,925] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt... 0: [2023-05-25 13:38:01,926] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt... 0: [2023-05-25 13:38:01,926] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt... 21: [2023-05-25 13:38:01,926] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt. 21: [2023-05-25 13:38:01,927] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt. 30: [2023-05-25 13:38:01,928] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 31: [2023-05-25 13:38:01,929] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 2: [2023-05-25 13:38:01,929] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 31: [2023-05-25 13:38:01,930] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 31: [2023-05-25 13:38:01,930] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 2: [2023-05-25 13:38:01,931] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 2: [2023-05-25 13:38:01,932] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt... 31: [2023-05-25 13:38:01,932] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 20: [2023-05-25 13:38:01,933] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt. 20: [2023-05-25 13:38:01,933] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_30-model_03-model_states.pt. 2: [2023-05-25 13:38:01,933] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt... 6: [2023-05-25 13:38:01,934] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 6: [2023-05-25 13:38:01,937] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt... 16: [2023-05-25 13:38:01,938] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 21: [2023-05-25 13:38:01,939] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 21: [2023-05-25 13:38:01,939] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 6: [2023-05-25 13:38:01,940] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_00-model_states.pt. 16: [2023-05-25 13:38:01,940] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt... 16: [2023-05-25 13:38:01,942] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 6: [2023-05-25 13:38:01,943] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt... 25: [2023-05-25 13:38:01,945] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 16: [2023-05-25 13:38:01,944] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt... 25: [2023-05-25 13:38:01,945] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 27: [2023-05-25 13:38:01,946] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt. 25: [2023-05-25 13:38:01,946] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 20: [2023-05-25 13:38:01,946] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 20: [2023-05-25 13:38:01,946] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt... 25: [2023-05-25 13:38:01,947] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 24: [2023-05-25 13:38:01,949] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt. 24: [2023-05-25 13:38:01,950] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt. 27: [2023-05-25 13:38:01,951] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt. 28: [2023-05-25 13:38:01,952] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt. 28: [2023-05-25 13:38:01,952] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt. 25: [2023-05-25 13:38:01,953] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt. 25: [2023-05-25 13:38:01,953] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt. 24: [2023-05-25 13:38:01,957] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 24: [2023-05-25 13:38:01,957] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 24: [2023-05-25 13:38:01,958] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 29: [2023-05-25 13:38:01,958] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt. 29: [2023-05-25 13:38:01,958] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt. 27: [2023-05-25 13:38:01,960] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 31: [2023-05-25 13:38:01,959] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 26: [2023-05-25 13:38:01,960] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt. 24: [2023-05-25 13:38:01,960] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 26: [2023-05-25 13:38:01,961] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt. 28: [2023-05-25 13:38:01,963] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 24: [2023-05-25 13:38:01,964] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 24: [2023-05-25 13:38:01,965] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 28: [2023-05-25 13:38:01,965] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 21: [2023-05-25 13:38:01,966] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 25: [2023-05-25 13:38:01,966] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 25: [2023-05-25 13:38:01,967] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 31: [2023-05-25 13:38:01,968] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 21: [2023-05-25 13:38:01,968] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 21: [2023-05-25 13:38:01,969] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt... 25: [2023-05-25 13:38:01,969] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt. 25: [2023-05-25 13:38:01,969] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt. 27: [2023-05-25 13:38:01,970] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 21: [2023-05-25 13:38:01,971] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt... 31: [2023-05-25 13:38:01,972] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 29: [2023-05-25 13:38:01,972] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 29: [2023-05-25 13:38:01,973] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 26: [2023-05-25 13:38:01,976] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 26: [2023-05-25 13:38:01,976] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 30: [2023-05-25 13:38:01,976] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 30: [2023-05-25 13:38:01,977] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 30: [2023-05-25 13:38:01,977] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 30: [2023-05-25 13:38:01,977] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 20: [2023-05-25 13:38:01,978] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 20: [2023-05-25 13:38:01,978] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_00-model_states.pt. 30: [2023-05-25 13:38:01,979] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt... 30: [2023-05-25 13:38:01,980] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt... 30: [2023-05-25 13:38:01,980] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 30: [2023-05-25 13:38:01,981] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 26: [2023-05-25 13:38:01,981] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt. 31: [2023-05-25 13:38:01,981] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 20: [2023-05-25 13:38:01,981] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt... 20: [2023-05-25 13:38:01,981] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt... 26: [2023-05-25 13:38:01,983] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt. 27: [2023-05-25 13:38:01,984] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 25: [2023-05-25 13:38:01,984] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 27: [2023-05-25 13:38:01,984] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 25: [2023-05-25 13:38:01,984] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 27: [2023-05-25 13:38:01,984] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt. 25: [2023-05-25 13:38:01,984] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 25: [2023-05-25 13:38:01,984] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 27: [2023-05-25 13:38:01,984] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt. 27: [2023-05-25 13:38:01,986] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 25: [2023-05-25 13:38:01,986] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt. 25: [2023-05-25 13:38:01,986] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt. 27: [2023-05-25 13:38:01,986] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 30: [2023-05-25 13:38:01,987] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt. 30: [2023-05-25 13:38:01,988] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt. 27: [2023-05-25 13:38:01,988] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt. 27: [2023-05-25 13:38:01,988] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt. 24: [2023-05-25 13:38:01,994] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt. 24: [2023-05-25 13:38:01,995] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt. 26: [2023-05-25 13:38:01,997] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 26: [2023-05-25 13:38:01,998] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 28: [2023-05-25 13:38:01,999] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 28: [2023-05-25 13:38:01,999] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 28: [2023-05-25 13:38:01,999] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 24: [2023-05-25 13:38:01,999] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 25: [2023-05-25 13:38:01,999] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 24: [2023-05-25 13:38:01,999] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 28: [2023-05-25 13:38:01,999] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 27: [2023-05-25 13:38:01,999] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 28: [2023-05-25 13:38:02,000] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt... 24: [2023-05-25 13:38:02,000] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 28: [2023-05-25 13:38:02,001] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt... 28: [2023-05-25 13:38:02,001] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 28: [2023-05-25 13:38:02,001] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 25: [2023-05-25 13:38:02,001] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 30: [2023-05-25 13:38:02,002] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 24: [2023-05-25 13:38:02,002] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt... 25: [2023-05-25 13:38:02,002] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 27: [2023-05-25 13:38:02,002] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 30: [2023-05-25 13:38:02,002] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 24: [2023-05-25 13:38:02,003] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt... 30: [2023-05-25 13:38:02,003] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt. 25: [2023-05-25 13:38:02,003] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 27: [2023-05-25 13:38:02,003] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 25: [2023-05-25 13:38:02,004] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 28: [2023-05-25 13:38:02,004] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt. 24: [2023-05-25 13:38:02,004] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 28: [2023-05-25 13:38:02,004] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt. 27: [2023-05-25 13:38:02,004] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 25: [2023-05-25 13:38:02,006] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt... 27: [2023-05-25 13:38:02,006] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 30: [2023-05-25 13:38:02,006] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt. 27: [2023-05-25 13:38:02,006] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 27: [2023-05-25 13:38:02,007] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt... 28: [2023-05-25 13:38:02,007] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt. 27: [2023-05-25 13:38:02,007] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt... 28: [2023-05-25 13:38:02,007] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt. 24: [2023-05-25 13:38:02,007] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 24: [2023-05-25 13:38:02,007] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 31: [2023-05-25 13:38:02,010] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt. 31: [2023-05-25 13:38:02,010] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt. 31: [2023-05-25 13:38:02,010] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt. 31: [2023-05-25 13:38:02,011] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt. 24: [2023-05-25 13:38:02,011] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt. 24: [2023-05-25 13:38:02,011] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt. 26: [2023-05-25 13:38:02,012] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 29: [2023-05-25 13:38:02,012] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 29: [2023-05-25 13:38:02,012] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 15: [2023-05-25 13:38:02,012] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt. 29: [2023-05-25 13:38:02,012] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 26: [2023-05-25 13:38:02,012] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 25: [2023-05-25 13:38:02,012] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 15: [2023-05-25 13:38:02,012] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt. 29: [2023-05-25 13:38:02,013] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt. 29: [2023-05-25 13:38:02,013] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_03-model_states.pt. 29: [2023-05-25 13:38:02,014] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt. 26: [2023-05-25 13:38:02,014] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 29: [2023-05-25 13:38:02,014] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt. 29: [2023-05-25 13:38:02,014] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt... 25: [2023-05-25 13:38:02,014] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt... 26: [2023-05-25 13:38:02,015] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 24: [2023-05-25 13:38:02,015] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 30: [2023-05-25 13:38:02,015] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 29: [2023-05-25 13:38:02,017] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 26: [2023-05-25 13:38:02,017] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt. 26: [2023-05-25 13:38:02,017] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_01-model_states.pt. 29: [2023-05-25 13:38:02,017] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 28: [2023-05-25 13:38:02,019] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 25: [2023-05-25 13:38:02,019] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 25: [2023-05-25 13:38:02,019] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 29: [2023-05-25 13:38:02,019] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 28: [2023-05-25 13:38:02,019] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 24: [2023-05-25 13:38:02,020] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 14: [2023-05-25 13:38:02,021] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt. 29: [2023-05-25 13:38:02,021] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt... 26: [2023-05-25 13:38:02,021] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 14: [2023-05-25 13:38:02,021] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt. 26: [2023-05-25 13:38:02,021] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 6: [2023-05-25 13:38:02,021] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt. 6: [2023-05-25 13:38:02,021] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt. 25: [2023-05-25 13:38:02,022] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt... 25: [2023-05-25 13:38:02,022] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt... 28: [2023-05-25 13:38:02,023] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 27: [2023-05-25 13:38:02,023] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 27: [2023-05-25 13:38:02,023] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 28: [2023-05-25 13:38:02,024] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 30: [2023-05-25 13:38:02,023] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 31: [2023-05-25 13:38:02,023] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 16: [2023-05-25 13:38:02,024] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 26: [2023-05-25 13:38:02,024] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt... 26: [2023-05-25 13:38:02,024] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt... 30: [2023-05-25 13:38:02,024] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 16: [2023-05-25 13:38:02,024] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 16: [2023-05-25 13:38:02,025] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 31: [2023-05-25 13:38:02,026] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 16: [2023-05-25 13:38:02,027] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 15: [2023-05-25 13:38:02,027] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 31: [2023-05-25 13:38:02,027] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 24: [2023-05-25 13:38:02,028] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 24: [2023-05-25 13:38:02,028] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 31: [2023-05-25 13:38:02,028] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 29: [2023-05-25 13:38:02,028] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 15: [2023-05-25 13:38:02,028] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 19: [2023-05-25 13:38:02,028] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 19: [2023-05-25 13:38:02,029] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 29: [2023-05-25 13:38:02,029] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 29: [2023-05-25 13:38:02,029] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 25: [2023-05-25 13:38:02,030] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 29: [2023-05-25 13:38:02,030] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 19: [2023-05-25 13:38:02,031] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 19: [2023-05-25 13:38:02,031] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 10: [2023-05-25 13:38:02,031] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt. 10: [2023-05-25 13:38:02,032] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt. 30: [2023-05-25 13:38:02,032] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 26: [2023-05-25 13:38:02,033] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 25: [2023-05-25 13:38:02,033] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 26: [2023-05-25 13:38:02,033] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 25: [2023-05-25 13:38:02,033] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt... 14: [2023-05-25 13:38:02,033] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 27: [2023-05-25 13:38:02,034] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 25: [2023-05-25 13:38:02,035] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt... 6: [2023-05-25 13:38:02,035] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 28: [2023-05-25 13:38:02,035] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 6: [2023-05-25 13:38:02,035] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 14: [2023-05-25 13:38:02,036] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 26: [2023-05-25 13:38:02,036] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 26: [2023-05-25 13:38:02,036] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 27: [2023-05-25 13:38:02,037] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt... 29: [2023-05-25 13:38:02,037] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 24: [2023-05-25 13:38:02,037] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 20: [2023-05-25 13:38:02,038] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt. 20: [2023-05-25 13:38:02,038] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt. 30: [2023-05-25 13:38:02,039] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 27: [2023-05-25 13:38:02,039] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 27: [2023-05-25 13:38:02,039] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 26: [2023-05-25 13:38:02,040] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt... 26: [2023-05-25 13:38:02,040] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt... 24: [2023-05-25 13:38:02,040] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt... 11: [2023-05-25 13:38:02,041] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt. 28: [2023-05-25 13:38:02,041] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 11: [2023-05-25 13:38:02,041] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt. 8: [2023-05-25 13:38:02,041] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt. 8: [2023-05-25 13:38:02,042] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt. 26: [2023-05-25 13:38:02,042] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 10: [2023-05-25 13:38:02,043] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 30: [2023-05-25 13:38:02,043] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 13: [2023-05-25 13:38:02,043] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt. 13: [2023-05-25 13:38:02,043] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt. 30: [2023-05-25 13:38:02,043] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 24: [2023-05-25 13:38:02,044] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 10: [2023-05-25 13:38:02,044] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 30: [2023-05-25 13:38:02,045] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt... 30: [2023-05-25 13:38:02,045] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt... 22: [2023-05-25 13:38:02,045] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 27: [2023-05-25 13:38:02,045] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 27: [2023-05-25 13:38:02,045] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 22: [2023-05-25 13:38:02,045] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 24: [2023-05-25 13:38:02,046] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt... 30: [2023-05-25 13:38:02,046] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 22: [2023-05-25 13:38:02,047] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 9: [2023-05-25 13:38:02,047] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt. 9: [2023-05-25 13:38:02,047] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt. 22: [2023-05-25 13:38:02,048] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 27: [2023-05-25 13:38:02,048] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt... 27: [2023-05-25 13:38:02,048] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt... 28: [2023-05-25 13:38:02,048] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 28: [2023-05-25 13:38:02,048] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 9: [2023-05-25 13:38:02,048] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt. 14: [2023-05-25 13:38:02,048] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt. 13: [2023-05-25 13:38:02,048] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt. 31: [2023-05-25 13:38:02,048] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt. 14: [2023-05-25 13:38:02,048] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt. 30: [2023-05-25 13:38:02,048] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt... 31: [2023-05-25 13:38:02,049] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_42-model_02-model_states.pt. 13: [2023-05-25 13:38:02,049] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt. 9: [2023-05-25 13:38:02,049] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt. 20: [2023-05-25 13:38:02,050] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 30: [2023-05-25 13:38:02,050] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 28: [2023-05-25 13:38:02,050] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt... 29: [2023-05-25 13:38:02,050] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 21: [2023-05-25 13:38:02,050] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 21: [2023-05-25 13:38:02,050] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 10: [2023-05-25 13:38:02,050] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt. 10: [2023-05-25 13:38:02,051] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt. 29: [2023-05-25 13:38:02,051] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 29: [2023-05-25 13:38:02,052] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt... 20: [2023-05-25 13:38:02,052] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 13: [2023-05-25 13:38:02,052] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt. 13: [2023-05-25 13:38:02,053] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt. 27: [2023-05-25 13:38:02,053] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 21: [2023-05-25 13:38:02,053] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 16: [2023-05-25 13:38:02,054] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 11: [2023-05-25 13:38:02,054] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 28: [2023-05-25 13:38:02,054] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 11: [2023-05-25 13:38:02,054] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 21: [2023-05-25 13:38:02,054] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 8: [2023-05-25 13:38:02,055] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 8: [2023-05-25 13:38:02,055] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 27: [2023-05-25 13:38:02,055] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt... 12: [2023-05-25 13:38:02,056] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt. 28: [2023-05-25 13:38:02,056] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 29: [2023-05-25 13:38:02,056] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 12: [2023-05-25 13:38:02,056] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt. 12: [2023-05-25 13:38:02,056] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt. 26: [2023-05-25 13:38:02,056] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 13: [2023-05-25 13:38:02,057] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 12: [2023-05-25 13:38:02,057] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_02-model_states.pt. 26: [2023-05-25 13:38:02,057] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 24: [2023-05-25 13:38:02,058] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 28: [2023-05-25 13:38:02,058] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt... 13: [2023-05-25 13:38:02,058] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 16: [2023-05-25 13:38:02,058] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 28: [2023-05-25 13:38:02,059] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 28: [2023-05-25 13:38:02,059] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 15: [2023-05-25 13:38:02,060] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 15: [2023-05-25 13:38:02,060] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 15: [2023-05-25 13:38:02,060] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 24: [2023-05-25 13:38:02,061] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt... 31: [2023-05-25 13:38:02,061] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 28: [2023-05-25 13:38:02,061] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt... 28: [2023-05-25 13:38:02,061] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt... 15: [2023-05-25 13:38:02,061] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 31: [2023-05-25 13:38:02,061] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 15: [2023-05-25 13:38:02,062] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 30: [2023-05-25 13:38:02,062] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 14: [2023-05-25 13:38:02,062] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 10: [2023-05-25 13:38:02,062] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 14: [2023-05-25 13:38:02,063] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 10: [2023-05-25 13:38:02,063] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 15: [2023-05-25 13:38:02,063] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 31: [2023-05-25 13:38:02,063] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt... 24: [2023-05-25 13:38:02,064] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 30: [2023-05-25 13:38:02,064] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt... 15: [2023-05-25 13:38:02,064] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt... 3: [2023-05-25 13:38:02,064] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt. 31: [2023-05-25 13:38:02,064] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt... 13: [2023-05-25 13:38:02,064] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 3: [2023-05-25 13:38:02,064] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt. 29: [2023-05-25 13:38:02,064] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 13: [2023-05-25 13:38:02,065] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 26: [2023-05-25 13:38:02,065] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 15: [2023-05-25 13:38:02,065] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt... 24: [2023-05-25 13:38:02,066] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt... 9: [2023-05-25 13:38:02,066] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 29: [2023-05-25 13:38:02,067] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt... 9: [2023-05-25 13:38:02,067] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 10: [2023-05-25 13:38:02,068] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 19: [2023-05-25 13:38:02,068] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 31: [2023-05-25 13:38:02,067] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 31: [2023-05-25 13:38:02,068] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 31: [2023-05-25 13:38:02,068] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 18: [2023-05-25 13:38:02,068] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 29: [2023-05-25 13:38:02,068] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 26: [2023-05-25 13:38:02,068] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 10: [2023-05-25 13:38:02,068] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 19: [2023-05-25 13:38:02,068] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 18: [2023-05-25 13:38:02,068] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 26: [2023-05-25 13:38:02,069] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt... 16: [2023-05-25 13:38:02,069] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 18: [2023-05-25 13:38:02,069] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 12: [2023-05-25 13:38:02,069] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 10: [2023-05-25 13:38:02,069] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 10: [2023-05-25 13:38:02,070] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 13: [2023-05-25 13:38:02,070] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 12: [2023-05-25 13:38:02,070] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 18: [2023-05-25 13:38:02,070] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 12: [2023-05-25 13:38:02,070] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 13: [2023-05-25 13:38:02,070] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 26: [2023-05-25 13:38:02,071] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt... 21: [2023-05-25 13:38:02,070] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 26: [2023-05-25 13:38:02,071] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 31: [2023-05-25 13:38:02,071] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt... 31: [2023-05-25 13:38:02,071] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt... 31: [2023-05-25 13:38:02,072] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt... 29: [2023-05-25 13:38:02,072] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt... 9: [2023-05-25 13:38:02,072] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 9: [2023-05-25 13:38:02,072] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 12: [2023-05-25 13:38:02,072] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 29: [2023-05-25 13:38:02,074] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 29: [2023-05-25 13:38:02,074] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 16: [2023-05-25 13:38:02,074] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 17: [2023-05-25 13:38:02,074] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 4: [2023-05-25 13:38:02,074] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt. 4: [2023-05-25 13:38:02,075] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt. 17: [2023-05-25 13:38:02,075] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 17: [2023-05-25 13:38:02,075] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 5: [2023-05-25 13:38:02,075] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt. 5: [2023-05-25 13:38:02,075] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt. 5: [2023-05-25 13:38:02,076] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt. 5: [2023-05-25 13:38:02,076] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt. 29: [2023-05-25 13:38:02,076] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt... 4: [2023-05-25 13:38:02,076] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt. 4: [2023-05-25 13:38:02,076] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt. 15: [2023-05-25 13:38:02,076] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt. 15: [2023-05-25 13:38:02,076] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt. 3: [2023-05-25 13:38:02,077] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 10: [2023-05-25 13:38:02,078] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 17: [2023-05-25 13:38:02,078] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 10: [2023-05-25 13:38:02,079] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt... 21: [2023-05-25 13:38:02,079] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 3: [2023-05-25 13:38:02,079] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 10: [2023-05-25 13:38:02,080] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 15: [2023-05-25 13:38:02,080] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt. 15: [2023-05-25 13:38:02,080] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt. 8: [2023-05-25 13:38:02,081] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt. 10: [2023-05-25 13:38:02,081] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt... 19: [2023-05-25 13:38:02,081] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 8: [2023-05-25 13:38:02,082] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt. 22: [2023-05-25 13:38:02,083] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 18: [2023-05-25 13:38:02,084] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt. 18: [2023-05-25 13:38:02,084] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt. 19: [2023-05-25 13:38:02,085] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 21: [2023-05-25 13:38:02,085] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 23: [2023-05-25 13:38:02,086] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 0: [2023-05-25 13:38:02,087] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt. 0: [2023-05-25 13:38:02,087] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt. 23: [2023-05-25 13:38:02,087] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 22: [2023-05-25 13:38:02,088] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 4: [2023-05-25 13:38:02,088] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 23: [2023-05-25 13:38:02,088] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 5: [2023-05-25 13:38:02,088] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 5: [2023-05-25 13:38:02,088] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 5: [2023-05-25 13:38:02,089] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 5: [2023-05-25 13:38:02,089] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 23: [2023-05-25 13:38:02,089] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 4: [2023-05-25 13:38:02,090] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 1: [2023-05-25 13:38:02,089] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt. 1: [2023-05-25 13:38:02,089] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt. 31: [2023-05-25 13:38:02,090] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 4: [2023-05-25 13:38:02,090] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 4: [2023-05-25 13:38:02,090] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 20: [2023-05-25 13:38:02,092] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 15: [2023-05-25 13:38:02,092] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 15: [2023-05-25 13:38:02,092] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 20: [2023-05-25 13:38:02,093] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt... 10: [2023-05-25 13:38:02,092] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 15: [2023-05-25 13:38:02,093] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 21: [2023-05-25 13:38:02,093] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 10: [2023-05-25 13:38:02,093] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 10: [2023-05-25 13:38:02,093] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 31: [2023-05-25 13:38:02,092] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt... 10: [2023-05-25 13:38:02,094] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt... 20: [2023-05-25 13:38:02,094] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 10: [2023-05-25 13:38:02,095] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt... 11: [2023-05-25 13:38:02,095] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt. 11: [2023-05-25 13:38:02,095] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt. 8: [2023-05-25 13:38:02,095] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 8: [2023-05-25 13:38:02,095] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 20: [2023-05-25 13:38:02,096] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 31: [2023-05-25 13:38:02,096] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_00-model_states.pt. 15: [2023-05-25 13:38:02,096] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 15: [2023-05-25 13:38:02,096] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 18: [2023-05-25 13:38:02,096] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 20: [2023-05-25 13:38:02,096] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 20: [2023-05-25 13:38:02,096] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 23: [2023-05-25 13:38:02,097] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt. 20: [2023-05-25 13:38:02,097] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt... 20: [2023-05-25 13:38:02,097] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 31: [2023-05-25 13:38:02,098] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt... 23: [2023-05-25 13:38:02,099] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt. 22: [2023-05-25 13:38:02,099] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 0: [2023-05-25 13:38:02,099] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 18: [2023-05-25 13:38:02,100] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 10: [2023-05-25 13:38:02,099] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 1: [2023-05-25 13:38:02,100] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt. 0: [2023-05-25 13:38:02,100] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 22: [2023-05-25 13:38:02,100] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 1: [2023-05-25 13:38:02,101] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt. 3: [2023-05-25 13:38:02,101] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt. 18: [2023-05-25 13:38:02,101] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt. 18: [2023-05-25 13:38:02,102] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt. 3: [2023-05-25 13:38:02,102] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt. 11: [2023-05-25 13:38:02,102] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 11: [2023-05-25 13:38:02,102] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 11: [2023-05-25 13:38:02,102] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 15: [2023-05-25 13:38:02,103] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 11: [2023-05-25 13:38:02,103] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 2: [2023-05-25 13:38:02,103] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt. 2: [2023-05-25 13:38:02,104] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt. 22: [2023-05-25 13:38:02,104] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt. 22: [2023-05-25 13:38:02,104] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt. 1: [2023-05-25 13:38:02,104] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 1: [2023-05-25 13:38:02,104] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 11: [2023-05-25 13:38:02,104] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 16: [2023-05-25 13:38:02,105] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt. 19: [2023-05-25 13:38:02,105] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt. 11: [2023-05-25 13:38:02,105] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 19: [2023-05-25 13:38:02,105] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt. 16: [2023-05-25 13:38:02,105] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt. 11: [2023-05-25 13:38:02,105] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt... 11: [2023-05-25 13:38:02,105] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt... 18: [2023-05-25 13:38:02,105] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 17: [2023-05-25 13:38:02,106] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 7: [2023-05-25 13:38:02,106] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt. 7: [2023-05-25 13:38:02,106] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_01-model_states.pt. 18: [2023-05-25 13:38:02,106] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 17: [2023-05-25 13:38:02,107] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt. 5: [2023-05-25 13:38:02,107] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 5: [2023-05-25 13:38:02,107] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 7: [2023-05-25 13:38:02,107] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt. 7: [2023-05-25 13:38:02,108] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt. 5: [2023-05-25 13:38:02,108] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 11: [2023-05-25 13:38:02,108] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 10: [2023-05-25 13:38:02,109] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 5: [2023-05-25 13:38:02,109] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 11: [2023-05-25 13:38:02,111] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 17: [2023-05-25 13:38:02,111] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt. 15: [2023-05-25 13:38:02,112] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 10: [2023-05-25 13:38:02,112] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 23: [2023-05-25 13:38:02,112] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 1: [2023-05-25 13:38:02,112] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 17: [2023-05-25 13:38:02,113] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt. 15: [2023-05-25 13:38:02,113] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 17: [2023-05-25 13:38:02,114] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 17: [2023-05-25 13:38:02,114] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt. 1: [2023-05-25 13:38:02,114] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 18: [2023-05-25 13:38:02,115] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 23: [2023-05-25 13:38:02,115] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 21: [2023-05-25 13:38:02,115] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt. 21: [2023-05-25 13:38:02,116] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_02-model_states.pt. 15: [2023-05-25 13:38:02,116] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 3: [2023-05-25 13:38:02,117] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 22: [2023-05-25 13:38:02,117] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 18: [2023-05-25 13:38:02,117] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 15: [2023-05-25 13:38:02,118] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt... 3: [2023-05-25 13:38:02,118] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 2: [2023-05-25 13:38:02,119] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 22: [2023-05-25 13:38:02,119] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 23: [2023-05-25 13:38:02,119] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 19: [2023-05-25 13:38:02,119] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 16: [2023-05-25 13:38:02,119] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 19: [2023-05-25 13:38:02,120] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 2: [2023-05-25 13:38:02,120] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 18: [2023-05-25 13:38:02,120] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 8: [2023-05-25 13:38:02,120] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 8: [2023-05-25 13:38:02,121] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 20: [2023-05-25 13:38:02,120] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 8: [2023-05-25 13:38:02,121] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 8: [2023-05-25 13:38:02,121] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 16: [2023-05-25 13:38:02,121] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 18: [2023-05-25 13:38:02,121] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 8: [2023-05-25 13:38:02,122] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 5: [2023-05-25 13:38:02,122] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 5: [2023-05-25 13:38:02,122] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 5: [2023-05-25 13:38:02,122] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 7: [2023-05-25 13:38:02,123] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 8: [2023-05-25 13:38:02,123] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt... 8: [2023-05-25 13:38:02,123] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt... 5: [2023-05-25 13:38:02,123] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 8: [2023-05-25 13:38:02,123] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 17: [2023-05-25 13:38:02,124] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 17: [2023-05-25 13:38:02,124] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 5: [2023-05-25 13:38:02,124] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt... 5: [2023-05-25 13:38:02,125] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt... 5: [2023-05-25 13:38:02,125] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt... 22: [2023-05-25 13:38:02,125] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt. 23: [2023-05-25 13:38:02,125] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 7: [2023-05-25 13:38:02,125] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 22: [2023-05-25 13:38:02,125] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt. 5: [2023-05-25 13:38:02,125] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt... 8: [2023-05-25 13:38:02,127] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 8: [2023-05-25 13:38:02,127] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 17: [2023-05-25 13:38:02,128] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 20: [2023-05-25 13:38:02,128] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 6: [2023-05-25 13:38:02,128] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt. 6: [2023-05-25 13:38:02,128] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt. 17: [2023-05-25 13:38:02,129] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 7: [2023-05-25 13:38:02,129] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 18: [2023-05-25 13:38:02,129] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 7: [2023-05-25 13:38:02,129] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 21: [2023-05-25 13:38:02,129] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 14: [2023-05-25 13:38:02,129] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 14: [2023-05-25 13:38:02,129] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 21: [2023-05-25 13:38:02,130] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 14: [2023-05-25 13:38:02,131] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 17: [2023-05-25 13:38:02,131] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 18: [2023-05-25 13:38:02,132] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt... 8: [2023-05-25 13:38:02,132] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt... 8: [2023-05-25 13:38:02,132] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt... 17: [2023-05-25 13:38:02,132] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 14: [2023-05-25 13:38:02,132] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 15: [2023-05-25 13:38:02,132] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 15: [2023-05-25 13:38:02,132] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 14: [2023-05-25 13:38:02,132] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 14: [2023-05-25 13:38:02,132] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 14: [2023-05-25 13:38:02,132] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 14: [2023-05-25 13:38:02,133] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt... 18: [2023-05-25 13:38:02,134] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 23: [2023-05-25 13:38:02,134] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 11: [2023-05-25 13:38:02,134] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 14: [2023-05-25 13:38:02,134] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt... 14: [2023-05-25 13:38:02,135] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 15: [2023-05-25 13:38:02,135] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt... 15: [2023-05-25 13:38:02,135] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt... 20: [2023-05-25 13:38:02,135] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 14: [2023-05-25 13:38:02,135] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt... 14: [2023-05-25 13:38:02,135] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt... 5: [2023-05-25 13:38:02,136] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 0: [2023-05-25 13:38:02,136] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt. 12: [2023-05-25 13:38:02,137] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 12: [2023-05-25 13:38:02,137] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 18: [2023-05-25 13:38:02,136] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt... 0: [2023-05-25 13:38:02,137] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt. 12: [2023-05-25 13:38:02,137] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 12: [2023-05-25 13:38:02,137] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 12: [2023-05-25 13:38:02,137] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 12: [2023-05-25 13:38:02,138] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 12: [2023-05-25 13:38:02,138] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 23: [2023-05-25 13:38:02,139] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 12: [2023-05-25 13:38:02,139] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 22: [2023-05-25 13:38:02,139] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 11: [2023-05-25 13:38:02,139] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 22: [2023-05-25 13:38:02,139] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 13: [2023-05-25 13:38:02,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 12: [2023-05-25 13:38:02,140] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt... 12: [2023-05-25 13:38:02,140] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt... 12: [2023-05-25 13:38:02,140] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt... 9: [2023-05-25 13:38:02,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 9: [2023-05-25 13:38:02,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 13: [2023-05-25 13:38:02,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 13: [2023-05-25 13:38:02,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 11: [2023-05-25 13:38:02,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 9: [2023-05-25 13:38:02,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 9: [2023-05-25 13:38:02,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 9: [2023-05-25 13:38:02,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 9: [2023-05-25 13:38:02,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 18: [2023-05-25 13:38:02,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 13: [2023-05-25 13:38:02,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 12: [2023-05-25 13:38:02,140] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt... 13: [2023-05-25 13:38:02,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 13: [2023-05-25 13:38:02,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 13: [2023-05-25 13:38:02,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 10: [2023-05-25 13:38:02,141] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt. 13: [2023-05-25 13:38:02,141] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 2: [2023-05-25 13:38:02,141] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt. 5: [2023-05-25 13:38:02,141] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 20: [2023-05-25 13:38:02,141] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 11: [2023-05-25 13:38:02,141] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt... 13: [2023-05-25 13:38:02,142] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 4: [2023-05-25 13:38:02,142] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 4: [2023-05-25 13:38:02,142] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 4: [2023-05-25 13:38:02,142] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 4: [2023-05-25 13:38:02,142] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 4: [2023-05-25 13:38:02,142] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 12: [2023-05-25 13:38:02,142] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt. 6: [2023-05-25 13:38:02,142] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 2: [2023-05-25 13:38:02,142] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_03-model_states.pt. 4: [2023-05-25 13:38:02,142] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 12: [2023-05-25 13:38:02,142] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt. 13: [2023-05-25 13:38:02,143] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt... 13: [2023-05-25 13:38:02,143] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt... 10: [2023-05-25 13:38:02,143] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt. 9: [2023-05-25 13:38:02,143] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt... 9: [2023-05-25 13:38:02,143] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt... 18: [2023-05-25 13:38:02,143] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt... 13: [2023-05-25 13:38:02,143] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt... 0: [2023-05-25 13:38:02,144] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 13: [2023-05-25 13:38:02,144] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 4: [2023-05-25 13:38:02,144] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt... 4: [2023-05-25 13:38:02,144] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt... 13: [2023-05-25 13:38:02,144] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt... 13: [2023-05-25 13:38:02,144] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt... 19: [2023-05-25 13:38:02,144] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 19: [2023-05-25 13:38:02,144] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 9: [2023-05-25 13:38:02,144] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt... 4: [2023-05-25 13:38:02,144] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt... 9: [2023-05-25 13:38:02,144] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt... 13: [2023-05-25 13:38:02,144] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt... 8: [2023-05-25 13:38:02,144] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt. 4: [2023-05-25 13:38:02,144] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt... 9: [2023-05-25 13:38:02,144] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 4: [2023-05-25 13:38:02,144] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 8: [2023-05-25 13:38:02,144] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt. 0: [2023-05-25 13:38:02,144] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 0: [2023-05-25 13:38:02,144] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 14: [2023-05-25 13:38:02,144] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt. 6: [2023-05-25 13:38:02,145] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 0: [2023-05-25 13:38:02,145] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 4: [2023-05-25 13:38:02,145] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 14: [2023-05-25 13:38:02,145] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt. 9: [2023-05-25 13:38:02,145] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 17: [2023-05-25 13:38:02,145] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 17: [2023-05-25 13:38:02,145] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt... 0: [2023-05-25 13:38:02,146] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 23: [2023-05-25 13:38:02,146] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 19: [2023-05-25 13:38:02,146] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt... 19: [2023-05-25 13:38:02,146] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt... 22: [2023-05-25 13:38:02,147] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 15: [2023-05-25 13:38:02,147] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 19: [2023-05-25 13:38:02,147] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt. 19: [2023-05-25 13:38:02,147] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt. 0: [2023-05-25 13:38:02,148] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt... 0: [2023-05-25 13:38:02,148] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt... 0: [2023-05-25 13:38:02,148] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 16: [2023-05-25 13:38:02,148] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt. 16: [2023-05-25 13:38:02,148] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt. 23: [2023-05-25 13:38:02,148] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt... 8: [2023-05-25 13:38:02,148] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 15: [2023-05-25 13:38:02,148] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt... 11: [2023-05-25 13:38:02,149] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 23: [2023-05-25 13:38:02,149] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt. 5: [2023-05-25 13:38:02,150] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 23: [2023-05-25 13:38:02,150] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt. 22: [2023-05-25 13:38:02,150] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt... 18: [2023-05-25 13:38:02,150] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 11: [2023-05-25 13:38:02,150] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 0: [2023-05-25 13:38:02,151] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 18: [2023-05-25 13:38:02,152] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt... 9: [2023-05-25 13:38:02,152] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt. 10: [2023-05-25 13:38:02,153] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 0: [2023-05-25 13:38:02,153] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 9: [2023-05-25 13:38:02,153] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_03-model_states.pt. 23: [2023-05-25 13:38:02,153] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 11: [2023-05-25 13:38:02,153] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt... 11: [2023-05-25 13:38:02,154] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 5: [2023-05-25 13:38:02,154] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 22: [2023-05-25 13:38:02,154] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 7: [2023-05-25 13:38:02,155] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 7: [2023-05-25 13:38:02,155] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 7: [2023-05-25 13:38:02,155] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 10: [2023-05-25 13:38:02,155] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 23: [2023-05-25 13:38:02,155] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt... 7: [2023-05-25 13:38:02,155] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 12: [2023-05-25 13:38:02,156] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 16: [2023-05-25 13:38:02,156] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 2: [2023-05-25 13:38:02,156] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 12: [2023-05-25 13:38:02,156] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 21: [2023-05-25 13:38:02,157] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt. 21: [2023-05-25 13:38:02,157] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt. 22: [2023-05-25 13:38:02,157] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt... 19: [2023-05-25 13:38:02,157] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt. 16: [2023-05-25 13:38:02,157] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt. 7: [2023-05-25 13:38:02,158] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt... 8: [2023-05-25 13:38:02,157] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 7: [2023-05-25 13:38:02,158] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt... 7: [2023-05-25 13:38:02,158] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 7: [2023-05-25 13:38:02,158] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 2: [2023-05-25 13:38:02,158] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 19: [2023-05-25 13:38:02,159] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt. 16: [2023-05-25 13:38:02,159] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt. 16: [2023-05-25 13:38:02,160] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt... 16: [2023-05-25 13:38:02,160] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 8: [2023-05-25 13:38:02,160] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 3: [2023-05-25 13:38:02,160] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 3: [2023-05-25 13:38:02,160] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 3: [2023-05-25 13:38:02,160] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 3: [2023-05-25 13:38:02,160] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 14: [2023-05-25 13:38:02,160] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 3: [2023-05-25 13:38:02,160] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 3: [2023-05-25 13:38:02,160] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 17: [2023-05-25 13:38:02,160] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 17: [2023-05-25 13:38:02,160] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 7: [2023-05-25 13:38:02,161] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 14: [2023-05-25 13:38:02,161] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 3: [2023-05-25 13:38:02,161] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 21: [2023-05-25 13:38:02,162] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 3: [2023-05-25 13:38:02,162] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 3: [2023-05-25 13:38:02,163] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt... 23: [2023-05-25 13:38:02,163] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 17: [2023-05-25 13:38:02,162] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt... 16: [2023-05-25 13:38:02,162] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt... 14: [2023-05-25 13:38:02,163] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 17: [2023-05-25 13:38:02,162] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt... 16: [2023-05-25 13:38:02,163] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 16: [2023-05-25 13:38:02,163] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 7: [2023-05-25 13:38:02,163] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt... 17: [2023-05-25 13:38:02,163] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 3: [2023-05-25 13:38:02,163] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt... 3: [2023-05-25 13:38:02,163] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt... 3: [2023-05-25 13:38:02,163] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt... 22: [2023-05-25 13:38:02,164] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 8: [2023-05-25 13:38:02,161] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 19: [2023-05-25 13:38:02,164] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 8: [2023-05-25 13:38:02,164] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 23: [2023-05-25 13:38:02,164] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 21: [2023-05-25 13:38:02,164] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt... 19: [2023-05-25 13:38:02,165] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 14: [2023-05-25 13:38:02,165] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 17: [2023-05-25 13:38:02,165] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt... 22: [2023-05-25 13:38:02,167] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt... 22: [2023-05-25 13:38:02,168] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 13: [2023-05-25 13:38:02,169] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 21: [2023-05-25 13:38:02,169] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 22: [2023-05-25 13:38:02,170] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt... 7: [2023-05-25 13:38:02,170] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 0: [2023-05-25 13:38:02,170] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt. 0: [2023-05-25 13:38:02,171] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt. 9: [2023-05-25 13:38:02,171] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 21: [2023-05-25 13:38:02,171] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 19: [2023-05-25 13:38:02,172] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 16: [2023-05-25 13:38:02,172] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 21: [2023-05-25 13:38:02,172] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 19: [2023-05-25 13:38:02,172] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 9: [2023-05-25 13:38:02,173] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 12: [2023-05-25 13:38:02,172] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 7: [2023-05-25 13:38:02,173] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt... 22: [2023-05-25 13:38:02,173] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt. 22: [2023-05-25 13:38:02,174] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt. 21: [2023-05-25 13:38:02,174] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt... 13: [2023-05-25 13:38:02,174] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 12: [2023-05-25 13:38:02,174] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 16: [2023-05-25 13:38:02,175] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 4: [2023-05-25 13:38:02,175] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt. 4: [2023-05-25 13:38:02,175] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 4: [2023-05-25 13:38:02,175] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 4: [2023-05-25 13:38:02,175] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt. 8: [2023-05-25 13:38:02,176] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 14: [2023-05-25 13:38:02,178] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 12: [2023-05-25 13:38:02,178] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 12: [2023-05-25 13:38:02,179] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt... 14: [2023-05-25 13:38:02,180] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 12: [2023-05-25 13:38:02,181] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 0: [2023-05-25 13:38:02,181] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 2: [2023-05-25 13:38:02,182] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 2: [2023-05-25 13:38:02,182] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 12: [2023-05-25 13:38:02,182] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt... 3: [2023-05-25 13:38:02,183] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 13: [2023-05-25 13:38:02,183] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 2: [2023-05-25 13:38:02,184] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 2: [2023-05-25 13:38:02,184] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 2: [2023-05-25 13:38:02,184] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 10: [2023-05-25 13:38:02,184] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 2: [2023-05-25 13:38:02,184] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 9: [2023-05-25 13:38:02,184] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 9: [2023-05-25 13:38:02,184] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 1: [2023-05-25 13:38:02,185] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 1: [2023-05-25 13:38:02,185] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 2: [2023-05-25 13:38:02,185] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 11: [2023-05-25 13:38:02,185] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt. 11: [2023-05-25 13:38:02,185] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_20-model_01-model_states.pt. 10: [2023-05-25 13:38:02,185] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 2: [2023-05-25 13:38:02,186] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt... 2: [2023-05-25 13:38:02,186] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 1: [2023-05-25 13:38:02,186] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 12: [2023-05-25 13:38:02,186] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 1: [2023-05-25 13:38:02,186] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 1: [2023-05-25 13:38:02,186] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 2: [2023-05-25 13:38:02,187] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt... 2: [2023-05-25 13:38:02,187] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt... 8: [2023-05-25 13:38:02,187] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 12: [2023-05-25 13:38:02,187] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 10: [2023-05-25 13:38:02,187] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt... 10: [2023-05-25 13:38:02,187] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt... 1: [2023-05-25 13:38:02,187] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 1: [2023-05-25 13:38:02,187] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 1: [2023-05-25 13:38:02,187] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 1: [2023-05-25 13:38:02,187] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt... 0: [2023-05-25 13:38:02,187] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 22: [2023-05-25 13:38:02,188] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 0: [2023-05-25 13:38:02,188] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 22: [2023-05-25 13:38:02,188] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 8: [2023-05-25 13:38:02,188] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt... 2: [2023-05-25 13:38:02,188] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt... 8: [2023-05-25 13:38:02,188] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 1: [2023-05-25 13:38:02,188] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt... 4: [2023-05-25 13:38:02,188] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 4: [2023-05-25 13:38:02,189] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 1: [2023-05-25 13:38:02,189] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt... 4: [2023-05-25 13:38:02,190] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 1: [2023-05-25 13:38:02,190] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt... 4: [2023-05-25 13:38:02,190] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 8: [2023-05-25 13:38:02,190] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt... 14: [2023-05-25 13:38:02,190] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 13: [2023-05-25 13:38:02,190] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 21: [2023-05-25 13:38:02,190] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 20: [2023-05-25 13:38:02,191] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt. 21: [2023-05-25 13:38:02,191] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt... 23: [2023-05-25 13:38:02,191] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 7: [2023-05-25 13:38:02,192] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 20: [2023-05-25 13:38:02,192] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_03-model_states.pt. 14: [2023-05-25 13:38:02,192] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt... 3: [2023-05-25 13:38:02,193] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 0: [2023-05-25 13:38:02,193] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 14: [2023-05-25 13:38:02,193] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 19: [2023-05-25 13:38:02,193] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 6: [2023-05-25 13:38:02,193] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 6: [2023-05-25 13:38:02,194] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 6: [2023-05-25 13:38:02,194] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 6: [2023-05-25 13:38:02,194] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 6: [2023-05-25 13:38:02,194] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 19: [2023-05-25 13:38:02,194] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 6: [2023-05-25 13:38:02,195] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 14: [2023-05-25 13:38:02,195] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt... 19: [2023-05-25 13:38:02,195] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt... 6: [2023-05-25 13:38:02,195] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 0: [2023-05-25 13:38:02,196] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 3: [2023-05-25 13:38:02,196] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 6: [2023-05-25 13:38:02,196] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 6: [2023-05-25 13:38:02,196] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt... 6: [2023-05-25 13:38:02,196] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt... 23: [2023-05-25 13:38:02,196] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 23: [2023-05-25 13:38:02,196] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt... 6: [2023-05-25 13:38:02,197] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt... 19: [2023-05-25 13:38:02,197] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 0: [2023-05-25 13:38:02,197] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 6: [2023-05-25 13:38:02,197] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt... 7: [2023-05-25 13:38:02,198] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 11: [2023-05-25 13:38:02,198] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 11: [2023-05-25 13:38:02,198] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt... 23: [2023-05-25 13:38:02,198] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt... 7: [2023-05-25 13:38:02,199] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt. 7: [2023-05-25 13:38:02,199] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt. 16: [2023-05-25 13:38:02,200] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 16: [2023-05-25 13:38:02,200] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 0: [2023-05-25 13:38:02,200] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt... 9: [2023-05-25 13:38:02,201] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 19: [2023-05-25 13:38:02,201] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt... 9: [2023-05-25 13:38:02,201] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 1: [2023-05-25 13:38:02,201] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt. 1: [2023-05-25 13:38:02,202] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt. 19: [2023-05-25 13:38:02,202] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt... 19: [2023-05-25 13:38:02,202] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 16: [2023-05-25 13:38:02,203] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt... 16: [2023-05-25 13:38:02,203] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt... 20: [2023-05-25 13:38:02,204] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 19: [2023-05-25 13:38:02,204] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt... 20: [2023-05-25 13:38:02,205] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 21: [2023-05-25 13:38:02,207] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 3: [2023-05-25 13:38:02,207] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 9: [2023-05-25 13:38:02,207] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 16: [2023-05-25 13:38:02,207] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 16: [2023-05-25 13:38:02,208] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 21: [2023-05-25 13:38:02,208] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt... 7: [2023-05-25 13:38:02,209] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 17: [2023-05-25 13:38:02,209] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt. 17: [2023-05-25 13:38:02,209] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt. 16: [2023-05-25 13:38:02,210] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt... 16: [2023-05-25 13:38:02,210] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt... 9: [2023-05-25 13:38:02,210] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt... 0: [2023-05-25 13:38:02,210] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 9: [2023-05-25 13:38:02,210] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 0: [2023-05-25 13:38:02,211] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 25: [2023-05-25 13:38:02,212] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 22: [2023-05-25 13:38:02,212] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 22: [2023-05-25 13:38:02,212] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 9: [2023-05-25 13:38:02,213] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt... 0: [2023-05-25 13:38:02,214] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt... 25: [2023-05-25 13:38:02,214] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 4: [2023-05-25 13:38:02,215] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 7: [2023-05-25 13:38:02,215] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 25: [2023-05-25 13:38:02,215] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 7: [2023-05-25 13:38:02,216] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 7: [2023-05-25 13:38:02,216] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 22: [2023-05-25 13:38:02,217] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt... 22: [2023-05-25 13:38:02,217] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt... 2: [2023-05-25 13:38:02,216] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 4: [2023-05-25 13:38:02,217] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt... 2: [2023-05-25 13:38:02,217] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 4: [2023-05-25 13:38:02,217] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 25: [2023-05-25 13:38:02,217] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 1: [2023-05-25 13:38:02,218] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 1: [2023-05-25 13:38:02,218] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 4: [2023-05-25 13:38:02,219] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt... 11: [2023-05-25 13:38:02,219] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 20: [2023-05-25 13:38:02,220] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt. 20: [2023-05-25 13:38:02,220] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt. 11: [2023-05-25 13:38:02,221] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt... 23: [2023-05-25 13:38:02,221] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt. 23: [2023-05-25 13:38:02,221] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt. 21: [2023-05-25 13:38:02,222] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt. 17: [2023-05-25 13:38:02,222] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 17: [2023-05-25 13:38:02,222] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 18: [2023-05-25 13:38:02,223] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt. 21: [2023-05-25 13:38:02,223] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt. 1: [2023-05-25 13:38:02,224] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 1: [2023-05-25 13:38:02,224] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 18: [2023-05-25 13:38:02,224] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_31-model_01-model_states.pt. 6: [2023-05-25 13:38:02,224] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 6: [2023-05-25 13:38:02,226] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 0: [2023-05-25 13:38:02,227] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 24: [2023-05-25 13:38:02,227] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 24: [2023-05-25 13:38:02,227] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 11: [2023-05-25 13:38:02,228] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_00-model_states.pt. 0: [2023-05-25 13:38:02,229] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 24: [2023-05-25 13:38:02,229] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 0: [2023-05-25 13:38:02,229] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt... 24: [2023-05-25 13:38:02,230] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 26: [2023-05-25 13:38:02,230] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 26: [2023-05-25 13:38:02,230] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 11: [2023-05-25 13:38:02,230] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt... 2: [2023-05-25 13:38:02,231] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 20: [2023-05-25 13:38:02,231] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 2: [2023-05-25 13:38:02,231] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 0: [2023-05-25 13:38:02,231] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt... 26: [2023-05-25 13:38:02,231] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 20: [2023-05-25 13:38:02,233] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 20: [2023-05-25 13:38:02,233] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 26: [2023-05-25 13:38:02,233] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 23: [2023-05-25 13:38:02,234] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 18: [2023-05-25 13:38:02,234] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 23: [2023-05-25 13:38:02,234] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 21: [2023-05-25 13:38:02,235] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 21: [2023-05-25 13:38:02,236] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 18: [2023-05-25 13:38:02,237] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt... 1: [2023-05-25 13:38:02,239] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 6: [2023-05-25 13:38:02,239] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 20: [2023-05-25 13:38:02,239] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 6: [2023-05-25 13:38:02,240] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 1: [2023-05-25 13:38:02,242] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 20: [2023-05-25 13:38:02,245] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt... 20: [2023-05-25 13:38:02,245] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt... 5: [2023-05-25 13:38:02,245] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt. 5: [2023-05-25 13:38:02,245] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt. 1: [2023-05-25 13:38:02,246] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 7: [2023-05-25 13:38:02,246] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 3: [2023-05-25 13:38:02,247] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt. 3: [2023-05-25 13:38:02,247] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt. 7: [2023-05-25 13:38:02,248] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 1: [2023-05-25 13:38:02,248] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt... 1: [2023-05-25 13:38:02,248] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 7: [2023-05-25 13:38:02,249] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt... 29: [2023-05-25 13:38:02,249] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 29: [2023-05-25 13:38:02,249] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 7: [2023-05-25 13:38:02,250] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt... 25: [2023-05-25 13:38:02,250] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 25: [2023-05-25 13:38:02,250] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 1: [2023-05-25 13:38:02,250] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt... 29: [2023-05-25 13:38:02,251] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 29: [2023-05-25 13:38:02,251] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 17: [2023-05-25 13:38:02,253] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 18: [2023-05-25 13:38:02,255] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 17: [2023-05-25 13:38:02,255] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt... 20: [2023-05-25 13:38:02,256] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 23: [2023-05-25 13:38:02,256] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 18: [2023-05-25 13:38:02,256] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt... 20: [2023-05-25 13:38:02,257] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt... 23: [2023-05-25 13:38:02,257] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt... 17: [2023-05-25 13:38:02,258] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 5: [2023-05-25 13:38:02,259] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 5: [2023-05-25 13:38:02,259] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 17: [2023-05-25 13:38:02,260] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt... 26: [2023-05-25 13:38:02,260] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 2: [2023-05-25 13:38:02,261] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt. 27: [2023-05-25 13:38:02,261] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 23: [2023-05-25 13:38:02,261] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 27: [2023-05-25 13:38:02,261] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 2: [2023-05-25 13:38:02,261] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt. 3: [2023-05-25 13:38:02,261] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 28: [2023-05-25 13:38:02,262] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 28: [2023-05-25 13:38:02,262] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 31: [2023-05-25 13:38:02,262] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 20: [2023-05-25 13:38:02,263] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 23: [2023-05-25 13:38:02,263] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt... 27: [2023-05-25 13:38:02,263] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 28: [2023-05-25 13:38:02,263] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 3: [2023-05-25 13:38:02,263] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 27: [2023-05-25 13:38:02,263] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 25: [2023-05-25 13:38:02,263] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 25: [2023-05-25 13:38:02,264] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 31: [2023-05-25 13:38:02,264] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 30: [2023-05-25 13:38:02,263] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 30: [2023-05-25 13:38:02,264] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 28: [2023-05-25 13:38:02,264] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 20: [2023-05-25 13:38:02,264] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt... 21: [2023-05-25 13:38:02,265] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 21: [2023-05-25 13:38:02,265] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 31: [2023-05-25 13:38:02,266] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 31: [2023-05-25 13:38:02,266] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt. 31: [2023-05-25 13:38:02,266] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt. 31: [2023-05-25 13:38:02,267] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 18: [2023-05-25 13:38:02,267] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_00-model_states.pt. 30: [2023-05-25 13:38:02,266] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 30: [2023-05-25 13:38:02,267] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 21: [2023-05-25 13:38:02,267] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt... 21: [2023-05-25 13:38:02,267] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt... 26: [2023-05-25 13:38:02,268] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 18: [2023-05-25 13:38:02,269] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt... 24: [2023-05-25 13:38:02,270] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 24: [2023-05-25 13:38:02,270] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 6: [2023-05-25 13:38:02,270] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt. 6: [2023-05-25 13:38:02,270] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_08-model_02-model_states.pt. 26: [2023-05-25 13:38:02,274] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 2: [2023-05-25 13:38:02,276] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 2: [2023-05-25 13:38:02,276] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 31: [2023-05-25 13:38:02,280] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 26: [2023-05-25 13:38:02,281] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 31: [2023-05-25 13:38:02,281] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 25: [2023-05-25 13:38:02,281] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 25: [2023-05-25 13:38:02,282] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 26: [2023-05-25 13:38:02,282] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 25: [2023-05-25 13:38:02,282] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 25: [2023-05-25 13:38:02,282] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 26: [2023-05-25 13:38:02,282] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 26: [2023-05-25 13:38:02,282] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 26: [2023-05-25 13:38:02,282] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 26: [2023-05-25 13:38:02,282] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 25: [2023-05-25 13:38:02,282] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 25: [2023-05-25 13:38:02,282] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 24: [2023-05-25 13:38:02,282] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 26: [2023-05-25 13:38:02,282] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 29: [2023-05-25 13:38:02,283] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt. 29: [2023-05-25 13:38:02,283] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt. 24: [2023-05-25 13:38:02,283] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 6: [2023-05-25 13:38:02,284] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 6: [2023-05-25 13:38:02,287] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt... 5: [2023-05-25 13:38:02,288] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 24: [2023-05-25 13:38:02,290] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 24: [2023-05-25 13:38:02,290] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 24: [2023-05-25 13:38:02,290] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 24: [2023-05-25 13:38:02,290] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 24: [2023-05-25 13:38:02,290] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 24: [2023-05-25 13:38:02,290] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 3: [2023-05-25 13:38:02,291] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 5: [2023-05-25 13:38:02,291] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt... 25: [2023-05-25 13:38:02,293] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_12_optim_states.pt... 25: [2023-05-25 13:38:02,293] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_12_optim_states.pt... 29: [2023-05-25 13:38:02,294] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 3: [2023-05-25 13:38:02,294] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt... 29: [2023-05-25 13:38:02,294] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 5: [2023-05-25 13:38:02,294] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 3: [2023-05-25 13:38:02,296] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 5: [2023-05-25 13:38:02,296] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt... 29: [2023-05-25 13:38:02,297] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 26: [2023-05-25 13:38:02,298] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_12_optim_states.pt... 26: [2023-05-25 13:38:02,298] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_12_optim_states.pt... 3: [2023-05-25 13:38:02,298] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt... 29: [2023-05-25 13:38:02,298] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 28: [2023-05-25 13:38:02,299] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 28: [2023-05-25 13:38:02,299] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 24: [2023-05-25 13:38:02,299] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_12_optim_states.pt... 24: [2023-05-25 13:38:02,299] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_12_optim_states.pt... 26: [2023-05-25 13:38:02,300] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt. 2: [2023-05-25 13:38:02,303] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 26: [2023-05-25 13:38:02,303] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt. 24: [2023-05-25 13:38:02,303] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt. 24: [2023-05-25 13:38:02,303] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt. 30: [2023-05-25 13:38:02,303] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 30: [2023-05-25 13:38:02,304] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 25: [2023-05-25 13:38:02,304] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt. 25: [2023-05-25 13:38:02,304] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt. 29: [2023-05-25 13:38:02,305] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 2: [2023-05-25 13:38:02,305] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt... 31: [2023-05-25 13:38:02,306] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 28: [2023-05-25 13:38:02,306] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt. 28: [2023-05-25 13:38:02,306] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt. 28: [2023-05-25 13:38:02,306] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt. 28: [2023-05-25 13:38:02,307] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt. 27: [2023-05-25 13:38:02,307] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 27: [2023-05-25 13:38:02,307] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 27: [2023-05-25 13:38:02,307] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt. 24: [2023-05-25 13:38:02,307] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt. 2: [2023-05-25 13:38:02,307] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 29: [2023-05-25 13:38:02,308] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 24: [2023-05-25 13:38:02,308] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt. 27: [2023-05-25 13:38:02,309] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt. 2: [2023-05-25 13:38:02,309] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt... 31: [2023-05-25 13:38:02,310] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 28: [2023-05-25 13:38:02,311] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt. 28: [2023-05-25 13:38:02,311] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt. 31: [2023-05-25 13:38:02,312] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 6: [2023-05-25 13:38:02,313] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 6: [2023-05-25 13:38:02,313] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_00-model_states.pt. 26: [2023-05-25 13:38:02,313] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 6: [2023-05-25 13:38:02,314] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt... 28: [2023-05-25 13:38:02,314] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 28: [2023-05-25 13:38:02,314] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 30: [2023-05-25 13:38:02,314] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt. 6: [2023-05-25 13:38:02,315] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt... 31: [2023-05-25 13:38:02,313] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt... 31: [2023-05-25 13:38:02,314] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 24: [2023-05-25 13:38:02,316] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 31: [2023-05-25 13:38:02,316] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt... 30: [2023-05-25 13:38:02,317] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt. 24: [2023-05-25 13:38:02,317] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 30: [2023-05-25 13:38:02,317] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 31: [2023-05-25 13:38:02,317] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 20: [2023-05-25 13:38:02,318] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt. 29: [2023-05-25 13:38:02,318] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 29: [2023-05-25 13:38:02,319] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 25: [2023-05-25 13:38:02,319] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 25: [2023-05-25 13:38:02,319] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 29: [2023-05-25 13:38:02,319] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 29: [2023-05-25 13:38:02,319] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 29: [2023-05-25 13:38:02,320] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 27: [2023-05-25 13:38:02,320] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 29: [2023-05-25 13:38:02,320] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 30: [2023-05-25 13:38:02,320] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 15: [2023-05-25 13:38:02,320] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 26: [2023-05-25 13:38:02,320] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 15: [2023-05-25 13:38:02,320] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 29: [2023-05-25 13:38:02,321] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 20: [2023-05-25 13:38:02,321] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt. 15: [2023-05-25 13:38:02,321] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 28: [2023-05-25 13:38:02,322] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 28: [2023-05-25 13:38:02,322] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 27: [2023-05-25 13:38:02,322] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 30: [2023-05-25 13:38:02,322] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 27: [2023-05-25 13:38:02,322] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 30: [2023-05-25 13:38:02,322] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 28: [2023-05-25 13:38:02,322] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 28: [2023-05-25 13:38:02,322] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 24: [2023-05-25 13:38:02,322] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 30: [2023-05-25 13:38:02,322] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 15: [2023-05-25 13:38:02,322] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 28: [2023-05-25 13:38:02,322] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 30: [2023-05-25 13:38:02,322] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 30: [2023-05-25 13:38:02,322] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 28: [2023-05-25 13:38:02,322] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 29: [2023-05-25 13:38:02,323] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt... 30: [2023-05-25 13:38:02,323] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt. 28: [2023-05-25 13:38:02,323] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 28: [2023-05-25 13:38:02,323] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 30: [2023-05-25 13:38:02,323] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 28: [2023-05-25 13:38:02,324] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 24: [2023-05-25 13:38:02,324] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 28: [2023-05-25 13:38:02,324] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 26: [2023-05-25 13:38:02,324] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt. 26: [2023-05-25 13:38:02,324] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt. 31: [2023-05-25 13:38:02,325] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 31: [2023-05-25 13:38:02,325] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 31: [2023-05-25 13:38:02,326] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 27: [2023-05-25 13:38:02,326] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 31: [2023-05-25 13:38:02,326] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 31: [2023-05-25 13:38:02,326] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 31: [2023-05-25 13:38:02,326] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 31: [2023-05-25 13:38:02,326] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 28: [2023-05-25 13:38:02,328] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 28: [2023-05-25 13:38:02,328] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 29: [2023-05-25 13:38:02,329] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_12_optim_states.pt... 29: [2023-05-25 13:38:02,329] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_12_optim_states.pt... 31: [2023-05-25 13:38:02,329] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt. 27: [2023-05-25 13:38:02,329] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 27: [2023-05-25 13:38:02,329] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 31: [2023-05-25 13:38:02,329] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt. 29: [2023-05-25 13:38:02,329] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt. 29: [2023-05-25 13:38:02,330] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt. 27: [2023-05-25 13:38:02,330] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 27: [2023-05-25 13:38:02,330] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 27: [2023-05-25 13:38:02,331] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 20: [2023-05-25 13:38:02,331] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 27: [2023-05-25 13:38:02,331] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 25: [2023-05-25 13:38:02,332] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt. 25: [2023-05-25 13:38:02,332] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt. 28: [2023-05-25 13:38:02,332] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_12_optim_states.pt... 28: [2023-05-25 13:38:02,332] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_12_optim_states.pt... 30: [2023-05-25 13:38:02,333] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 30: [2023-05-25 13:38:02,333] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt. 31: [2023-05-25 13:38:02,333] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_12_optim_states.pt... 31: [2023-05-25 13:38:02,333] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_12_optim_states.pt... 29: [2023-05-25 13:38:02,334] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 20: [2023-05-25 13:38:02,334] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 30: [2023-05-25 13:38:02,334] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 22: [2023-05-25 13:38:02,334] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt. 4: [2023-05-25 13:38:02,336] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt. 4: [2023-05-25 13:38:02,336] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt. 29: [2023-05-25 13:38:02,336] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt. 22: [2023-05-25 13:38:02,336] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt. 30: [2023-05-25 13:38:02,336] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 29: [2023-05-25 13:38:02,337] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt... 29: [2023-05-25 13:38:02,337] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt. 30: [2023-05-25 13:38:02,337] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt. 30: [2023-05-25 13:38:02,337] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_12_optim_states.pt... 30: [2023-05-25 13:38:02,337] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_12_optim_states.pt... 27: [2023-05-25 13:38:02,338] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_12_optim_states.pt... 27: [2023-05-25 13:38:02,338] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_12_optim_states.pt... 31: [2023-05-25 13:38:02,338] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt. 26: [2023-05-25 13:38:02,339] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 26: [2023-05-25 13:38:02,339] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 27: [2023-05-25 13:38:02,340] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt. 27: [2023-05-25 13:38:02,340] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_01-model_states.pt. 30: [2023-05-25 13:38:02,340] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt. 31: [2023-05-25 13:38:02,341] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_03-model_states.pt. 11: [2023-05-25 13:38:02,343] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 11: [2023-05-25 13:38:02,343] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 31: [2023-05-25 13:38:02,344] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 29: [2023-05-25 13:38:02,344] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 29: [2023-05-25 13:38:02,344] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 24: [2023-05-25 13:38:02,344] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 26: [2023-05-25 13:38:02,344] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt. 25: [2023-05-25 13:38:02,344] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt. 26: [2023-05-25 13:38:02,344] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt. 11: [2023-05-25 13:38:02,345] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 31: [2023-05-25 13:38:02,345] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 25: [2023-05-25 13:38:02,345] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 25: [2023-05-25 13:38:02,345] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 25: [2023-05-25 13:38:02,345] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt. 11: [2023-05-25 13:38:02,345] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 27: [2023-05-25 13:38:02,346] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt. 24: [2023-05-25 13:38:02,346] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 27: [2023-05-25 13:38:02,346] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt. 24: [2023-05-25 13:38:02,346] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt... 24: [2023-05-25 13:38:02,347] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt. 24: [2023-05-25 13:38:02,347] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_43-model_02-model_states.pt. 22: [2023-05-25 13:38:02,348] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 24: [2023-05-25 13:38:02,348] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 24: [2023-05-25 13:38:02,348] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt... 30: [2023-05-25 13:38:02,349] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 29: [2023-05-25 13:38:02,351] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 29: [2023-05-25 13:38:02,351] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 31: [2023-05-25 13:38:02,351] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 24: [2023-05-25 13:38:02,350] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt... 30: [2023-05-25 13:38:02,350] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 25: [2023-05-25 13:38:02,351] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 25: [2023-05-25 13:38:02,351] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 1: [2023-05-25 13:38:02,351] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt. 18: [2023-05-25 13:38:02,351] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt. 4: [2023-05-25 13:38:02,351] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 26: [2023-05-25 13:38:02,351] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 1: [2023-05-25 13:38:02,351] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt. 4: [2023-05-25 13:38:02,351] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 15: [2023-05-25 13:38:02,351] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 10: [2023-05-25 13:38:02,352] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 22: [2023-05-25 13:38:02,352] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 18: [2023-05-25 13:38:02,352] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt. 10: [2023-05-25 13:38:02,352] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 10: [2023-05-25 13:38:02,353] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 19: [2023-05-25 13:38:02,353] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt. 19: [2023-05-25 13:38:02,353] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt. 28: [2023-05-25 13:38:02,354] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 28: [2023-05-25 13:38:02,354] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 27: [2023-05-25 13:38:02,354] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 25: [2023-05-25 13:38:02,355] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt... 25: [2023-05-25 13:38:02,355] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt... 10: [2023-05-25 13:38:02,354] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 26: [2023-05-25 13:38:02,355] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 27: [2023-05-25 13:38:02,355] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 17: [2023-05-25 13:38:02,355] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt. 17: [2023-05-25 13:38:02,355] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt. 28: [2023-05-25 13:38:02,356] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt... 28: [2023-05-25 13:38:02,356] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt... 30: [2023-05-25 13:38:02,357] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 31: [2023-05-25 13:38:02,357] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 27: [2023-05-25 13:38:02,358] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 15: [2023-05-25 13:38:02,358] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 26: [2023-05-25 13:38:02,358] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 24: [2023-05-25 13:38:02,358] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 26: [2023-05-25 13:38:02,359] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 26: [2023-05-25 13:38:02,359] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt... 14: [2023-05-25 13:38:02,359] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 14: [2023-05-25 13:38:02,359] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 28: [2023-05-25 13:38:02,360] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 28: [2023-05-25 13:38:02,360] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 25: [2023-05-25 13:38:02,360] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 25: [2023-05-25 13:38:02,360] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 24: [2023-05-25 13:38:02,360] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 26: [2023-05-25 13:38:02,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt... 14: [2023-05-25 13:38:02,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 24: [2023-05-25 13:38:02,361] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt... 28: [2023-05-25 13:38:02,361] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 24: [2023-05-25 13:38:02,362] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 15: [2023-05-25 13:38:02,361] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt. 15: [2023-05-25 13:38:02,362] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt. 14: [2023-05-25 13:38:02,362] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 27: [2023-05-25 13:38:02,362] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt... 13: [2023-05-25 13:38:02,362] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 28: [2023-05-25 13:38:02,362] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 13: [2023-05-25 13:38:02,362] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 28: [2023-05-25 13:38:02,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt... 28: [2023-05-25 13:38:02,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt... 27: [2023-05-25 13:38:02,363] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 27: [2023-05-25 13:38:02,363] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 13: [2023-05-25 13:38:02,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 13: [2023-05-25 13:38:02,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 28: [2023-05-25 13:38:02,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt... 1: [2023-05-25 13:38:02,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 28: [2023-05-25 13:38:02,365] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt... 27: [2023-05-25 13:38:02,365] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt... 27: [2023-05-25 13:38:02,365] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt... 18: [2023-05-25 13:38:02,365] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 26: [2023-05-25 13:38:02,366] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 26: [2023-05-25 13:38:02,366] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 1: [2023-05-25 13:38:02,366] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 18: [2023-05-25 13:38:02,366] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 18: [2023-05-25 13:38:02,366] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 18: [2023-05-25 13:38:02,367] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 19: [2023-05-25 13:38:02,367] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 8: [2023-05-25 13:38:02,367] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 19: [2023-05-25 13:38:02,368] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 15: [2023-05-25 13:38:02,368] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 18: [2023-05-25 13:38:02,368] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 18: [2023-05-25 13:38:02,368] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 17: [2023-05-25 13:38:02,369] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 17: [2023-05-25 13:38:02,369] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 8: [2023-05-25 13:38:02,369] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 9: [2023-05-25 13:38:02,370] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt. 9: [2023-05-25 13:38:02,370] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt. 26: [2023-05-25 13:38:02,370] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt... 26: [2023-05-25 13:38:02,370] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt... 8: [2023-05-25 13:38:02,370] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 30: [2023-05-25 13:38:02,370] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 15: [2023-05-25 13:38:02,371] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 31: [2023-05-25 13:38:02,372] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 8: [2023-05-25 13:38:02,372] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 23: [2023-05-25 13:38:02,373] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 30: [2023-05-25 13:38:02,373] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt... 23: [2023-05-25 13:38:02,373] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 31: [2023-05-25 13:38:02,373] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt... 23: [2023-05-25 13:38:02,374] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 23: [2023-05-25 13:38:02,374] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 29: [2023-05-25 13:38:02,374] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 25: [2023-05-25 13:38:02,375] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 21: [2023-05-25 13:38:02,375] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 21: [2023-05-25 13:38:02,376] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 15: [2023-05-25 13:38:02,376] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 21: [2023-05-25 13:38:02,377] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 29: [2023-05-25 13:38:02,377] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt... 21: [2023-05-25 13:38:02,378] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 30: [2023-05-25 13:38:02,378] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 25: [2023-05-25 13:38:02,379] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt... 31: [2023-05-25 13:38:02,379] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 15: [2023-05-25 13:38:02,380] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 31: [2023-05-25 13:38:02,380] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 10: [2023-05-25 13:38:02,381] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt. 30: [2023-05-25 13:38:02,381] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 10: [2023-05-25 13:38:02,381] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt. 30: [2023-05-25 13:38:02,382] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt... 12: [2023-05-25 13:38:02,382] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 31: [2023-05-25 13:38:02,382] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt... 12: [2023-05-25 13:38:02,382] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 29: [2023-05-25 13:38:02,382] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 25: [2023-05-25 13:38:02,382] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 29: [2023-05-25 13:38:02,382] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 6: [2023-05-25 13:38:02,382] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt. 29: [2023-05-25 13:38:02,383] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 31: [2023-05-25 13:38:02,383] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt... 6: [2023-05-25 13:38:02,383] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt. 12: [2023-05-25 13:38:02,383] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 10: [2023-05-25 13:38:02,383] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 26: [2023-05-25 13:38:02,383] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 30: [2023-05-25 13:38:02,384] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt... 11: [2023-05-25 13:38:02,384] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 11: [2023-05-25 13:38:02,384] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 12: [2023-05-25 13:38:02,384] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 29: [2023-05-25 13:38:02,384] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt... 9: [2023-05-25 13:38:02,385] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 25: [2023-05-25 13:38:02,385] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt... 9: [2023-05-25 13:38:02,385] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 13: [2023-05-25 13:38:02,385] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt. 8: [2023-05-25 13:38:02,385] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt. 13: [2023-05-25 13:38:02,385] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt. 8: [2023-05-25 13:38:02,386] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt. 26: [2023-05-25 13:38:02,386] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt... 29: [2023-05-25 13:38:02,387] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt... 29: [2023-05-25 13:38:02,387] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt... 30: [2023-05-25 13:38:02,387] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 27: [2023-05-25 13:38:02,387] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 22: [2023-05-25 13:38:02,388] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 22: [2023-05-25 13:38:02,388] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 22: [2023-05-25 13:38:02,388] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 27: [2023-05-25 13:38:02,389] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 10: [2023-05-25 13:38:02,389] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 22: [2023-05-25 13:38:02,389] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 27: [2023-05-25 13:38:02,390] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt... 31: [2023-05-25 13:38:02,390] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 23: [2023-05-25 13:38:02,390] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt. 30: [2023-05-25 13:38:02,390] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 22: [2023-05-25 13:38:02,390] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 30: [2023-05-25 13:38:02,391] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt... 27: [2023-05-25 13:38:02,391] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt... 22: [2023-05-25 13:38:02,391] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt... 26: [2023-05-25 13:38:02,392] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 23: [2023-05-25 13:38:02,392] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt. 31: [2023-05-25 13:38:02,392] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt... 24: [2023-05-25 13:38:02,392] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 24: [2023-05-25 13:38:02,392] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 16: [2023-05-25 13:38:02,392] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 7: [2023-05-25 13:38:02,392] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt. 16: [2023-05-25 13:38:02,392] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 22: [2023-05-25 13:38:02,392] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 30: [2023-05-25 13:38:02,393] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt... 25: [2023-05-25 13:38:02,393] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 25: [2023-05-25 13:38:02,393] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 14: [2023-05-25 13:38:02,393] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt. 7: [2023-05-25 13:38:02,394] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt. 14: [2023-05-25 13:38:02,394] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt. 26: [2023-05-25 13:38:02,394] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt... 16: [2023-05-25 13:38:02,394] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 16: [2023-05-25 13:38:02,394] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 22: [2023-05-25 13:38:02,394] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt... 30: [2023-05-25 13:38:02,395] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 10: [2023-05-25 13:38:02,395] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 2: [2023-05-25 13:38:02,395] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt. 2: [2023-05-25 13:38:02,396] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt. 11: [2023-05-25 13:38:02,397] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 30: [2023-05-25 13:38:02,397] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt... 6: [2023-05-25 13:38:02,397] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 25: [2023-05-25 13:38:02,397] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt... 25: [2023-05-25 13:38:02,398] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt... 10: [2023-05-25 13:38:02,398] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 20: [2023-05-25 13:38:02,398] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 20: [2023-05-25 13:38:02,398] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 20: [2023-05-25 13:38:02,398] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 10: [2023-05-25 13:38:02,398] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 11: [2023-05-25 13:38:02,398] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 20: [2023-05-25 13:38:02,398] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 18: [2023-05-25 13:38:02,398] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 18: [2023-05-25 13:38:02,398] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 18: [2023-05-25 13:38:02,398] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 14: [2023-05-25 13:38:02,398] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 6: [2023-05-25 13:38:02,399] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 14: [2023-05-25 13:38:02,399] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 5: [2023-05-25 13:38:02,400] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt. 8: [2023-05-25 13:38:02,400] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 5: [2023-05-25 13:38:02,400] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt. 8: [2023-05-25 13:38:02,400] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 20: [2023-05-25 13:38:02,400] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 27: [2023-05-25 13:38:02,400] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 18: [2023-05-25 13:38:02,400] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 20: [2023-05-25 13:38:02,400] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 13: [2023-05-25 13:38:02,400] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 18: [2023-05-25 13:38:02,400] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt... 18: [2023-05-25 13:38:02,400] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt... 20: [2023-05-25 13:38:02,401] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt... 20: [2023-05-25 13:38:02,401] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt... 13: [2023-05-25 13:38:02,401] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 16: [2023-05-25 13:38:02,401] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt. 27: [2023-05-25 13:38:02,401] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_00-model_states.pt. 16: [2023-05-25 13:38:02,402] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt. 10: [2023-05-25 13:38:02,402] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 13: [2023-05-25 13:38:02,402] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 24: [2023-05-25 13:38:02,402] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt... 24: [2023-05-25 13:38:02,402] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt... 27: [2023-05-25 13:38:02,402] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt... 23: [2023-05-25 13:38:02,403] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 9: [2023-05-25 13:38:02,404] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 9: [2023-05-25 13:38:02,404] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 27: [2023-05-25 13:38:02,404] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt... 12: [2023-05-25 13:38:02,404] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt. 12: [2023-05-25 13:38:02,404] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt. 3: [2023-05-25 13:38:02,405] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt. 3: [2023-05-25 13:38:02,405] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt. 23: [2023-05-25 13:38:02,406] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 0: [2023-05-25 13:38:02,406] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt. 0: [2023-05-25 13:38:02,406] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_03-model_states.pt. 13: [2023-05-25 13:38:02,406] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt. 15: [2023-05-25 13:38:02,406] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt. 14: [2023-05-25 13:38:02,406] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt. 15: [2023-05-25 13:38:02,406] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt. 13: [2023-05-25 13:38:02,406] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 13: [2023-05-25 13:38:02,406] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt. 14: [2023-05-25 13:38:02,407] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 9: [2023-05-25 13:38:02,407] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 9: [2023-05-25 13:38:02,407] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 14: [2023-05-25 13:38:02,407] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt. 15: [2023-05-25 13:38:02,408] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 15: [2023-05-25 13:38:02,409] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 2: [2023-05-25 13:38:02,409] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 15: [2023-05-25 13:38:02,410] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt. 21: [2023-05-25 13:38:02,410] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt. 15: [2023-05-25 13:38:02,410] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt. 19: [2023-05-25 13:38:02,410] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 19: [2023-05-25 13:38:02,410] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 19: [2023-05-25 13:38:02,410] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 7: [2023-05-25 13:38:02,410] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 19: [2023-05-25 13:38:02,410] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 14: [2023-05-25 13:38:02,410] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 7: [2023-05-25 13:38:02,411] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 18: [2023-05-25 13:38:02,411] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 2: [2023-05-25 13:38:02,411] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 22: [2023-05-25 13:38:02,411] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 15: [2023-05-25 13:38:02,412] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt... 18: [2023-05-25 13:38:02,412] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 19: [2023-05-25 13:38:02,412] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 19: [2023-05-25 13:38:02,413] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 15: [2023-05-25 13:38:02,413] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt... 19: [2023-05-25 13:38:02,413] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt... 19: [2023-05-25 13:38:02,413] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt... 21: [2023-05-25 13:38:02,413] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_02-model_states.pt. 21: [2023-05-25 13:38:02,414] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 16: [2023-05-25 13:38:02,414] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 12: [2023-05-25 13:38:02,414] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 23: [2023-05-25 13:38:02,414] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 21: [2023-05-25 13:38:02,414] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 9: [2023-05-25 13:38:02,414] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 23: [2023-05-25 13:38:02,414] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 5: [2023-05-25 13:38:02,415] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 5: [2023-05-25 13:38:02,415] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 14: [2023-05-25 13:38:02,415] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 16: [2023-05-25 13:38:02,415] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 13: [2023-05-25 13:38:02,415] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 14: [2023-05-25 13:38:02,415] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 12: [2023-05-25 13:38:02,416] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 9: [2023-05-25 13:38:02,417] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt... 12: [2023-05-25 13:38:02,417] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 8: [2023-05-25 13:38:02,417] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 8: [2023-05-25 13:38:02,417] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 12: [2023-05-25 13:38:02,418] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 0: [2023-05-25 13:38:02,419] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 0: [2023-05-25 13:38:02,419] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 3: [2023-05-25 13:38:02,419] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 14: [2023-05-25 13:38:02,420] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 3: [2023-05-25 13:38:02,419] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 9: [2023-05-25 13:38:02,422] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 15: [2023-05-25 13:38:02,422] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 13: [2023-05-25 13:38:02,422] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 15: [2023-05-25 13:38:02,422] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 22: [2023-05-25 13:38:02,423] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 13: [2023-05-25 13:38:02,423] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 13: [2023-05-25 13:38:02,423] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 21: [2023-05-25 13:38:02,423] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 14: [2023-05-25 13:38:02,424] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 22: [2023-05-25 13:38:02,424] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 9: [2023-05-25 13:38:02,424] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt... 15: [2023-05-25 13:38:02,425] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 23: [2023-05-25 13:38:02,426] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 15: [2023-05-25 13:38:02,427] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 12: [2023-05-25 13:38:02,429] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 12: [2023-05-25 13:38:02,429] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 23: [2023-05-25 13:38:02,429] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 21: [2023-05-25 13:38:02,430] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 21: [2023-05-25 13:38:02,430] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 21: [2023-05-25 13:38:02,430] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 11: [2023-05-25 13:38:02,430] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt. 11: [2023-05-25 13:38:02,430] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt. 10: [2023-05-25 13:38:02,430] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 8: [2023-05-25 13:38:02,430] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 13: [2023-05-25 13:38:02,431] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 8: [2023-05-25 13:38:02,432] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 8: [2023-05-25 13:38:02,433] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 13: [2023-05-25 13:38:02,433] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt... 17: [2023-05-25 13:38:02,433] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt. 9: [2023-05-25 13:38:02,433] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 21: [2023-05-25 13:38:02,433] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt. 17: [2023-05-25 13:38:02,433] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt. 21: [2023-05-25 13:38:02,433] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt. 23: [2023-05-25 13:38:02,434] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 10: [2023-05-25 13:38:02,434] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt... 10: [2023-05-25 13:38:02,434] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 16: [2023-05-25 13:38:02,434] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 22: [2023-05-25 13:38:02,435] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 8: [2023-05-25 13:38:02,436] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt... 16: [2023-05-25 13:38:02,436] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 10: [2023-05-25 13:38:02,436] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt... 20: [2023-05-25 13:38:02,435] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 20: [2023-05-25 13:38:02,436] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 23: [2023-05-25 13:38:02,437] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt... 13: [2023-05-25 13:38:02,437] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 23: [2023-05-25 13:38:02,438] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 14: [2023-05-25 13:38:02,438] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 13: [2023-05-25 13:38:02,439] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt... 8: [2023-05-25 13:38:02,440] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 23: [2023-05-25 13:38:02,440] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt... 14: [2023-05-25 13:38:02,440] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt... 19: [2023-05-25 13:38:02,441] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 8: [2023-05-25 13:38:02,442] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt... 16: [2023-05-25 13:38:02,442] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 16: [2023-05-25 13:38:02,443] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 14: [2023-05-25 13:38:02,443] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 16: [2023-05-25 13:38:02,444] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt... 19: [2023-05-25 13:38:02,444] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 11: [2023-05-25 13:38:02,445] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 16: [2023-05-25 13:38:02,445] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt... 14: [2023-05-25 13:38:02,445] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt... 11: [2023-05-25 13:38:02,445] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 21: [2023-05-25 13:38:02,446] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 16: [2023-05-25 13:38:02,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 17: [2023-05-25 13:38:02,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 21: [2023-05-25 13:38:02,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 17: [2023-05-25 13:38:02,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 9: [2023-05-25 13:38:02,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 9: [2023-05-25 13:38:02,448] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 20: [2023-05-25 13:38:02,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 17: [2023-05-25 13:38:02,449] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 17: [2023-05-25 13:38:02,449] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 17: [2023-05-25 13:38:02,449] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 17: [2023-05-25 13:38:02,449] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 20: [2023-05-25 13:38:02,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 16: [2023-05-25 13:38:02,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 12: [2023-05-25 13:38:02,450] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 12: [2023-05-25 13:38:02,450] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 14: [2023-05-25 13:38:02,451] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 17: [2023-05-25 13:38:02,451] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt... 10: [2023-05-25 13:38:02,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt. 17: [2023-05-25 13:38:02,452] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 10: [2023-05-25 13:38:02,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt. 17: [2023-05-25 13:38:02,452] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 17: [2023-05-25 13:38:02,452] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt... 12: [2023-05-25 13:38:02,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt... 12: [2023-05-25 13:38:02,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt... 14: [2023-05-25 13:38:02,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt... 13: [2023-05-25 13:38:02,453] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 15: [2023-05-25 13:38:02,454] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 19: [2023-05-25 13:38:02,454] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 14: [2023-05-25 13:38:02,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 15: [2023-05-25 13:38:02,456] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt... 14: [2023-05-25 13:38:02,457] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt... 13: [2023-05-25 13:38:02,459] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt... 15: [2023-05-25 13:38:02,459] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 21: [2023-05-25 13:38:02,460] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 21: [2023-05-25 13:38:02,460] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 8: [2023-05-25 13:38:02,461] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt. 8: [2023-05-25 13:38:02,461] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt. 13: [2023-05-25 13:38:02,461] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 15: [2023-05-25 13:38:02,462] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt... 19: [2023-05-25 13:38:02,462] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 15: [2023-05-25 13:38:02,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 15: [2023-05-25 13:38:02,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 9: [2023-05-25 13:38:02,463] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 21: [2023-05-25 13:38:02,463] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt... 21: [2023-05-25 13:38:02,463] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt... 13: [2023-05-25 13:38:02,463] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt... 15: [2023-05-25 13:38:02,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt... 10: [2023-05-25 13:38:02,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 10: [2023-05-25 13:38:02,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 15: [2023-05-25 13:38:02,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt... 10: [2023-05-25 13:38:02,467] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt. 18: [2023-05-25 13:38:02,467] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt. 10: [2023-05-25 13:38:02,467] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt. 18: [2023-05-25 13:38:02,467] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt. 14: [2023-05-25 13:38:02,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt. 14: [2023-05-25 13:38:02,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt. 8: [2023-05-25 13:38:02,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt. 0: [2023-05-25 13:38:02,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 22: [2023-05-25 13:38:02,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt. 8: [2023-05-25 13:38:02,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt. 0: [2023-05-25 13:38:02,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 0: [2023-05-25 13:38:02,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 0: [2023-05-25 13:38:02,471] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 11: [2023-05-25 13:38:02,472] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt. 11: [2023-05-25 13:38:02,472] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt. 12: [2023-05-25 13:38:02,473] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt. 23: [2023-05-25 13:38:02,473] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt. 23: [2023-05-25 13:38:02,473] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt. 12: [2023-05-25 13:38:02,474] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt. 8: [2023-05-25 13:38:02,474] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 22: [2023-05-25 13:38:02,474] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt. 21: [2023-05-25 13:38:02,475] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 11: [2023-05-25 13:38:02,475] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt. 5: [2023-05-25 13:38:02,475] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt. 11: [2023-05-25 13:38:02,475] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_01-model_states.pt. 5: [2023-05-25 13:38:02,475] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt. 8: [2023-05-25 13:38:02,476] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 11: [2023-05-25 13:38:02,476] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 4: [2023-05-25 13:38:02,476] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt. 17: [2023-05-25 13:38:02,477] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 4: [2023-05-25 13:38:02,477] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt. 13: [2023-05-25 13:38:02,477] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt. 21: [2023-05-25 13:38:02,477] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt... 0: [2023-05-25 13:38:02,477] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt... 0: [2023-05-25 13:38:02,477] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt... 13: [2023-05-25 13:38:02,477] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt. 0: [2023-05-25 13:38:02,477] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 17: [2023-05-25 13:38:02,480] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt... 5: [2023-05-25 13:38:02,479] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 5: [2023-05-25 13:38:02,480] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 5: [2023-05-25 13:38:02,480] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 6: [2023-05-25 13:38:02,480] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt. 0: [2023-05-25 13:38:02,480] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 21: [2023-05-25 13:38:02,480] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 18: [2023-05-25 13:38:02,480] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 6: [2023-05-25 13:38:02,480] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt. 5: [2023-05-25 13:38:02,481] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 18: [2023-05-25 13:38:02,481] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 14: [2023-05-25 13:38:02,481] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 5: [2023-05-25 13:38:02,481] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 21: [2023-05-25 13:38:02,482] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt... 5: [2023-05-25 13:38:02,482] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt... 5: [2023-05-25 13:38:02,482] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt... 14: [2023-05-25 13:38:02,482] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 10: [2023-05-25 13:38:02,482] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 10: [2023-05-25 13:38:02,482] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 5: [2023-05-25 13:38:02,483] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 11: [2023-05-25 13:38:02,484] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt... 8: [2023-05-25 13:38:02,484] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 19: [2023-05-25 13:38:02,484] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt. 17: [2023-05-25 13:38:02,485] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 12: [2023-05-25 13:38:02,485] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt. 22: [2023-05-25 13:38:02,486] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 17: [2023-05-25 13:38:02,486] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 17: [2023-05-25 13:38:02,486] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 23: [2023-05-25 13:38:02,486] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 8: [2023-05-25 13:38:02,486] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 19: [2023-05-25 13:38:02,486] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt. 12: [2023-05-25 13:38:02,487] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt. 12: [2023-05-25 13:38:02,487] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 17: [2023-05-25 13:38:02,488] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt... 22: [2023-05-25 13:38:02,488] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 11: [2023-05-25 13:38:02,488] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 7: [2023-05-25 13:38:02,488] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt. 23: [2023-05-25 13:38:02,488] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 12: [2023-05-25 13:38:02,488] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 7: [2023-05-25 13:38:02,488] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt. 20: [2023-05-25 13:38:02,488] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt. 20: [2023-05-25 13:38:02,489] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt. 11: [2023-05-25 13:38:02,489] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 11: [2023-05-25 13:38:02,489] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 5: [2023-05-25 13:38:02,489] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 5: [2023-05-25 13:38:02,489] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 4: [2023-05-25 13:38:02,490] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 11: [2023-05-25 13:38:02,490] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 4: [2023-05-25 13:38:02,490] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 11: [2023-05-25 13:38:02,491] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 23: [2023-05-25 13:38:02,491] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt. 11: [2023-05-25 13:38:02,491] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt... 23: [2023-05-25 13:38:02,491] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt. 13: [2023-05-25 13:38:02,491] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 16: [2023-05-25 13:38:02,491] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt. 13: [2023-05-25 13:38:02,492] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 16: [2023-05-25 13:38:02,492] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt. 18: [2023-05-25 13:38:02,493] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt. 18: [2023-05-25 13:38:02,493] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt. 0: [2023-05-25 13:38:02,494] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 0: [2023-05-25 13:38:02,494] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt. 0: [2023-05-25 13:38:02,494] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt. 6: [2023-05-25 13:38:02,495] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 6: [2023-05-25 13:38:02,495] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 3: [2023-05-25 13:38:02,496] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 3: [2023-05-25 13:38:02,496] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 3: [2023-05-25 13:38:02,497] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 3: [2023-05-25 13:38:02,497] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 3: [2023-05-25 13:38:02,497] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt... 1: [2023-05-25 13:38:02,497] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt. 1: [2023-05-25 13:38:02,497] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt. 22: [2023-05-25 13:38:02,497] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt. 22: [2023-05-25 13:38:02,498] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt. 3: [2023-05-25 13:38:02,498] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 12: [2023-05-25 13:38:02,498] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 19: [2023-05-25 13:38:02,498] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 17: [2023-05-25 13:38:02,499] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 3: [2023-05-25 13:38:02,499] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt... 3: [2023-05-25 13:38:02,499] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 17: [2023-05-25 13:38:02,500] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 19: [2023-05-25 13:38:02,500] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 12: [2023-05-25 13:38:02,502] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 20: [2023-05-25 13:38:02,502] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 7: [2023-05-25 13:38:02,502] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 20: [2023-05-25 13:38:02,503] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 9: [2023-05-25 13:38:02,503] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt. 9: [2023-05-25 13:38:02,504] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_02-model_states.pt. 10: [2023-05-25 13:38:02,504] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 10: [2023-05-25 13:38:02,504] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 5: [2023-05-25 13:38:02,504] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 7: [2023-05-25 13:38:02,504] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 19: [2023-05-25 13:38:02,505] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt. 9: [2023-05-25 13:38:02,505] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt. 19: [2023-05-25 13:38:02,505] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt. 16: [2023-05-25 13:38:02,505] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 9: [2023-05-25 13:38:02,505] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_21-model_03-model_states.pt. 16: [2023-05-25 13:38:02,506] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 18: [2023-05-25 13:38:02,506] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 18: [2023-05-25 13:38:02,506] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 23: [2023-05-25 13:38:02,506] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 10: [2023-05-25 13:38:02,506] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt... 10: [2023-05-25 13:38:02,506] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt... 18: [2023-05-25 13:38:02,506] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 7: [2023-05-25 13:38:02,508] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 7: [2023-05-25 13:38:02,508] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 7: [2023-05-25 13:38:02,508] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 7: [2023-05-25 13:38:02,508] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 0: [2023-05-25 13:38:02,509] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 14: [2023-05-25 13:38:02,509] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 0: [2023-05-25 13:38:02,509] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 0: [2023-05-25 13:38:02,509] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 8: [2023-05-25 13:38:02,509] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 18: [2023-05-25 13:38:02,509] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt... 23: [2023-05-25 13:38:02,510] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 8: [2023-05-25 13:38:02,510] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 8: [2023-05-25 13:38:02,511] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt... 7: [2023-05-25 13:38:02,511] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt... 7: [2023-05-25 13:38:02,511] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 7: [2023-05-25 13:38:02,511] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt... 7: [2023-05-25 13:38:02,511] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 14: [2023-05-25 13:38:02,511] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 14: [2023-05-25 13:38:02,511] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt... 22: [2023-05-25 13:38:02,512] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 23: [2023-05-25 13:38:02,512] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 1: [2023-05-25 13:38:02,513] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 1: [2023-05-25 13:38:02,513] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 16: [2023-05-25 13:38:02,513] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt. 3: [2023-05-25 13:38:02,513] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt. 3: [2023-05-25 13:38:02,513] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt. 8: [2023-05-25 13:38:02,513] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt... 1: [2023-05-25 13:38:02,513] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 14: [2023-05-25 13:38:02,513] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt... 1: [2023-05-25 13:38:02,514] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 1: [2023-05-25 13:38:02,514] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 1: [2023-05-25 13:38:02,514] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 16: [2023-05-25 13:38:02,514] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt. 23: [2023-05-25 13:38:02,514] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt... 18: [2023-05-25 13:38:02,515] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 22: [2023-05-25 13:38:02,515] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 22: [2023-05-25 13:38:02,515] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 1: [2023-05-25 13:38:02,515] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 0: [2023-05-25 13:38:02,515] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 10: [2023-05-25 13:38:02,515] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 1: [2023-05-25 13:38:02,516] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt... 10: [2023-05-25 13:38:02,516] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 21: [2023-05-25 13:38:02,516] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt. 1: [2023-05-25 13:38:02,516] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 21: [2023-05-25 13:38:02,516] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt. 1: [2023-05-25 13:38:02,516] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt... 12: [2023-05-25 13:38:02,517] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 18: [2023-05-25 13:38:02,517] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt... 5: [2023-05-25 13:38:02,518] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 9: [2023-05-25 13:38:02,518] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 10: [2023-05-25 13:38:02,519] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt... 10: [2023-05-25 13:38:02,519] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt... 12: [2023-05-25 13:38:02,519] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt... 8: [2023-05-25 13:38:02,519] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 8: [2023-05-25 13:38:02,519] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 22: [2023-05-25 13:38:02,519] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt... 5: [2023-05-25 13:38:02,520] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 19: [2023-05-25 13:38:02,520] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 9: [2023-05-25 13:38:02,520] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 19: [2023-05-25 13:38:02,520] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 11: [2023-05-25 13:38:02,520] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 11: [2023-05-25 13:38:02,520] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 6: [2023-05-25 13:38:02,520] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 6: [2023-05-25 13:38:02,520] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 6: [2023-05-25 13:38:02,520] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 6: [2023-05-25 13:38:02,520] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 23: [2023-05-25 13:38:02,520] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 4: [2023-05-25 13:38:02,521] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 4: [2023-05-25 13:38:02,521] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 4: [2023-05-25 13:38:02,521] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 4: [2023-05-25 13:38:02,521] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 4: [2023-05-25 13:38:02,521] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 4: [2023-05-25 13:38:02,521] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 12: [2023-05-25 13:38:02,522] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 8: [2023-05-25 13:38:02,522] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt... 9: [2023-05-25 13:38:02,522] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 6: [2023-05-25 13:38:02,522] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 6: [2023-05-25 13:38:02,522] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt... 6: [2023-05-25 13:38:02,522] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt... 9: [2023-05-25 13:38:02,522] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt... 6: [2023-05-25 13:38:02,523] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 23: [2023-05-25 13:38:02,523] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt... 5: [2023-05-25 13:38:02,523] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 5: [2023-05-25 13:38:02,523] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 13: [2023-05-25 13:38:02,523] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 13: [2023-05-25 13:38:02,523] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 11: [2023-05-25 13:38:02,523] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 11: [2023-05-25 13:38:02,523] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 11: [2023-05-25 13:38:02,523] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt... 4: [2023-05-25 13:38:02,523] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt... 4: [2023-05-25 13:38:02,523] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt... 4: [2023-05-25 13:38:02,523] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 4: [2023-05-25 13:38:02,523] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt... 11: [2023-05-25 13:38:02,523] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt... 4: [2023-05-25 13:38:02,523] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 2: [2023-05-25 13:38:02,523] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt. 8: [2023-05-25 13:38:02,523] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt... 2: [2023-05-25 13:38:02,523] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_01-model_states.pt. 4: [2023-05-25 13:38:02,524] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt... 12: [2023-05-25 13:38:02,524] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt... 11: [2023-05-25 13:38:02,525] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt... 13: [2023-05-25 13:38:02,525] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt... 13: [2023-05-25 13:38:02,525] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt... 11: [2023-05-25 13:38:02,525] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt... 6: [2023-05-25 13:38:02,525] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 20: [2023-05-25 13:38:02,526] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 5: [2023-05-25 13:38:02,527] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt... 5: [2023-05-25 13:38:02,527] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt... 22: [2023-05-25 13:38:02,526] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 3: [2023-05-25 13:38:02,527] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 3: [2023-05-25 13:38:02,527] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 3: [2023-05-25 13:38:02,527] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 16: [2023-05-25 13:38:02,527] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 20: [2023-05-25 13:38:02,528] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt... 19: [2023-05-25 13:38:02,528] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 22: [2023-05-25 13:38:02,529] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt... 19: [2023-05-25 13:38:02,529] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 12: [2023-05-25 13:38:02,528] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 16: [2023-05-25 13:38:02,529] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 19: [2023-05-25 13:38:02,529] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt... 21: [2023-05-25 13:38:02,529] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 23: [2023-05-25 13:38:02,530] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 21: [2023-05-25 13:38:02,530] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 0: [2023-05-25 13:38:02,530] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 3: [2023-05-25 13:38:02,530] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 6: [2023-05-25 13:38:02,530] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt... 19: [2023-05-25 13:38:02,531] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt... 12: [2023-05-25 13:38:02,531] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt... 7: [2023-05-25 13:38:02,532] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 23: [2023-05-25 13:38:02,532] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt... 5: [2023-05-25 13:38:02,533] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 6: [2023-05-25 13:38:02,533] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 20: [2023-05-25 13:38:02,533] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 18: [2023-05-25 13:38:02,534] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 12: [2023-05-25 13:38:02,534] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 7: [2023-05-25 13:38:02,535] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 6: [2023-05-25 13:38:02,535] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt... 20: [2023-05-25 13:38:02,535] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt... 7: [2023-05-25 13:38:02,536] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt... 18: [2023-05-25 13:38:02,536] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt... 12: [2023-05-25 13:38:02,537] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt... 7: [2023-05-25 13:38:02,537] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt... 2: [2023-05-25 13:38:02,537] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 3: [2023-05-25 13:38:02,539] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 18: [2023-05-25 13:38:02,540] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 18: [2023-05-25 13:38:02,542] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt... 9: [2023-05-25 13:38:02,542] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 19: [2023-05-25 13:38:02,543] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 20: [2023-05-25 13:38:02,543] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt. 20: [2023-05-25 13:38:02,543] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_03-model_states.pt. 22: [2023-05-25 13:38:02,543] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 7: [2023-05-25 13:38:02,544] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 16: [2023-05-25 13:38:02,544] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 16: [2023-05-25 13:38:02,544] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 9: [2023-05-25 13:38:02,544] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt... 1: [2023-05-25 13:38:02,544] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 1: [2023-05-25 13:38:02,544] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 2: [2023-05-25 13:38:02,544] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 7: [2023-05-25 13:38:02,544] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 3: [2023-05-25 13:38:02,544] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 0: [2023-05-25 13:38:02,544] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 0: [2023-05-25 13:38:02,544] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 22: [2023-05-25 13:38:02,547] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt... 16: [2023-05-25 13:38:02,546] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt... 16: [2023-05-25 13:38:02,546] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt... 23: [2023-05-25 13:38:02,547] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 0: [2023-05-25 13:38:02,547] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt... 1: [2023-05-25 13:38:02,547] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 1: [2023-05-25 13:38:02,548] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt... 1: [2023-05-25 13:38:02,548] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt... 0: [2023-05-25 13:38:02,548] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt... 19: [2023-05-25 13:38:02,548] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt... 1: [2023-05-25 13:38:02,548] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 16: [2023-05-25 13:38:02,549] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 23: [2023-05-25 13:38:02,549] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt... 6: [2023-05-25 13:38:02,550] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 16: [2023-05-25 13:38:02,551] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt... 19: [2023-05-25 13:38:02,551] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 17: [2023-05-25 13:38:02,551] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt. 17: [2023-05-25 13:38:02,551] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_32-model_01-model_states.pt. 6: [2023-05-25 13:38:02,552] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 9: [2023-05-25 13:38:02,552] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 19: [2023-05-25 13:38:02,553] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt... 21: [2023-05-25 13:38:02,553] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 4: [2023-05-25 13:38:02,553] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 9: [2023-05-25 13:38:02,554] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 4: [2023-05-25 13:38:02,554] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 9: [2023-05-25 13:38:02,555] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt... 21: [2023-05-25 13:38:02,555] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt... 9: [2023-05-25 13:38:02,555] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt... 22: [2023-05-25 13:38:02,556] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 2: [2023-05-25 13:38:02,557] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 20: [2023-05-25 13:38:02,558] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 20: [2023-05-25 13:38:02,558] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 2: [2023-05-25 13:38:02,557] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 2: [2023-05-25 13:38:02,558] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 2: [2023-05-25 13:38:02,558] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 22: [2023-05-25 13:38:02,558] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt... 3: [2023-05-25 13:38:02,558] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 3: [2023-05-25 13:38:02,560] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 2: [2023-05-25 13:38:02,560] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt... 2: [2023-05-25 13:38:02,560] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt... 2: [2023-05-25 13:38:02,560] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 2: [2023-05-25 13:38:02,560] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 9: [2023-05-25 13:38:02,560] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_00-model_states.pt. 7: [2023-05-25 13:38:02,561] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 3: [2023-05-25 13:38:02,561] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt... 16: [2023-05-25 13:38:02,562] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 3: [2023-05-25 13:38:02,562] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt... 9: [2023-05-25 13:38:02,563] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt... 1: [2023-05-25 13:38:02,563] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 7: [2023-05-25 13:38:02,563] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 6: [2023-05-25 13:38:02,564] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 16: [2023-05-25 13:38:02,564] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt... 21: [2023-05-25 13:38:02,565] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 1: [2023-05-25 13:38:02,565] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 17: [2023-05-25 13:38:02,566] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 6: [2023-05-25 13:38:02,566] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 21: [2023-05-25 13:38:02,567] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt... 17: [2023-05-25 13:38:02,567] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt... 2: [2023-05-25 13:38:02,568] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 15: [2023-05-25 13:38:02,569] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 15: [2023-05-25 13:38:02,569] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 4: [2023-05-25 13:38:02,570] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 4: [2023-05-25 13:38:02,570] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 15: [2023-05-25 13:38:02,570] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 15: [2023-05-25 13:38:02,571] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 2: [2023-05-25 13:38:02,574] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt... 2: [2023-05-25 13:38:02,574] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 2: [2023-05-25 13:38:02,576] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt... 4: [2023-05-25 13:38:02,586] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt. 4: [2023-05-25 13:38:02,586] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt. 7: [2023-05-25 13:38:02,587] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt. 7: [2023-05-25 13:38:02,587] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt. 0: [2023-05-25 13:38:02,587] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt. 0: [2023-05-25 13:38:02,588] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt. 17: [2023-05-25 13:38:02,588] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 20: [2023-05-25 13:38:02,588] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 20: [2023-05-25 13:38:02,589] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 17: [2023-05-25 13:38:02,589] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt... 20: [2023-05-25 13:38:02,591] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt... 20: [2023-05-25 13:38:02,591] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt... 2: [2023-05-25 13:38:02,593] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 2: [2023-05-25 13:38:02,593] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 1: [2023-05-25 13:38:02,595] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt. 1: [2023-05-25 13:38:02,595] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt. 3: [2023-05-25 13:38:02,597] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt. 3: [2023-05-25 13:38:02,597] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt. 15: [2023-05-25 13:38:02,597] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 4: [2023-05-25 13:38:02,598] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 4: [2023-05-25 13:38:02,599] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 11: [2023-05-25 13:38:02,599] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 11: [2023-05-25 13:38:02,600] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 11: [2023-05-25 13:38:02,601] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 2: [2023-05-25 13:38:02,601] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt. 11: [2023-05-25 13:38:02,601] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 2: [2023-05-25 13:38:02,601] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt. 0: [2023-05-25 13:38:02,601] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 10: [2023-05-25 13:38:02,601] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 10: [2023-05-25 13:38:02,602] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 0: [2023-05-25 13:38:02,603] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 10: [2023-05-25 13:38:02,603] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 15: [2023-05-25 13:38:02,604] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 10: [2023-05-25 13:38:02,604] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 6: [2023-05-25 13:38:02,604] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt. 6: [2023-05-25 13:38:02,604] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt. 7: [2023-05-25 13:38:02,604] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 17: [2023-05-25 13:38:02,604] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_00-model_states.pt. 28: [2023-05-25 13:38:02,605] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt. 28: [2023-05-25 13:38:02,605] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt. 7: [2023-05-25 13:38:02,605] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 17: [2023-05-25 13:38:02,606] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt... 2: [2023-05-25 13:38:02,607] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 2: [2023-05-25 13:38:02,608] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 1: [2023-05-25 13:38:02,608] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 15: [2023-05-25 13:38:02,610] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 3: [2023-05-25 13:38:02,612] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 3: [2023-05-25 13:38:02,612] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 1: [2023-05-25 13:38:02,612] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 2: [2023-05-25 13:38:02,615] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 2: [2023-05-25 13:38:02,616] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 6: [2023-05-25 13:38:02,617] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 15: [2023-05-25 13:38:02,617] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 28: [2023-05-25 13:38:02,618] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 28: [2023-05-25 13:38:02,618] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 6: [2023-05-25 13:38:02,619] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 28: [2023-05-25 13:38:02,620] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 28: [2023-05-25 13:38:02,620] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 13: [2023-05-25 13:38:02,622] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 13: [2023-05-25 13:38:02,622] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 13: [2023-05-25 13:38:02,624] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 13: [2023-05-25 13:38:02,624] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 4: [2023-05-25 13:38:02,627] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 25: [2023-05-25 13:38:02,627] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt. 7: [2023-05-25 13:38:02,627] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 25: [2023-05-25 13:38:02,628] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt. 30: [2023-05-25 13:38:02,628] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt. 19: [2023-05-25 13:38:02,629] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt. 4: [2023-05-25 13:38:02,629] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 19: [2023-05-25 13:38:02,629] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt. 30: [2023-05-25 13:38:02,628] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt. 14: [2023-05-25 13:38:02,630] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 14: [2023-05-25 13:38:02,630] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 11: [2023-05-25 13:38:02,631] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 4: [2023-05-25 13:38:02,631] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt... 4: [2023-05-25 13:38:02,631] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt... 14: [2023-05-25 13:38:02,632] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 14: [2023-05-25 13:38:02,632] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 7: [2023-05-25 13:38:02,633] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt... 0: [2023-05-25 13:38:02,633] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 11: [2023-05-25 13:38:02,633] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 0: [2023-05-25 13:38:02,636] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt... 26: [2023-05-25 13:38:02,637] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt. 3: [2023-05-25 13:38:02,637] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 26: [2023-05-25 13:38:02,638] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt. 0: [2023-05-25 13:38:02,638] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 7: [2023-05-25 13:38:02,638] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 3: [2023-05-25 13:38:02,639] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt... 28: [2023-05-25 13:38:02,640] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt... 28: [2023-05-25 13:38:02,640] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt... 30: [2023-05-25 13:38:02,640] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 7: [2023-05-25 13:38:02,640] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt... 0: [2023-05-25 13:38:02,640] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt... 1: [2023-05-25 13:38:02,640] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 30: [2023-05-25 13:38:02,641] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 25: [2023-05-25 13:38:02,641] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 25: [2023-05-25 13:38:02,641] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 10: [2023-05-25 13:38:02,641] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 25: [2023-05-25 13:38:02,641] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 10: [2023-05-25 13:38:02,641] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 25: [2023-05-25 13:38:02,641] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 30: [2023-05-25 13:38:02,641] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 25: [2023-05-25 13:38:02,642] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt... 25: [2023-05-25 13:38:02,642] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt... 30: [2023-05-25 13:38:02,642] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt... 30: [2023-05-25 13:38:02,642] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 30: [2023-05-25 13:38:02,642] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt... 19: [2023-05-25 13:38:02,642] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 19: [2023-05-25 13:38:02,642] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 1: [2023-05-25 13:38:02,642] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 1: [2023-05-25 13:38:02,643] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt... 8: [2023-05-25 13:38:02,643] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 8: [2023-05-25 13:38:02,644] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 8: [2023-05-25 13:38:02,644] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 3: [2023-05-25 13:38:02,644] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 24: [2023-05-25 13:38:02,644] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt. 8: [2023-05-25 13:38:02,646] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 1: [2023-05-25 13:38:02,645] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt... 24: [2023-05-25 13:38:02,646] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt. 3: [2023-05-25 13:38:02,646] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt... 2: [2023-05-25 13:38:02,647] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 11: [2023-05-25 13:38:02,647] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 11: [2023-05-25 13:38:02,647] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 2: [2023-05-25 13:38:02,649] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt... 23: [2023-05-25 13:38:02,649] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 23: [2023-05-25 13:38:02,649] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 23: [2023-05-25 13:38:02,650] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 23: [2023-05-25 13:38:02,650] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 28: [2023-05-25 13:38:02,652] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt. 10: [2023-05-25 13:38:02,652] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 26: [2023-05-25 13:38:02,652] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 26: [2023-05-25 13:38:02,653] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 18: [2023-05-25 13:38:02,654] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt. 28: [2023-05-25 13:38:02,654] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt. 10: [2023-05-25 13:38:02,654] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 18: [2023-05-25 13:38:02,654] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt. 30: [2023-05-25 13:38:02,654] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt. 30: [2023-05-25 13:38:02,655] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt. 6: [2023-05-25 13:38:02,655] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 26: [2023-05-25 13:38:02,657] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 26: [2023-05-25 13:38:02,657] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 28: [2023-05-25 13:38:02,657] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt. 6: [2023-05-25 13:38:02,657] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt... 28: [2023-05-25 13:38:02,657] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt. 2: [2023-05-25 13:38:02,657] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 24: [2023-05-25 13:38:02,658] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 24: [2023-05-25 13:38:02,658] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 12: [2023-05-25 13:38:02,658] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 2: [2023-05-25 13:38:02,659] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt... 26: [2023-05-25 13:38:02,659] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt... 26: [2023-05-25 13:38:02,659] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt... 28: [2023-05-25 13:38:02,659] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_13_optim_states.pt... 28: [2023-05-25 13:38:02,660] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_13_optim_states.pt... 12: [2023-05-25 13:38:02,660] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 12: [2023-05-25 13:38:02,660] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 24: [2023-05-25 13:38:02,661] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 24: [2023-05-25 13:38:02,661] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 6: [2023-05-25 13:38:02,662] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 12: [2023-05-25 13:38:02,662] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 25: [2023-05-25 13:38:02,662] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt. 20: [2023-05-25 13:38:02,662] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 20: [2023-05-25 13:38:02,663] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 25: [2023-05-25 13:38:02,663] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt. 27: [2023-05-25 13:38:02,663] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt. 20: [2023-05-25 13:38:02,664] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 6: [2023-05-25 13:38:02,664] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt... 24: [2023-05-25 13:38:02,664] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt... 30: [2023-05-25 13:38:02,664] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt. 24: [2023-05-25 13:38:02,664] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt... 20: [2023-05-25 13:38:02,665] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 21: [2023-05-25 13:38:02,665] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 27: [2023-05-25 13:38:02,665] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt. 21: [2023-05-25 13:38:02,665] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 30: [2023-05-25 13:38:02,665] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt. 21: [2023-05-25 13:38:02,665] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 21: [2023-05-25 13:38:02,666] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 5: [2023-05-25 13:38:02,667] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt. 13: [2023-05-25 13:38:02,667] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 5: [2023-05-25 13:38:02,667] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_09-model_02-model_states.pt. 18: [2023-05-25 13:38:02,667] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 13: [2023-05-25 13:38:02,667] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 30: [2023-05-25 13:38:02,668] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 30: [2023-05-25 13:38:02,668] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 30: [2023-05-25 13:38:02,668] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 30: [2023-05-25 13:38:02,668] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 28: [2023-05-25 13:38:02,669] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 30: [2023-05-25 13:38:02,669] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt... 30: [2023-05-25 13:38:02,669] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt... 18: [2023-05-25 13:38:02,669] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 28: [2023-05-25 13:38:02,669] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 26: [2023-05-25 13:38:02,670] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt. 26: [2023-05-25 13:38:02,670] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt. 28: [2023-05-25 13:38:02,672] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 14: [2023-05-25 13:38:02,672] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 14: [2023-05-25 13:38:02,672] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 28: [2023-05-25 13:38:02,672] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 25: [2023-05-25 13:38:02,672] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_14_optim_states.pt... 25: [2023-05-25 13:38:02,672] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_14_optim_states.pt... 25: [2023-05-25 13:38:02,673] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt. 30: [2023-05-25 13:38:02,673] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_14_optim_states.pt... 30: [2023-05-25 13:38:02,673] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_14_optim_states.pt... 25: [2023-05-25 13:38:02,673] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt. 30: [2023-05-25 13:38:02,674] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt. 28: [2023-05-25 13:38:02,674] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt... 28: [2023-05-25 13:38:02,674] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt... 30: [2023-05-25 13:38:02,674] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt. 10: [2023-05-25 13:38:02,674] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt. 10: [2023-05-25 13:38:02,675] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt. 8: [2023-05-25 13:38:02,675] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 24: [2023-05-25 13:38:02,675] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt. 28: [2023-05-25 13:38:02,677] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt. 28: [2023-05-25 13:38:02,677] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt. 24: [2023-05-25 13:38:02,677] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt. 9: [2023-05-25 13:38:02,677] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 9: [2023-05-25 13:38:02,678] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 8: [2023-05-25 13:38:02,678] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 27: [2023-05-25 13:38:02,679] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 27: [2023-05-25 13:38:02,679] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 27: [2023-05-25 13:38:02,679] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 27: [2023-05-25 13:38:02,680] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 5: [2023-05-25 13:38:02,680] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 9: [2023-05-25 13:38:02,680] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 5: [2023-05-25 13:38:02,680] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt... 27: [2023-05-25 13:38:02,681] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt... 27: [2023-05-25 13:38:02,681] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt... 25: [2023-05-25 13:38:02,681] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt. 25: [2023-05-25 13:38:02,681] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt. 23: [2023-05-25 13:38:02,681] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 23: [2023-05-25 13:38:02,682] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 9: [2023-05-25 13:38:02,683] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 14: [2023-05-25 13:38:02,686] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 22: [2023-05-25 13:38:02,686] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt. 22: [2023-05-25 13:38:02,687] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt. 21: [2023-05-25 13:38:02,687] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 18: [2023-05-25 13:38:02,687] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 14: [2023-05-25 13:38:02,687] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 18: [2023-05-25 13:38:02,687] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 27: [2023-05-25 13:38:02,688] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt. 25: [2023-05-25 13:38:02,688] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 31: [2023-05-25 13:38:02,688] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt. 10: [2023-05-25 13:38:02,688] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 25: [2023-05-25 13:38:02,688] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 18: [2023-05-25 13:38:02,688] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 25: [2023-05-25 13:38:02,688] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 10: [2023-05-25 13:38:02,688] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 13: [2023-05-25 13:38:02,688] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 25: [2023-05-25 13:38:02,688] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 8: [2023-05-25 13:38:02,689] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 31: [2023-05-25 13:38:02,689] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt. 18: [2023-05-25 13:38:02,689] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 22: [2023-05-25 13:38:02,689] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 22: [2023-05-25 13:38:02,690] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 19: [2023-05-25 13:38:02,691] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 19: [2023-05-25 13:38:02,691] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 26: [2023-05-25 13:38:02,691] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_13_optim_states.pt... 26: [2023-05-25 13:38:02,691] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_13_optim_states.pt... 19: [2023-05-25 13:38:02,691] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 13: [2023-05-25 13:38:02,691] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 19: [2023-05-25 13:38:02,691] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 24: [2023-05-25 13:38:02,691] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_13_optim_states.pt... 24: [2023-05-25 13:38:02,691] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_13_optim_states.pt... 8: [2023-05-25 13:38:02,691] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 24: [2023-05-25 13:38:02,691] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt. 16: [2023-05-25 13:38:02,691] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt. 16: [2023-05-25 13:38:02,691] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 24: [2023-05-25 13:38:02,691] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt. 22: [2023-05-25 13:38:02,691] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 16: [2023-05-25 13:38:02,691] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt. 16: [2023-05-25 13:38:02,691] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 22: [2023-05-25 13:38:02,692] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 19: [2023-05-25 13:38:02,692] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt... 27: [2023-05-25 13:38:02,692] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt. 19: [2023-05-25 13:38:02,693] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt... 19: [2023-05-25 13:38:02,693] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 28: [2023-05-25 13:38:02,693] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_14_optim_states.pt... 28: [2023-05-25 13:38:02,693] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_14_optim_states.pt... 19: [2023-05-25 13:38:02,693] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 16: [2023-05-25 13:38:02,693] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 25: [2023-05-25 13:38:02,694] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt... 16: [2023-05-25 13:38:02,694] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 30: [2023-05-25 13:38:02,694] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_13_optim_states.pt... 30: [2023-05-25 13:38:02,694] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_13_optim_states.pt... 20: [2023-05-25 13:38:02,694] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 31: [2023-05-25 13:38:02,695] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt. 31: [2023-05-25 13:38:02,695] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt. 12: [2023-05-25 13:38:02,695] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 12: [2023-05-25 13:38:02,695] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 14: [2023-05-25 13:38:02,696] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt. 23: [2023-05-25 13:38:02,696] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 9: [2023-05-25 13:38:02,696] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt. 14: [2023-05-25 13:38:02,696] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt. 9: [2023-05-25 13:38:02,696] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt. 23: [2023-05-25 13:38:02,697] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 20: [2023-05-25 13:38:02,699] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 21: [2023-05-25 13:38:02,699] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 28: [2023-05-25 13:38:02,699] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt. 28: [2023-05-25 13:38:02,699] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt. 20: [2023-05-25 13:38:02,699] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt. 26: [2023-05-25 13:38:02,699] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt. 26: [2023-05-25 13:38:02,700] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt. 20: [2023-05-25 13:38:02,700] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt. 22: [2023-05-25 13:38:02,700] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 22: [2023-05-25 13:38:02,700] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 31: [2023-05-25 13:38:02,701] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 27: [2023-05-25 13:38:02,702] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_14_optim_states.pt... 27: [2023-05-25 13:38:02,702] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_14_optim_states.pt... 21: [2023-05-25 13:38:02,702] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 31: [2023-05-25 13:38:02,701] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 21: [2023-05-25 13:38:02,703] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt. 21: [2023-05-25 13:38:02,703] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt. 31: [2023-05-25 13:38:02,704] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 25: [2023-05-25 13:38:02,695] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt... 25: [2023-05-25 13:38:02,696] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 25: [2023-05-25 13:38:02,696] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 25: [2023-05-25 13:38:02,697] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 25: [2023-05-25 13:38:02,697] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 25: [2023-05-25 13:38:02,704] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt... 25: [2023-05-25 13:38:02,704] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt... 31: [2023-05-25 13:38:02,704] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 31: [2023-05-25 13:38:02,704] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt... 16: [2023-05-25 13:38:02,704] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 31: [2023-05-25 13:38:02,704] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt... 26: [2023-05-25 13:38:02,705] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt. 24: [2023-05-25 13:38:02,706] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 26: [2023-05-25 13:38:02,706] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt. 16: [2023-05-25 13:38:02,706] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 23: [2023-05-25 13:38:02,706] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt. 23: [2023-05-25 13:38:02,706] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt. 31: [2023-05-25 13:38:02,706] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt. 24: [2023-05-25 13:38:02,706] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 17: [2023-05-25 13:38:02,706] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt. 24: [2023-05-25 13:38:02,706] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 17: [2023-05-25 13:38:02,706] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_02-model_states.pt. 8: [2023-05-25 13:38:02,706] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt. 8: [2023-05-25 13:38:02,707] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt. 24: [2023-05-25 13:38:02,707] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 24: [2023-05-25 13:38:02,707] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt. 5: [2023-05-25 13:38:02,708] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 29: [2023-05-25 13:38:02,708] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt. 24: [2023-05-25 13:38:02,708] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt. 12: [2023-05-25 13:38:02,709] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 25: [2023-05-25 13:38:02,709] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt. 12: [2023-05-25 13:38:02,709] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 14: [2023-05-25 13:38:02,709] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 9: [2023-05-25 13:38:02,709] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 18: [2023-05-25 13:38:02,709] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 18: [2023-05-25 13:38:02,709] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 25: [2023-05-25 13:38:02,709] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt. 20: [2023-05-25 13:38:02,710] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 24: [2023-05-25 13:38:02,710] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt... 24: [2023-05-25 13:38:02,710] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt... 5: [2023-05-25 13:38:02,710] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_00-model_states.pt. 20: [2023-05-25 13:38:02,711] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 31: [2023-05-25 13:38:02,711] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 5: [2023-05-25 13:38:02,712] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt... 30: [2023-05-25 13:38:02,711] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt. 30: [2023-05-25 13:38:02,712] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt. 31: [2023-05-25 13:38:02,712] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 14: [2023-05-25 13:38:02,712] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 9: [2023-05-25 13:38:02,712] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 27: [2023-05-25 13:38:02,712] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt. 29: [2023-05-25 13:38:02,712] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt. 27: [2023-05-25 13:38:02,712] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt. 5: [2023-05-25 13:38:02,712] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt... 31: [2023-05-25 13:38:02,712] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 20: [2023-05-25 13:38:02,713] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 31: [2023-05-25 13:38:02,713] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 20: [2023-05-25 13:38:02,713] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 18: [2023-05-25 13:38:02,713] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 18: [2023-05-25 13:38:02,713] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt... 31: [2023-05-25 13:38:02,713] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt... 18: [2023-05-25 13:38:02,714] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt... 31: [2023-05-25 13:38:02,714] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt... 21: [2023-05-25 13:38:02,715] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 31: [2023-05-25 13:38:02,715] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt. 28: [2023-05-25 13:38:02,715] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 31: [2023-05-25 13:38:02,715] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt. 28: [2023-05-25 13:38:02,716] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 26: [2023-05-25 13:38:02,716] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 28: [2023-05-25 13:38:02,716] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 28: [2023-05-25 13:38:02,716] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 26: [2023-05-25 13:38:02,716] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 26: [2023-05-25 13:38:02,716] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 12: [2023-05-25 13:38:02,716] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt. 9: [2023-05-25 13:38:02,716] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 31: [2023-05-25 13:38:02,717] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt. 26: [2023-05-25 13:38:02,717] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 10: [2023-05-25 13:38:02,717] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 24: [2023-05-25 13:38:02,717] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt. 12: [2023-05-25 13:38:02,717] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt. 26: [2023-05-25 13:38:02,717] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt... 26: [2023-05-25 13:38:02,717] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt... 21: [2023-05-25 13:38:02,718] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 24: [2023-05-25 13:38:02,717] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt. 10: [2023-05-25 13:38:02,719] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt... 18: [2023-05-25 13:38:02,719] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 17: [2023-05-25 13:38:02,720] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 21: [2023-05-25 13:38:02,720] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 26: [2023-05-25 13:38:02,720] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 26: [2023-05-25 13:38:02,720] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 9: [2023-05-25 13:38:02,720] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 23: [2023-05-25 13:38:02,720] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 26: [2023-05-25 13:38:02,720] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 31: [2023-05-25 13:38:02,717] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt. 8: [2023-05-25 13:38:02,719] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 31: [2023-05-25 13:38:02,718] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt. 26: [2023-05-25 13:38:02,720] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 31: [2023-05-25 13:38:02,720] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 31: [2023-05-25 13:38:02,720] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 17: [2023-05-25 13:38:02,720] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 25: [2023-05-25 13:38:02,721] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt. 31: [2023-05-25 13:38:02,721] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt... 26: [2023-05-25 13:38:02,721] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt. 26: [2023-05-25 13:38:02,721] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt... 26: [2023-05-25 13:38:02,721] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt... 10: [2023-05-25 13:38:02,721] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 26: [2023-05-25 13:38:02,721] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt. 23: [2023-05-25 13:38:02,722] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 8: [2023-05-25 13:38:02,723] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 10: [2023-05-25 13:38:02,723] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt... 28: [2023-05-25 13:38:02,723] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt... 28: [2023-05-25 13:38:02,723] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt... 19: [2023-05-25 13:38:02,723] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 15: [2023-05-25 13:38:02,723] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt. 26: [2023-05-25 13:38:02,724] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt. 26: [2023-05-25 13:38:02,724] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt. 15: [2023-05-25 13:38:02,724] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt. 29: [2023-05-25 13:38:02,724] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 29: [2023-05-25 13:38:02,724] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 29: [2023-05-25 13:38:02,725] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 29: [2023-05-25 13:38:02,725] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 31: [2023-05-25 13:38:02,725] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt. 19: [2023-05-25 13:38:02,725] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 20: [2023-05-25 13:38:02,725] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 27: [2023-05-25 13:38:02,726] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 20: [2023-05-25 13:38:02,726] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt... 18: [2023-05-25 13:38:02,726] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 27: [2023-05-25 13:38:02,726] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 12: [2023-05-25 13:38:02,726] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt. 27: [2023-05-25 13:38:02,726] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 27: [2023-05-25 13:38:02,727] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 12: [2023-05-25 13:38:02,727] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt. 24: [2023-05-25 13:38:02,728] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 24: [2023-05-25 13:38:02,728] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 25: [2023-05-25 13:38:02,728] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt. 24: [2023-05-25 13:38:02,728] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 24: [2023-05-25 13:38:02,728] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 27: [2023-05-25 13:38:02,728] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt... 27: [2023-05-25 13:38:02,729] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt... 30: [2023-05-25 13:38:02,729] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 24: [2023-05-25 13:38:02,729] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt... 24: [2023-05-25 13:38:02,729] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt... 30: [2023-05-25 13:38:02,729] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 31: [2023-05-25 13:38:02,729] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_14_optim_states.pt... 31: [2023-05-25 13:38:02,729] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_14_optim_states.pt... 30: [2023-05-25 13:38:02,729] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 31: [2023-05-25 13:38:02,729] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 30: [2023-05-25 13:38:02,729] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 24: [2023-05-25 13:38:02,729] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_14_optim_states.pt... 24: [2023-05-25 13:38:02,729] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_14_optim_states.pt... 12: [2023-05-25 13:38:02,730] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 12: [2023-05-25 13:38:02,730] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 30: [2023-05-25 13:38:02,730] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt... 30: [2023-05-25 13:38:02,730] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt... 31: [2023-05-25 13:38:02,730] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 31: [2023-05-25 13:38:02,730] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt... 14: [2023-05-25 13:38:02,730] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 31: [2023-05-25 13:38:02,730] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt. 27: [2023-05-25 13:38:02,731] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt. 25: [2023-05-25 13:38:02,731] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_13_optim_states.pt... 25: [2023-05-25 13:38:02,731] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_13_optim_states.pt... 24: [2023-05-25 13:38:02,731] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt. 14: [2023-05-25 13:38:02,731] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt... 24: [2023-05-25 13:38:02,731] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt. 25: [2023-05-25 13:38:02,732] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_15_optim_states.pt... 29: [2023-05-25 13:38:02,732] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt... 29: [2023-05-25 13:38:02,732] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt... 18: [2023-05-25 13:38:02,732] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 9: [2023-05-25 13:38:02,732] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 26: [2023-05-25 13:38:02,733] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_15_optim_states.pt... 26: [2023-05-25 13:38:02,733] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_15_optim_states.pt... 27: [2023-05-25 13:38:02,733] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt. 26: [2023-05-25 13:38:02,733] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_14_optim_states.pt... 26: [2023-05-25 13:38:02,733] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_14_optim_states.pt... 25: [2023-05-25 13:38:02,733] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_15_optim_states.pt... 16: [2023-05-25 13:38:02,734] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 16: [2023-05-25 13:38:02,734] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 9: [2023-05-25 13:38:02,734] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 9: [2023-05-25 13:38:02,734] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 16: [2023-05-25 13:38:02,735] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 31: [2023-05-25 13:38:02,735] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_13_optim_states.pt... 31: [2023-05-25 13:38:02,735] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_15_optim_states.pt... 31: [2023-05-25 13:38:02,735] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_13_optim_states.pt... 31: [2023-05-25 13:38:02,735] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_15_optim_states.pt... 20: [2023-05-25 13:38:02,735] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 9: [2023-05-25 13:38:02,735] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt... 20: [2023-05-25 13:38:02,736] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt... 22: [2023-05-25 13:38:02,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 22: [2023-05-25 13:38:02,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 22: [2023-05-25 13:38:02,737] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 22: [2023-05-25 13:38:02,737] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 29: [2023-05-25 13:38:02,737] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt. 16: [2023-05-25 13:38:02,737] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt... 15: [2023-05-25 13:38:02,738] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 15: [2023-05-25 13:38:02,738] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 29: [2023-05-25 13:38:02,738] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt. 22: [2023-05-25 13:38:02,739] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt... 24: [2023-05-25 13:38:02,739] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_15_optim_states.pt... 30: [2023-05-25 13:38:02,739] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt. 30: [2023-05-25 13:38:02,739] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt. 24: [2023-05-25 13:38:02,739] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_15_optim_states.pt... 22: [2023-05-25 13:38:02,739] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt... 19: [2023-05-25 13:38:02,739] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 13: [2023-05-25 13:38:02,739] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt. 16: [2023-05-25 13:38:02,740] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 4: [2023-05-25 13:38:02,740] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt. 8: [2023-05-25 13:38:02,740] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt. 4: [2023-05-25 13:38:02,740] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt. 13: [2023-05-25 13:38:02,740] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt. 9: [2023-05-25 13:38:02,740] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 8: [2023-05-25 13:38:02,740] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt. 14: [2023-05-25 13:38:02,741] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 19: [2023-05-25 13:38:02,741] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 9: [2023-05-25 13:38:02,741] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt... 12: [2023-05-25 13:38:02,741] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 28: [2023-05-25 13:38:02,741] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt. 16: [2023-05-25 13:38:02,741] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt... 13: [2023-05-25 13:38:02,741] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt. 28: [2023-05-25 13:38:02,742] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt. 13: [2023-05-25 13:38:02,742] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt. 29: [2023-05-25 13:38:02,742] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt. 29: [2023-05-25 13:38:02,742] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_02-model_states.pt. 12: [2023-05-25 13:38:02,742] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 14: [2023-05-25 13:38:02,743] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt... 17: [2023-05-25 13:38:02,744] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt. 17: [2023-05-25 13:38:02,744] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt. 21: [2023-05-25 13:38:02,746] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 16: [2023-05-25 13:38:02,748] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 21: [2023-05-25 13:38:02,748] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt... 23: [2023-05-25 13:38:02,748] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 16: [2023-05-25 13:38:02,749] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 22: [2023-05-25 13:38:02,749] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 15: [2023-05-25 13:38:02,750] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt. 17: [2023-05-25 13:38:02,750] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 21: [2023-05-25 13:38:02,750] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 17: [2023-05-25 13:38:02,750] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 22: [2023-05-25 13:38:02,750] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 23: [2023-05-25 13:38:02,750] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt. 15: [2023-05-25 13:38:02,750] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt. 17: [2023-05-25 13:38:02,752] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 23: [2023-05-25 13:38:02,752] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt. 23: [2023-05-25 13:38:02,752] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 21: [2023-05-25 13:38:02,752] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt... 23: [2023-05-25 13:38:02,752] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt... 4: [2023-05-25 13:38:02,753] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 17: [2023-05-25 13:38:02,753] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 4: [2023-05-25 13:38:02,754] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 29: [2023-05-25 13:38:02,754] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_13_optim_states.pt... 29: [2023-05-25 13:38:02,754] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_13_optim_states.pt... 8: [2023-05-25 13:38:02,753] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 8: [2023-05-25 13:38:02,754] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 23: [2023-05-25 13:38:02,755] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt... 29: [2023-05-25 13:38:02,755] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 29: [2023-05-25 13:38:02,756] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 29: [2023-05-25 13:38:02,756] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 8: [2023-05-25 13:38:02,756] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 29: [2023-05-25 13:38:02,756] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt... 8: [2023-05-25 13:38:02,756] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt... 29: [2023-05-25 13:38:02,757] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 29: [2023-05-25 13:38:02,757] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt... 8: [2023-05-25 13:38:02,757] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 17: [2023-05-25 13:38:02,757] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 11: [2023-05-25 13:38:02,758] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt. 17: [2023-05-25 13:38:02,758] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 11: [2023-05-25 13:38:02,758] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_01-model_states.pt. 8: [2023-05-25 13:38:02,759] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt... 29: [2023-05-25 13:38:02,760] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt. 13: [2023-05-25 13:38:02,760] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 29: [2023-05-25 13:38:02,760] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_02-model_states.pt. 13: [2023-05-25 13:38:02,761] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 15: [2023-05-25 13:38:02,762] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 17: [2023-05-25 13:38:02,763] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 17: [2023-05-25 13:38:02,763] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 13: [2023-05-25 13:38:02,763] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 2: [2023-05-25 13:38:02,763] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt. 2: [2023-05-25 13:38:02,764] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt. 23: [2023-05-25 13:38:02,764] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 2: [2023-05-25 13:38:02,764] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt. 2: [2023-05-25 13:38:02,764] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt. 12: [2023-05-25 13:38:02,765] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 12: [2023-05-25 13:38:02,765] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 15: [2023-05-25 13:38:02,765] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 23: [2023-05-25 13:38:02,765] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 13: [2023-05-25 13:38:02,765] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 15: [2023-05-25 13:38:02,766] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt. 15: [2023-05-25 13:38:02,767] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt. 12: [2023-05-25 13:38:02,768] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt... 12: [2023-05-25 13:38:02,768] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt... 14: [2023-05-25 13:38:02,770] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt. 14: [2023-05-25 13:38:02,770] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt. 12: [2023-05-25 13:38:02,771] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 11: [2023-05-25 13:38:02,771] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 17: [2023-05-25 13:38:02,773] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 5: [2023-05-25 13:38:02,773] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt. 27: [2023-05-25 13:38:02,773] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_15_optim_states.pt... 27: [2023-05-25 13:38:02,773] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_15_optim_states.pt... 5: [2023-05-25 13:38:02,773] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt. 0: [2023-05-25 13:38:02,773] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt. 17: [2023-05-25 13:38:02,774] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt... 0: [2023-05-25 13:38:02,774] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt. 12: [2023-05-25 13:38:02,774] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt... 11: [2023-05-25 13:38:02,774] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 17: [2023-05-25 13:38:02,777] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt... 17: [2023-05-25 13:38:02,777] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt... 2: [2023-05-25 13:38:02,778] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 2: [2023-05-25 13:38:02,779] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 12: [2023-05-25 13:38:02,779] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 15: [2023-05-25 13:38:02,779] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 15: [2023-05-25 13:38:02,779] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 3: [2023-05-25 13:38:02,780] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt. 3: [2023-05-25 13:38:02,780] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt. 11: [2023-05-25 13:38:02,780] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt. 11: [2023-05-25 13:38:02,781] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt. 2: [2023-05-25 13:38:02,781] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 2: [2023-05-25 13:38:02,781] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 12: [2023-05-25 13:38:02,781] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt... 3: [2023-05-25 13:38:02,781] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 3: [2023-05-25 13:38:02,781] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 4: [2023-05-25 13:38:02,782] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt. 4: [2023-05-25 13:38:02,782] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt. 8: [2023-05-25 13:38:02,782] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 14: [2023-05-25 13:38:02,782] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 3: [2023-05-25 13:38:02,782] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 15: [2023-05-25 13:38:02,783] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 29: [2023-05-25 13:38:02,783] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt. 3: [2023-05-25 13:38:02,783] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 15: [2023-05-25 13:38:02,783] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt... 15: [2023-05-25 13:38:02,783] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt... 15: [2023-05-25 13:38:02,783] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 29: [2023-05-25 13:38:02,783] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_03-model_states.pt. 27: [2023-05-25 13:38:02,784] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt. 27: [2023-05-25 13:38:02,784] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_44-model_01-model_states.pt. 0: [2023-05-25 13:38:02,784] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt. 0: [2023-05-25 13:38:02,784] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt. 1: [2023-05-25 13:38:02,784] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt. 1: [2023-05-25 13:38:02,784] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt. 6: [2023-05-25 13:38:02,785] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt. 14: [2023-05-25 13:38:02,785] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 17: [2023-05-25 13:38:02,785] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 14: [2023-05-25 13:38:02,785] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt. 5: [2023-05-25 13:38:02,785] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 23: [2023-05-25 13:38:02,786] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 5: [2023-05-25 13:38:02,786] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 6: [2023-05-25 13:38:02,786] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt. 14: [2023-05-25 13:38:02,786] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt. 23: [2023-05-25 13:38:02,787] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt... 7: [2023-05-25 13:38:02,787] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt. 0: [2023-05-25 13:38:02,787] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 7: [2023-05-25 13:38:02,788] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt. 17: [2023-05-25 13:38:02,789] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 8: [2023-05-25 13:38:02,789] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt... 11: [2023-05-25 13:38:02,789] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 17: [2023-05-25 13:38:02,790] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt. 17: [2023-05-25 13:38:02,790] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt. 3: [2023-05-25 13:38:02,790] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt. 3: [2023-05-25 13:38:02,790] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_01-model_states.pt. 8: [2023-05-25 13:38:02,790] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 17: [2023-05-25 13:38:02,790] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt... 11: [2023-05-25 13:38:02,790] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt... 0: [2023-05-25 13:38:02,791] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 15: [2023-05-25 13:38:02,792] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 13: [2023-05-25 13:38:02,792] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 8: [2023-05-25 13:38:02,793] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt... 3: [2023-05-25 13:38:02,793] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 9: [2023-05-25 13:38:02,793] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt. 3: [2023-05-25 13:38:02,794] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 11: [2023-05-25 13:38:02,794] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 15: [2023-05-25 13:38:02,794] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt... 4: [2023-05-25 13:38:02,795] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 23: [2023-05-25 13:38:02,794] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 11: [2023-05-25 13:38:02,795] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 4: [2023-05-25 13:38:02,795] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 9: [2023-05-25 13:38:02,795] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt. 13: [2023-05-25 13:38:02,795] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 12: [2023-05-25 13:38:02,795] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt. 13: [2023-05-25 13:38:02,795] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 12: [2023-05-25 13:38:02,795] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt. 13: [2023-05-25 13:38:02,796] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt... 23: [2023-05-25 13:38:02,796] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt... 23: [2023-05-25 13:38:02,798] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt. 13: [2023-05-25 13:38:02,798] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 27: [2023-05-25 13:38:02,798] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 29: [2023-05-25 13:38:02,798] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 13: [2023-05-25 13:38:02,798] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt... 13: [2023-05-25 13:38:02,798] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt... 23: [2023-05-25 13:38:02,798] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt. 29: [2023-05-25 13:38:02,798] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 27: [2023-05-25 13:38:02,799] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 6: [2023-05-25 13:38:02,799] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 14: [2023-05-25 13:38:02,799] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 0: [2023-05-25 13:38:02,799] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 14: [2023-05-25 13:38:02,800] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 27: [2023-05-25 13:38:02,800] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 0: [2023-05-25 13:38:02,800] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 29: [2023-05-25 13:38:02,800] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt... 17: [2023-05-25 13:38:02,800] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 27: [2023-05-25 13:38:02,800] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 6: [2023-05-25 13:38:02,800] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 13: [2023-05-25 13:38:02,801] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt... 27: [2023-05-25 13:38:02,801] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt... 22: [2023-05-25 13:38:02,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt. 19: [2023-05-25 13:38:02,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt. 27: [2023-05-25 13:38:02,801] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt... 29: [2023-05-25 13:38:02,801] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt... 22: [2023-05-25 13:38:02,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt. 19: [2023-05-25 13:38:02,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt. 15: [2023-05-25 13:38:02,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 29: [2023-05-25 13:38:02,802] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_00-model_states.pt. 22: [2023-05-25 13:38:02,802] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt. 29: [2023-05-25 13:38:02,802] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt... 22: [2023-05-25 13:38:02,802] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt. 16: [2023-05-25 13:38:02,802] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt. 10: [2023-05-25 13:38:02,802] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt. 10: [2023-05-25 13:38:02,802] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt. 29: [2023-05-25 13:38:02,802] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_14_optim_states.pt... 29: [2023-05-25 13:38:02,802] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_14_optim_states.pt... 10: [2023-05-25 13:38:02,802] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_03-model_states.pt. 10: [2023-05-25 13:38:02,802] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt. 16: [2023-05-25 13:38:02,803] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt. 20: [2023-05-25 13:38:02,803] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt. 20: [2023-05-25 13:38:02,803] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt. 17: [2023-05-25 13:38:02,804] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 17: [2023-05-25 13:38:02,804] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 15: [2023-05-25 13:38:02,803] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt... 5: [2023-05-25 13:38:02,804] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt. 5: [2023-05-25 13:38:02,804] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt. 1: [2023-05-25 13:38:02,804] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 1: [2023-05-25 13:38:02,804] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 17: [2023-05-25 13:38:02,804] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 7: [2023-05-25 13:38:02,805] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 29: [2023-05-25 13:38:02,805] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt. 29: [2023-05-25 13:38:02,805] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_03-model_states.pt. 20: [2023-05-25 13:38:02,805] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt. 20: [2023-05-25 13:38:02,806] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt. 19: [2023-05-25 13:38:02,805] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt. 3: [2023-05-25 13:38:02,806] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 15: [2023-05-25 13:38:02,805] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 11: [2023-05-25 13:38:02,806] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt. 11: [2023-05-25 13:38:02,806] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt. 1: [2023-05-25 13:38:02,806] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt. 1: [2023-05-25 13:38:02,806] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt. 19: [2023-05-25 13:38:02,806] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt. 21: [2023-05-25 13:38:02,806] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt. 21: [2023-05-25 13:38:02,807] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt. 27: [2023-05-25 13:38:02,807] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt. 11: [2023-05-25 13:38:02,807] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 15: [2023-05-25 13:38:02,807] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt... 9: [2023-05-25 13:38:02,807] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 13: [2023-05-25 13:38:02,808] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt. 12: [2023-05-25 13:38:02,808] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 13: [2023-05-25 13:38:02,808] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt. 12: [2023-05-25 13:38:02,808] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 3: [2023-05-25 13:38:02,808] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 11: [2023-05-25 13:38:02,809] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt... 27: [2023-05-25 13:38:02,810] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_46-model_01-model_states.pt. 18: [2023-05-25 13:38:02,810] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt. 18: [2023-05-25 13:38:02,810] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_03-model_states.pt. 9: [2023-05-25 13:38:02,810] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 7: [2023-05-25 13:38:02,805] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 23: [2023-05-25 13:38:02,811] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 14: [2023-05-25 13:38:02,811] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 6: [2023-05-25 13:38:02,812] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt. 6: [2023-05-25 13:38:02,812] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt. 23: [2023-05-25 13:38:02,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 19: [2023-05-25 13:38:02,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 14: [2023-05-25 13:38:02,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt... 18: [2023-05-25 13:38:02,815] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt. 22: [2023-05-25 13:38:02,815] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 10: [2023-05-25 13:38:02,815] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 19: [2023-05-25 13:38:02,815] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 7: [2023-05-25 13:38:02,815] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt. 7: [2023-05-25 13:38:02,815] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_03-model_states.pt. 16: [2023-05-25 13:38:02,816] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 16: [2023-05-25 13:38:02,817] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 22: [2023-05-25 13:38:02,817] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 18: [2023-05-25 13:38:02,817] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt. 22: [2023-05-25 13:38:02,817] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 20: [2023-05-25 13:38:02,817] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 20: [2023-05-25 13:38:02,818] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 22: [2023-05-25 13:38:02,818] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 5: [2023-05-25 13:38:02,818] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 5: [2023-05-25 13:38:02,818] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 17: [2023-05-25 13:38:02,818] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 21: [2023-05-25 13:38:02,819] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 10: [2023-05-25 13:38:02,819] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 11: [2023-05-25 13:38:02,819] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 14: [2023-05-25 13:38:02,819] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 10: [2023-05-25 13:38:02,819] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 15: [2023-05-25 13:38:02,820] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 11: [2023-05-25 13:38:02,820] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 30: [2023-05-25 13:38:02,820] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_15_optim_states.pt... 10: [2023-05-25 13:38:02,820] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 30: [2023-05-25 13:38:02,820] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_15_optim_states.pt... 21: [2023-05-25 13:38:02,820] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 19: [2023-05-25 13:38:02,821] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 5: [2023-05-25 13:38:02,821] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 14: [2023-05-25 13:38:02,822] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt... 5: [2023-05-25 13:38:02,822] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 20: [2023-05-25 13:38:02,822] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 20: [2023-05-25 13:38:02,822] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 15: [2023-05-25 13:38:02,822] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt... 19: [2023-05-25 13:38:02,822] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 5: [2023-05-25 13:38:02,823] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 16: [2023-05-25 13:38:02,823] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt. 16: [2023-05-25 13:38:02,823] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt. 1: [2023-05-25 13:38:02,823] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 5: [2023-05-25 13:38:02,824] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 1: [2023-05-25 13:38:02,824] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 1: [2023-05-25 13:38:02,824] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 5: [2023-05-25 13:38:02,824] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 5: [2023-05-25 13:38:02,824] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 18: [2023-05-25 13:38:02,825] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 8: [2023-05-25 13:38:02,825] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt. 8: [2023-05-25 13:38:02,825] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt. 1: [2023-05-25 13:38:02,825] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 18: [2023-05-25 13:38:02,825] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 13: [2023-05-25 13:38:02,826] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 13: [2023-05-25 13:38:02,826] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 3: [2023-05-25 13:38:02,826] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 3: [2023-05-25 13:38:02,826] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 3: [2023-05-25 13:38:02,826] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 6: [2023-05-25 13:38:02,826] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 11: [2023-05-25 13:38:02,826] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 6: [2023-05-25 13:38:02,827] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 5: [2023-05-25 13:38:02,827] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt... 5: [2023-05-25 13:38:02,827] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt... 1: [2023-05-25 13:38:02,827] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 1: [2023-05-25 13:38:02,828] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 18: [2023-05-25 13:38:02,829] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 3: [2023-05-25 13:38:02,829] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt... 18: [2023-05-25 13:38:02,829] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 11: [2023-05-25 13:38:02,829] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 11: [2023-05-25 13:38:02,829] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt... 14: [2023-05-25 13:38:02,829] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 12: [2023-05-25 13:38:02,830] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 14: [2023-05-25 13:38:02,830] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 12: [2023-05-25 13:38:02,831] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 12: [2023-05-25 13:38:02,831] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt... 3: [2023-05-25 13:38:02,831] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 7: [2023-05-25 13:38:02,832] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 7: [2023-05-25 13:38:02,832] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 14: [2023-05-25 13:38:02,832] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt... 12: [2023-05-25 13:38:02,832] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt... 7: [2023-05-25 13:38:02,832] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 14: [2023-05-25 13:38:02,833] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt... 11: [2023-05-25 13:38:02,834] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt... 3: [2023-05-25 13:38:02,834] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt... 17: [2023-05-25 13:38:02,834] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 17: [2023-05-25 13:38:02,834] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 27: [2023-05-25 13:38:02,834] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_13_optim_states.pt... 27: [2023-05-25 13:38:02,835] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_13_optim_states.pt... 7: [2023-05-25 13:38:02,835] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 7: [2023-05-25 13:38:02,835] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 7: [2023-05-25 13:38:02,835] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 19: [2023-05-25 13:38:02,835] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 1: [2023-05-25 13:38:02,836] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 1: [2023-05-25 13:38:02,836] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 19: [2023-05-25 13:38:02,836] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt... 17: [2023-05-25 13:38:02,837] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt... 17: [2023-05-25 13:38:02,837] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt... 16: [2023-05-25 13:38:02,837] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 1: [2023-05-25 13:38:02,838] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt... 7: [2023-05-25 13:38:02,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 1: [2023-05-25 13:38:02,838] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt... 8: [2023-05-25 13:38:02,838] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 16: [2023-05-25 13:38:02,839] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 22: [2023-05-25 13:38:02,839] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 7: [2023-05-25 13:38:02,839] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt... 8: [2023-05-25 13:38:02,839] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 3: [2023-05-25 13:38:02,840] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 9: [2023-05-25 13:38:02,840] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 23: [2023-05-25 13:38:02,840] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 22: [2023-05-25 13:38:02,840] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt... 3: [2023-05-25 13:38:02,841] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 21: [2023-05-25 13:38:02,841] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt. 21: [2023-05-25 13:38:02,841] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_33-model_01-model_states.pt. 3: [2023-05-25 13:38:02,841] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 11: [2023-05-25 13:38:02,842] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 23: [2023-05-25 13:38:02,842] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 9: [2023-05-25 13:38:02,843] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt... 7: [2023-05-25 13:38:02,843] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 11: [2023-05-25 13:38:02,843] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt... 9: [2023-05-25 13:38:02,843] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 3: [2023-05-25 13:38:02,843] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt... 20: [2023-05-25 13:38:02,843] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 23: [2023-05-25 13:38:02,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt... 23: [2023-05-25 13:38:02,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt... 5: [2023-05-25 13:38:02,844] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 9: [2023-05-25 13:38:02,844] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt. 9: [2023-05-25 13:38:02,844] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_22-model_02-model_states.pt. 10: [2023-05-25 13:38:02,844] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 22: [2023-05-25 13:38:02,844] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 9: [2023-05-25 13:38:02,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt... 7: [2023-05-25 13:38:02,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt... 20: [2023-05-25 13:38:02,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt... 16: [2023-05-25 13:38:02,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 21: [2023-05-25 13:38:02,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 10: [2023-05-25 13:38:02,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt... 22: [2023-05-25 13:38:02,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt... 3: [2023-05-25 13:38:02,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 1: [2023-05-25 13:38:02,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 28: [2023-05-25 13:38:02,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_15_optim_states.pt... 28: [2023-05-25 13:38:02,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_15_optim_states.pt... 1: [2023-05-25 13:38:02,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt... 3: [2023-05-25 13:38:02,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt... 21: [2023-05-25 13:38:02,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt... 5: [2023-05-25 13:38:02,849] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 5: [2023-05-25 13:38:02,849] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 19: [2023-05-25 13:38:02,850] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 20: [2023-05-25 13:38:02,850] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 19: [2023-05-25 13:38:02,851] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 19: [2023-05-25 13:38:02,852] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 20: [2023-05-25 13:38:02,852] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 19: [2023-05-25 13:38:02,852] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt... 29: [2023-05-25 13:38:02,852] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_15_optim_states.pt... 5: [2023-05-25 13:38:02,852] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt... 5: [2023-05-25 13:38:02,852] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt... 22: [2023-05-25 13:38:02,852] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 29: [2023-05-25 13:38:02,852] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_15_optim_states.pt... 22: [2023-05-25 13:38:02,852] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 16: [2023-05-25 13:38:02,852] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt... 20: [2023-05-25 13:38:02,853] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt... 5: [2023-05-25 13:38:02,853] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 19: [2023-05-25 13:38:02,853] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt... 11: [2023-05-25 13:38:02,854] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 19: [2023-05-25 13:38:02,854] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt... 10: [2023-05-25 13:38:02,854] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 10: [2023-05-25 13:38:02,854] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 21: [2023-05-25 13:38:02,854] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 22: [2023-05-25 13:38:02,854] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt... 10: [2023-05-25 13:38:02,855] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 20: [2023-05-25 13:38:02,855] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt... 21: [2023-05-25 13:38:02,855] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 21: [2023-05-25 13:38:02,855] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt... 22: [2023-05-25 13:38:02,855] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt... 7: [2023-05-25 13:38:02,855] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 11: [2023-05-25 13:38:02,856] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt... 5: [2023-05-25 13:38:02,856] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 7: [2023-05-25 13:38:02,856] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt... 10: [2023-05-25 13:38:02,856] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt... 13: [2023-05-25 13:38:02,856] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 21: [2023-05-25 13:38:02,856] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt... 10: [2023-05-25 13:38:02,857] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt... 16: [2023-05-25 13:38:02,857] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 10: [2023-05-25 13:38:02,857] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt... 9: [2023-05-25 13:38:02,857] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 18: [2023-05-25 13:38:02,859] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 18: [2023-05-25 13:38:02,859] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 9: [2023-05-25 13:38:02,859] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt... 16: [2023-05-25 13:38:02,859] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt... 13: [2023-05-25 13:38:02,860] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt... 1: [2023-05-25 13:38:02,860] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 1: [2023-05-25 13:38:02,860] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 18: [2023-05-25 13:38:02,861] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 18: [2023-05-25 13:38:02,861] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 18: [2023-05-25 13:38:02,861] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt... 18: [2023-05-25 13:38:02,861] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt... 13: [2023-05-25 13:38:02,863] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 1: [2023-05-25 13:38:02,863] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt... 20: [2023-05-25 13:38:02,863] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 18: [2023-05-25 13:38:02,863] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt... 18: [2023-05-25 13:38:02,864] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt... 20: [2023-05-25 13:38:02,865] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt... 13: [2023-05-25 13:38:02,865] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt... 5: [2023-05-25 13:38:02,867] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 8: [2023-05-25 13:38:02,868] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 1: [2023-05-25 13:38:02,869] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 7: [2023-05-25 13:38:02,869] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 7: [2023-05-25 13:38:02,869] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 8: [2023-05-25 13:38:02,873] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt... 8: [2023-05-25 13:38:02,874] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 0: [2023-05-25 13:38:02,874] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 0: [2023-05-25 13:38:02,875] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 0: [2023-05-25 13:38:02,875] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 0: [2023-05-25 13:38:02,875] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 0: [2023-05-25 13:38:02,875] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 0: [2023-05-25 13:38:02,875] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 1: [2023-05-25 13:38:02,876] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 8: [2023-05-25 13:38:02,876] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt... 0: [2023-05-25 13:38:02,876] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 0: [2023-05-25 13:38:02,877] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt... 16: [2023-05-25 13:38:02,877] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 0: [2023-05-25 13:38:02,878] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt... 0: [2023-05-25 13:38:02,878] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt... 0: [2023-05-25 13:38:02,878] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 0: [2023-05-25 13:38:02,878] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt... 16: [2023-05-25 13:38:02,878] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 7: [2023-05-25 13:38:02,879] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 16: [2023-05-25 13:38:02,880] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt... 16: [2023-05-25 13:38:02,880] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt... 21: [2023-05-25 13:38:02,880] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 21: [2023-05-25 13:38:02,882] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt... 7: [2023-05-25 13:38:02,882] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt... 9: [2023-05-25 13:38:02,884] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 1: [2023-05-25 13:38:02,886] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 7: [2023-05-25 13:38:02,886] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 7: [2023-05-25 13:38:02,886] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 9: [2023-05-25 13:38:02,887] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt... 21: [2023-05-25 13:38:02,887] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_00-model_states.pt. 21: [2023-05-25 13:38:02,888] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt... 9: [2023-05-25 13:38:02,889] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_00-model_states.pt. 9: [2023-05-25 13:38:02,891] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt... 4: [2023-05-25 13:38:02,905] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 4: [2023-05-25 13:38:02,905] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 4: [2023-05-25 13:38:02,905] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 4: [2023-05-25 13:38:02,905] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 4: [2023-05-25 13:38:02,905] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 4: [2023-05-25 13:38:02,905] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 0: [2023-05-25 13:38:02,906] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 4: [2023-05-25 13:38:02,907] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt... 0: [2023-05-25 13:38:02,907] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 4: [2023-05-25 13:38:02,907] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 4: [2023-05-25 13:38:02,907] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt... 4: [2023-05-25 13:38:02,907] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt... 4: [2023-05-25 13:38:02,907] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 4: [2023-05-25 13:38:02,907] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt... 6: [2023-05-25 13:38:02,915] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 6: [2023-05-25 13:38:02,915] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 6: [2023-05-25 13:38:02,915] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 6: [2023-05-25 13:38:02,915] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 6: [2023-05-25 13:38:02,915] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 6: [2023-05-25 13:38:02,915] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 0: [2023-05-25 13:38:02,917] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 6: [2023-05-25 13:38:02,918] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 6: [2023-05-25 13:38:02,918] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt... 6: [2023-05-25 13:38:02,918] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt... 6: [2023-05-25 13:38:02,918] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt... 6: [2023-05-25 13:38:02,918] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt... 6: [2023-05-25 13:38:02,919] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 0: [2023-05-25 13:38:02,920] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 2: [2023-05-25 13:38:02,927] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 2: [2023-05-25 13:38:02,927] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 2: [2023-05-25 13:38:02,927] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 2: [2023-05-25 13:38:02,927] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 2: [2023-05-25 13:38:02,927] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 2: [2023-05-25 13:38:02,927] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 2: [2023-05-25 13:38:02,929] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt... 2: [2023-05-25 13:38:02,929] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 2: [2023-05-25 13:38:02,929] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt... 2: [2023-05-25 13:38:02,929] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 2: [2023-05-25 13:38:02,930] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt... 2: [2023-05-25 13:38:02,930] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt... 4: [2023-05-25 13:38:02,931] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 4: [2023-05-25 13:38:02,938] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 4: [2023-05-25 13:38:02,944] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 6: [2023-05-25 13:38:02,945] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 13: [2023-05-25 13:38:02,950] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 13: [2023-05-25 13:38:02,950] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 8: [2023-05-25 13:38:02,950] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt. 8: [2023-05-25 13:38:02,950] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt. 4: [2023-05-25 13:38:02,952] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 15: [2023-05-25 13:38:02,952] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 15: [2023-05-25 13:38:02,952] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 13: [2023-05-25 13:38:02,953] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 15: [2023-05-25 13:38:02,953] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 15: [2023-05-25 13:38:02,954] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 6: [2023-05-25 13:38:02,954] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 13: [2023-05-25 13:38:02,954] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 18: [2023-05-25 13:38:02,956] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt. 18: [2023-05-25 13:38:02,956] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt. 2: [2023-05-25 13:38:02,959] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 6: [2023-05-25 13:38:02,960] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 7: [2023-05-25 13:38:02,960] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt. 7: [2023-05-25 13:38:02,960] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt. 8: [2023-05-25 13:38:02,964] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 2: [2023-05-25 13:38:02,964] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 8: [2023-05-25 13:38:02,964] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 6: [2023-05-25 13:38:02,967] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 18: [2023-05-25 13:38:02,969] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 18: [2023-05-25 13:38:02,972] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 2: [2023-05-25 13:38:02,974] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 7: [2023-05-25 13:38:02,975] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 7: [2023-05-25 13:38:02,976] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 2: [2023-05-25 13:38:02,977] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 14: [2023-05-25 13:38:02,980] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt. 14: [2023-05-25 13:38:02,980] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt. 15: [2023-05-25 13:38:02,984] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 15: [2023-05-25 13:38:02,986] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 9: [2023-05-25 13:38:02,987] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt. 9: [2023-05-25 13:38:02,987] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt. 14: [2023-05-25 13:38:02,990] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 14: [2023-05-25 13:38:02,992] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 10: [2023-05-25 13:38:02,992] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt. 10: [2023-05-25 13:38:02,992] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt. 14: [2023-05-25 13:38:02,993] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 14: [2023-05-25 13:38:02,993] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 14: [2023-05-25 13:38:02,994] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 13: [2023-05-25 13:38:02,995] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 14: [2023-05-25 13:38:02,995] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 13: [2023-05-25 13:38:02,996] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 20: [2023-05-25 13:38:02,998] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 20: [2023-05-25 13:38:02,998] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt. 20: [2023-05-25 13:38:02,998] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 11: [2023-05-25 13:38:02,998] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt. 11: [2023-05-25 13:38:02,999] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt. 20: [2023-05-25 13:38:03,000] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 20: [2023-05-25 13:38:03,000] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 20: [2023-05-25 13:38:03,000] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt. 9: [2023-05-25 13:38:03,003] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 9: [2023-05-25 13:38:03,003] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 10: [2023-05-25 13:38:03,003] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 3: [2023-05-25 13:38:03,003] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt. 3: [2023-05-25 13:38:03,003] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt. 10: [2023-05-25 13:38:03,004] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 10: [2023-05-25 13:38:03,004] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 10: [2023-05-25 13:38:03,004] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 10: [2023-05-25 13:38:03,005] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 10: [2023-05-25 13:38:03,006] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 7: [2023-05-25 13:38:03,006] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 11: [2023-05-25 13:38:03,006] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 11: [2023-05-25 13:38:03,008] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 7: [2023-05-25 13:38:03,008] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 7: [2023-05-25 13:38:03,009] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt... 27: [2023-05-25 13:38:03,010] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_12_optim_states.pt. 27: [2023-05-25 13:38:03,010] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 216 11: [2023-05-25 13:38:03,010] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 7: [2023-05-25 13:38:03,010] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt... 20: [2023-05-25 13:38:03,011] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 11: [2023-05-25 13:38:03,012] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 11: [2023-05-25 13:38:03,012] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 11: [2023-05-25 13:38:03,013] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 23: [2023-05-25 13:38:03,014] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt. 19: [2023-05-25 13:38:03,014] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 23: [2023-05-25 13:38:03,014] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt. 19: [2023-05-25 13:38:03,015] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 19: [2023-05-25 13:38:03,017] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 19: [2023-05-25 13:38:03,017] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 14: [2023-05-25 13:38:03,016] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt. 20: [2023-05-25 13:38:03,017] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 15: [2023-05-25 13:38:03,017] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt. 15: [2023-05-25 13:38:03,017] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt. 12: [2023-05-25 13:38:03,017] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 14: [2023-05-25 13:38:03,017] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt. 12: [2023-05-25 13:38:03,017] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 15: [2023-05-25 13:38:03,017] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_04_optim_states.pt... 15: [2023-05-25 13:38:03,017] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_04_optim_states.pt... 3: [2023-05-25 13:38:03,018] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 12: [2023-05-25 13:38:03,019] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 3: [2023-05-25 13:38:03,019] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 12: [2023-05-25 13:38:03,020] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 8: [2023-05-25 13:38:03,020] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt. 8: [2023-05-25 13:38:03,021] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt. 8: [2023-05-25 13:38:03,021] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 8: [2023-05-25 13:38:03,021] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 8: [2023-05-25 13:38:03,021] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 8: [2023-05-25 13:38:03,021] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 12: [2023-05-25 13:38:03,022] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt. 12: [2023-05-25 13:38:03,022] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt. 8: [2023-05-25 13:38:03,023] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt... 8: [2023-05-25 13:38:03,024] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 8: [2023-05-25 13:38:03,024] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 8: [2023-05-25 13:38:03,024] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt... 15: [2023-05-25 13:38:03,025] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt. 14: [2023-05-25 13:38:03,025] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 12: [2023-05-25 13:38:03,025] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt. 15: [2023-05-25 13:38:03,026] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt. 12: [2023-05-25 13:38:03,026] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt. 23: [2023-05-25 13:38:03,028] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 14: [2023-05-25 13:38:03,029] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt... 14: [2023-05-25 13:38:03,029] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 14: [2023-05-25 13:38:03,030] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 14: [2023-05-25 13:38:03,031] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 15: [2023-05-25 13:38:03,031] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 15: [2023-05-25 13:38:03,031] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 14: [2023-05-25 13:38:03,031] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 18: [2023-05-25 13:38:03,031] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 18: [2023-05-25 13:38:03,031] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 18: [2023-05-25 13:38:03,031] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 18: [2023-05-25 13:38:03,032] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 18: [2023-05-25 13:38:03,033] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 18: [2023-05-25 13:38:03,033] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 14: [2023-05-25 13:38:03,034] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 8: [2023-05-25 13:38:03,034] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 14: [2023-05-25 13:38:03,034] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt... 23: [2023-05-25 13:38:03,034] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 10: [2023-05-25 13:38:03,034] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 12: [2023-05-25 13:38:03,035] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 12: [2023-05-25 13:38:03,035] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 8: [2023-05-25 13:38:03,035] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 4: [2023-05-25 13:38:03,035] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt. 4: [2023-05-25 13:38:03,036] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt. 18: [2023-05-25 13:38:03,036] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt... 18: [2023-05-25 13:38:03,036] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt... 10: [2023-05-25 13:38:03,036] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt. 10: [2023-05-25 13:38:03,037] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt. 16: [2023-05-25 13:38:03,038] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 16: [2023-05-25 13:38:03,039] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 16: [2023-05-25 13:38:03,040] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 15: [2023-05-25 13:38:03,040] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 10: [2023-05-25 13:38:03,040] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 10: [2023-05-25 13:38:03,040] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 16: [2023-05-25 13:38:03,041] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 15: [2023-05-25 13:38:03,041] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 20: [2023-05-25 13:38:03,041] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 20: [2023-05-25 13:38:03,041] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 20: [2023-05-25 13:38:03,041] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 12: [2023-05-25 13:38:03,041] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 10: [2023-05-25 13:38:03,042] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 11: [2023-05-25 13:38:03,043] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt. 12: [2023-05-25 13:38:03,043] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 11: [2023-05-25 13:38:03,043] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt. 10: [2023-05-25 13:38:03,043] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt... 20: [2023-05-25 13:38:03,044] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt... 10: [2023-05-25 13:38:03,043] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt... 13: [2023-05-25 13:38:03,045] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_04_optim_states.pt... 13: [2023-05-25 13:38:03,045] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_04_optim_states.pt... 22: [2023-05-25 13:38:03,045] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 23: [2023-05-25 13:38:03,045] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 22: [2023-05-25 13:38:03,045] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 23: [2023-05-25 13:38:03,045] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 22: [2023-05-25 13:38:03,046] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 22: [2023-05-25 13:38:03,047] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 23: [2023-05-25 13:38:03,047] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 23: [2023-05-25 13:38:03,047] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 2: [2023-05-25 13:38:03,048] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt. 4: [2023-05-25 13:38:03,049] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 4: [2023-05-25 13:38:03,049] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 2: [2023-05-25 13:38:03,049] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt. 0: [2023-05-25 13:38:03,049] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt. 0: [2023-05-25 13:38:03,049] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt. 6: [2023-05-25 13:38:03,050] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt. 3: [2023-05-25 13:38:03,050] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 10: [2023-05-25 13:38:03,051] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 10: [2023-05-25 13:38:03,051] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 6: [2023-05-25 13:38:03,052] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt. 13: [2023-05-25 13:38:03,052] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt. 13: [2023-05-25 13:38:03,052] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt. 13: [2023-05-25 13:38:03,053] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt. 13: [2023-05-25 13:38:03,053] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_01-model_states.pt. 11: [2023-05-25 13:38:03,053] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 19: [2023-05-25 13:38:03,053] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 11: [2023-05-25 13:38:03,053] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 11: [2023-05-25 13:38:03,053] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 11: [2023-05-25 13:38:03,054] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 3: [2023-05-25 13:38:03,054] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt... 8: [2023-05-25 13:38:03,055] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 3: [2023-05-25 13:38:03,055] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 19: [2023-05-25 13:38:03,055] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 11: [2023-05-25 13:38:03,056] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 20: [2023-05-25 13:38:03,057] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 11: [2023-05-25 13:38:03,057] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt... 11: [2023-05-25 13:38:03,057] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt... 11: [2023-05-25 13:38:03,057] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 8: [2023-05-25 13:38:03,058] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt... 3: [2023-05-25 13:38:03,058] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt... 12: [2023-05-25 13:38:03,058] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 12: [2023-05-25 13:38:03,058] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 20: [2023-05-25 13:38:03,059] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt... 13: [2023-05-25 13:38:03,060] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt. 21: [2023-05-25 13:38:03,060] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt. 17: [2023-05-25 13:38:03,060] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt. 13: [2023-05-25 13:38:03,060] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt. 16: [2023-05-25 13:38:03,061] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt. 17: [2023-05-25 13:38:03,061] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt. 18: [2023-05-25 13:38:03,060] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 16: [2023-05-25 13:38:03,061] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt. 21: [2023-05-25 13:38:03,061] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt. 18: [2023-05-25 13:38:03,062] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 2: [2023-05-25 13:38:03,062] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 17: [2023-05-25 13:38:03,062] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 17: [2023-05-25 13:38:03,063] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 6: [2023-05-25 13:38:03,063] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 8: [2023-05-25 13:38:03,062] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 8: [2023-05-25 13:38:03,062] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 22: [2023-05-25 13:38:03,063] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt. 2: [2023-05-25 13:38:03,063] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 22: [2023-05-25 13:38:03,063] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt. 14: [2023-05-25 13:38:03,063] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 5: [2023-05-25 13:38:03,064] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt. 20: [2023-05-25 13:38:03,064] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_08_optim_states.pt... 20: [2023-05-25 13:38:03,064] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_08_optim_states.pt... 10: [2023-05-25 13:38:03,064] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_04_optim_states.pt... 10: [2023-05-25 13:38:03,064] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_04_optim_states.pt... 0: [2023-05-25 13:38:03,064] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 8: [2023-05-25 13:38:03,064] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 6: [2023-05-25 13:38:03,065] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 17: [2023-05-25 13:38:03,064] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 17: [2023-05-25 13:38:03,065] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 12: [2023-05-25 13:38:03,066] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 0: [2023-05-25 13:38:03,066] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 14: [2023-05-25 13:38:03,066] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt... 5: [2023-05-25 13:38:03,066] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt. 8: [2023-05-25 13:38:03,066] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt... 1: [2023-05-25 13:38:03,066] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt. 1: [2023-05-25 13:38:03,067] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_10-model_02-model_states.pt. 23: [2023-05-25 13:38:03,067] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 8: [2023-05-25 13:38:03,067] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt. 8: [2023-05-25 13:38:03,068] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt. 14: [2023-05-25 13:38:03,068] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_04_optim_states.pt... 15: [2023-05-25 13:38:03,068] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 15: [2023-05-25 13:38:03,068] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 14: [2023-05-25 13:38:03,068] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_04_optim_states.pt... 19: [2023-05-25 13:38:03,069] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt. 19: [2023-05-25 13:38:03,069] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt. 12: [2023-05-25 13:38:03,069] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt... 23: [2023-05-25 13:38:03,069] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt... 13: [2023-05-25 13:38:03,070] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 13: [2023-05-25 13:38:03,070] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 15: [2023-05-25 13:38:03,070] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 15: [2023-05-25 13:38:03,070] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt... 15: [2023-05-25 13:38:03,070] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt... 9: [2023-05-25 13:38:03,071] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 9: [2023-05-25 13:38:03,072] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 9: [2023-05-25 13:38:03,072] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 23: [2023-05-25 13:38:03,072] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 9: [2023-05-25 13:38:03,072] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 13: [2023-05-25 13:38:03,072] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 15: [2023-05-25 13:38:03,072] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt... 14: [2023-05-25 13:38:03,073] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 16: [2023-05-25 13:38:03,073] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 12: [2023-05-25 13:38:03,073] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 16: [2023-05-25 13:38:03,074] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 9: [2023-05-25 13:38:03,074] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt... 21: [2023-05-25 13:38:03,074] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 23: [2023-05-25 13:38:03,074] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt... 4: [2023-05-25 13:38:03,074] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 10: [2023-05-25 13:38:03,074] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt. 10: [2023-05-25 13:38:03,075] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt. 9: [2023-05-25 13:38:03,075] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 12: [2023-05-25 13:38:03,075] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt... 13: [2023-05-25 13:38:03,075] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 9: [2023-05-25 13:38:03,075] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 9: [2023-05-25 13:38:03,075] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt... 14: [2023-05-25 13:38:03,075] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt... 21: [2023-05-25 13:38:03,075] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 12: [2023-05-25 13:38:03,076] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 17: [2023-05-25 13:38:03,076] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 23: [2023-05-25 13:38:03,076] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 22: [2023-05-25 13:38:03,076] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 1: [2023-05-25 13:38:03,076] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt. 15: [2023-05-25 13:38:03,076] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 17: [2023-05-25 13:38:03,076] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 5: [2023-05-25 13:38:03,076] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 19: [2023-05-25 13:38:03,077] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_08_optim_states.pt... 19: [2023-05-25 13:38:03,077] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_08_optim_states.pt... 16: [2023-05-25 13:38:03,077] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 22: [2023-05-25 13:38:03,077] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 4: [2023-05-25 13:38:03,077] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt... 1: [2023-05-25 13:38:03,077] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt. 23: [2023-05-25 13:38:03,077] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 10: [2023-05-25 13:38:03,077] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 16: [2023-05-25 13:38:03,077] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 22: [2023-05-25 13:38:03,078] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 4: [2023-05-25 13:38:03,078] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 12: [2023-05-25 13:38:03,078] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt... 15: [2023-05-25 13:38:03,078] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt... 13: [2023-05-25 13:38:03,079] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 5: [2023-05-25 13:38:03,079] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 10: [2023-05-25 13:38:03,079] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt... 4: [2023-05-25 13:38:03,080] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt... 12: [2023-05-25 13:38:03,080] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 8: [2023-05-25 13:38:03,081] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 8: [2023-05-25 13:38:03,081] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 13: [2023-05-25 13:38:03,081] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 12: [2023-05-25 13:38:03,082] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt... 1: [2023-05-25 13:38:03,082] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 22: [2023-05-25 13:38:03,083] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 7: [2023-05-25 13:38:03,083] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt. 6: [2023-05-25 13:38:03,084] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 7: [2023-05-25 13:38:03,084] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt. 19: [2023-05-25 13:38:03,085] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 19: [2023-05-25 13:38:03,085] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 6: [2023-05-25 13:38:03,085] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt... 1: [2023-05-25 13:38:03,086] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt... 15: [2023-05-25 13:38:03,087] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt. 15: [2023-05-25 13:38:03,087] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt. 11: [2023-05-25 13:38:03,087] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 19: [2023-05-25 13:38:03,087] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt. 5: [2023-05-25 13:38:03,087] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt. 10: [2023-05-25 13:38:03,088] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 11: [2023-05-25 13:38:03,088] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_04_optim_states.pt... 11: [2023-05-25 13:38:03,088] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_04_optim_states.pt... 12: [2023-05-25 13:38:03,088] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt. 12: [2023-05-25 13:38:03,088] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt. 14: [2023-05-25 13:38:03,088] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt. 14: [2023-05-25 13:38:03,088] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt. 2: [2023-05-25 13:38:03,088] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 5: [2023-05-25 13:38:03,088] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt. 18: [2023-05-25 13:38:03,089] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_08_optim_states.pt... 18: [2023-05-25 13:38:03,089] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_08_optim_states.pt... 29: [2023-05-25 13:38:03,089] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_12_optim_states.pt. 29: [2023-05-25 13:38:03,089] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 236 10: [2023-05-25 13:38:03,089] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 12: [2023-05-25 13:38:03,090] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_04_optim_states.pt... 12: [2023-05-25 13:38:03,090] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_04_optim_states.pt... 19: [2023-05-25 13:38:03,090] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt. 10: [2023-05-25 13:38:03,090] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 11: [2023-05-25 13:38:03,090] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt... 11: [2023-05-25 13:38:03,090] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 9: [2023-05-25 13:38:03,090] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt. 2: [2023-05-25 13:38:03,090] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt... 2: [2023-05-25 13:38:03,091] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 9: [2023-05-25 13:38:03,091] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt. 8: [2023-05-25 13:38:03,092] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_04_optim_states.pt... 8: [2023-05-25 13:38:03,092] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_04_optim_states.pt... 11: [2023-05-25 13:38:03,092] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt... 2: [2023-05-25 13:38:03,093] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt... 10: [2023-05-25 13:38:03,093] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt... 6: [2023-05-25 13:38:03,094] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 1: [2023-05-25 13:38:03,095] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 1: [2023-05-25 13:38:03,095] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 0: [2023-05-25 13:38:03,095] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 6: [2023-05-25 13:38:03,096] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt... 3: [2023-05-25 13:38:03,097] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt. 3: [2023-05-25 13:38:03,097] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt. 23: [2023-05-25 13:38:03,099] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_08_optim_states.pt... 23: [2023-05-25 13:38:03,099] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_08_optim_states.pt... 0: [2023-05-25 13:38:03,099] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt... 11: [2023-05-25 13:38:03,099] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt. 11: [2023-05-25 13:38:03,099] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_03-model_states.pt. 7: [2023-05-25 13:38:03,099] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 19: [2023-05-25 13:38:03,100] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 16: [2023-05-25 13:38:03,100] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_08_optim_states.pt... 16: [2023-05-25 13:38:03,100] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_08_optim_states.pt... 17: [2023-05-25 13:38:03,100] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 7: [2023-05-25 13:38:03,101] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 12: [2023-05-25 13:38:03,101] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 5: [2023-05-25 13:38:03,101] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 22: [2023-05-25 13:38:03,101] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_08_optim_states.pt... 22: [2023-05-25 13:38:03,101] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_08_optim_states.pt... 5: [2023-05-25 13:38:03,102] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 14: [2023-05-25 13:38:03,102] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 0: [2023-05-25 13:38:03,102] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt. 15: [2023-05-25 13:38:03,102] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 17: [2023-05-25 13:38:03,102] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt... 15: [2023-05-25 13:38:03,102] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 14: [2023-05-25 13:38:03,103] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 16: [2023-05-25 13:38:03,104] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 9: [2023-05-25 13:38:03,104] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt. 19: [2023-05-25 13:38:03,105] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 17: [2023-05-25 13:38:03,105] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 17: [2023-05-25 13:38:03,105] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 12: [2023-05-25 13:38:03,105] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 9: [2023-05-25 13:38:03,105] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_23-model_02-model_states.pt. 17: [2023-05-25 13:38:03,105] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt. 17: [2023-05-25 13:38:03,106] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt. 13: [2023-05-25 13:38:03,106] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 13: [2023-05-25 13:38:03,107] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 16: [2023-05-25 13:38:03,107] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 22: [2023-05-25 13:38:03,108] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 13: [2023-05-25 13:38:03,108] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt... 16: [2023-05-25 13:38:03,108] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt... 0: [2023-05-25 13:38:03,109] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt. 5: [2023-05-25 13:38:03,109] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 9: [2023-05-25 13:38:03,109] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 16: [2023-05-25 13:38:03,109] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt... 3: [2023-05-25 13:38:03,109] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 9: [2023-05-25 13:38:03,110] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 0: [2023-05-25 13:38:03,110] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 8: [2023-05-25 13:38:03,110] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 8: [2023-05-25 13:38:03,110] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 22: [2023-05-25 13:38:03,111] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt... 3: [2023-05-25 13:38:03,110] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 22: [2023-05-25 13:38:03,111] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 13: [2023-05-25 13:38:03,112] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 17: [2023-05-25 13:38:03,112] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 13: [2023-05-25 13:38:03,113] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 13: [2023-05-25 13:38:03,113] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 11: [2023-05-25 13:38:03,113] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 0: [2023-05-25 13:38:03,113] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 0: [2023-05-25 13:38:03,114] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt... 5: [2023-05-25 13:38:03,114] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 22: [2023-05-25 13:38:03,114] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt... 5: [2023-05-25 13:38:03,114] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt... 1: [2023-05-25 13:38:03,115] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 11: [2023-05-25 13:38:03,115] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 13: [2023-05-25 13:38:03,115] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt... 13: [2023-05-25 13:38:03,116] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt... 13: [2023-05-25 13:38:03,116] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt... 13: [2023-05-25 13:38:03,116] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt... 5: [2023-05-25 13:38:03,116] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt... 17: [2023-05-25 13:38:03,116] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt... 9: [2023-05-25 13:38:03,117] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 9: [2023-05-25 13:38:03,117] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 1: [2023-05-25 13:38:03,117] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt... 8: [2023-05-25 13:38:03,118] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt... 8: [2023-05-25 13:38:03,118] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt... 9: [2023-05-25 13:38:03,119] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt... 10: [2023-05-25 13:38:03,119] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 10: [2023-05-25 13:38:03,119] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 27: [2023-05-25 13:38:03,120] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_12_optim_states.pt. 1: [2023-05-25 13:38:03,120] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_00-model_states.pt. 27: [2023-05-25 13:38:03,120] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 220 17: [2023-05-25 13:38:03,121] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 17: [2023-05-25 13:38:03,121] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 12: [2023-05-25 13:38:03,121] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 19: [2023-05-25 13:38:03,122] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 0: [2023-05-25 13:38:03,122] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 14: [2023-05-25 13:38:03,123] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 1: [2023-05-25 13:38:03,123] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt... 31: [2023-05-25 13:38:03,123] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_12_optim_states.pt. 31: [2023-05-25 13:38:03,123] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 248 19: [2023-05-25 13:38:03,124] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt... 28: [2023-05-25 13:38:03,124] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_12_optim_states.pt. 28: [2023-05-25 13:38:03,124] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 224 5: [2023-05-25 13:38:03,124] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 5: [2023-05-25 13:38:03,125] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 0: [2023-05-25 13:38:03,125] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 0: [2023-05-25 13:38:03,125] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 13: [2023-05-25 13:38:03,125] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 10: [2023-05-25 13:38:03,126] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt... 19: [2023-05-25 13:38:03,126] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 10: [2023-05-25 13:38:03,126] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt... 0: [2023-05-25 13:38:03,126] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 5: [2023-05-25 13:38:03,126] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 5: [2023-05-25 13:38:03,126] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 12: [2023-05-25 13:38:03,126] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt... 9: [2023-05-25 13:38:03,127] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 13: [2023-05-25 13:38:03,128] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt... 6: [2023-05-25 13:38:03,128] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt. 6: [2023-05-25 13:38:03,128] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt. 19: [2023-05-25 13:38:03,129] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt... 0: [2023-05-25 13:38:03,129] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 4: [2023-05-25 13:38:03,130] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 19: [2023-05-25 13:38:03,130] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 4: [2023-05-25 13:38:03,130] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 21: [2023-05-25 13:38:03,130] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt. 21: [2023-05-25 13:38:03,130] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt. 31: [2023-05-25 13:38:03,130] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_12_optim_states.pt. 31: [2023-05-25 13:38:03,130] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 252 15: [2023-05-25 13:38:03,131] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 14: [2023-05-25 13:38:03,132] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 4: [2023-05-25 13:38:03,132] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt. 4: [2023-05-25 13:38:03,132] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt. 19: [2023-05-25 13:38:03,132] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt... 4: [2023-05-25 13:38:03,132] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 4: [2023-05-25 13:38:03,133] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 15: [2023-05-25 13:38:03,134] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt... 15: [2023-05-25 13:38:03,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 17: [2023-05-25 13:38:03,136] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_08_optim_states.pt... 17: [2023-05-25 13:38:03,136] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_08_optim_states.pt... 4: [2023-05-25 13:38:03,136] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt. 4: [2023-05-25 13:38:03,136] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt. 7: [2023-05-25 13:38:03,137] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 7: [2023-05-25 13:38:03,137] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 15: [2023-05-25 13:38:03,137] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt... 14: [2023-05-25 13:38:03,138] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt... 14: [2023-05-25 13:38:03,138] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt... 7: [2023-05-25 13:38:03,138] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 19: [2023-05-25 13:38:03,137] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 7: [2023-05-25 13:38:03,138] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 7: [2023-05-25 13:38:03,138] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 3: [2023-05-25 13:38:03,139] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 3: [2023-05-25 13:38:03,139] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 3: [2023-05-25 13:38:03,139] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 19: [2023-05-25 13:38:03,139] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt... 7: [2023-05-25 13:38:03,139] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 5: [2023-05-25 13:38:03,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 5: [2023-05-25 13:38:03,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 9: [2023-05-25 13:38:03,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 12: [2023-05-25 13:38:03,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 7: [2023-05-25 13:38:03,141] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt... 7: [2023-05-25 13:38:03,141] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt... 9: [2023-05-25 13:38:03,141] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt... 3: [2023-05-25 13:38:03,141] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 3: [2023-05-25 13:38:03,141] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 6: [2023-05-25 13:38:03,142] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 6: [2023-05-25 13:38:03,142] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 3: [2023-05-25 13:38:03,142] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt... 5: [2023-05-25 13:38:03,143] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt... 5: [2023-05-25 13:38:03,143] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt... 12: [2023-05-25 13:38:03,143] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt... 11: [2023-05-25 13:38:03,143] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 21: [2023-05-25 13:38:03,144] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 21: [2023-05-25 13:38:03,144] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 11: [2023-05-25 13:38:03,145] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 0: [2023-05-25 13:38:03,145] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt. 9: [2023-05-25 13:38:03,145] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_04_optim_states.pt... 9: [2023-05-25 13:38:03,145] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_04_optim_states.pt... 0: [2023-05-25 13:38:03,145] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt. 11: [2023-05-25 13:38:03,145] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt... 4: [2023-05-25 13:38:03,146] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 4: [2023-05-25 13:38:03,146] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 11: [2023-05-25 13:38:03,147] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt... 3: [2023-05-25 13:38:03,147] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 9: [2023-05-25 13:38:03,148] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 26: [2023-05-25 13:38:03,148] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_12_optim_states.pt. 26: [2023-05-25 13:38:03,149] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 212 3: [2023-05-25 13:38:03,149] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt... 9: [2023-05-25 13:38:03,150] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt... 4: [2023-05-25 13:38:03,151] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 4: [2023-05-25 13:38:03,151] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 17: [2023-05-25 13:38:03,151] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 17: [2023-05-25 13:38:03,151] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 17: [2023-05-25 13:38:03,155] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt... 17: [2023-05-25 13:38:03,155] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt... 9: [2023-05-25 13:38:03,155] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 5: [2023-05-25 13:38:03,155] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 29: [2023-05-25 13:38:03,157] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_12_optim_states.pt. 29: [2023-05-25 13:38:03,157] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 232 9: [2023-05-25 13:38:03,158] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt... 24: [2023-05-25 13:38:03,159] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_12_optim_states.pt. 24: [2023-05-25 13:38:03,160] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 192 9: [2023-05-25 13:38:03,160] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_00-model_states.pt. 5: [2023-05-25 13:38:03,160] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 0: [2023-05-25 13:38:03,161] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 23: [2023-05-25 13:38:03,161] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt. 7: [2023-05-25 13:38:03,160] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 23: [2023-05-25 13:38:03,161] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt. 21: [2023-05-25 13:38:03,161] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 21: [2023-05-25 13:38:03,161] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 21: [2023-05-25 13:38:03,161] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 21: [2023-05-25 13:38:03,162] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 16: [2023-05-25 13:38:03,162] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt. 0: [2023-05-25 13:38:03,162] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 9: [2023-05-25 13:38:03,162] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt... 0: [2023-05-25 13:38:03,163] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 6: [2023-05-25 13:38:03,163] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 6: [2023-05-25 13:38:03,163] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 0: [2023-05-25 13:38:03,163] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt... 22: [2023-05-25 13:38:03,164] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt. 22: [2023-05-25 13:38:03,164] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt. 21: [2023-05-25 13:38:03,163] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 21: [2023-05-25 13:38:03,163] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 22: [2023-05-25 13:38:03,164] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt. 22: [2023-05-25 13:38:03,164] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt. 0: [2023-05-25 13:38:03,164] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 21: [2023-05-25 13:38:03,164] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt... 21: [2023-05-25 13:38:03,164] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt... 20: [2023-05-25 13:38:03,164] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt. 20: [2023-05-25 13:38:03,165] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt. 6: [2023-05-25 13:38:03,165] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 16: [2023-05-25 13:38:03,165] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt. 6: [2023-05-25 13:38:03,165] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 0: [2023-05-25 13:38:03,166] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 1: [2023-05-25 13:38:03,166] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt. 1: [2023-05-25 13:38:03,166] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt. 18: [2023-05-25 13:38:03,169] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt. 5: [2023-05-25 13:38:03,169] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 3: [2023-05-25 13:38:03,169] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 3: [2023-05-25 13:38:03,170] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 18: [2023-05-25 13:38:03,171] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt. 23: [2023-05-25 13:38:03,171] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt. 23: [2023-05-25 13:38:03,172] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt. 5: [2023-05-25 13:38:03,172] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 4: [2023-05-25 13:38:03,173] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 4: [2023-05-25 13:38:03,173] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 0: [2023-05-25 13:38:03,174] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 4: [2023-05-25 13:38:03,175] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 23: [2023-05-25 13:38:03,175] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 0: [2023-05-25 13:38:03,175] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 21: [2023-05-25 13:38:03,175] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 21: [2023-05-25 13:38:03,175] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 7: [2023-05-25 13:38:03,175] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 23: [2023-05-25 13:38:03,176] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 16: [2023-05-25 13:38:03,176] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 7: [2023-05-25 13:38:03,177] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 4: [2023-05-25 13:38:03,177] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 4: [2023-05-25 13:38:03,177] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt... 21: [2023-05-25 13:38:03,178] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt... 21: [2023-05-25 13:38:03,178] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt... 20: [2023-05-25 13:38:03,178] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 0: [2023-05-25 13:38:03,178] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt... 20: [2023-05-25 13:38:03,179] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 16: [2023-05-25 13:38:03,179] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 4: [2023-05-25 13:38:03,179] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt... 22: [2023-05-25 13:38:03,179] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 6: [2023-05-25 13:38:03,179] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 6: [2023-05-25 13:38:03,179] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 22: [2023-05-25 13:38:03,180] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 22: [2023-05-25 13:38:03,180] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 18: [2023-05-25 13:38:03,181] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 22: [2023-05-25 13:38:03,181] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 6: [2023-05-25 13:38:03,182] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt... 6: [2023-05-25 13:38:03,182] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt... 0: [2023-05-25 13:38:03,182] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 4: [2023-05-25 13:38:03,182] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 3: [2023-05-25 13:38:03,183] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 23: [2023-05-25 13:38:03,184] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 1: [2023-05-25 13:38:03,184] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 3: [2023-05-25 13:38:03,184] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 4: [2023-05-25 13:38:03,185] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt... 18: [2023-05-25 13:38:03,185] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 1: [2023-05-25 13:38:03,185] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 2: [2023-05-25 13:38:03,186] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt. 1: [2023-05-25 13:38:03,186] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 1: [2023-05-25 13:38:03,186] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 1: [2023-05-25 13:38:03,186] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 1: [2023-05-25 13:38:03,186] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 2: [2023-05-25 13:38:03,186] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_03-model_states.pt. 23: [2023-05-25 13:38:03,186] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 24: [2023-05-25 13:38:03,187] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_12_optim_states.pt. 4: [2023-05-25 13:38:03,187] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 24: [2023-05-25 13:38:03,188] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 196 4: [2023-05-25 13:38:03,187] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 28: [2023-05-25 13:38:03,189] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_12_optim_states.pt. 2: [2023-05-25 13:38:03,189] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt. 1: [2023-05-25 13:38:03,189] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 28: [2023-05-25 13:38:03,189] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 228 2: [2023-05-25 13:38:03,189] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt. 1: [2023-05-25 13:38:03,189] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 4: [2023-05-25 13:38:03,190] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 4: [2023-05-25 13:38:03,190] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt... 1: [2023-05-25 13:38:03,190] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt... 1: [2023-05-25 13:38:03,190] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt... 21: [2023-05-25 13:38:03,191] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 7: [2023-05-25 13:38:03,191] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 6: [2023-05-25 13:38:03,194] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 21: [2023-05-25 13:38:03,194] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 7: [2023-05-25 13:38:03,194] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt. 7: [2023-05-25 13:38:03,195] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt. 21: [2023-05-25 13:38:03,195] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt. 21: [2023-05-25 13:38:03,195] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_02-model_states.pt. 6: [2023-05-25 13:38:03,196] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 19: [2023-05-25 13:38:03,200] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt. 5: [2023-05-25 13:38:03,200] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt. 2: [2023-05-25 13:38:03,200] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 19: [2023-05-25 13:38:03,201] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt. 0: [2023-05-25 13:38:03,201] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 5: [2023-05-25 13:38:03,201] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt. 2: [2023-05-25 13:38:03,202] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 18: [2023-05-25 13:38:03,204] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 0: [2023-05-25 13:38:03,204] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt... 2: [2023-05-25 13:38:03,205] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 18: [2023-05-25 13:38:03,205] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt. 18: [2023-05-25 13:38:03,205] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_03-model_states.pt. 18: [2023-05-25 13:38:03,205] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt... 0: [2023-05-25 13:38:03,206] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 16: [2023-05-25 13:38:03,206] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 0: [2023-05-25 13:38:03,208] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt... 2: [2023-05-25 13:38:03,208] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 7: [2023-05-25 13:38:03,208] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 20: [2023-05-25 13:38:03,208] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 20: [2023-05-25 13:38:03,208] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 23: [2023-05-25 13:38:03,208] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 6: [2023-05-25 13:38:03,209] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 16: [2023-05-25 13:38:03,209] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 23: [2023-05-25 13:38:03,210] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt... 7: [2023-05-25 13:38:03,210] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 20: [2023-05-25 13:38:03,211] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt... 20: [2023-05-25 13:38:03,211] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt... 21: [2023-05-25 13:38:03,211] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 16: [2023-05-25 13:38:03,211] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt... 21: [2023-05-25 13:38:03,211] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 6: [2023-05-25 13:38:03,211] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 16: [2023-05-25 13:38:03,212] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt... 22: [2023-05-25 13:38:03,212] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 19: [2023-05-25 13:38:03,213] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 5: [2023-05-25 13:38:03,213] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 22: [2023-05-25 13:38:03,213] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 22: [2023-05-25 13:38:03,213] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 22: [2023-05-25 13:38:03,214] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt... 16: [2023-05-25 13:38:03,214] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt. 26: [2023-05-25 13:38:03,215] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_12_optim_states.pt. 16: [2023-05-25 13:38:03,215] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt. 26: [2023-05-25 13:38:03,215] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 208 5: [2023-05-25 13:38:03,215] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 23: [2023-05-25 13:38:03,216] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 22: [2023-05-25 13:38:03,216] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt... 19: [2023-05-25 13:38:03,216] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 22: [2023-05-25 13:38:03,217] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt... 1: [2023-05-25 13:38:03,217] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 1: [2023-05-25 13:38:03,217] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 20: [2023-05-25 13:38:03,217] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt. 23: [2023-05-25 13:38:03,217] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 18: [2023-05-25 13:38:03,218] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 20: [2023-05-25 13:38:03,218] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt. 23: [2023-05-25 13:38:03,219] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt... 22: [2023-05-25 13:38:03,219] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 23: [2023-05-25 13:38:03,220] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt... 23: [2023-05-25 13:38:03,220] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 18: [2023-05-25 13:38:03,220] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 1: [2023-05-25 13:38:03,220] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt... 1: [2023-05-25 13:38:03,220] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt... 1: [2023-05-25 13:38:03,221] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 3: [2023-05-25 13:38:03,222] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt. 22: [2023-05-25 13:38:03,222] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt... 21: [2023-05-25 13:38:03,222] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_08_optim_states.pt... 21: [2023-05-25 13:38:03,222] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_08_optim_states.pt... 3: [2023-05-25 13:38:03,222] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt. 18: [2023-05-25 13:38:03,222] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 23: [2023-05-25 13:38:03,222] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt... 1: [2023-05-25 13:38:03,225] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 18: [2023-05-25 13:38:03,225] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt... 6: [2023-05-25 13:38:03,226] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt. 6: [2023-05-25 13:38:03,227] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_01-model_states.pt. 2: [2023-05-25 13:38:03,227] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 2: [2023-05-25 13:38:03,227] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 16: [2023-05-25 13:38:03,228] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 2: [2023-05-25 13:38:03,229] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 2: [2023-05-25 13:38:03,229] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 2: [2023-05-25 13:38:03,230] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 2: [2023-05-25 13:38:03,230] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 20: [2023-05-25 13:38:03,231] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 20: [2023-05-25 13:38:03,232] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 16: [2023-05-25 13:38:03,232] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 2: [2023-05-25 13:38:03,232] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt... 2: [2023-05-25 13:38:03,232] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt... 18: [2023-05-25 13:38:03,233] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 18: [2023-05-25 13:38:03,233] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt... 25: [2023-05-25 13:38:03,234] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_12_optim_states.pt. 25: [2023-05-25 13:38:03,234] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 204 21: [2023-05-25 13:38:03,236] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 3: [2023-05-25 13:38:03,237] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 7: [2023-05-25 13:38:03,237] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 3: [2023-05-25 13:38:03,237] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 21: [2023-05-25 13:38:03,238] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt... 2: [2023-05-25 13:38:03,238] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 5: [2023-05-25 13:38:03,238] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 7: [2023-05-25 13:38:03,239] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt... 1: [2023-05-25 13:38:03,239] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 7: [2023-05-25 13:38:03,239] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 21: [2023-05-25 13:38:03,240] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 5: [2023-05-25 13:38:03,240] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt... 6: [2023-05-25 13:38:03,240] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 2: [2023-05-25 13:38:03,240] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt... 21: [2023-05-25 13:38:03,241] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt... 2: [2023-05-25 13:38:03,240] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 6: [2023-05-25 13:38:03,241] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 1: [2023-05-25 13:38:03,241] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 19: [2023-05-25 13:38:03,241] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 7: [2023-05-25 13:38:03,242] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt... 2: [2023-05-25 13:38:03,243] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt... 5: [2023-05-25 13:38:03,246] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 19: [2023-05-25 13:38:03,246] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt... 19: [2023-05-25 13:38:03,247] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 20: [2023-05-25 13:38:03,247] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 5: [2023-05-25 13:38:03,248] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt... 19: [2023-05-25 13:38:03,249] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt... 17: [2023-05-25 13:38:03,251] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt. 17: [2023-05-25 13:38:03,251] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_34-model_01-model_states.pt. 20: [2023-05-25 13:38:03,254] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt... 18: [2023-05-25 13:38:03,256] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 18: [2023-05-25 13:38:03,258] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt... 20: [2023-05-25 13:38:03,260] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 16: [2023-05-25 13:38:03,261] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 2: [2023-05-25 13:38:03,261] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 2: [2023-05-25 13:38:03,261] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 20: [2023-05-25 13:38:03,262] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt... 17: [2023-05-25 13:38:03,266] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 17: [2023-05-25 13:38:03,266] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt... 16: [2023-05-25 13:38:03,265] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt... 3: [2023-05-25 13:38:03,266] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 3: [2023-05-25 13:38:03,268] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 3: [2023-05-25 13:38:03,269] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt... 16: [2023-05-25 13:38:03,270] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 6: [2023-05-25 13:38:03,271] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 6: [2023-05-25 13:38:03,271] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 3: [2023-05-25 13:38:03,271] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt... 16: [2023-05-25 13:38:03,272] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt... 6: [2023-05-25 13:38:03,273] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt... 6: [2023-05-25 13:38:03,273] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt... 12: [2023-05-25 13:38:03,274] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt. 12: [2023-05-25 13:38:03,275] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt. 2: [2023-05-25 13:38:03,275] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 2: [2023-05-25 13:38:03,276] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 17: [2023-05-25 13:38:03,289] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 17: [2023-05-25 13:38:03,291] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt... 30: [2023-05-25 13:38:03,294] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_12_optim_states.pt. 30: [2023-05-25 13:38:03,294] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 240 18: [2023-05-25 13:38:03,295] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt. 18: [2023-05-25 13:38:03,295] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt. 17: [2023-05-25 13:38:03,296] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_00-model_states.pt. 17: [2023-05-25 13:38:03,298] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt... 12: [2023-05-25 13:38:03,298] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_05_optim_states.pt... 12: [2023-05-25 13:38:03,298] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_05_optim_states.pt... 4: [2023-05-25 13:38:03,301] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt. 4: [2023-05-25 13:38:03,301] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt. 25: [2023-05-25 13:38:03,302] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_12_optim_states.pt. 25: [2023-05-25 13:38:03,302] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 200 8: [2023-05-25 13:38:03,305] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt. 8: [2023-05-25 13:38:03,305] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt. 4: [2023-05-25 13:38:03,314] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 4: [2023-05-25 13:38:03,314] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 14: [2023-05-25 13:38:03,317] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt. 14: [2023-05-25 13:38:03,318] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt. 20: [2023-05-25 13:38:03,325] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt. 20: [2023-05-25 13:38:03,325] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt. 9: [2023-05-25 13:38:03,325] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt. 9: [2023-05-25 13:38:03,326] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt. 3: [2023-05-25 13:38:03,327] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt. 3: [2023-05-25 13:38:03,327] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt. 15: [2023-05-25 13:38:03,333] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt. 15: [2023-05-25 13:38:03,334] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt. 6: [2023-05-25 13:38:03,338] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt. 7: [2023-05-25 13:38:03,338] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt. 7: [2023-05-25 13:38:03,338] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt. 15: [2023-05-25 13:38:03,338] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt. 6: [2023-05-25 13:38:03,339] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt. 8: [2023-05-25 13:38:03,340] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_05_optim_states.pt... 8: [2023-05-25 13:38:03,340] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_05_optim_states.pt... 3: [2023-05-25 13:38:03,340] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 15: [2023-05-25 13:38:03,340] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt. 14: [2023-05-25 13:38:03,341] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_05_optim_states.pt... 14: [2023-05-25 13:38:03,341] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_05_optim_states.pt... 3: [2023-05-25 13:38:03,342] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 4: [2023-05-25 13:38:03,345] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 18: [2023-05-25 13:38:03,347] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_10_optim_states.pt... 18: [2023-05-25 13:38:03,347] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_10_optim_states.pt... 10: [2023-05-25 13:38:03,347] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt. 4: [2023-05-25 13:38:03,347] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 10: [2023-05-25 13:38:03,347] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt. 4: [2023-05-25 13:38:03,348] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt... 4: [2023-05-25 13:38:03,349] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt... 7: [2023-05-25 13:38:03,351] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 7: [2023-05-25 13:38:03,354] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 6: [2023-05-25 13:38:03,355] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 6: [2023-05-25 13:38:03,355] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 20: [2023-05-25 13:38:03,356] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_10_optim_states.pt... 20: [2023-05-25 13:38:03,356] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_10_optim_states.pt... 11: [2023-05-25 13:38:03,357] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt. 9: [2023-05-25 13:38:03,359] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_05_optim_states.pt... 9: [2023-05-25 13:38:03,359] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_05_optim_states.pt... 3: [2023-05-25 13:38:03,359] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt. 11: [2023-05-25 13:38:03,359] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt. 3: [2023-05-25 13:38:03,360] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt. 7: [2023-05-25 13:38:03,363] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt. 7: [2023-05-25 13:38:03,364] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt. 3: [2023-05-25 13:38:03,364] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 3: [2023-05-25 13:38:03,366] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt... 17: [2023-05-25 13:38:03,370] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt. 17: [2023-05-25 13:38:03,370] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt. 3: [2023-05-25 13:38:03,372] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 0: [2023-05-25 13:38:03,372] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt. 10: [2023-05-25 13:38:03,373] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_05_optim_states.pt... 10: [2023-05-25 13:38:03,373] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_05_optim_states.pt... 0: [2023-05-25 13:38:03,373] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt. 5: [2023-05-25 13:38:03,374] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt. 3: [2023-05-25 13:38:03,374] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 5: [2023-05-25 13:38:03,375] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt. 11: [2023-05-25 13:38:03,376] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt. 11: [2023-05-25 13:38:03,376] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt. 11: [2023-05-25 13:38:03,376] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_05_optim_states.pt... 7: [2023-05-25 13:38:03,377] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 7: [2023-05-25 13:38:03,377] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 3: [2023-05-25 13:38:03,377] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 7: [2023-05-25 13:38:03,378] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 15: [2023-05-25 13:38:03,378] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_06_optim_states.pt... 15: [2023-05-25 13:38:03,378] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_06_optim_states.pt... 15: [2023-05-25 13:38:03,378] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_05_optim_states.pt... 15: [2023-05-25 13:38:03,378] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_05_optim_states.pt... 10: [2023-05-25 13:38:03,378] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt. 10: [2023-05-25 13:38:03,378] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt. 7: [2023-05-25 13:38:03,379] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt... 3: [2023-05-25 13:38:03,379] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt... 11: [2023-05-25 13:38:03,380] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_05_optim_states.pt... 23: [2023-05-25 13:38:03,381] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt. 6: [2023-05-25 13:38:03,383] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 19: [2023-05-25 13:38:03,387] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt. 19: [2023-05-25 13:38:03,387] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt. 5: [2023-05-25 13:38:03,388] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 0: [2023-05-25 13:38:03,388] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 5: [2023-05-25 13:38:03,388] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 6: [2023-05-25 13:38:03,388] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 7: [2023-05-25 13:38:03,388] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 0: [2023-05-25 13:38:03,389] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 2: [2023-05-25 13:38:03,389] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt. 2: [2023-05-25 13:38:03,390] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt. 7: [2023-05-25 13:38:03,390] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt... 17: [2023-05-25 13:38:03,392] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_10_optim_states.pt... 17: [2023-05-25 13:38:03,392] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_10_optim_states.pt... 6: [2023-05-25 13:38:03,393] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt... 6: [2023-05-25 13:38:03,393] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt... 16: [2023-05-25 13:38:03,393] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt. 23: [2023-05-25 13:38:03,395] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt. 16: [2023-05-25 13:38:03,395] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt. 13: [2023-05-25 13:38:03,399] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt. 4: [2023-05-25 13:38:03,400] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt. 13: [2023-05-25 13:38:03,400] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_01-model_states.pt. 17: [2023-05-25 13:38:03,400] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt. 4: [2023-05-25 13:38:03,400] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt. 17: [2023-05-25 13:38:03,400] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt. 2: [2023-05-25 13:38:03,403] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 2: [2023-05-25 13:38:03,404] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 13: [2023-05-25 13:38:03,404] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt. 13: [2023-05-25 13:38:03,404] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt. 12: [2023-05-25 13:38:03,405] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt. 1: [2023-05-25 13:38:03,406] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt. 1: [2023-05-25 13:38:03,406] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt. 11: [2023-05-25 13:38:03,408] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_06_optim_states.pt... 11: [2023-05-25 13:38:03,408] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_06_optim_states.pt... 12: [2023-05-25 13:38:03,408] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt. 19: [2023-05-25 13:38:03,410] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt. 4: [2023-05-25 13:38:03,413] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 0: [2023-05-25 13:38:03,413] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 4: [2023-05-25 13:38:03,413] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 19: [2023-05-25 13:38:03,413] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt. 0: [2023-05-25 13:38:03,414] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt... 16: [2023-05-25 13:38:03,416] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_10_optim_states.pt... 16: [2023-05-25 13:38:03,416] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_10_optim_states.pt... 22: [2023-05-25 13:38:03,416] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt. 22: [2023-05-25 13:38:03,416] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt. 10: [2023-05-25 13:38:03,419] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_06_optim_states.pt... 10: [2023-05-25 13:38:03,419] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_06_optim_states.pt... 5: [2023-05-25 13:38:03,421] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt. 5: [2023-05-25 13:38:03,421] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_11-model_02-model_states.pt. 1: [2023-05-25 13:38:03,422] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 16: [2023-05-25 13:38:03,422] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt. 21: [2023-05-25 13:38:03,422] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt. 21: [2023-05-25 13:38:03,422] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt. 21: [2023-05-25 13:38:03,423] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt. 21: [2023-05-25 13:38:03,423] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_02-model_states.pt. 16: [2023-05-25 13:38:03,423] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt. 1: [2023-05-25 13:38:03,423] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 0: [2023-05-25 13:38:03,424] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt. 0: [2023-05-25 13:38:03,424] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt. 0: [2023-05-25 13:38:03,425] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 0: [2023-05-25 13:38:03,428] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt... 6: [2023-05-25 13:38:03,428] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt. 6: [2023-05-25 13:38:03,429] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt. 17: [2023-05-25 13:38:03,431] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_11_optim_states.pt... 17: [2023-05-25 13:38:03,431] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_11_optim_states.pt... 13: [2023-05-25 13:38:03,431] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_05_optim_states.pt... 13: [2023-05-25 13:38:03,431] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_05_optim_states.pt... 23: [2023-05-25 13:38:03,432] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_10_optim_states.pt... 23: [2023-05-25 13:38:03,432] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_10_optim_states.pt... 2: [2023-05-25 13:38:03,432] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 5: [2023-05-25 13:38:03,434] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 5: [2023-05-25 13:38:03,434] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt... 2: [2023-05-25 13:38:03,435] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt... 14: [2023-05-25 13:38:03,435] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt. 13: [2023-05-25 13:38:03,435] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_06_optim_states.pt... 13: [2023-05-25 13:38:03,435] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_06_optim_states.pt... 14: [2023-05-25 13:38:03,435] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt. 4: [2023-05-25 13:38:03,437] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt. 4: [2023-05-25 13:38:03,437] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt. 0: [2023-05-25 13:38:03,438] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 0: [2023-05-25 13:38:03,439] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 2: [2023-05-25 13:38:03,439] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 19: [2023-05-25 13:38:03,441] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_11_optim_states.pt... 19: [2023-05-25 13:38:03,441] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_11_optim_states.pt... 2: [2023-05-25 13:38:03,441] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt... 6: [2023-05-25 13:38:03,442] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 6: [2023-05-25 13:38:03,443] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 19: [2023-05-25 13:38:03,443] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_10_optim_states.pt... 19: [2023-05-25 13:38:03,443] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_10_optim_states.pt... 1: [2023-05-25 13:38:03,443] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 8: [2023-05-25 13:38:03,444] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt. 8: [2023-05-25 13:38:03,445] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt. 1: [2023-05-25 13:38:03,445] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt... 4: [2023-05-25 13:38:03,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 7: [2023-05-25 13:38:03,451] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt. 7: [2023-05-25 13:38:03,451] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt. 1: [2023-05-25 13:38:03,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 4: [2023-05-25 13:38:03,452] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 1: [2023-05-25 13:38:03,455] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt... 21: [2023-05-25 13:38:03,457] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_10_optim_states.pt... 21: [2023-05-25 13:38:03,457] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_10_optim_states.pt... 21: [2023-05-25 13:38:03,457] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_09_optim_states.pt... 21: [2023-05-25 13:38:03,457] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_09_optim_states.pt... 12: [2023-05-25 13:38:03,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_06_optim_states.pt... 12: [2023-05-25 13:38:03,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_06_optim_states.pt... 0: [2023-05-25 13:38:03,460] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt. 0: [2023-05-25 13:38:03,461] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt. 1: [2023-05-25 13:38:03,461] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt. 1: [2023-05-25 13:38:03,461] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt. 5: [2023-05-25 13:38:03,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 5: [2023-05-25 13:38:03,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 5: [2023-05-25 13:38:03,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 18: [2023-05-25 13:38:03,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt. 5: [2023-05-25 13:38:03,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 5: [2023-05-25 13:38:03,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 12: [2023-05-25 13:38:03,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt. 25: [2023-05-25 13:38:03,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_14_optim_states.pt. 12: [2023-05-25 13:38:03,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt. 30: [2023-05-25 13:38:03,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_14_optim_states.pt. 25: [2023-05-25 13:38:03,464] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 202 30: [2023-05-25 13:38:03,464] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 242 5: [2023-05-25 13:38:03,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_00-model_states.pt. 5: [2023-05-25 13:38:03,464] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 5: [2023-05-25 13:38:03,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt... 5: [2023-05-25 13:38:03,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt... 5: [2023-05-25 13:38:03,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 5: [2023-05-25 13:38:03,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt... 16: [2023-05-25 13:38:03,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_11_optim_states.pt... 16: [2023-05-25 13:38:03,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_11_optim_states.pt... 18: [2023-05-25 13:38:03,466] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt. 5: [2023-05-25 13:38:03,466] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt... 7: [2023-05-25 13:38:03,466] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 23: [2023-05-25 13:38:03,467] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt. 13: [2023-05-25 13:38:03,467] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt. 13: [2023-05-25 13:38:03,467] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt. 23: [2023-05-25 13:38:03,467] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt. 7: [2023-05-25 13:38:03,468] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 22: [2023-05-25 13:38:03,468] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_10_optim_states.pt... 22: [2023-05-25 13:38:03,468] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_10_optim_states.pt... 9: [2023-05-25 13:38:03,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt. 9: [2023-05-25 13:38:03,472] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_02-model_states.pt. 0: [2023-05-25 13:38:03,474] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 21: [2023-05-25 13:38:03,474] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt. 21: [2023-05-25 13:38:03,474] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt. 0: [2023-05-25 13:38:03,477] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 8: [2023-05-25 13:38:03,477] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt. 1: [2023-05-25 13:38:03,477] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 1: [2023-05-25 13:38:03,477] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 8: [2023-05-25 13:38:03,479] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt. 14: [2023-05-25 13:38:03,479] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_06_optim_states.pt... 14: [2023-05-25 13:38:03,479] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_06_optim_states.pt... 6: [2023-05-25 13:38:03,480] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 6: [2023-05-25 13:38:03,480] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 6: [2023-05-25 13:38:03,480] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 6: [2023-05-25 13:38:03,481] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 22: [2023-05-25 13:38:03,481] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt. 22: [2023-05-25 13:38:03,481] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt. 6: [2023-05-25 13:38:03,482] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 6: [2023-05-25 13:38:03,482] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 14: [2023-05-25 13:38:03,483] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt. 14: [2023-05-25 13:38:03,484] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt. 6: [2023-05-25 13:38:03,484] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt... 6: [2023-05-25 13:38:03,484] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt... 12: [2023-05-25 13:38:03,485] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_07_optim_states.pt... 12: [2023-05-25 13:38:03,485] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_07_optim_states.pt... 25: [2023-05-25 13:38:03,485] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_15_optim_states.pt. 30: [2023-05-25 13:38:03,486] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_12_optim_states.pt. 30: [2023-05-25 13:38:03,486] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 244 5: [2023-05-25 13:38:03,487] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 9: [2023-05-25 13:38:03,487] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt. 9: [2023-05-25 13:38:03,487] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt. 23: [2023-05-25 13:38:03,490] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt. 23: [2023-05-25 13:38:03,490] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt. 25: [2023-05-25 13:38:03,485] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 207 9: [2023-05-25 13:38:03,491] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_06_optim_states.pt... 9: [2023-05-25 13:38:03,492] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_06_optim_states.pt... 8: [2023-05-25 13:38:03,492] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_06_optim_states.pt... 8: [2023-05-25 13:38:03,492] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_06_optim_states.pt... 5: [2023-05-25 13:38:03,492] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 23: [2023-05-25 13:38:03,496] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_09_optim_states.pt... 23: [2023-05-25 13:38:03,497] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_09_optim_states.pt... 1: [2023-05-25 13:38:03,498] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt. 2: [2023-05-25 13:38:03,498] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt. 2: [2023-05-25 13:38:03,498] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt. 1: [2023-05-25 13:38:03,498] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt. 10: [2023-05-25 13:38:03,499] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt. 10: [2023-05-25 13:38:03,499] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt. 20: [2023-05-25 13:38:03,500] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt. 20: [2023-05-25 13:38:03,500] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt. 8: [2023-05-25 13:38:03,500] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_07_optim_states.pt... 8: [2023-05-25 13:38:03,500] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_07_optim_states.pt... 2: [2023-05-25 13:38:03,501] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt. 2: [2023-05-25 13:38:03,502] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_03-model_states.pt. 5: [2023-05-25 13:38:03,502] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt. 5: [2023-05-25 13:38:03,502] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt. 21: [2023-05-25 13:38:03,504] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_11_optim_states.pt... 21: [2023-05-25 13:38:03,504] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_11_optim_states.pt... 28: [2023-05-25 13:38:03,505] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_13_optim_states.pt. 28: [2023-05-25 13:38:03,505] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 229 26: [2023-05-25 13:38:03,506] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_13_optim_states.pt. 26: [2023-05-25 13:38:03,506] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 209 31: [2023-05-25 13:38:03,507] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_15_optim_states.pt. 31: [2023-05-25 13:38:03,507] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 255 18: [2023-05-25 13:38:03,508] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt. 6: [2023-05-25 13:38:03,509] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt. 6: [2023-05-25 13:38:03,509] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt. 5: [2023-05-25 13:38:03,509] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_00_optim_states.pt... 5: [2023-05-25 13:38:03,509] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_00_optim_states.pt... 18: [2023-05-25 13:38:03,510] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt. 9: [2023-05-25 13:38:03,510] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_07_optim_states.pt... 6: [2023-05-25 13:38:03,510] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 15: [2023-05-25 13:38:03,510] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt. 15: [2023-05-25 13:38:03,511] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt. 6: [2023-05-25 13:38:03,511] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 1: [2023-05-25 13:38:03,512] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 9: [2023-05-25 13:38:03,512] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_07_optim_states.pt... 2: [2023-05-25 13:38:03,512] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 14: [2023-05-25 13:38:03,512] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_07_optim_states.pt... 14: [2023-05-25 13:38:03,512] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_07_optim_states.pt... 2: [2023-05-25 13:38:03,513] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 1: [2023-05-25 13:38:03,514] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 2: [2023-05-25 13:38:03,515] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 29: [2023-05-25 13:38:03,517] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 236 2: [2023-05-25 13:38:03,517] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 22: [2023-05-25 13:38:03,518] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_09_optim_states.pt... 22: [2023-05-25 13:38:03,518] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_09_optim_states.pt... 5: [2023-05-25 13:38:03,518] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 5: [2023-05-25 13:38:03,519] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 3: [2023-05-25 13:38:03,520] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt. 3: [2023-05-25 13:38:03,520] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_01-model_states.pt. 26: [2023-05-25 13:38:03,522] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_13_optim_states.pt. 26: [2023-05-25 13:38:03,522] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 213 0: [2023-05-25 13:38:03,522] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 0: [2023-05-25 13:38:03,522] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 22: [2023-05-25 13:38:03,523] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt. 22: [2023-05-25 13:38:03,523] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_03-model_states.pt. 0: [2023-05-25 13:38:03,523] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 0: [2023-05-25 13:38:03,523] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 0: [2023-05-25 13:38:03,523] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 0: [2023-05-25 13:38:03,523] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 0: [2023-05-25 13:38:03,523] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 6: [2023-05-25 13:38:03,523] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 10: [2023-05-25 13:38:03,524] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_07_optim_states.pt... 10: [2023-05-25 13:38:03,524] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_07_optim_states.pt... 6: [2023-05-25 13:38:03,524] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 29: [2023-05-25 13:38:03,524] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 232 20: [2023-05-25 13:38:03,524] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_11_optim_states.pt... 20: [2023-05-25 13:38:03,524] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_11_optim_states.pt... 18: [2023-05-25 13:38:03,525] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_09_optim_states.pt... 18: [2023-05-25 13:38:03,525] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_09_optim_states.pt... 0: [2023-05-25 13:38:03,527] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt... 0: [2023-05-25 13:38:03,527] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt... 0: [2023-05-25 13:38:03,527] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt... 0: [2023-05-25 13:38:03,527] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt... 0: [2023-05-25 13:38:03,527] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 18: [2023-05-25 13:38:03,529] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_11_optim_states.pt... 18: [2023-05-25 13:38:03,529] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_11_optim_states.pt... 6: [2023-05-25 13:38:03,534] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_00_optim_states.pt... 6: [2023-05-25 13:38:03,534] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_00_optim_states.pt... 3: [2023-05-25 13:38:03,534] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 3: [2023-05-25 13:38:03,534] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 30: [2023-05-25 13:38:03,541] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 240 0: [2023-05-25 13:38:03,543] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 22: [2023-05-25 13:38:03,544] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_11_optim_states.pt... 22: [2023-05-25 13:38:03,544] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_11_optim_states.pt... 30: [2023-05-25 13:38:03,544] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 244 4: [2023-05-25 13:38:03,546] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 4: [2023-05-25 13:38:03,546] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 4: [2023-05-25 13:38:03,546] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 4: [2023-05-25 13:38:03,546] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 4: [2023-05-25 13:38:03,546] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 4: [2023-05-25 13:38:03,547] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 4: [2023-05-25 13:38:03,548] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt... 5: [2023-05-25 13:38:03,548] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 5: [2023-05-25 13:38:03,548] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 4: [2023-05-25 13:38:03,549] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt... 4: [2023-05-25 13:38:03,549] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 4: [2023-05-25 13:38:03,549] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 4: [2023-05-25 13:38:03,549] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt... 4: [2023-05-25 13:38:03,549] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt... 5: [2023-05-25 13:38:03,552] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt... 5: [2023-05-25 13:38:03,552] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt... 15: [2023-05-25 13:38:03,553] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_07_optim_states.pt... 15: [2023-05-25 13:38:03,553] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_07_optim_states.pt... 23: [2023-05-25 13:38:03,554] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_11_optim_states.pt... 23: [2023-05-25 13:38:03,554] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_11_optim_states.pt... 29: [2023-05-25 13:38:03,557] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_15_optim_states.pt. 29: [2023-05-25 13:38:03,557] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 239 6: [2023-05-25 13:38:03,558] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 6: [2023-05-25 13:38:03,558] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 30: [2023-05-25 13:38:03,560] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_13_optim_states.pt. 30: [2023-05-25 13:38:03,560] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 241 20: [2023-05-25 13:38:03,560] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt. 20: [2023-05-25 13:38:03,560] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt. 6: [2023-05-25 13:38:03,560] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt... 6: [2023-05-25 13:38:03,560] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt... 29: [2023-05-25 13:38:03,561] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_13_optim_states.pt. 29: [2023-05-25 13:38:03,561] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 233 0: [2023-05-25 13:38:03,564] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 0: [2023-05-25 13:38:03,565] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt... 25: [2023-05-25 13:38:03,566] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_14_optim_states.pt. 25: [2023-05-25 13:38:03,566] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 206 4: [2023-05-25 13:38:03,569] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt. 4: [2023-05-25 13:38:03,569] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt. 24: [2023-05-25 13:38:03,570] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 196 28: [2023-05-25 13:38:03,578] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_15_optim_states.pt. 28: [2023-05-25 13:38:03,578] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 231 0: > overriding learning rate value to 0.0002 0: > overriding minimum learning rate value to 2e-05 0: > overriding warmup iterations value to 0 0: > overriding total number of iterations value to 1 0: > overriding decay style value to cosine 0: [2023-05-25 13:38:03,579] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt... 29: [2023-05-25 13:38:03,580] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_14_optim_states.pt. 29: [2023-05-25 13:38:03,580] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 238 4: [2023-05-25 13:38:03,581] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 4: [2023-05-25 13:38:03,581] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 4: [2023-05-25 13:38:03,583] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 4: [2023-05-25 13:38:03,583] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 13: [2023-05-25 13:38:03,584] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_07_optim_states.pt... 13: [2023-05-25 13:38:03,585] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_07_optim_states.pt... 19: [2023-05-25 13:38:03,587] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt. 11: [2023-05-25 13:38:03,588] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt. 19: [2023-05-25 13:38:03,590] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt. 11: [2023-05-25 13:38:03,591] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_24-model_03-model_states.pt. 0: [2023-05-25 13:38:03,593] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt. 0: [2023-05-25 13:38:03,594] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt. 24: [2023-05-25 13:38:03,594] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_14_optim_states.pt. 24: [2023-05-25 13:38:03,594] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 194 24: [2023-05-25 13:38:03,594] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 192 24: [2023-05-25 13:38:03,598] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_15_optim_states.pt. 24: [2023-05-25 13:38:03,598] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 195 31: [2023-05-25 13:38:03,600] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_13_optim_states.pt. 31: [2023-05-25 13:38:03,600] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 249 4: [2023-05-25 13:38:03,602] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 16: [2023-05-25 13:38:03,604] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt. 16: [2023-05-25 13:38:03,604] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt. 4: [2023-05-25 13:38:03,605] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt... 25: [2023-05-25 13:38:03,605] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_13_optim_states.pt. 25: [2023-05-25 13:38:03,606] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 201 29: [2023-05-25 13:38:03,608] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_14_optim_states.pt. 29: [2023-05-25 13:38:03,608] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 234 0: [2023-05-25 13:38:03,609] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 7: [2023-05-25 13:38:03,609] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt. 0: [2023-05-25 13:38:03,609] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 7: [2023-05-25 13:38:03,609] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt. 27: [2023-05-25 13:38:03,610] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_15_optim_states.pt. 27: [2023-05-25 13:38:03,610] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 219 28: [2023-05-25 13:38:03,610] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_14_optim_states.pt. 28: [2023-05-25 13:38:03,610] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 226 11: [2023-05-25 13:38:03,611] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_07_optim_states.pt... 11: [2023-05-25 13:38:03,611] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_07_optim_states.pt... 27: [2023-05-25 13:38:03,612] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_14_optim_states.pt. 27: [2023-05-25 13:38:03,612] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 222 30: [2023-05-25 13:38:03,613] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_14_optim_states.pt. 30: [2023-05-25 13:38:03,613] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 246 4: [2023-05-25 13:38:03,613] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_00_optim_states.pt... 4: [2023-05-25 13:38:03,613] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt... 4: [2023-05-25 13:38:03,613] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 28: [2023-05-25 13:38:03,614] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 224 28: [2023-05-25 13:38:03,615] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 228 4: [2023-05-25 13:38:03,615] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt... 31: [2023-05-25 13:38:03,615] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_14_optim_states.pt. 31: [2023-05-25 13:38:03,616] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 250 3: [2023-05-25 13:38:03,618] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt. 30: [2023-05-25 13:38:03,619] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_15_optim_states.pt. 30: [2023-05-25 13:38:03,619] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 243 1: [2023-05-25 13:38:03,619] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt. 1: [2023-05-25 13:38:03,619] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt. 3: [2023-05-25 13:38:03,620] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt. 7: [2023-05-25 13:38:03,622] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 7: [2023-05-25 13:38:03,623] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 29: [2023-05-25 13:38:03,624] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_13_optim_states.pt. 29: [2023-05-25 13:38:03,624] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 237 19: [2023-05-25 13:38:03,627] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_09_optim_states.pt... 19: [2023-05-25 13:38:03,627] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_09_optim_states.pt... 28: [2023-05-25 13:38:03,629] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_14_optim_states.pt. 28: [2023-05-25 13:38:03,629] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 230 24: [2023-05-25 13:38:03,629] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_13_optim_states.pt. 24: [2023-05-25 13:38:03,629] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 193 26: [2023-05-25 13:38:03,632] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_15_optim_states.pt. 26: [2023-05-25 13:38:03,632] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 215 1: [2023-05-25 13:38:03,633] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 2: [2023-05-25 13:38:03,634] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt. 2: [2023-05-25 13:38:03,634] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt. 1: [2023-05-25 13:38:03,634] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 3: [2023-05-25 13:38:03,635] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 3: [2023-05-25 13:38:03,635] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 0: [2023-05-25 13:38:03,640] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 0: [2023-05-25 13:38:03,643] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 25: [2023-05-25 13:38:03,646] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_15_optim_states.pt. 25: [2023-05-25 13:38:03,647] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 203 2: [2023-05-25 13:38:03,647] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 2: [2023-05-25 13:38:03,647] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 24: [2023-05-25 13:38:03,649] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_14_optim_states.pt. 24: [2023-05-25 13:38:03,649] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 198 6: [2023-05-25 13:38:03,650] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt. 6: [2023-05-25 13:38:03,650] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt. 26: [2023-05-25 13:38:03,652] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_14_optim_states.pt. 26: [2023-05-25 13:38:03,652] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 210 20: [2023-05-25 13:38:03,658] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_09_optim_states.pt... 20: [2023-05-25 13:38:03,658] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_09_optim_states.pt... 0: [2023-05-25 13:38:03,661] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt... 0: [2023-05-25 13:38:03,661] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt... 7: [2023-05-25 13:38:03,661] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 3: [2023-05-25 13:38:03,661] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 3: [2023-05-25 13:38:03,661] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 3: [2023-05-25 13:38:03,661] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 3: [2023-05-25 13:38:03,661] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 3: [2023-05-25 13:38:03,661] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 3: [2023-05-25 13:38:03,661] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 2: [2023-05-25 13:38:03,661] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 2: [2023-05-25 13:38:03,661] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 2: [2023-05-25 13:38:03,661] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 2: [2023-05-25 13:38:03,661] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 2: [2023-05-25 13:38:03,661] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 2: [2023-05-25 13:38:03,661] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 7: [2023-05-25 13:38:03,662] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 7: [2023-05-25 13:38:03,662] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 7: [2023-05-25 13:38:03,662] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 7: [2023-05-25 13:38:03,662] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 7: [2023-05-25 13:38:03,662] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 7: [2023-05-25 13:38:03,662] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 7: [2023-05-25 13:38:03,662] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 1: [2023-05-25 13:38:03,662] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 1: [2023-05-25 13:38:03,662] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 1: [2023-05-25 13:38:03,662] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 1: [2023-05-25 13:38:03,662] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 1: [2023-05-25 13:38:03,662] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 1: [2023-05-25 13:38:03,662] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 1: [2023-05-25 13:38:03,662] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 1: [2023-05-25 13:38:03,663] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 3: [2023-05-25 13:38:03,663] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 7: [2023-05-25 13:38:03,663] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 6: [2023-05-25 13:38:03,663] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 7: [2023-05-25 13:38:03,663] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt... 2: [2023-05-25 13:38:03,663] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 7: [2023-05-25 13:38:03,664] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt... 7: [2023-05-25 13:38:03,664] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt... 3: [2023-05-25 13:38:03,664] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt... 2: [2023-05-25 13:38:03,664] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 3: [2023-05-25 13:38:03,664] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 2: [2023-05-25 13:38:03,664] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt... 1: [2023-05-25 13:38:03,664] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt... 3: [2023-05-25 13:38:03,664] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt... 3: [2023-05-25 13:38:03,664] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt... 3: [2023-05-25 13:38:03,665] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt... 7: [2023-05-25 13:38:03,665] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt... 7: [2023-05-25 13:38:03,665] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt... 7: [2023-05-25 13:38:03,665] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt... 7: [2023-05-25 13:38:03,665] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 2: [2023-05-25 13:38:03,665] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt... 2: [2023-05-25 13:38:03,665] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt... 28: [2023-05-25 13:38:03,665] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_15_optim_states.pt. 1: [2023-05-25 13:38:03,665] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt... 28: [2023-05-25 13:38:03,665] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 227 1: [2023-05-25 13:38:03,665] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt... 6: [2023-05-25 13:38:03,665] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 1: [2023-05-25 13:38:03,666] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt... 2: [2023-05-25 13:38:03,666] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt... 1: [2023-05-25 13:38:03,666] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 1: [2023-05-25 13:38:03,666] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt... 1: [2023-05-25 13:38:03,666] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt... 1: [2023-05-25 13:38:03,666] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 3: [2023-05-25 13:38:03,666] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 3: [2023-05-25 13:38:03,666] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 31: [2023-05-25 13:38:03,668] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_15_optim_states.pt. 31: [2023-05-25 13:38:03,668] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 251 2: [2023-05-25 13:38:03,669] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 3: [2023-05-25 13:38:03,669] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt... 3: [2023-05-25 13:38:03,669] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt... 2: [2023-05-25 13:38:03,670] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt... 16: [2023-05-25 13:38:03,674] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_09_optim_states.pt... 16: [2023-05-25 13:38:03,674] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_09_optim_states.pt... 2: [2023-05-25 13:38:03,677] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 2: [2023-05-25 13:38:03,679] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt... 27: [2023-05-25 13:38:03,683] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_15_optim_states.pt. 27: [2023-05-25 13:38:03,683] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 223 24: [2023-05-25 13:38:03,687] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_13_optim_states.pt. 24: [2023-05-25 13:38:03,687] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 197 6: [2023-05-25 13:38:03,691] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 3: [2023-05-25 13:38:03,691] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 6: [2023-05-25 13:38:03,694] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt... 2: [2023-05-25 13:38:03,694] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 7: [2023-05-25 13:38:03,695] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 6: [2023-05-25 13:38:03,696] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 7: [2023-05-25 13:38:03,696] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 27: [2023-05-25 13:38:03,696] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_13_optim_states.pt. 27: [2023-05-25 13:38:03,696] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 217 1: [2023-05-25 13:38:03,697] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 1: [2023-05-25 13:38:03,697] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 6: [2023-05-25 13:38:03,698] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt... 2: [2023-05-25 13:38:03,700] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 27: [2023-05-25 13:38:03,700] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_14_optim_states.pt. 27: [2023-05-25 13:38:03,700] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 218 3: [2023-05-25 13:38:03,701] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 17: [2023-05-25 13:38:03,705] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt. 17: [2023-05-25 13:38:03,706] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_35-model_01-model_states.pt. 25: [2023-05-25 13:38:03,709] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_13_optim_states.pt. 25: [2023-05-25 13:38:03,709] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 205 5: [2023-05-25 13:38:03,720] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt. 5: [2023-05-25 13:38:03,720] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_12-model_02-model_states.pt. 27: [2023-05-25 13:38:03,723] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 220 6: [2023-05-25 13:38:03,724] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt. 26: [2023-05-25 13:38:03,726] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 208 6: [2023-05-25 13:38:03,727] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt. 25: [2023-05-25 13:38:03,727] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 204 31: [2023-05-25 13:38:03,728] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 248 25: [2023-05-25 13:38:03,729] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 200 27: [2023-05-25 13:38:03,731] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_13_optim_states.pt. 27: [2023-05-25 13:38:03,731] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 221 27: [2023-05-25 13:38:03,731] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 216 26: [2023-05-25 13:38:03,732] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 212 31: [2023-05-25 13:38:03,732] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 252 3: [2023-05-25 13:38:03,734] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt... 3: [2023-05-25 13:38:03,734] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt... 2: [2023-05-25 13:38:03,734] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt... 2: [2023-05-25 13:38:03,734] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt... 1: [2023-05-25 13:38:03,734] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt... 1: [2023-05-25 13:38:03,734] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt... 5: [2023-05-25 13:38:03,740] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 5: [2023-05-25 13:38:03,741] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt... 28: [2023-05-25 13:38:03,748] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_13_optim_states.pt. 28: [2023-05-25 13:38:03,749] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 225 7: [2023-05-25 13:38:03,750] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_00_optim_states.pt... 7: [2023-05-25 13:38:03,750] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_00_optim_states.pt... 24: [2023-05-25 13:38:03,754] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_15_optim_states.pt. 24: [2023-05-25 13:38:03,754] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 199 30: [2023-05-25 13:38:03,756] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 243 5: [2023-05-25 13:38:03,771] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 5: [2023-05-25 13:38:03,771] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_00-model_states.pt. 5: [2023-05-25 13:38:03,775] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt... 5: [2023-05-25 13:38:03,775] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt... 5: [2023-05-25 13:38:03,779] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt. 5: [2023-05-25 13:38:03,780] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt. 26: [2023-05-25 13:38:03,784] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_14_optim_states.pt. 26: [2023-05-25 13:38:03,784] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 214 6: [2023-05-25 13:38:03,793] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_03_optim_states.pt... 6: [2023-05-25 13:38:03,793] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_03_optim_states.pt... 26: [2023-05-25 13:38:03,793] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_15_optim_states.pt. 26: [2023-05-25 13:38:03,794] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 211 30: [2023-05-25 13:38:03,794] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_15_optim_states.pt. 30: [2023-05-25 13:38:03,794] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 247 31: [2023-05-25 13:38:03,793] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_13_optim_states.pt. 31: [2023-05-25 13:38:03,793] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 253 31: [2023-05-25 13:38:03,797] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_14_optim_states.pt. 29: [2023-05-25 13:38:03,799] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_15_optim_states.pt. 29: [2023-05-25 13:38:03,800] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 235 31: [2023-05-25 13:38:03,797] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 254 17: [2023-05-25 13:38:03,804] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_09_optim_states.pt... 17: [2023-05-25 13:38:03,804] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_09_optim_states.pt... 30: [2023-05-25 13:38:03,808] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 247 20: [2023-05-25 13:38:03,809] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_08_optim_states.pt. 20: [2023-05-25 13:38:03,809] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 164 5: [2023-05-25 13:38:03,811] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_03_optim_states.pt... 5: [2023-05-25 13:38:03,811] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_03_optim_states.pt... 4: [2023-05-25 13:38:03,811] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt. 4: [2023-05-25 13:38:03,812] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt. 29: [2023-05-25 13:38:03,812] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 235 29: [2023-05-25 13:38:03,814] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 239 0: [2023-05-25 13:38:03,815] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt. 0: [2023-05-25 13:38:03,818] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt. 20: [2023-05-25 13:38:03,822] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 164 0: [2023-05-25 13:38:03,822] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt. 0: [2023-05-25 13:38:03,823] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt. 4: [2023-05-25 13:38:03,831] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt. 4: [2023-05-25 13:38:03,831] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt. 4: [2023-05-25 13:38:03,835] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_01_optim_states.pt... 4: [2023-05-25 13:38:03,835] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_01_optim_states.pt... 28: [2023-05-25 13:38:03,841] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 225 28: [2023-05-25 13:38:03,842] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 229 24: [2023-05-25 13:38:03,851] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 195 0: [2023-05-25 13:38:03,853] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_03_optim_states.pt... 0: [2023-05-25 13:38:03,853] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_03_optim_states.pt... 7: [2023-05-25 13:38:03,856] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt. 7: [2023-05-25 13:38:03,856] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt. 24: [2023-05-25 13:38:03,858] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 199 28: [2023-05-25 13:38:03,860] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 227 28: [2023-05-25 13:38:03,860] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 231 24: [2023-05-25 13:38:03,863] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 193 24: [2023-05-25 13:38:03,864] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 197 0: [2023-05-25 13:38:03,878] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_01_optim_states.pt... 0: [2023-05-25 13:38:03,878] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_01_optim_states.pt... 2: [2023-05-25 13:38:03,884] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt. 2: [2023-05-25 13:38:03,887] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt. 31: [2023-05-25 13:38:03,887] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 251 19: [2023-05-25 13:38:03,888] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_08_optim_states.pt. 19: [2023-05-25 13:38:03,888] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 152 31: [2023-05-25 13:38:03,888] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 255 30: [2023-05-25 13:38:03,889] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_13_optim_states.pt. 30: [2023-05-25 13:38:03,889] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 245 1: [2023-05-25 13:38:03,899] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt. 1: [2023-05-25 13:38:03,899] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt. 19: [2023-05-25 13:38:03,905] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 152 6: [2023-05-25 13:38:03,906] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt. 6: [2023-05-25 13:38:03,908] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt. 18: [2023-05-25 13:38:03,909] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_08_optim_states.pt. 18: [2023-05-25 13:38:03,909] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 144 26: [2023-05-25 13:38:03,915] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 215 26: [2023-05-25 13:38:03,918] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 211 18: [2023-05-25 13:38:03,921] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 144 5: [2023-05-25 13:38:03,922] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt. 5: [2023-05-25 13:38:03,922] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt. 19: [2023-05-25 13:38:03,923] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_08_optim_states.pt. 19: [2023-05-25 13:38:03,923] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 156 25: [2023-05-25 13:38:03,930] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 203 1: [2023-05-25 13:38:03,930] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt. 1: [2023-05-25 13:38:03,930] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt. 7: [2023-05-25 13:38:03,937] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_03_optim_states.pt... 7: [2023-05-25 13:38:03,937] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_03_optim_states.pt... 25: [2023-05-25 13:38:03,937] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 207 30: [2023-05-25 13:38:03,938] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 241 1: [2023-05-25 13:38:03,938] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_03_optim_states.pt... 1: [2023-05-25 13:38:03,939] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_03_optim_states.pt... 19: [2023-05-25 13:38:03,935] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 156 2: [2023-05-25 13:38:03,931] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt. 2: [2023-05-25 13:38:03,931] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt. 2: [2023-05-25 13:38:03,933] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_03_optim_states.pt... 2: [2023-05-25 13:38:03,933] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_03_optim_states.pt... 30: [2023-05-25 13:38:03,940] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 245 14: [2023-05-25 13:38:03,943] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_04_optim_states.pt. 14: [2023-05-25 13:38:03,943] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 112 1: [2023-05-25 13:38:03,947] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt. 1: [2023-05-25 13:38:03,947] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt. 3: [2023-05-25 13:38:03,949] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt. 6: [2023-05-25 13:38:03,949] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt. 3: [2023-05-25 13:38:03,950] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_03-model_states.pt. 0: [2023-05-25 13:38:03,955] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt. 0: [2023-05-25 13:38:03,955] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt. 6: [2023-05-25 13:38:03,955] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt. 14: [2023-05-25 13:38:03,956] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 112 7: [2023-05-25 13:38:03,957] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt. 7: [2023-05-25 13:38:03,957] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt. 29: [2023-05-25 13:38:03,958] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 233 29: [2023-05-25 13:38:03,961] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 237 26: [2023-05-25 13:38:03,961] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 214 26: [2023-05-25 13:38:03,962] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 210 25: [2023-05-25 13:38:03,962] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 202 3: [2023-05-25 13:38:03,962] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt. 3: [2023-05-25 13:38:03,963] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt. 2: [2023-05-25 13:38:03,963] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_02_optim_states.pt... 2: [2023-05-25 13:38:03,964] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_02_optim_states.pt... 25: [2023-05-25 13:38:03,965] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 206 3: [2023-05-25 13:38:03,969] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt. 3: [2023-05-25 13:38:03,970] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt. 18: [2023-05-25 13:38:03,971] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_08_optim_states.pt. 18: [2023-05-25 13:38:03,971] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 148 1: [2023-05-25 13:38:03,973] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_02_optim_states.pt... 1: [2023-05-25 13:38:03,973] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_02_optim_states.pt... 2: [2023-05-25 13:38:03,977] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt. 2: [2023-05-25 13:38:03,977] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_01-model_states.pt. 27: [2023-05-25 13:38:03,978] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 223 27: [2023-05-25 13:38:03,981] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 219 25: [2023-05-25 13:38:03,981] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 201 30: [2023-05-25 13:38:03,982] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 242 25: [2023-05-25 13:38:03,982] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 205 18: [2023-05-25 13:38:03,983] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 148 24: [2023-05-25 13:38:03,985] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 198 12: [2023-05-25 13:38:03,985] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_04_optim_states.pt. 30: [2023-05-25 13:38:03,986] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 246 12: [2023-05-25 13:38:03,986] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 96 6: [2023-05-25 13:38:03,988] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_02_optim_states.pt... 6: [2023-05-25 13:38:03,988] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_02_optim_states.pt... 31: [2023-05-25 13:38:03,989] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 249 31: [2023-05-25 13:38:03,989] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 253 3: [2023-05-25 13:38:03,990] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_02_optim_states.pt... 3: [2023-05-25 13:38:03,990] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_02_optim_states.pt... 24: [2023-05-25 13:38:03,990] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 194 14: [2023-05-25 13:38:03,992] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_04_optim_states.pt. 14: [2023-05-25 13:38:03,992] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 116 3: [2023-05-25 13:38:03,994] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_03_optim_states.pt... 4: [2023-05-25 13:38:03,994] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_03_optim_states.pt... 3: [2023-05-25 13:38:03,994] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_03_optim_states.pt... 4: [2023-05-25 13:38:03,994] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_03_optim_states.pt... 4: [2023-05-25 13:38:03,995] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt. 4: [2023-05-25 13:38:03,995] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt. 27: [2023-05-25 13:38:03,997] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 222 6: [2023-05-25 13:38:03,997] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_01_optim_states.pt... 6: [2023-05-25 13:38:03,997] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_01_optim_states.pt... 12: [2023-05-25 13:38:03,999] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 96 22: [2023-05-25 13:38:04,000] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_08_optim_states.pt. 22: [2023-05-25 13:38:04,001] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 176 27: [2023-05-25 13:38:04,004] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 218 27: [2023-05-25 13:38:04,005] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 217 14: [2023-05-25 13:38:04,005] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 116 5: [2023-05-25 13:38:04,007] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_01_optim_states.pt... 5: [2023-05-25 13:38:04,007] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_01_optim_states.pt... 26: [2023-05-25 13:38:04,008] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 213 27: [2023-05-25 13:38:04,008] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 221 26: [2023-05-25 13:38:04,008] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 209 3: [2023-05-25 13:38:04,009] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_01_optim_states.pt... 3: [2023-05-25 13:38:04,009] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_01_optim_states.pt... 22: [2023-05-25 13:38:04,013] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 176 8: [2023-05-25 13:38:04,016] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_04_optim_states.pt. 8: [2023-05-25 13:38:04,016] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 68 16: [2023-05-25 13:38:04,025] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_08_optim_states.pt. 16: [2023-05-25 13:38:04,025] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 128 28: [2023-05-25 13:38:04,026] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 230 28: [2023-05-25 13:38:04,027] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 226 7: [2023-05-25 13:38:04,028] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt. 7: [2023-05-25 13:38:04,029] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt. 8: [2023-05-25 13:38:04,030] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 68 7: [2023-05-25 13:38:04,031] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_01_optim_states.pt... 7: [2023-05-25 13:38:04,031] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_01_optim_states.pt... 8: [2023-05-25 13:38:04,039] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_04_optim_states.pt. 8: [2023-05-25 13:38:04,039] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 64 16: [2023-05-25 13:38:04,042] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 128 0: [2023-05-25 13:38:04,045] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_02_optim_states.pt... 0: [2023-05-25 13:38:04,045] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_02_optim_states.pt... 8: [2023-05-25 13:38:04,051] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 64 4: [2023-05-25 13:38:04,052] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_02_optim_states.pt... 4: [2023-05-25 13:38:04,052] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_02_optim_states.pt... 11: [2023-05-25 13:38:04,052] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_04_optim_states.pt. 7: [2023-05-25 13:38:04,052] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_02_optim_states.pt... 11: [2023-05-25 13:38:04,052] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 92 7: [2023-05-25 13:38:04,052] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_02_optim_states.pt... 31: [2023-05-25 13:38:04,057] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 250 31: [2023-05-25 13:38:04,061] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 254 11: [2023-05-25 13:38:04,065] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 92 16: [2023-05-25 13:38:04,072] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_08_optim_states.pt. 16: [2023-05-25 13:38:04,073] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 132 20: [2023-05-25 13:38:04,079] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_08_optim_states.pt. 20: [2023-05-25 13:38:04,079] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 160 16: [2023-05-25 13:38:04,085] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 132 20: [2023-05-25 13:38:04,094] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 160 10: [2023-05-25 13:38:04,094] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_04_optim_states.pt. 10: [2023-05-25 13:38:04,094] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 84 22: [2023-05-25 13:38:04,099] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_08_optim_states.pt. 22: [2023-05-25 13:38:04,099] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 180 13: [2023-05-25 13:38:04,102] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_04_optim_states.pt. 13: [2023-05-25 13:38:04,102] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 108 17: [2023-05-25 13:38:04,106] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_08_optim_states.pt. 17: [2023-05-25 13:38:04,106] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 136 10: [2023-05-25 13:38:04,106] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 84 2: [2023-05-25 13:38:04,104] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_01_optim_states.pt... 2: [2023-05-25 13:38:04,111] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_01_optim_states.pt... 22: [2023-05-25 13:38:04,112] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 180 13: [2023-05-25 13:38:04,118] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 108 17: [2023-05-25 13:38:04,118] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 136 1: [2023-05-25 13:38:04,139] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_01_optim_states.pt... 1: [2023-05-25 13:38:04,139] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_01_optim_states.pt... 5: [2023-05-25 13:38:04,142] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt. 5: [2023-05-25 13:38:04,142] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/layer_13-model_02-model_states.pt. 15: [2023-05-25 13:38:04,143] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_04_optim_states.pt. 15: [2023-05-25 13:38:04,143] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 120 15: [2023-05-25 13:38:04,143] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_04_optim_states.pt. 15: [2023-05-25 13:38:04,144] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 124 23: [2023-05-25 13:38:04,147] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_08_optim_states.pt. 23: [2023-05-25 13:38:04,147] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 188 13: [2023-05-25 13:38:04,153] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_04_optim_states.pt. 13: [2023-05-25 13:38:04,153] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 104 15: [2023-05-25 13:38:04,157] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 124 15: [2023-05-25 13:38:04,159] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 120 23: [2023-05-25 13:38:04,161] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 188 10: [2023-05-25 13:38:04,164] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_04_optim_states.pt. 10: [2023-05-25 13:38:04,165] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 80 13: [2023-05-25 13:38:04,165] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 104 5: [2023-05-25 13:38:04,172] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_02_optim_states.pt... 5: [2023-05-25 13:38:04,173] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_02_optim_states.pt... 10: [2023-05-25 13:38:04,176] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 80 29: [2023-05-25 13:38:04,176] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 234 29: [2023-05-25 13:38:04,177] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 238 17: [2023-05-25 13:38:04,214] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_08_optim_states.pt. 17: [2023-05-25 13:38:04,214] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 140 12: [2023-05-25 13:38:04,223] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_04_optim_states.pt. 12: [2023-05-25 13:38:04,224] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 100 11: [2023-05-25 13:38:04,227] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_04_optim_states.pt. 11: [2023-05-25 13:38:04,227] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 88 17: [2023-05-25 13:38:04,227] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 140 11: [2023-05-25 13:38:04,229] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_05_optim_states.pt. 11: [2023-05-25 13:38:04,229] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 93 23: [2023-05-25 13:38:04,235] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_08_optim_states.pt. 23: [2023-05-25 13:38:04,235] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 184 23: [2023-05-25 13:38:04,236] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_10_optim_states.pt. 23: [2023-05-25 13:38:04,236] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 190 15: [2023-05-25 13:38:04,236] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_05_optim_states.pt. 15: [2023-05-25 13:38:04,236] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 121 12: [2023-05-25 13:38:04,237] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 100 15: [2023-05-25 13:38:04,240] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_06_optim_states.pt. 15: [2023-05-25 13:38:04,240] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 126 11: [2023-05-25 13:38:04,246] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 88 11: [2023-05-25 13:38:04,246] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 93 23: [2023-05-25 13:38:04,248] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 184 23: [2023-05-25 13:38:04,250] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 190 15: [2023-05-25 13:38:04,251] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 121 16: [2023-05-25 13:38:04,254] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_10_optim_states.pt. 16: [2023-05-25 13:38:04,254] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 134 15: [2023-05-25 13:38:04,255] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 126 16: [2023-05-25 13:38:04,267] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 134 10: [2023-05-25 13:38:04,267] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_06_optim_states.pt. 10: [2023-05-25 13:38:04,268] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 82 21: [2023-05-25 13:38:04,274] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_10_optim_states.pt. 21: [2023-05-25 13:38:04,274] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 174 10: [2023-05-25 13:38:04,279] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 82 22: [2023-05-25 13:38:04,281] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_09_optim_states.pt. 22: [2023-05-25 13:38:04,282] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 177 21: [2023-05-25 13:38:04,293] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 174 22: [2023-05-25 13:38:04,294] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 177 12: [2023-05-25 13:38:04,297] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_07_optim_states.pt. 12: [2023-05-25 13:38:04,297] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 103 11: [2023-05-25 13:38:04,306] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_05_optim_states.pt. 11: [2023-05-25 13:38:04,306] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 89 12: [2023-05-25 13:38:04,311] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 103 16: [2023-05-25 13:38:04,311] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_11_optim_states.pt. 16: [2023-05-25 13:38:04,311] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 131 13: [2023-05-25 13:38:04,317] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_06_optim_states.pt. 13: [2023-05-25 13:38:04,318] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 106 11: [2023-05-25 13:38:04,320] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 89 16: [2023-05-25 13:38:04,326] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 131 21: [2023-05-25 13:38:04,326] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_10_optim_states.pt. 21: [2023-05-25 13:38:04,327] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 170 20: [2023-05-25 13:38:04,329] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_11_optim_states.pt. 20: [2023-05-25 13:38:04,330] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 163 13: [2023-05-25 13:38:04,334] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 106 12: [2023-05-25 13:38:04,336] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_06_optim_states.pt. 12: [2023-05-25 13:38:04,336] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 98 19: [2023-05-25 13:38:04,339] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_11_optim_states.pt. 19: [2023-05-25 13:38:04,339] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 159 21: [2023-05-25 13:38:04,339] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 170 21: [2023-05-25 13:38:04,340] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_08_optim_states.pt. 21: [2023-05-25 13:38:04,340] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 168 20: [2023-05-25 13:38:04,342] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 163 14: [2023-05-25 13:38:04,343] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_05_optim_states.pt. 14: [2023-05-25 13:38:04,343] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 117 15: [2023-05-25 13:38:04,345] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_05_optim_states.pt. 15: [2023-05-25 13:38:04,345] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 125 12: [2023-05-25 13:38:04,350] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 98 19: [2023-05-25 13:38:04,351] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 159 19: [2023-05-25 13:38:04,352] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_10_optim_states.pt. 19: [2023-05-25 13:38:04,353] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 154 21: [2023-05-25 13:38:04,353] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 168 14: [2023-05-25 13:38:04,356] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 117 15: [2023-05-25 13:38:04,360] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 125 14: [2023-05-25 13:38:04,364] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_06_optim_states.pt. 14: [2023-05-25 13:38:04,364] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 118 19: [2023-05-25 13:38:04,365] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 154 17: [2023-05-25 13:38:04,367] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_10_optim_states.pt. 17: [2023-05-25 13:38:04,368] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 138 13: [2023-05-25 13:38:04,372] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_05_optim_states.pt. 13: [2023-05-25 13:38:04,372] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 105 17: [2023-05-25 13:38:04,373] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_11_optim_states.pt. 17: [2023-05-25 13:38:04,373] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 139 10: [2023-05-25 13:38:04,376] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_05_optim_states.pt. 10: [2023-05-25 13:38:04,376] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 85 17: [2023-05-25 13:38:04,377] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_10_optim_states.pt. 17: [2023-05-25 13:38:04,377] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 142 14: [2023-05-25 13:38:04,380] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 118 10: [2023-05-25 13:38:04,380] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_05_optim_states.pt. 10: [2023-05-25 13:38:04,380] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 81 17: [2023-05-25 13:38:04,382] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 138 17: [2023-05-25 13:38:04,386] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 139 13: [2023-05-25 13:38:04,388] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 105 10: [2023-05-25 13:38:04,389] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 85 9: [2023-05-25 13:38:04,389] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_05_optim_states.pt. 9: [2023-05-25 13:38:04,389] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 77 23: [2023-05-25 13:38:04,390] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_09_optim_states.pt. 23: [2023-05-25 13:38:04,390] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 185 17: [2023-05-25 13:38:04,392] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 142 14: [2023-05-25 13:38:04,394] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_05_optim_states.pt. 14: [2023-05-25 13:38:04,394] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 113 10: [2023-05-25 13:38:04,395] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 81 6: [2023-05-25 13:38:04,396] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_00_optim_states.pt. 6: [2023-05-25 13:38:04,396] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 52 21: [2023-05-25 13:38:04,396] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_09_optim_states.pt. 21: [2023-05-25 13:38:04,396] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 169 9: [2023-05-25 13:38:04,399] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_07_optim_states.pt. 9: [2023-05-25 13:38:04,399] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 75 23: [2023-05-25 13:38:04,403] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 185 9: [2023-05-25 13:38:04,404] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 77 8: [2023-05-25 13:38:04,404] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_05_optim_states.pt. 8: [2023-05-25 13:38:04,404] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 69 14: [2023-05-25 13:38:04,406] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 113 21: [2023-05-25 13:38:04,410] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 169 19: [2023-05-25 13:38:04,412] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_11_optim_states.pt. 19: [2023-05-25 13:38:04,412] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 155 6: [2023-05-25 13:38:04,412] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 52 9: [2023-05-25 13:38:04,414] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 75 8: [2023-05-25 13:38:04,417] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 69 19: [2023-05-25 13:38:04,424] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 155 20: [2023-05-25 13:38:04,430] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_10_optim_states.pt. 20: [2023-05-25 13:38:04,430] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 162 20: [2023-05-25 13:38:04,442] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 162 21: [2023-05-25 13:38:04,449] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_08_optim_states.pt. 21: [2023-05-25 13:38:04,449] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 172 10: [2023-05-25 13:38:04,449] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_07_optim_states.pt. 10: [2023-05-25 13:38:04,450] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 87 12: [2023-05-25 13:38:04,453] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_05_optim_states.pt. 12: [2023-05-25 13:38:04,453] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 101 10: [2023-05-25 13:38:04,463] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 87 21: [2023-05-25 13:38:04,464] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 172 11: [2023-05-25 13:38:04,464] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_06_optim_states.pt. 11: [2023-05-25 13:38:04,464] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 90 12: [2023-05-25 13:38:04,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_07_optim_states.pt. 12: [2023-05-25 13:38:04,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_06_optim_states.pt. 12: [2023-05-25 13:38:04,468] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 99 12: [2023-05-25 13:38:04,468] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 102 12: [2023-05-25 13:38:04,469] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 101 18: [2023-05-25 13:38:04,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_10_optim_states.pt. 18: [2023-05-25 13:38:04,471] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 146 19: [2023-05-25 13:38:04,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_10_optim_states.pt. 19: [2023-05-25 13:38:04,471] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 158 9: [2023-05-25 13:38:04,473] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_04_optim_states.pt. 9: [2023-05-25 13:38:04,473] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 76 9: [2023-05-25 13:38:04,474] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_07_optim_states.pt. 9: [2023-05-25 13:38:04,474] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 79 11: [2023-05-25 13:38:04,478] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 90 12: [2023-05-25 13:38:04,483] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 99 12: [2023-05-25 13:38:04,483] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 102 18: [2023-05-25 13:38:04,483] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 146 8: [2023-05-25 13:38:04,482] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_07_optim_states.pt. 8: [2023-05-25 13:38:04,482] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 71 8: [2023-05-25 13:38:04,482] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_06_optim_states.pt. 8: [2023-05-25 13:38:04,482] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 70 19: [2023-05-25 13:38:04,486] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 158 9: [2023-05-25 13:38:04,488] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 76 20: [2023-05-25 13:38:04,489] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_11_optim_states.pt. 20: [2023-05-25 13:38:04,489] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 167 9: [2023-05-25 13:38:04,489] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_06_optim_states.pt. 9: [2023-05-25 13:38:04,489] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 74 9: [2023-05-25 13:38:04,491] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 79 7: [2023-05-25 13:38:04,491] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_00_optim_states.pt. 7: [2023-05-25 13:38:04,492] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 56 8: [2023-05-25 13:38:04,495] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 71 8: [2023-05-25 13:38:04,496] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 70 18: [2023-05-25 13:38:04,499] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_09_optim_states.pt. 18: [2023-05-25 13:38:04,499] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 145 8: [2023-05-25 13:38:04,502] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_05_optim_states.pt. 8: [2023-05-25 13:38:04,502] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 65 9: [2023-05-25 13:38:04,502] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 74 20: [2023-05-25 13:38:04,502] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 167 23: [2023-05-25 13:38:04,505] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_09_optim_states.pt. 23: [2023-05-25 13:38:04,505] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 189 7: [2023-05-25 13:38:04,506] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 56 20: [2023-05-25 13:38:04,506] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_10_optim_states.pt. 20: [2023-05-25 13:38:04,506] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 166 11: [2023-05-25 13:38:04,508] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_07_optim_states.pt. 11: [2023-05-25 13:38:04,509] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 95 18: [2023-05-25 13:38:04,511] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 145 8: [2023-05-25 13:38:04,516] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 65 15: [2023-05-25 13:38:04,518] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_06_optim_states.pt. 15: [2023-05-25 13:38:04,518] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 122 20: [2023-05-25 13:38:04,519] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 166 23: [2023-05-25 13:38:04,520] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 189 11: [2023-05-25 13:38:04,522] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 95 17: [2023-05-25 13:38:04,525] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_11_optim_states.pt. 17: [2023-05-25 13:38:04,525] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 143 20: [2023-05-25 13:38:04,529] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_09_optim_states.pt. 20: [2023-05-25 13:38:04,529] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 161 21: [2023-05-25 13:38:04,530] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_11_optim_states.pt. 21: [2023-05-25 13:38:04,530] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 171 15: [2023-05-25 13:38:04,533] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 122 18: [2023-05-25 13:38:04,538] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_09_optim_states.pt. 18: [2023-05-25 13:38:04,539] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 149 17: [2023-05-25 13:38:04,539] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 143 20: [2023-05-25 13:38:04,542] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 161 18: [2023-05-25 13:38:04,543] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_10_optim_states.pt. 18: [2023-05-25 13:38:04,543] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 150 21: [2023-05-25 13:38:04,544] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 171 16: [2023-05-25 13:38:04,545] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_10_optim_states.pt. 16: [2023-05-25 13:38:04,545] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 130 11: [2023-05-25 13:38:04,549] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_06_optim_states.pt. 11: [2023-05-25 13:38:04,549] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 94 18: [2023-05-25 13:38:04,552] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 149 9: [2023-05-25 13:38:04,552] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_04_optim_states.pt. 9: [2023-05-25 13:38:04,552] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 72 14: [2023-05-25 13:38:04,554] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_06_optim_states.pt. 14: [2023-05-25 13:38:04,554] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 114 18: [2023-05-25 13:38:04,558] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 150 22: [2023-05-25 13:38:04,559] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_10_optim_states.pt. 22: [2023-05-25 13:38:04,560] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 182 16: [2023-05-25 13:38:04,560] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 130 11: [2023-05-25 13:38:04,563] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 94 14: [2023-05-25 13:38:04,567] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 114 9: [2023-05-25 13:38:04,567] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 72 16: [2023-05-25 13:38:04,567] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_09_optim_states.pt. 16: [2023-05-25 13:38:04,567] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 133 19: [2023-05-25 13:38:04,569] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_09_optim_states.pt. 19: [2023-05-25 13:38:04,569] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 157 14: [2023-05-25 13:38:04,569] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_07_optim_states.pt. 14: [2023-05-25 13:38:04,569] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 119 12: [2023-05-25 13:38:04,570] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_05_optim_states.pt. 12: [2023-05-25 13:38:04,570] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 97 22: [2023-05-25 13:38:04,572] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 182 16: [2023-05-25 13:38:04,579] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 133 10: [2023-05-25 13:38:04,583] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_06_optim_states.pt. 10: [2023-05-25 13:38:04,583] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 86 19: [2023-05-25 13:38:04,583] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 157 12: [2023-05-25 13:38:04,584] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 97 14: [2023-05-25 13:38:04,585] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 119 23: [2023-05-25 13:38:04,586] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_11_optim_states.pt. 23: [2023-05-25 13:38:04,586] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 187 20: [2023-05-25 13:38:04,589] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_09_optim_states.pt. 20: [2023-05-25 13:38:04,590] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 165 18: [2023-05-25 13:38:04,597] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_11_optim_states.pt. 18: [2023-05-25 13:38:04,597] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 151 10: [2023-05-25 13:38:04,598] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 86 23: [2023-05-25 13:38:04,602] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 187 20: [2023-05-25 13:38:04,604] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 165 18: [2023-05-25 13:38:04,611] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 151 19: [2023-05-25 13:38:04,613] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_09_optim_states.pt. 19: [2023-05-25 13:38:04,613] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 153 22: [2023-05-25 13:38:04,619] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_09_optim_states.pt. 22: [2023-05-25 13:38:04,619] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 181 22: [2023-05-25 13:38:04,620] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_11_optim_states.pt. 22: [2023-05-25 13:38:04,620] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 183 15: [2023-05-25 13:38:04,627] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_07_optim_states.pt. 15: [2023-05-25 13:38:04,627] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 127 19: [2023-05-25 13:38:04,628] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 153 22: [2023-05-25 13:38:04,633] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 181 16: [2023-05-25 13:38:04,635] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_11_optim_states.pt. 16: [2023-05-25 13:38:04,635] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 135 22: [2023-05-25 13:38:04,636] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 183 11: [2023-05-25 13:38:04,636] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_07_optim_states.pt. 13: [2023-05-25 13:38:04,636] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_07_optim_states.pt. 11: [2023-05-25 13:38:04,636] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 91 13: [2023-05-25 13:38:04,636] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 111 8: [2023-05-25 13:38:04,638] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_07_optim_states.pt. 8: [2023-05-25 13:38:04,638] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 67 10: [2023-05-25 13:38:04,640] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_07_optim_states.pt. 10: [2023-05-25 13:38:04,640] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 83 15: [2023-05-25 13:38:04,640] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 127 9: [2023-05-25 13:38:04,640] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_05_optim_states.pt. 9: [2023-05-25 13:38:04,640] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 73 17: [2023-05-25 13:38:04,645] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_09_optim_states.pt. 17: [2023-05-25 13:38:04,645] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 141 13: [2023-05-25 13:38:04,646] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_05_optim_states.pt. 13: [2023-05-25 13:38:04,646] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 109 11: [2023-05-25 13:38:04,650] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 91 0: [2023-05-25 13:38:04,650] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt. 0: [2023-05-25 13:38:04,650] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 0 16: [2023-05-25 13:38:04,650] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 135 13: [2023-05-25 13:38:04,652] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 111 8: [2023-05-25 13:38:04,653] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 67 9: [2023-05-25 13:38:04,654] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 73 10: [2023-05-25 13:38:04,654] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 83 17: [2023-05-25 13:38:04,657] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 141 13: [2023-05-25 13:38:04,660] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 109 0: [2023-05-25 13:38:04,664] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 0 21: [2023-05-25 13:38:04,667] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_09_optim_states.pt. 21: [2023-05-25 13:38:04,667] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 173 8: [2023-05-25 13:38:04,675] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_06_optim_states.pt. 8: [2023-05-25 13:38:04,675] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 66 0: could not find arguments in the checkpoint ... 0: checkpoint version 3.0 21: [2023-05-25 13:38:04,683] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 173 15: [2023-05-25 13:38:04,684] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_07_optim_states.pt. 15: [2023-05-25 13:38:04,685] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 123 14: [2023-05-25 13:38:04,687] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_07_optim_states.pt. 14: [2023-05-25 13:38:04,687] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 115 8: [2023-05-25 13:38:04,688] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 66 14: [2023-05-25 13:38:04,700] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 115 15: [2023-05-25 13:38:04,701] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 123 0: [2023-05-25 13:38:04,707] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_03_optim_states.pt. 0: [2023-05-25 13:38:04,707] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 3 9: [2023-05-25 13:38:04,709] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_06_optim_states.pt. 9: [2023-05-25 13:38:04,709] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 78 0: [2023-05-25 13:38:04,722] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 3 16: [2023-05-25 13:38:04,722] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_09_optim_states.pt. 16: [2023-05-25 13:38:04,722] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 129 9: [2023-05-25 13:38:04,726] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 78 23: [2023-05-25 13:38:04,727] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_10_optim_states.pt. 23: [2023-05-25 13:38:04,727] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 186 18: [2023-05-25 13:38:04,727] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_11_optim_states.pt. 18: [2023-05-25 13:38:04,727] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 147 16: [2023-05-25 13:38:04,738] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 129 18: [2023-05-25 13:38:04,740] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 147 23: [2023-05-25 13:38:04,743] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 186 4: [2023-05-25 13:38:04,750] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt. 4: [2023-05-25 13:38:04,750] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 32 22: [2023-05-25 13:38:04,754] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_11_optim_states.pt. 22: [2023-05-25 13:38:04,754] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 179 5: [2023-05-25 13:38:04,758] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_00_optim_states.pt. 5: [2023-05-25 13:38:04,758] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 40 22: [2023-05-25 13:38:04,758] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_10_optim_states.pt. 22: [2023-05-25 13:38:04,758] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 178 4: [2023-05-25 13:38:04,764] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 32 13: [2023-05-25 13:38:04,766] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_07_optim_states.pt. 13: [2023-05-25 13:38:04,766] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 107 22: [2023-05-25 13:38:04,770] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 179 22: [2023-05-25 13:38:04,771] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 178 23: [2023-05-25 13:38:04,771] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_11_optim_states.pt. 23: [2023-05-25 13:38:04,772] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 191 5: [2023-05-25 13:38:04,773] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 40 0: [2023-05-25 13:38:04,777] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt. 0: [2023-05-25 13:38:04,777] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 4 5: [2023-05-25 13:38:04,778] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_03_optim_states.pt. 5: [2023-05-25 13:38:04,779] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 43 13: [2023-05-25 13:38:04,780] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 107 23: [2023-05-25 13:38:04,783] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 191 0: [2023-05-25 13:38:04,790] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 4 5: [2023-05-25 13:38:04,796] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 43 13: [2023-05-25 13:38:04,852] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_06_optim_states.pt. 13: [2023-05-25 13:38:04,852] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 110 3: [2023-05-25 13:38:04,860] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt. 3: [2023-05-25 13:38:04,861] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 24 13: [2023-05-25 13:38:04,867] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 110 17: [2023-05-25 13:38:04,872] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_09_optim_states.pt. 17: [2023-05-25 13:38:04,872] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 137 3: [2023-05-25 13:38:04,878] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 24 17: [2023-05-25 13:38:04,884] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 137 3: [2023-05-25 13:38:04,891] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_03_optim_states.pt. 3: [2023-05-25 13:38:04,891] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 27 1: [2023-05-25 13:38:04,893] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt. 1: [2023-05-25 13:38:04,893] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 12 3: [2023-05-25 13:38:04,907] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 27 1: [2023-05-25 13:38:04,908] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 12 5: [2023-05-25 13:38:04,918] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_01_optim_states.pt. 5: [2023-05-25 13:38:04,918] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 41 0: [2023-05-25 13:38:04,924] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_03_optim_states.pt. 0: [2023-05-25 13:38:04,924] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 7 6: [2023-05-25 13:38:04,924] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_01_optim_states.pt. 6: [2023-05-25 13:38:04,924] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 53 5: [2023-05-25 13:38:04,936] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 41 7: [2023-05-25 13:38:04,937] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_01_optim_states.pt. 7: [2023-05-25 13:38:04,938] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 57 6: [2023-05-25 13:38:04,939] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 53 2: [2023-05-25 13:38:04,939] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt. 2: [2023-05-25 13:38:04,939] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 16 7: [2023-05-25 13:38:04,939] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_02_optim_states.pt. 5: [2023-05-25 13:38:04,939] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_00_optim_states.pt. 7: [2023-05-25 13:38:04,939] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 62 5: [2023-05-25 13:38:04,939] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 44 0: [2023-05-25 13:38:04,939] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 7 2: [2023-05-25 13:38:04,941] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_02_optim_states.pt. 2: [2023-05-25 13:38:04,942] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 18 4: [2023-05-25 13:38:04,950] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_02_optim_states.pt. 4: [2023-05-25 13:38:04,950] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 38 4: [2023-05-25 13:38:04,950] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_03_optim_states.pt. 4: [2023-05-25 13:38:04,950] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 39 4: [2023-05-25 13:38:04,953] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_00_optim_states.pt. 4: [2023-05-25 13:38:04,953] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 36 7: [2023-05-25 13:38:04,954] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 57 5: [2023-05-25 13:38:04,954] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 44 2: [2023-05-25 13:38:04,954] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 16 7: [2023-05-25 13:38:04,954] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 62 21: [2023-05-25 13:38:04,957] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_11_optim_states.pt. 21: [2023-05-25 13:38:04,957] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 175 2: [2023-05-25 13:38:04,957] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 18 4: [2023-05-25 13:38:04,967] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 39 4: [2023-05-25 13:38:04,967] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 38 21: [2023-05-25 13:38:04,970] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 175 4: [2023-05-25 13:38:04,970] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 36 6: [2023-05-25 13:38:04,970] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_01_optim_states.pt. 6: [2023-05-25 13:38:04,971] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 49 6: [2023-05-25 13:38:04,989] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 49 6: [2023-05-25 13:38:04,991] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_00_optim_states.pt. 6: [2023-05-25 13:38:04,991] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 48 3: [2023-05-25 13:38:04,999] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt. 3: [2023-05-25 13:38:04,999] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 28 0: [2023-05-25 13:38:05,005] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_01_optim_states.pt. 0: [2023-05-25 13:38:05,006] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 5 6: [2023-05-25 13:38:05,006] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 48 3: [2023-05-25 13:38:05,014] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 28 0: [2023-05-25 13:38:05,021] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 5 0: [2023-05-25 13:38:05,026] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_01_optim_states.pt. 0: [2023-05-25 13:38:05,026] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 1 1: [2023-05-25 13:38:05,028] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt. 1: [2023-05-25 13:38:05,029] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 8 0: [2023-05-25 13:38:05,042] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 1 1: [2023-05-25 13:38:05,043] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 8 6: [2023-05-25 13:38:05,043] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_03_optim_states.pt. 6: [2023-05-25 13:38:05,044] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 51 6: [2023-05-25 13:38:05,059] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 51 5: [2023-05-25 13:38:05,064] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_03_optim_states.pt. 5: [2023-05-25 13:38:05,064] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 47 7: [2023-05-25 13:38:05,064] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_01_optim_states.pt. 7: [2023-05-25 13:38:05,065] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 61 4: [2023-05-25 13:38:05,072] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_9_mp_rank_01_optim_states.pt. 4: [2023-05-25 13:38:05,072] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 37 7: [2023-05-25 13:38:05,081] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 61 5: [2023-05-25 13:38:05,081] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 47 3: [2023-05-25 13:38:05,086] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_01_optim_states.pt. 3: [2023-05-25 13:38:05,086] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 29 4: [2023-05-25 13:38:05,087] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 37 4: [2023-05-25 13:38:05,096] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_03_optim_states.pt. 4: [2023-05-25 13:38:05,096] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 35 3: [2023-05-25 13:38:05,101] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 29 4: [2023-05-25 13:38:05,110] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 35 2: [2023-05-25 13:38:05,111] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt. 2: [2023-05-25 13:38:05,111] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 20 6: [2023-05-25 13:38:05,113] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_02_optim_states.pt. 6: [2023-05-25 13:38:05,113] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 54 1: [2023-05-25 13:38:05,115] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_02_optim_states.pt. 1: [2023-05-25 13:38:05,115] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 10 0: [2023-05-25 13:38:05,123] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_1_mp_rank_02_optim_states.pt. 0: [2023-05-25 13:38:05,123] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 6 6: [2023-05-25 13:38:05,129] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 54 2: [2023-05-25 13:38:05,130] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 20 1: [2023-05-25 13:38:05,130] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 10 2: [2023-05-25 13:38:05,132] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_02_optim_states.pt. 2: [2023-05-25 13:38:05,132] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 22 4: [2023-05-25 13:38:05,134] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_01_optim_states.pt. 4: [2023-05-25 13:38:05,134] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 33 0: [2023-05-25 13:38:05,139] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 6 2: [2023-05-25 13:38:05,144] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_03_optim_states.pt. 2: [2023-05-25 13:38:05,144] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 19 2: [2023-05-25 13:38:05,146] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 22 4: [2023-05-25 13:38:05,149] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 33 5: [2023-05-25 13:38:05,155] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_10_mp_rank_02_optim_states.pt. 5: [2023-05-25 13:38:05,155] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 42 1: [2023-05-25 13:38:05,155] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_01_optim_states.pt. 1: [2023-05-25 13:38:05,156] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 13 2: [2023-05-25 13:38:05,160] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 19 1: [2023-05-25 13:38:05,170] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 13 5: [2023-05-25 13:38:05,172] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 42 5: [2023-05-25 13:38:05,172] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_01_optim_states.pt. 5: [2023-05-25 13:38:05,172] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 45 1: [2023-05-25 13:38:05,175] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_03_optim_states.pt. 1: [2023-05-25 13:38:05,175] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 11 1: [2023-05-25 13:38:05,179] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_02_optim_states.pt. 1: [2023-05-25 13:38:05,179] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 14 7: [2023-05-25 13:38:05,183] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_00_optim_states.pt. 7: [2023-05-25 13:38:05,183] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 60 6: [2023-05-25 13:38:05,186] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_12_mp_rank_02_optim_states.pt. 6: [2023-05-25 13:38:05,186] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 50 6: [2023-05-25 13:38:05,187] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_13_mp_rank_03_optim_states.pt. 6: [2023-05-25 13:38:05,187] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 55 5: [2023-05-25 13:38:05,188] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 45 1: [2023-05-25 13:38:05,189] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 11 2: [2023-05-25 13:38:05,190] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_03_optim_states.pt. 2: [2023-05-25 13:38:05,190] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 23 1: [2023-05-25 13:38:05,194] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 14 2: [2023-05-25 13:38:05,194] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_4_mp_rank_01_optim_states.pt. 2: [2023-05-25 13:38:05,195] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 17 7: [2023-05-25 13:38:05,198] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 60 6: [2023-05-25 13:38:05,202] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 50 6: [2023-05-25 13:38:05,205] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 55 2: [2023-05-25 13:38:05,208] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 23 2: [2023-05-25 13:38:05,211] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 17 5: [2023-05-25 13:38:05,219] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_11_mp_rank_02_optim_states.pt. 5: [2023-05-25 13:38:05,219] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 46 5: [2023-05-25 13:38:05,235] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 46 3: [2023-05-25 13:38:05,242] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_02_optim_states.pt. 3: [2023-05-25 13:38:05,242] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 30 4: [2023-05-25 13:38:05,254] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_8_mp_rank_02_optim_states.pt. 4: [2023-05-25 13:38:05,254] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 34 3: [2023-05-25 13:38:05,256] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 30 4: [2023-05-25 13:38:05,271] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 34 0: [2023-05-25 13:38:05,291] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_0_mp_rank_02_optim_states.pt. 0: [2023-05-25 13:38:05,291] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 2 3: [2023-05-25 13:38:05,298] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_02_optim_states.pt. 3: [2023-05-25 13:38:05,298] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 26 1: [2023-05-25 13:38:05,298] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_3_mp_rank_03_optim_states.pt. 1: [2023-05-25 13:38:05,298] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 15 7: [2023-05-25 13:38:05,300] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_15_mp_rank_03_optim_states.pt. 7: [2023-05-25 13:38:05,300] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 63 0: [2023-05-25 13:38:05,309] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 2 3: [2023-05-25 13:38:05,313] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 26 1: [2023-05-25 13:38:05,315] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 15 3: [2023-05-25 13:38:05,316] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_6_mp_rank_01_optim_states.pt. 3: [2023-05-25 13:38:05,316] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 25 7: [2023-05-25 13:38:05,316] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 63 3: [2023-05-25 13:38:05,331] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 25 3: [2023-05-25 13:38:05,337] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_7_mp_rank_03_optim_states.pt. 3: [2023-05-25 13:38:05,338] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 31 7: [2023-05-25 13:38:05,350] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_02_optim_states.pt. 7: [2023-05-25 13:38:05,350] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 58 3: [2023-05-25 13:38:05,351] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 31 7: [2023-05-25 13:38:05,352] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_14_mp_rank_03_optim_states.pt. 7: [2023-05-25 13:38:05,352] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 59 2: [2023-05-25 13:38:05,364] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_5_mp_rank_01_optim_states.pt. 2: [2023-05-25 13:38:05,364] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 21 7: [2023-05-25 13:38:05,365] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 58 7: [2023-05-25 13:38:05,368] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 59 2: [2023-05-25 13:38:05,381] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 21 1: [2023-05-25 13:38:05,385] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b/global_step84877/bf16_zero_pp_rank_2_mp_rank_01_optim_states.pt. 1: [2023-05-25 13:38:05,386] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 16 ZeRO state_dicts for rank 9 1: [2023-05-25 13:38:05,402] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 16 zero partition checkpoints for rank 9 0: successfully loaded checkpoint from lm1-8b7-178b-c4-repetitions/8b7178b4b at iteration 0 31: time (ms) | load-checkpoint: 6084.62 0: estimated model parameters: 9.828646912 0: estimated model parameters without embeddings: 8.863956992 0: [after model, optimizer, and learning rate scheduler are built] datetime: 2023-05-25 13:38:06 0: > building train, validation, and test datasets ... 0: > datasets target sizes (minimum size): 0: train: 1 0: validation: 51200 0: test: 51200 0: > building train, validation, and test datasets for GPT ... 0: > building dataset index ... 0: reading sizes... 0: reading pointers... 0: reading document index... 0: creating numpy buffer of mmap... 0: creating memory view of numpy buffer... 0: > finished creating indexed dataset in 0.066274 seconds 0: number of documents: 835726 0: > dataset split: 0: train: 0: document indices in [0, 835726) total of 835726 documents 0: > loading doc-idx mapping from /scratch/project_462000119/data/c4_subsampled/gpt2tok_c4_en_400M_text_document_train_indexmap_1ns_2048sl_1234s_doc_idx.npy 0: > loading sample-idx mapping from /scratch/project_462000119/data/c4_subsampled/gpt2tok_c4_en_400M_text_document_train_indexmap_1ns_2048sl_1234s_sample_idx.npy 0: > loading shuffle-idx mapping from /scratch/project_462000119/data/c4_subsampled/gpt2tok_c4_en_400M_text_document_train_indexmap_1ns_2048sl_1234s_shuffle_idx.npy 0: loaded indexed file in 0.057 seconds 0: total number of samples: 195101 0: total number of epochs: 1 0: > building dataset index ... 0: reading sizes... 0: reading pointers... 0: reading document index... 0: creating numpy buffer of mmap... 0: creating memory view of numpy buffer... 0: > finished creating indexed dataset in 0.033319 seconds 0: number of documents: 364608 0: > dataset split: 0: validation: 0: document indices in [0, 364608) total of 364608 documents 0: > loading doc-idx mapping from /scratch/project_462000119/data/c4_validation/gpt2tok_c4validation_rerun_text_document_validation_indexmap_51200ns_2048sl_1234s_doc_idx.npy 0: > loading sample-idx mapping from /scratch/project_462000119/data/c4_validation/gpt2tok_c4validation_rerun_text_document_validation_indexmap_51200ns_2048sl_1234s_sample_idx.npy 0: > loading shuffle-idx mapping from /scratch/project_462000119/data/c4_validation/gpt2tok_c4validation_rerun_text_document_validation_indexmap_51200ns_2048sl_1234s_shuffle_idx.npy 0: loaded indexed file in 0.090 seconds 0: total number of samples: 84978 0: total number of epochs: 1 0: > finished creating GPT datasets ... 0: [after dataloaders are built] datetime: 2023-05-25 13:38:11 0: done with setup ... 0: training ... 31: time (ms) | model-and-optimizer-setup: 14533.34 | train/valid/test-data-iterators-setup: 1635.27 0: [after training is done] datetime: 2023-05-25 13:38:11 31: ----------------------------------------------------------------------------------------------------------------- 31: validation loss at the end of training for val data | lm loss value: 3.131161E+00 | lm loss PPL: 2.290056E+01 | 31: ----------------------------------------------------------------------------------------------------------------- END 3583607: Thu 25 May 2023 01:41:39 PM EEST