File size: 4,663 Bytes
5551a58 d725b93 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 |
[21:24:05] - INFO - absl - A polynomial schedule was set with a non-positive `transition_steps` value; this results in a constant schedule with value `init_value`.
/home/dat/pino/lib/python3.8/site-packages/jax/_src/numpy/lax_numpy.py:3132: UserWarning: Explicitly requested dtype <class 'jax._src.numpy.lax_numpy.int64'> requested in zeros is not available, and will be truncated to dtype int32. To enable more dtypes, set the jax_enable_x64 configuration option or the JAX_ENABLE_X64 shell environment variable. See https://github.com/google/jax#current-gotchas for more.
lax._check_user_dtype_supported(dtype, "zeros")
/home/dat/pino/lib/python3.8/site-packages/jax/lib/xla_bridge.py:386: UserWarning: jax.host_count has been renamed to jax.process_count. This alias will eventually be removed; please update your code.
warnings.warn(
/home/dat/pino/lib/python3.8/site-packages/jax/lib/xla_bridge.py:373: UserWarning: jax.host_id has been renamed to jax.process_index. This alias will eventually be removed; please update your code.
warnings.warn(
Epoch ... (1/5): 0%| | 0/5 [00:00<?, ?it/s][21:24:06] - INFO - __main__ - Skipping to epoch 0 step 0
Training...: 20%|ββββββββββββββββββββββββββ | 253/1250 [04:26<1:03:55, 3.85s/it]
Training...: 40%|ββββββββββββββββββββββββββββββββββββββββββββββββββββ | 500/1250 [07:26<09:01, 1.39it/s]
Evaluating ...: 0%| | 0/31 [00:00<?, ?it/s]
[21:32:05] - INFO - huggingface_hub.repository - git version 2.25.1ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 31/31 [00:21<00:00, 9.98it/s]
git-lfs/2.9.2 (GitHub; linux amd64; go 1.13.5)
[21:32:05] - DEBUG - huggingface_hub.repository - [Repository] is a valid git repo
[21:32:35] - INFO - huggingface_hub.repository - Uploading LFS objects: 100% (2/2), 510 MB | 31 MB/s, done.
tcmalloc: large alloc 1354776576 bytes == 0x304b28000 @ 0x7fd74488d680 0x7fd7448adbdd 0x7fd478ac320d 0x7fd478ad1340 0x7fd478ad0e87 0x7fd478ad0e87 0x7fd478ad0e87 0x7fd478ad0e87 0x7fd478ad0e87 0x7fd478ad0e87 0x7fd478ad0e87 0x7fd478ad0e87 0x7fd478ad0e87 0x7fd478ad0e87 0x7fd478accbd3 0x7fd478acd1fe 0x504d56 0x56acb6 0x568d9a 0x5f5b33 0x56bc9b 0x5f5956 0x56aadf 0x5f5956 0x56aadf 0x568d9a 0x5f5b33 0x56bc9b 0x568d9a 0x68cdc7 0x67e161
tcmalloc: large alloc 2715181056 bytes == 0x35572c000 @ 0x7fd74488d680 0x7fd7448adbdd 0x7fd478ac320d 0x7fd478ad1340 0x7fd478ad0e87 0x7fd478ad0e87 0x7fd478ad0e87 0x7fd478ad0e87 0x7fd478ad0e87 0x7fd478ad0e87 0x7fd478ad0e87 0x7fd478ad0e87 0x7fd478ad0e87 0x7fd478accbd3 0x7fd478acd1fe 0x504d56 0x56acb6 0x568d9a 0x5f5b33 0x56bc9b 0x5f5956 0x56aadf 0x5f5956 0x56aadf 0x568d9a 0x5f5b33 0x56bc9b 0x568d9a 0x68cdc7 0x67e161 0x67e1df
tcmalloc: large alloc 1530273792 bytes == 0x2ae462000 @ 0x7fd74488d680 0x7fd7448ae824 0x5f7b11 0x7fd478accc6f 0x7fd478acd1fe 0x504d56 0x56acb6 0x568d9a 0x5f5b33 0x56bc9b 0x5f5956 0x56aadf 0x5f5956 0x56aadf 0x568d9a 0x5f5b33 0x56bc9b 0x568d9a 0x68cdc7 0x67e161 0x67e1df 0x67e281 0x67e627 0x6b6e62 0x6b71ed 0x7fd7446a20b3 0x5f96de
[21:32:57] - INFO - __main__ - checkpoint saved
Training...: 40%|ββββββββββββββββββββββββββββββββββββββββββββββββββββ | 500/1250 [08:46<13:09, 1.05s/it]
Step... (500 | Loss: 10.108721733093262, Acc: 0.043713752180337906): 0%| | 0/5 [08:51<?, ?it/s]
Traceback (most recent call last):
File "./run_mlm_flax.py", line 853, in <module>
rotate_checkpoints(training_args.output_dir, training_args.save_total_limit)
NameError: name 'rotate_checkpoints' is not defined |