2021-07-26 00:12:35.575266: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory 2021-07-26 00:12:35.575304: W tensorflow/stream_executor/cuda/cuda_driver.cc:326] failed call to cuInit: UNKNOWN ERROR (303) [00:12:36] - INFO - filelock - Lock 139656499698272 acquired on /home/versae/.cache/huggingface/transformers/27b7e968d2908b27f8c1df265c2dc08aef61be0f25bdc735df4df552829968fd.04a8293889c44bb7f31a5ee6212b8aa0b690121444e9c7ce1616fbe2a461ebba.lock Downloading: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 250M/250M [00:06<00:00, 35.8MB/s] [00:12:43] - INFO - filelock - Lock 139656499698272 released on /home/versae/.cache/huggingface/transformers/27b7e968d2908b27f8c1df265c2dc08aef61be0f25bdc735df4df552829968fd.04a8293889c44bb7f31a5ee6212b8aa0b690121444e9c7ce1616fbe2a461ebba.lock /var/hf/venv/lib/python3.8/site-packages/jax/lib/xla_bridge.py:386: UserWarning: jax.host_count has been renamed to jax.process_count. This alias will eventually be removed; please update your code. warnings.warn( /var/hf/venv/lib/python3.8/site-packages/jax/lib/xla_bridge.py:373: UserWarning: jax.host_id has been renamed to jax.process_index. This alias will eventually be removed; please update your code. warnings.warn( Training...: 2%|█▊ | 1000/50000 [22:19<17:30:45, 1.29s/it] Step... (500 | Loss: 1.8920137882232666, Learning Rate: 0.0006000000284984708) Training...: 2%|█▊ | 1000/50000 [22:21<17:30:45, 1.29s/it] [02:30:54] - INFO - __main__ - Saving checkpoint at 1000 steps██████████████████████████████████████████████████████| 130/130 [00:31<00:00, 4.59it/s] /var/hf/transformers-orig/src/transformers/modeling_flax_pytorch_utils.py:201: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:180.) pt_model_dict[flax_key] = torch.from_numpy(flax_tensor) All Flax model weights were used when initializing RobertaForMaskedLM. Some weights of RobertaForMaskedLM were not initialized from the Flax model and are newly initialized: ['lm_head.decoder.weight', 'roberta.embeddings.position_ids', 'lm_head.decoder.bias'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.