marcelovidigal's picture
Training in progress, epoch 5
e0f544a verified
raw
history blame
3.63 kB
/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/transformers/tokenization_utils_base.py:1601: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted in transformers v4.45, and will be then set to `False` by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884
warnings.warn(
Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert/distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
wandb: WARNING The `run_name` is currently set to the same value as `TrainingArguments.output_dir`. If this was not intended, please specify a different run name by setting the `TrainingArguments.run_name` parameter.
{'loss': 0.3343, 'grad_norm': 13.296141624450684, 'learning_rate': 1.872040946896993e-05, 'epoch': 0.32}
{'loss': 0.2518, 'grad_norm': 9.50092887878418, 'learning_rate': 1.744081893793986e-05, 'epoch': 0.64}
{'loss': 0.2252, 'grad_norm': 15.53085994720459, 'learning_rate': 1.616122840690979e-05, 'epoch': 0.96}
{'eval_loss': 0.23595260083675385, 'eval_accuracy': 0.90616, 'eval_runtime': 870.3291, 'eval_samples_per_second': 28.725, 'eval_steps_per_second': 1.796, 'epoch': 1.0}
{'loss': 0.1689, 'grad_norm': 7.930511474609375, 'learning_rate': 1.488163787587972e-05, 'epoch': 1.28}
{'loss': 0.1504, 'grad_norm': 36.72976303100586, 'learning_rate': 1.3602047344849649e-05, 'epoch': 1.6}
{'loss': 0.1526, 'grad_norm': 0.9081774353981018, 'learning_rate': 1.2322456813819578e-05, 'epoch': 1.92}
{'eval_loss': 0.2298162430524826, 'eval_accuracy': 0.92924, 'eval_runtime': 898.8856, 'eval_samples_per_second': 27.812, 'eval_steps_per_second': 1.739, 'epoch': 2.0}
{'loss': 0.1017, 'grad_norm': 0.0514792837202549, 'learning_rate': 1.1042866282789508e-05, 'epoch': 2.24}
{'loss': 0.08, 'grad_norm': 0.046891484409570694, 'learning_rate': 9.763275751759437e-06, 'epoch': 2.56}
{'loss': 0.0893, 'grad_norm': 0.06083718314766884, 'learning_rate': 8.483685220729368e-06, 'epoch': 2.88}
wandb: ERROR Error while calling W&B API: context deadline exceeded (<Response [500]>)
{'eval_loss': 0.2804093658924103, 'eval_accuracy': 0.93296, 'eval_runtime': 1235.9411, 'eval_samples_per_second': 20.228, 'eval_steps_per_second': 1.265, 'epoch': 3.0}
{'loss': 0.0667, 'grad_norm': 0.16060642898082733, 'learning_rate': 7.204094689699297e-06, 'epoch': 3.2}
{'loss': 0.0535, 'grad_norm': 0.03821321576833725, 'learning_rate': 5.924504158669226e-06, 'epoch': 3.52}
{'loss': 0.0489, 'grad_norm': 0.34995952248573303, 'learning_rate': 4.644913627639156e-06, 'epoch': 3.84}
wandb: ERROR Error while calling W&B API: context deadline exceeded (<Response [500]>)
{'eval_loss': 0.34566619992256165, 'eval_accuracy': 0.93168, 'eval_runtime': 847.1597, 'eval_samples_per_second': 29.51, 'eval_steps_per_second': 1.845, 'epoch': 4.0}
{'loss': 0.0418, 'grad_norm': 0.021322300657629967, 'learning_rate': 3.3653230966090854e-06, 'epoch': 4.16}
{'loss': 0.0241, 'grad_norm': 2.5630695819854736, 'learning_rate': 2.085732565579015e-06, 'epoch': 4.48}
{'loss': 0.0337, 'grad_norm': 0.019543960690498352, 'learning_rate': 8.061420345489445e-07, 'epoch': 4.8}
{'eval_loss': 0.35836175084114075, 'eval_accuracy': 0.9328, 'eval_runtime': 868.737, 'eval_samples_per_second': 28.777, 'eval_steps_per_second': 1.799, 'epoch': 5.0}