aneespatka commited on
Commit
02cc9fb
·
verified ·
1 Parent(s): 48c1db8

End of training

Browse files
Files changed (1) hide show
  1. README.md +6 -4
README.md CHANGED
@@ -42,16 +42,18 @@ The following hyperparameters were used during training:
42
  - train_batch_size: 32
43
  - eval_batch_size: 16
44
  - seed: 42
 
 
45
  - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
46
  - lr_scheduler_type: linear
47
  - num_epochs: 2
48
 
49
  ### Training results
50
 
51
- | Training Loss | Epoch | Step | Validation Loss | F1 |
52
- |:-------------:|:-----:|:----:|:---------------:|:------:|
53
- | 0.0 | 1.0 | 479 | nan | 0.2648 |
54
- | 0.0 | 2.0 | 958 | nan | 0.2648 |
55
 
56
 
57
  ### Framework versions
 
42
  - train_batch_size: 32
43
  - eval_batch_size: 16
44
  - seed: 42
45
+ - gradient_accumulation_steps: 4
46
+ - total_train_batch_size: 128
47
  - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
48
  - lr_scheduler_type: linear
49
  - num_epochs: 2
50
 
51
  ### Training results
52
 
53
+ | Training Loss | Epoch | Step | Validation Loss | F1 |
54
+ |:-------------:|:------:|:----:|:---------------:|:------:|
55
+ | 591.3904 | 0.9937 | 119 | nan | 0.2648 |
56
+ | 0.0 | 1.9937 | 238 | nan | 0.2648 |
57
 
58
 
59
  ### Framework versions