pszemraj
/

pegasus-x-large-book-summary

text2text-generation

Inference Endpoints

Model card Files Files and versions Community

pszemraj commited on Sep 16, 2022

Commit

c6d4e41

•

1 Parent(s): 8528f53

adan details

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -207,6 +207,7 @@ TODO
 #### Epochs 5 & 6
 The following hyperparameters were used during training:
 - learning_rate: 6e-05
 - train_batch_size: 4
 - eval_batch_size: 1
@@ -214,8 +215,9 @@ The following hyperparameters were used during training:
 - distributed_type: multi-GPU
 - gradient_accumulation_steps: 32
 - total_train_batch_size: 128
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: constant_with_warmup
 - num_epochs: 2
 ### Framework versions

 #### Epochs 5 & 6
 The following hyperparameters were used during training:
 - learning_rate: 6e-05
 - train_batch_size: 4
 - eval_batch_size: 1
 - distributed_type: multi-GPU
 - gradient_accumulation_steps: 32
 - total_train_batch_size: 128
+- optimizer: _ADAN_ using lucidrains' `adan-pytorch` with default betas
 - lr_scheduler_type: constant_with_warmup
+- data type: TF32
 - num_epochs: 2
 ### Framework versions