SUMMARY MODEL: | |
Model Params Size: 222882048 | |
Model Params Size Formatted: 222.88 M | |
Model Disk Size: 891648255 | |
Model Disk Size Formatted: 891.65 MB | |
TRAINING AND VALIDATION RESULTS: | |
Training batch size: 64 | |
Validation batch size: 128 | |
Total expected epochs: 40 | |
Total expected trainig steps: 94040 | |
Total expected trainig steps 2: 94040 | |
Total trained epochs: 40.0 | |
Total trained steps: 94040 | |
Elapsed time: 54841.370337963104 seconds | |
Elapsed time (formatted): 15:14:01 | |
Total flos: 3.6650496018087936e+18 | |
Total flos (formatted): 3.665050e+18 | |
Best epoch val_loss: 0.20909027755260468 | |
Best model checkpoint: /root/pretrain_utg4java_02/checkpoint-91689 | |
SUMMARY DATASETS: | |
Loaded Dataset: | |
DatasetDict({ | |
train: Dataset({ | |
features: ['text'], | |
num_rows: 150523 | |
}) | |
valid: Dataset({ | |
features: ['text'], | |
num_rows: 18816 | |
}) | |
test: Dataset({ | |
features: ['text'], | |
num_rows: 18815 | |
}) | |
}) | |
Tokenized Dataset: | |
DatasetDict({ | |
train: Dataset({ | |
features: ['input_ids'], | |
num_rows: 150523 | |
}) | |
valid: Dataset({ | |
features: ['input_ids'], | |
num_rows: 18816 | |
}) | |
test: Dataset({ | |
features: ['input_ids'], | |
num_rows: 18815 | |
}) | |
}) | |