--- license: apache-2.0 --- **Hyperparameters:** - learning rate: 2e-5 - weight decay: 0.01 - per_device_train_batch_size: 8 - per_device_eval_batch_size: 8 - gradient_accumulation_steps:1 - eval steps: 6000 - max_length: 512 - num_epochs: 2 **Dataset version:** - “craffel/tasky_or_not”, “10xp3_10xc4”, “15f88c8” **Checkpoint:** - 48000 steps **Results on Validation set:** | Step | Training Loss | Validation Loss | Accuracy | Precision | Recall | F1 | |-------|---------------|-----------------|----------|-----------|----------|----------| | 6000 | 0.031900 | 0.163412 | 0.982194 | 0.999211 | 0.980462 | 0.989748 | | 12000 | 0.014700 | 0.106132 | 0.976666 | 0.999639 | 0.973733 | 0.986516 | | 18000 | 0.010700 | 0.043012 | 0.995743 | 0.999223 | 0.995918 | 0.997568 | | 24000 | 0.007400 | 0.095047 | 0.984724 | 0.999857 | 0.982714 | 0.991211 | | 30000 | 0.004100 | 0.087274 | 0.990400 | 0.999829 | 0.989217 | 0.994495 | | 36000 | 0.003100 | 0.162909 | 0.981972 | 1.000000 | 0.979434 | 0.989610 | | 42000 | 0.002200 | 0.148721 | 0.980454 | 0.999986 | 0.977717 | 0.988726 | | 48000 | 0.001000 | 0.094455 | 0.990437 | 0.999943 | 0.989147 | 0.994516 |