aneespatka
/

modernbert-llm-sentiment

Text Classification

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

aneespatka commited on 16 days ago

Commit

02cc9fb

·

verified ·

1 Parent(s): 48c1db8

End of training

Files changed (1) hide show

README.md +6 -4

README.md CHANGED Viewed

@@ -42,16 +42,18 @@ The following hyperparameters were used during training:
 - train_batch_size: 32
 - eval_batch_size: 16
 - seed: 42
 - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - num_epochs: 2
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | F1     |
-|:-------------:|:-----:|:----:|:---------------:|:------:|
-| 0.0           | 1.0   | 479  | nan             | 0.2648 |
-| 0.0           | 2.0   | 958  | nan             | 0.2648 |
 ### Framework versions

 - train_batch_size: 32
 - eval_batch_size: 16
 - seed: 42
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 128
 - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - num_epochs: 2
 ### Training results
+| Training Loss | Epoch  | Step | Validation Loss | F1     |
+|:-------------:|:------:|:----:|:---------------:|:------:|
+| 591.3904      | 0.9937 | 119  | nan             | 0.2648 |
+| 0.0           | 1.9937 | 238  | nan             | 0.2648 |
 ### Framework versions