ary-modernbert-focal-lr-3e-5

This model is a fine-tuned version of answerdotai/ModernBERT-base on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 16
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 32
num_epochs: 4

Training Loss	Epoch	Step	Validation Loss	F1-micro	F1-macro
36.2902	1.0	81	12.0539	0.3013	0.2669
33.9076	2.0	162	11.7354	0.3050	0.2713
31.6342	3.0	243	11.7385	0.3035	0.2980
28.9004	4.0	324	11.8729	0.2942	0.3061