Edit model card

baseline_seed-42_1e-3

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.0426
  • Accuracy: 0.4215

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 32
  • eval_batch_size: 64
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 32000
  • num_epochs: 20.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Accuracy
6.4165 0.9996 1977 4.6002 0.2628
4.4252 1.9996 3955 3.9054 0.3265
3.8303 2.9997 5933 3.5867 0.3593
3.552 3.9997 7911 3.4361 0.3746
3.4003 4.9998 9889 3.3533 0.3826
3.3048 5.9999 11867 3.2995 0.3884
3.2405 6.9999 13845 3.2618 0.3921
3.1938 8.0 15823 3.2395 0.3949
3.1584 8.9996 17800 3.2179 0.3974
3.1331 9.9996 19778 3.1996 0.3994
3.1128 10.9997 21756 3.1903 0.4005
3.0941 11.9997 23734 3.1839 0.4014
3.0833 12.9998 25712 3.1728 0.4024
3.0736 13.9999 27690 3.1701 0.4029
3.0665 14.9999 29668 3.1649 0.4034
3.0616 16.0 31646 3.1627 0.4037
3.0446 16.9996 33623 3.1264 0.4081
2.9699 17.9996 35601 3.0900 0.4133
2.8822 18.9997 37579 3.0569 0.4182
2.7774 19.9912 39540 3.0426 0.4215

Framework versions

  • Transformers 4.45.1
  • Pytorch 2.4.1+cu121
  • Datasets 2.19.1
  • Tokenizers 0.20.0
Downloads last month
12
Safetensors
Model size
97.8M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.