solar_10.7_darulm_unigram_proj_init_8node_darulm_part1_v3_1.0_512_12_02_24
This model is a fine-tuned version of ../solar_darulm_unigram_proj_init_17_01_24 on the None dataset. It achieves the following results on the evaluation set:
- Loss: 2.3397
- Accuracy: 0.5164
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- distributed_type: multi-GPU
- num_devices: 16
- gradient_accumulation_steps: 8
- total_train_batch_size: 128
- total_eval_batch_size: 16
- optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-05
- lr_scheduler_type: linear
- num_epochs: 1.0
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
2.6722 | 0.01 | 500 | 2.4811 | 0.4951 |
2.6243 | 0.02 | 1000 | 2.4459 | 0.4999 |
2.6051 | 0.04 | 1500 | 2.4295 | 0.5025 |
2.5901 | 0.05 | 2000 | 2.4194 | 0.5037 |
2.5852 | 0.06 | 2500 | 2.4124 | 0.5049 |
2.5818 | 0.07 | 3000 | 2.4072 | 0.5054 |
2.5801 | 0.09 | 3500 | 2.4024 | 0.5059 |
2.5626 | 0.1 | 4000 | 2.3988 | 0.5070 |
2.5697 | 0.11 | 4500 | 2.3958 | 0.5073 |
2.5532 | 0.12 | 5000 | 2.3928 | 0.5079 |
2.5505 | 0.13 | 5500 | 2.3904 | 0.5080 |
2.5497 | 0.15 | 6000 | 2.3872 | 0.5086 |
2.5636 | 0.16 | 6500 | 2.3857 | 0.5089 |
2.5483 | 0.17 | 7000 | 2.3835 | 0.5092 |
2.5505 | 0.18 | 7500 | 2.3813 | 0.5097 |
2.5419 | 0.2 | 8000 | 2.3796 | 0.5096 |
2.5467 | 0.21 | 8500 | 2.3786 | 0.5099 |
2.5419 | 0.22 | 9000 | 2.3769 | 0.5102 |
2.5269 | 0.23 | 9500 | 2.3754 | 0.5105 |
2.5315 | 0.24 | 10000 | 2.3740 | 0.5106 |
2.5442 | 0.26 | 10500 | 2.3728 | 0.5108 |
2.5318 | 0.27 | 11000 | 2.3713 | 0.5112 |
2.5242 | 0.28 | 11500 | 2.3702 | 0.5113 |
2.5178 | 0.29 | 12000 | 2.3698 | 0.5112 |
2.5345 | 0.31 | 12500 | 2.3687 | 0.5114 |
2.531 | 0.32 | 13000 | 2.3675 | 0.5115 |
2.5304 | 0.33 | 13500 | 2.3661 | 0.5118 |
2.5264 | 0.34 | 14000 | 2.3653 | 0.5121 |
2.5281 | 0.35 | 14500 | 2.3647 | 0.5123 |
2.5259 | 0.37 | 15000 | 2.3636 | 0.5123 |
2.5075 | 0.38 | 15500 | 2.3629 | 0.5122 |
2.5147 | 0.39 | 16000 | 2.3621 | 0.5127 |
2.5137 | 0.4 | 16500 | 2.3611 | 0.5128 |
2.5206 | 0.42 | 17000 | 2.3603 | 0.5129 |
2.5153 | 0.43 | 17500 | 2.3597 | 0.5128 |
2.5184 | 0.44 | 18000 | 2.3590 | 0.5130 |
2.5104 | 0.45 | 18500 | 2.3581 | 0.5132 |
2.5085 | 0.46 | 19000 | 2.3577 | 0.5134 |
2.509 | 0.48 | 19500 | 2.3572 | 0.5135 |
2.5143 | 0.49 | 20000 | 2.3564 | 0.5135 |
2.5124 | 0.5 | 20500 | 2.3555 | 0.5137 |
2.5107 | 0.51 | 21000 | 2.3546 | 0.5139 |
2.5034 | 0.53 | 21500 | 2.3543 | 0.5140 |
2.4922 | 0.54 | 22000 | 2.3538 | 0.5139 |
2.514 | 0.55 | 22500 | 2.3532 | 0.5140 |
2.5199 | 0.56 | 23000 | 2.3527 | 0.5141 |
2.4926 | 0.57 | 23500 | 2.3521 | 0.5142 |
2.5104 | 0.59 | 24000 | 2.3517 | 0.5142 |
2.5067 | 0.6 | 24500 | 2.3511 | 0.5144 |
2.5055 | 0.61 | 25000 | 2.3508 | 0.5142 |
2.5011 | 0.62 | 25500 | 2.3502 | 0.5146 |
2.4931 | 0.64 | 26000 | 2.3496 | 0.5147 |
2.4965 | 0.65 | 26500 | 2.3491 | 0.5147 |
2.495 | 0.66 | 27000 | 2.3488 | 0.5146 |
2.5051 | 0.67 | 27500 | 2.3481 | 0.5150 |
2.51 | 0.68 | 28000 | 2.3478 | 0.5150 |
2.4883 | 0.7 | 28500 | 2.3474 | 0.5152 |
2.4973 | 0.71 | 29000 | 2.3470 | 0.5151 |
2.4939 | 0.72 | 29500 | 2.3464 | 0.5153 |
2.4952 | 0.73 | 30000 | 2.3461 | 0.5153 |
2.5028 | 0.75 | 30500 | 2.3459 | 0.5154 |
2.4979 | 0.76 | 31000 | 2.3454 | 0.5154 |
2.4928 | 0.77 | 31500 | 2.3450 | 0.5155 |
2.501 | 0.78 | 32000 | 2.3446 | 0.5156 |
2.5 | 0.79 | 32500 | 2.3443 | 0.5156 |
2.4865 | 0.81 | 33000 | 2.3438 | 0.5156 |
2.4898 | 0.82 | 33500 | 2.3434 | 0.5157 |
2.4977 | 0.83 | 34000 | 2.3430 | 0.5160 |
2.4904 | 0.84 | 34500 | 2.3427 | 0.5157 |
2.4779 | 0.86 | 35000 | 2.3424 | 0.5159 |
2.4792 | 0.87 | 35500 | 2.3420 | 0.5159 |
2.4931 | 0.88 | 36000 | 2.3419 | 0.5160 |
2.4997 | 0.89 | 36500 | 2.3416 | 0.5160 |
2.4986 | 0.9 | 37000 | 2.3414 | 0.5161 |
2.4965 | 0.92 | 37500 | 2.3411 | 0.5162 |
2.4743 | 0.93 | 38000 | 2.3409 | 0.5162 |
2.497 | 0.94 | 38500 | 2.3406 | 0.5163 |
2.4942 | 0.95 | 39000 | 2.3404 | 0.5162 |
2.4907 | 0.97 | 39500 | 2.3402 | 0.5163 |
2.4821 | 0.98 | 40000 | 2.3400 | 0.5163 |
2.4857 | 0.99 | 40500 | 2.3398 | 0.5163 |
Framework versions
- Transformers 4.37.2
- Pytorch 2.1.2
- Datasets 2.16.1
- Tokenizers 0.15.2
- Downloads last month
- 27
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.