llama3.0-8B_finetune_QA_EDU_18k_samples_r64
This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.4429
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3.6e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 5
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
0.3387 | 0.0110 | 50 | 0.6839 |
0.3452 | 0.0221 | 100 | 0.6427 |
0.7728 | 0.0331 | 150 | 0.6094 |
0.3223 | 0.0442 | 200 | 0.5932 |
1.3626 | 0.0552 | 250 | 0.5825 |
0.4702 | 0.0663 | 300 | 0.5639 |
0.9801 | 0.0773 | 350 | 0.5492 |
0.7758 | 0.0883 | 400 | 0.5495 |
0.391 | 0.0994 | 450 | 0.5496 |
0.3294 | 0.1104 | 500 | 0.5482 |
0.5366 | 0.1215 | 550 | 0.5362 |
0.6959 | 0.1325 | 600 | 0.5327 |
0.8941 | 0.1436 | 650 | 0.5248 |
0.6213 | 0.1546 | 700 | 0.5152 |
0.3407 | 0.1656 | 750 | 0.5224 |
0.1588 | 0.1767 | 800 | 0.5152 |
0.43 | 0.1877 | 850 | 0.5142 |
0.5168 | 0.1988 | 900 | 0.5092 |
0.1723 | 0.2098 | 950 | 0.5120 |
0.7291 | 0.2208 | 1000 | 0.5050 |
1.0851 | 0.2319 | 1050 | 0.5066 |
0.4143 | 0.2429 | 1100 | 0.5023 |
0.6451 | 0.2540 | 1150 | 0.4926 |
0.355 | 0.2650 | 1200 | 0.4970 |
0.3478 | 0.2761 | 1250 | 0.4920 |
0.4542 | 0.2871 | 1300 | 0.4904 |
0.3247 | 0.2981 | 1350 | 0.4914 |
0.9451 | 0.3092 | 1400 | 0.4917 |
0.2979 | 0.3202 | 1450 | 0.4817 |
0.4421 | 0.3313 | 1500 | 0.4768 |
0.3082 | 0.3423 | 1550 | 0.4777 |
0.2085 | 0.3534 | 1600 | 0.4842 |
0.8352 | 0.3644 | 1650 | 0.4768 |
0.7308 | 0.3754 | 1700 | 0.4766 |
0.204 | 0.3865 | 1750 | 0.4757 |
0.0932 | 0.3975 | 1800 | 0.4734 |
0.7036 | 0.4086 | 1850 | 0.4762 |
0.3586 | 0.4196 | 1900 | 0.4742 |
0.2608 | 0.4307 | 1950 | 0.4657 |
0.2743 | 0.4417 | 2000 | 0.4679 |
0.6592 | 0.4527 | 2050 | 0.4639 |
0.8109 | 0.4638 | 2100 | 0.4586 |
0.5858 | 0.4748 | 2150 | 0.4617 |
0.3109 | 0.4859 | 2200 | 0.4593 |
0.9452 | 0.4969 | 2250 | 0.4553 |
0.3315 | 0.5080 | 2300 | 0.4594 |
0.6462 | 0.5190 | 2350 | 0.4593 |
0.8237 | 0.5300 | 2400 | 0.4574 |
0.2423 | 0.5411 | 2450 | 0.4571 |
0.7438 | 0.5521 | 2500 | 0.4552 |
0.3187 | 0.5632 | 2550 | 0.4577 |
0.1727 | 0.5742 | 2600 | 0.4539 |
1.0822 | 0.5852 | 2650 | 0.4516 |
0.7895 | 0.5963 | 2700 | 0.4528 |
0.5207 | 0.6073 | 2750 | 0.4512 |
0.6933 | 0.6184 | 2800 | 0.4478 |
0.2839 | 0.6294 | 2850 | 0.4466 |
0.2103 | 0.6405 | 2900 | 0.4489 |
0.3859 | 0.6515 | 2950 | 0.4488 |
0.635 | 0.6625 | 3000 | 0.4444 |
0.383 | 0.6736 | 3050 | 0.4471 |
0.1978 | 0.6846 | 3100 | 0.4461 |
0.3309 | 0.6957 | 3150 | 0.4430 |
0.1848 | 0.7067 | 3200 | 0.4552 |
0.5733 | 0.7178 | 3250 | 0.4459 |
0.1587 | 0.7288 | 3300 | 0.4455 |
0.9247 | 0.7398 | 3350 | 0.4440 |
0.5319 | 0.7509 | 3400 | 0.4495 |
0.5374 | 0.7619 | 3450 | 0.4489 |
0.306 | 0.7730 | 3500 | 0.4438 |
0.3354 | 0.7840 | 3550 | 0.4408 |
1.0731 | 0.7951 | 3600 | 0.4426 |
0.1792 | 0.8061 | 3650 | 0.4395 |
0.8926 | 0.8171 | 3700 | 0.4405 |
0.5666 | 0.8282 | 3750 | 0.4442 |
0.3252 | 0.8392 | 3800 | 0.4414 |
0.2968 | 0.8503 | 3850 | 0.4402 |
0.4431 | 0.8613 | 3900 | 0.4353 |
0.3688 | 0.8723 | 3950 | 0.4388 |
0.7693 | 0.8834 | 4000 | 0.4386 |
0.3554 | 0.8944 | 4050 | 0.4334 |
0.3929 | 0.9055 | 4100 | 0.4370 |
0.3479 | 0.9165 | 4150 | 0.4296 |
0.3202 | 0.9276 | 4200 | 0.4304 |
0.215 | 0.9386 | 4250 | 0.4347 |
0.4625 | 0.9496 | 4300 | 0.4285 |
0.3518 | 0.9607 | 4350 | 0.4333 |
0.6876 | 0.9717 | 4400 | 0.4317 |
0.5002 | 0.9828 | 4450 | 0.4316 |
0.5361 | 0.9938 | 4500 | 0.4330 |
0.2125 | 1.0049 | 4550 | 0.4329 |
0.1812 | 1.0159 | 4600 | 0.4346 |
0.2158 | 1.0269 | 4650 | 0.4370 |
0.4102 | 1.0380 | 4700 | 0.4318 |
0.206 | 1.0490 | 4750 | 0.4391 |
0.2351 | 1.0601 | 4800 | 0.4359 |
0.6547 | 1.0711 | 4850 | 0.4354 |
0.425 | 1.0822 | 4900 | 0.4382 |
0.2033 | 1.0932 | 4950 | 0.4372 |
0.2499 | 1.1042 | 5000 | 0.4366 |
0.4414 | 1.1153 | 5050 | 0.4413 |
0.7572 | 1.1263 | 5100 | 0.4348 |
0.3694 | 1.1374 | 5150 | 0.4360 |
0.3498 | 1.1484 | 5200 | 0.4371 |
0.2947 | 1.1595 | 5250 | 0.4393 |
0.2726 | 1.1705 | 5300 | 0.4339 |
0.5417 | 1.1815 | 5350 | 0.4353 |
0.2784 | 1.1926 | 5400 | 0.4375 |
0.6162 | 1.2036 | 5450 | 0.4353 |
0.1968 | 1.2147 | 5500 | 0.4368 |
0.1835 | 1.2257 | 5550 | 0.4356 |
0.3171 | 1.2367 | 5600 | 0.4349 |
0.5474 | 1.2478 | 5650 | 0.4328 |
0.448 | 1.2588 | 5700 | 0.4375 |
1.0185 | 1.2699 | 5750 | 0.4389 |
0.9601 | 1.2809 | 5800 | 0.4373 |
0.2133 | 1.2920 | 5850 | 0.4354 |
0.3183 | 1.3030 | 5900 | 0.4332 |
0.2862 | 1.3140 | 5950 | 0.4381 |
0.3393 | 1.3251 | 6000 | 0.4412 |
0.6158 | 1.3361 | 6050 | 0.4339 |
0.5195 | 1.3472 | 6100 | 0.4356 |
0.181 | 1.3582 | 6150 | 0.4355 |
0.5839 | 1.3693 | 6200 | 0.4303 |
0.1585 | 1.3803 | 6250 | 0.4284 |
0.2724 | 1.3913 | 6300 | 0.4315 |
0.1604 | 1.4024 | 6350 | 0.4341 |
0.4486 | 1.4134 | 6400 | 0.4315 |
0.3017 | 1.4245 | 6450 | 0.4326 |
0.1418 | 1.4355 | 6500 | 0.4282 |
0.2565 | 1.4466 | 6550 | 0.4356 |
0.7795 | 1.4576 | 6600 | 0.4318 |
0.3656 | 1.4686 | 6650 | 0.4317 |
0.3542 | 1.4797 | 6700 | 0.4312 |
0.2464 | 1.4907 | 6750 | 0.4323 |
0.3649 | 1.5018 | 6800 | 0.4343 |
0.4028 | 1.5128 | 6850 | 0.4305 |
0.1466 | 1.5239 | 6900 | 0.4300 |
0.42 | 1.5349 | 6950 | 0.4268 |
0.1989 | 1.5459 | 7000 | 0.4240 |
0.8268 | 1.5570 | 7050 | 0.4279 |
0.1798 | 1.5680 | 7100 | 0.4288 |
0.6061 | 1.5791 | 7150 | 0.4293 |
0.3439 | 1.5901 | 7200 | 0.4269 |
0.3005 | 1.6011 | 7250 | 0.4286 |
0.2483 | 1.6122 | 7300 | 0.4261 |
0.2318 | 1.6232 | 7350 | 0.4295 |
0.5939 | 1.6343 | 7400 | 0.4286 |
0.4636 | 1.6453 | 7450 | 0.4270 |
0.2645 | 1.6564 | 7500 | 0.4312 |
0.2303 | 1.6674 | 7550 | 0.4295 |
0.1253 | 1.6784 | 7600 | 0.4257 |
0.2056 | 1.6895 | 7650 | 0.4256 |
0.1562 | 1.7005 | 7700 | 0.4267 |
0.0983 | 1.7116 | 7750 | 0.4270 |
0.2293 | 1.7226 | 7800 | 0.4238 |
0.2084 | 1.7337 | 7850 | 0.4247 |
0.2603 | 1.7447 | 7900 | 0.4282 |
0.2476 | 1.7557 | 7950 | 0.4297 |
0.1617 | 1.7668 | 8000 | 0.4304 |
0.627 | 1.7778 | 8050 | 0.4291 |
0.1572 | 1.7889 | 8100 | 0.4324 |
0.6089 | 1.7999 | 8150 | 0.4287 |
0.2519 | 1.8110 | 8200 | 0.4307 |
0.156 | 1.8220 | 8250 | 0.4361 |
0.2344 | 1.8330 | 8300 | 0.4343 |
0.6206 | 1.8441 | 8350 | 0.4311 |
0.1689 | 1.8551 | 8400 | 0.4343 |
0.4103 | 1.8662 | 8450 | 0.4282 |
0.1125 | 1.8772 | 8500 | 0.4290 |
0.2976 | 1.8883 | 8550 | 0.4289 |
0.1273 | 1.8993 | 8600 | 0.4352 |
0.2673 | 1.9103 | 8650 | 0.4298 |
0.6082 | 1.9214 | 8700 | 0.4244 |
0.3434 | 1.9324 | 8750 | 0.4235 |
0.2326 | 1.9435 | 8800 | 0.4245 |
0.1648 | 1.9545 | 8850 | 0.4236 |
0.1453 | 1.9655 | 8900 | 0.4256 |
0.1861 | 1.9766 | 8950 | 0.4218 |
0.2101 | 1.9876 | 9000 | 0.4199 |
0.4006 | 1.9987 | 9050 | 0.4235 |
0.1447 | 2.0097 | 9100 | 0.4393 |
0.1431 | 2.0208 | 9150 | 0.4469 |
0.3006 | 2.0318 | 9200 | 0.4494 |
0.18 | 2.0428 | 9250 | 0.4486 |
0.2461 | 2.0539 | 9300 | 0.4494 |
0.0889 | 2.0649 | 9350 | 0.4462 |
0.1561 | 2.0760 | 9400 | 0.4481 |
0.1231 | 2.0870 | 9450 | 0.4454 |
0.1086 | 2.0981 | 9500 | 0.4455 |
0.158 | 2.1091 | 9550 | 0.4508 |
0.2087 | 2.1201 | 9600 | 0.4419 |
0.1167 | 2.1312 | 9650 | 0.4440 |
0.081 | 2.1422 | 9700 | 0.4444 |
0.1423 | 2.1533 | 9750 | 0.4506 |
0.2236 | 2.1643 | 9800 | 0.4499 |
0.1666 | 2.1754 | 9850 | 0.4477 |
0.1584 | 2.1864 | 9900 | 0.4510 |
0.1239 | 2.1974 | 9950 | 0.4469 |
0.2762 | 2.2085 | 10000 | 0.4445 |
0.0763 | 2.2195 | 10050 | 0.4481 |
0.1761 | 2.2306 | 10100 | 0.4471 |
0.3125 | 2.2416 | 10150 | 0.4466 |
0.2341 | 2.2527 | 10200 | 0.4448 |
0.3233 | 2.2637 | 10250 | 0.4467 |
0.1329 | 2.2747 | 10300 | 0.4501 |
0.2718 | 2.2858 | 10350 | 0.4494 |
0.3885 | 2.2968 | 10400 | 0.4461 |
0.2436 | 2.3079 | 10450 | 0.4521 |
0.2303 | 2.3189 | 10500 | 0.4419 |
0.3943 | 2.3299 | 10550 | 0.4450 |
0.2931 | 2.3410 | 10600 | 0.4459 |
0.2592 | 2.3520 | 10650 | 0.4414 |
0.1219 | 2.3631 | 10700 | 0.4457 |
0.1899 | 2.3741 | 10750 | 0.4444 |
0.1499 | 2.3852 | 10800 | 0.4432 |
0.1858 | 2.3962 | 10850 | 0.4444 |
0.1313 | 2.4072 | 10900 | 0.4441 |
0.1774 | 2.4183 | 10950 | 0.4434 |
0.1376 | 2.4293 | 11000 | 0.4445 |
0.2892 | 2.4404 | 11050 | 0.4429 |
0.0961 | 2.4514 | 11100 | 0.4429 |
0.3296 | 2.4625 | 11150 | 0.4424 |
0.1569 | 2.4735 | 11200 | 0.4436 |
0.128 | 2.4845 | 11250 | 0.4458 |
0.3241 | 2.4956 | 11300 | 0.4504 |
0.1969 | 2.5066 | 11350 | 0.4455 |
0.0935 | 2.5177 | 11400 | 0.4434 |
0.2211 | 2.5287 | 11450 | 0.4456 |
0.476 | 2.5398 | 11500 | 0.4429 |
Framework versions
- PEFT 0.12.0
- Transformers 4.45.2
- Pytorch 2.4.0+cu121
- Datasets 3.0.0
- Tokenizers 0.20.1
- Downloads last month
- 12
Model tree for strongpear/llama3.0-8B_finetune_QA_EDU_18k_samples_r64
Base model
meta-llama/Meta-Llama-3-8B