llama3.0-8B_finetune_QA_EDU_18k_samples_r64

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4429

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3.6e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.3387 0.0110 50 0.6839
0.3452 0.0221 100 0.6427
0.7728 0.0331 150 0.6094
0.3223 0.0442 200 0.5932
1.3626 0.0552 250 0.5825
0.4702 0.0663 300 0.5639
0.9801 0.0773 350 0.5492
0.7758 0.0883 400 0.5495
0.391 0.0994 450 0.5496
0.3294 0.1104 500 0.5482
0.5366 0.1215 550 0.5362
0.6959 0.1325 600 0.5327
0.8941 0.1436 650 0.5248
0.6213 0.1546 700 0.5152
0.3407 0.1656 750 0.5224
0.1588 0.1767 800 0.5152
0.43 0.1877 850 0.5142
0.5168 0.1988 900 0.5092
0.1723 0.2098 950 0.5120
0.7291 0.2208 1000 0.5050
1.0851 0.2319 1050 0.5066
0.4143 0.2429 1100 0.5023
0.6451 0.2540 1150 0.4926
0.355 0.2650 1200 0.4970
0.3478 0.2761 1250 0.4920
0.4542 0.2871 1300 0.4904
0.3247 0.2981 1350 0.4914
0.9451 0.3092 1400 0.4917
0.2979 0.3202 1450 0.4817
0.4421 0.3313 1500 0.4768
0.3082 0.3423 1550 0.4777
0.2085 0.3534 1600 0.4842
0.8352 0.3644 1650 0.4768
0.7308 0.3754 1700 0.4766
0.204 0.3865 1750 0.4757
0.0932 0.3975 1800 0.4734
0.7036 0.4086 1850 0.4762
0.3586 0.4196 1900 0.4742
0.2608 0.4307 1950 0.4657
0.2743 0.4417 2000 0.4679
0.6592 0.4527 2050 0.4639
0.8109 0.4638 2100 0.4586
0.5858 0.4748 2150 0.4617
0.3109 0.4859 2200 0.4593
0.9452 0.4969 2250 0.4553
0.3315 0.5080 2300 0.4594
0.6462 0.5190 2350 0.4593
0.8237 0.5300 2400 0.4574
0.2423 0.5411 2450 0.4571
0.7438 0.5521 2500 0.4552
0.3187 0.5632 2550 0.4577
0.1727 0.5742 2600 0.4539
1.0822 0.5852 2650 0.4516
0.7895 0.5963 2700 0.4528
0.5207 0.6073 2750 0.4512
0.6933 0.6184 2800 0.4478
0.2839 0.6294 2850 0.4466
0.2103 0.6405 2900 0.4489
0.3859 0.6515 2950 0.4488
0.635 0.6625 3000 0.4444
0.383 0.6736 3050 0.4471
0.1978 0.6846 3100 0.4461
0.3309 0.6957 3150 0.4430
0.1848 0.7067 3200 0.4552
0.5733 0.7178 3250 0.4459
0.1587 0.7288 3300 0.4455
0.9247 0.7398 3350 0.4440
0.5319 0.7509 3400 0.4495
0.5374 0.7619 3450 0.4489
0.306 0.7730 3500 0.4438
0.3354 0.7840 3550 0.4408
1.0731 0.7951 3600 0.4426
0.1792 0.8061 3650 0.4395
0.8926 0.8171 3700 0.4405
0.5666 0.8282 3750 0.4442
0.3252 0.8392 3800 0.4414
0.2968 0.8503 3850 0.4402
0.4431 0.8613 3900 0.4353
0.3688 0.8723 3950 0.4388
0.7693 0.8834 4000 0.4386
0.3554 0.8944 4050 0.4334
0.3929 0.9055 4100 0.4370
0.3479 0.9165 4150 0.4296
0.3202 0.9276 4200 0.4304
0.215 0.9386 4250 0.4347
0.4625 0.9496 4300 0.4285
0.3518 0.9607 4350 0.4333
0.6876 0.9717 4400 0.4317
0.5002 0.9828 4450 0.4316
0.5361 0.9938 4500 0.4330
0.2125 1.0049 4550 0.4329
0.1812 1.0159 4600 0.4346
0.2158 1.0269 4650 0.4370
0.4102 1.0380 4700 0.4318
0.206 1.0490 4750 0.4391
0.2351 1.0601 4800 0.4359
0.6547 1.0711 4850 0.4354
0.425 1.0822 4900 0.4382
0.2033 1.0932 4950 0.4372
0.2499 1.1042 5000 0.4366
0.4414 1.1153 5050 0.4413
0.7572 1.1263 5100 0.4348
0.3694 1.1374 5150 0.4360
0.3498 1.1484 5200 0.4371
0.2947 1.1595 5250 0.4393
0.2726 1.1705 5300 0.4339
0.5417 1.1815 5350 0.4353
0.2784 1.1926 5400 0.4375
0.6162 1.2036 5450 0.4353
0.1968 1.2147 5500 0.4368
0.1835 1.2257 5550 0.4356
0.3171 1.2367 5600 0.4349
0.5474 1.2478 5650 0.4328
0.448 1.2588 5700 0.4375
1.0185 1.2699 5750 0.4389
0.9601 1.2809 5800 0.4373
0.2133 1.2920 5850 0.4354
0.3183 1.3030 5900 0.4332
0.2862 1.3140 5950 0.4381
0.3393 1.3251 6000 0.4412
0.6158 1.3361 6050 0.4339
0.5195 1.3472 6100 0.4356
0.181 1.3582 6150 0.4355
0.5839 1.3693 6200 0.4303
0.1585 1.3803 6250 0.4284
0.2724 1.3913 6300 0.4315
0.1604 1.4024 6350 0.4341
0.4486 1.4134 6400 0.4315
0.3017 1.4245 6450 0.4326
0.1418 1.4355 6500 0.4282
0.2565 1.4466 6550 0.4356
0.7795 1.4576 6600 0.4318
0.3656 1.4686 6650 0.4317
0.3542 1.4797 6700 0.4312
0.2464 1.4907 6750 0.4323
0.3649 1.5018 6800 0.4343
0.4028 1.5128 6850 0.4305
0.1466 1.5239 6900 0.4300
0.42 1.5349 6950 0.4268
0.1989 1.5459 7000 0.4240
0.8268 1.5570 7050 0.4279
0.1798 1.5680 7100 0.4288
0.6061 1.5791 7150 0.4293
0.3439 1.5901 7200 0.4269
0.3005 1.6011 7250 0.4286
0.2483 1.6122 7300 0.4261
0.2318 1.6232 7350 0.4295
0.5939 1.6343 7400 0.4286
0.4636 1.6453 7450 0.4270
0.2645 1.6564 7500 0.4312
0.2303 1.6674 7550 0.4295
0.1253 1.6784 7600 0.4257
0.2056 1.6895 7650 0.4256
0.1562 1.7005 7700 0.4267
0.0983 1.7116 7750 0.4270
0.2293 1.7226 7800 0.4238
0.2084 1.7337 7850 0.4247
0.2603 1.7447 7900 0.4282
0.2476 1.7557 7950 0.4297
0.1617 1.7668 8000 0.4304
0.627 1.7778 8050 0.4291
0.1572 1.7889 8100 0.4324
0.6089 1.7999 8150 0.4287
0.2519 1.8110 8200 0.4307
0.156 1.8220 8250 0.4361
0.2344 1.8330 8300 0.4343
0.6206 1.8441 8350 0.4311
0.1689 1.8551 8400 0.4343
0.4103 1.8662 8450 0.4282
0.1125 1.8772 8500 0.4290
0.2976 1.8883 8550 0.4289
0.1273 1.8993 8600 0.4352
0.2673 1.9103 8650 0.4298
0.6082 1.9214 8700 0.4244
0.3434 1.9324 8750 0.4235
0.2326 1.9435 8800 0.4245
0.1648 1.9545 8850 0.4236
0.1453 1.9655 8900 0.4256
0.1861 1.9766 8950 0.4218
0.2101 1.9876 9000 0.4199
0.4006 1.9987 9050 0.4235
0.1447 2.0097 9100 0.4393
0.1431 2.0208 9150 0.4469
0.3006 2.0318 9200 0.4494
0.18 2.0428 9250 0.4486
0.2461 2.0539 9300 0.4494
0.0889 2.0649 9350 0.4462
0.1561 2.0760 9400 0.4481
0.1231 2.0870 9450 0.4454
0.1086 2.0981 9500 0.4455
0.158 2.1091 9550 0.4508
0.2087 2.1201 9600 0.4419
0.1167 2.1312 9650 0.4440
0.081 2.1422 9700 0.4444
0.1423 2.1533 9750 0.4506
0.2236 2.1643 9800 0.4499
0.1666 2.1754 9850 0.4477
0.1584 2.1864 9900 0.4510
0.1239 2.1974 9950 0.4469
0.2762 2.2085 10000 0.4445
0.0763 2.2195 10050 0.4481
0.1761 2.2306 10100 0.4471
0.3125 2.2416 10150 0.4466
0.2341 2.2527 10200 0.4448
0.3233 2.2637 10250 0.4467
0.1329 2.2747 10300 0.4501
0.2718 2.2858 10350 0.4494
0.3885 2.2968 10400 0.4461
0.2436 2.3079 10450 0.4521
0.2303 2.3189 10500 0.4419
0.3943 2.3299 10550 0.4450
0.2931 2.3410 10600 0.4459
0.2592 2.3520 10650 0.4414
0.1219 2.3631 10700 0.4457
0.1899 2.3741 10750 0.4444
0.1499 2.3852 10800 0.4432
0.1858 2.3962 10850 0.4444
0.1313 2.4072 10900 0.4441
0.1774 2.4183 10950 0.4434
0.1376 2.4293 11000 0.4445
0.2892 2.4404 11050 0.4429
0.0961 2.4514 11100 0.4429
0.3296 2.4625 11150 0.4424
0.1569 2.4735 11200 0.4436
0.128 2.4845 11250 0.4458
0.3241 2.4956 11300 0.4504
0.1969 2.5066 11350 0.4455
0.0935 2.5177 11400 0.4434
0.2211 2.5287 11450 0.4456
0.476 2.5398 11500 0.4429

Framework versions

  • PEFT 0.12.0
  • Transformers 4.45.2
  • Pytorch 2.4.0+cu121
  • Datasets 3.0.0
  • Tokenizers 0.20.1
Downloads last month
12
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for strongpear/llama3.0-8B_finetune_QA_EDU_18k_samples_r64

Adapter
(536)
this model