Edit model card

qwen2_MetaMathQA_40K_ortho

This model is a fine-tuned version of Qwen/Qwen2-7B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1523

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 0.02
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
0.1742 0.0211 13 0.1537
0.1454 0.0421 26 0.1536
0.1508 0.0632 39 0.1538
0.1453 0.0843 52 0.1541
0.1475 0.1053 65 0.1551
0.1506 0.1264 78 0.1554
0.1562 0.1474 91 0.1569
0.1516 0.1685 104 0.1574
0.1558 0.1896 117 0.1583
0.1565 0.2106 130 0.1594
0.1577 0.2317 143 0.1601
0.1498 0.2528 156 0.1598
0.1578 0.2738 169 0.1602
0.1532 0.2949 182 0.1604
0.1593 0.3159 195 0.1605
0.151 0.3370 208 0.1598
0.1532 0.3581 221 0.1596
0.1519 0.3791 234 0.1593
0.1531 0.4002 247 0.1594
0.1545 0.4213 260 0.1590
0.1619 0.4423 273 0.1595
0.1561 0.4634 286 0.1589
0.1605 0.4845 299 0.1578
0.1495 0.5055 312 0.1584
0.1473 0.5266 325 0.1579
0.1505 0.5476 338 0.1568
0.1525 0.5687 351 0.1564
0.1565 0.5898 364 0.1560
0.1514 0.6108 377 0.1557
0.1459 0.6319 390 0.1550
0.1537 0.6530 403 0.1544
0.1539 0.6740 416 0.1545
0.1512 0.6951 429 0.1546
0.1536 0.7162 442 0.1540
0.1468 0.7372 455 0.1532
0.1504 0.7583 468 0.1532
0.1509 0.7793 481 0.1532
0.1506 0.8004 494 0.1530
0.147 0.8215 507 0.1527
0.1473 0.8425 520 0.1526
0.1505 0.8636 533 0.1526
0.1503 0.8847 546 0.1525
0.1474 0.9057 559 0.1524
0.1461 0.9268 572 0.1525
0.1459 0.9478 585 0.1523
0.1458 0.9689 598 0.1524
0.1523 0.9900 611 0.1523

Framework versions

  • PEFT 0.7.1
  • Transformers 4.40.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
2
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for imdatta0/qwen2_MetaMathQA_40K_ortho

Base model

Qwen/Qwen2-7B
Adapter
(233)
this model