btqa-base / README.md
toxuin's picture
Update README.md
c6fb484
metadata
license: apache-2.0
base_model: cerebras/btlm-3b-8k-base
tags:
  - generated_from_trainer
model-index:
  - name: app/bt_qa-out
    results: []
datasets:
  - iarfmoose/question_generator
  - Defalt-404/Bittensor_validator
  - multi_news
  - cnn_dailymail
library_name: peft

Built with Axolotl

app/bt_qa-out

This model is a fine-tuned version of cerebras/btlm-3b-8k-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.7451

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 8.5e-05
  • train_batch_size: 3
  • eval_batch_size: 3
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 32
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
2.7377 0.0 500 2.5700
2.5689 0.01 1000 2.4172
2.6676 0.01 1500 2.3622
2.4234 0.01 2000 2.3293
2.2221 0.01 2500 2.3090
2.3873 0.02 3000 2.2940
2.2688 0.02 3500 2.2644
2.4194 0.02 4000 2.2725
2.4086 0.03 4500 2.2561
2.4476 0.03 5000 2.2477
2.1512 0.03 5500 2.2330
2.1428 0.03 6000 2.2235
2.2834 0.04 6500 2.2141
2.2918 0.04 7000 2.2124
2.4352 0.04 7500 2.2074
1.7196 0.05 8000 2.2038
2.2394 0.05 8500 2.1973
2.1632 0.05 9000 2.1856
2.4313 0.06 9500 2.1820
2.4584 0.06 10000 2.1764
2.3359 0.06 10500 2.1780
2.2105 0.06 11000 2.1671
2.3152 0.07 11500 2.1603
2.3012 0.07 12000 2.1572
2.4636 0.07 12500 2.1553
2.0974 0.08 13000 2.1511
2.298 0.08 13500 2.1481
2.3312 0.08 14000 2.1445
2.5315 0.08 14500 2.1381
2.1854 0.09 15000 2.1364
2.3069 0.09 15500 2.1355
2.0756 0.09 16000 2.1331
2.0094 0.1 16500 2.1306
2.2674 0.1 17000 2.1230
1.8427 0.1 17500 2.1176
2.2277 0.1 18000 2.1168
2.1398 0.11 18500 2.1152
1.9927 0.11 19000 2.1088
2.0119 0.11 19500 2.1105
2.5796 0.12 20000 2.1040
1.3256 0.12 20500 2.0993
2.2051 0.12 21000 2.0992
1.628 0.13 21500 2.0944
2.1926 0.13 22000 2.0927
1.6482 0.13 22500 2.0873
2.1122 0.13 23000 2.0830
1.7405 0.14 23500 2.0828
2.2685 0.14 24000 2.0784
2.1062 0.14 24500 2.0766
2.1308 0.15 25000 2.0714
1.9122 0.15 25500 2.0719
2.3549 0.15 26000 2.0643
2.2159 0.15 26500 2.0655
1.493 0.16 27000 2.0598
1.893 0.16 27500 2.0557
2.1902 0.16 28000 2.0533
2.2353 0.17 28500 2.0524
1.8736 0.17 29000 2.0519
2.0511 0.17 29500 2.0449
1.2872 0.17 30000 2.0453
1.6353 0.18 30500 2.0377
1.992 0.18 31000 2.0419
2.3586 0.18 31500 2.0353
1.9453 0.19 32000 2.0330
2.1322 0.19 32500 2.0305
2.2887 0.19 33000 2.0253
2.0268 0.2 33500 2.0267
1.8397 0.2 34000 2.0207
2.5165 0.2 34500 2.0202
1.9142 0.2 35000 2.0139
1.5993 0.21 35500 2.0179
2.1691 0.21 36000 2.0102
2.4948 0.21 36500 2.0089
1.5422 0.22 37000 2.0039
1.4566 0.22 37500 2.0014
1.852 0.22 38000 2.0043
2.199 0.22 38500 1.9987
1.4852 0.23 39000 1.9976
1.3 0.23 39500 1.9936
2.1237 0.23 40000 1.9917
1.691 0.24 40500 1.9887
2.2169 0.24 41000 1.9870
2.1991 0.24 41500 1.9851
1.9517 0.24 42000 1.9806
1.6369 0.25 42500 1.9762
2.2759 0.25 43000 1.9753
2.2923 0.25 43500 1.9748
2.2552 0.26 44000 1.9702
2.066 0.26 44500 1.9683
2.2703 0.26 45000 1.9686
2.3544 0.27 45500 1.9648
2.255 0.27 46000 1.9635
1.8732 0.27 46500 1.9639
2.1203 0.27 47000 1.9590
2.1314 0.28 47500 1.9573
1.8511 0.28 48000 1.9533
2.1471 0.28 48500 1.9514
1.8417 0.29 49000 1.9509
2.4485 0.29 49500 1.9502
2.0708 0.29 50000 1.9455
1.8272 0.29 50500 1.9416
1.6232 0.3 51000 1.9380
1.6785 0.3 51500 1.9358
1.5734 0.3 52000 1.9313
1.9737 0.31 52500 1.9301
1.8393 0.31 53000 1.9295
1.4789 0.31 53500 1.9281
2.2062 0.31 54000 1.9273
2.3501 0.32 54500 1.9236
2.2756 0.32 55000 1.9218
2.1001 0.32 55500 1.9215
2.0342 0.33 56000 1.9179
1.8066 0.33 56500 1.9143
1.8322 0.33 57000 1.9137
2.0926 0.34 57500 1.9106
2.2106 0.34 58000 1.9083
2.0666 0.34 58500 1.9055
2.2082 0.34 59000 1.9026
2.1768 0.35 59500 1.9007
1.7091 0.35 60000 1.8967
1.7585 0.35 60500 1.8946
1.8968 0.36 61000 1.8936
2.107 0.36 61500 1.8906
1.5162 0.36 62000 1.8870
2.0642 0.36 62500 1.8836
2.0399 0.37 63000 1.8813
2.3971 0.37 63500 1.8785
1.7433 0.37 64000 1.8797
2.0971 0.38 64500 1.8743
1.8212 0.38 65000 1.8726
2.1023 0.38 65500 1.8695
1.9735 0.38 66000 1.8674
1.3196 0.39 66500 1.8657
1.9825 0.39 67000 1.8629
2.0356 0.39 67500 1.8604
1.8522 0.4 68000 1.8581
2.2666 0.4 68500 1.8568
2.3575 0.4 69000 1.8538
2.0086 0.41 69500 1.8537
1.9811 0.41 70000 1.8512
2.0702 0.41 70500 1.8485
1.8554 0.41 71000 1.8456
0.5356 0.42 71500 1.8437
1.4742 0.42 72000 1.8413
2.1901 0.42 72500 1.8420
1.7868 0.43 73000 1.8383
1.3144 0.43 73500 1.8371
2.1158 0.43 74000 1.8347
2.0779 0.43 74500 1.8331
1.9756 0.44 75000 1.8323
2.3395 0.44 75500 1.8309
1.895 0.44 76000 1.8283
2.0369 0.45 76500 1.8274
1.8068 0.45 77000 1.8251
2.2153 0.45 77500 1.8227
2.1389 0.45 78000 1.8212
1.9166 0.46 78500 1.8197
1.711 0.46 79000 1.8187
1.9102 0.46 79500 1.8165
0.8358 0.47 80000 1.8163
1.7278 0.47 80500 1.8148
1.601 0.47 81000 1.8126
1.9794 0.48 81500 1.8107
1.7323 0.48 82000 1.8095
2.2911 0.48 82500 1.8090
1.8962 0.48 83000 1.8065
2.3055 0.49 83500 1.8052
1.6899 0.49 84000 1.8037
1.6409 0.49 84500 1.8031
1.9116 0.5 85000 1.8011
0.6875 0.5 85500 1.8003
2.0829 0.5 86000 1.7983
1.5716 0.5 86500 1.7981
2.4537 0.51 87000 1.7961
1.8236 0.51 87500 1.7942
1.641 0.51 88000 1.7931
1.5533 0.52 88500 1.7916
1.679 0.52 89000 1.7902
2.1463 0.52 89500 1.7893
1.5477 0.52 90000 1.7884
1.2346 0.53 90500 1.7873
1.3352 0.53 91000 1.7859
2.1039 0.53 91500 1.7850
2.0818 0.54 92000 1.7834
1.3987 0.54 92500 1.7830
1.4544 0.54 93000 1.7827
0.4043 0.55 93500 1.7811
2.0149 0.55 94000 1.7794
1.9845 0.55 94500 1.7789
2.1053 0.55 95000 1.7775
2.1572 0.56 95500 1.7768
2.0754 0.56 96000 1.7761
1.7675 0.56 96500 1.7754
2.0023 0.57 97000 1.7743
1.2653 0.57 97500 1.7736
1.5566 0.57 98000 1.7728
1.9408 0.57 98500 1.7724
2.0936 0.58 99000 1.7713
0.5687 0.58 99500 1.7706
2.2833 0.58 100000 1.7702
1.6689 0.59 100500 1.7690
1.5198 0.59 101000 1.7684
1.6968 0.59 101500 1.7679
2.2034 0.59 102000 1.7674
1.7902 0.6 102500 1.7665
2.0557 0.6 103000 1.7658
1.8617 0.6 103500 1.7650
1.8749 0.61 104000 1.7637
1.7674 0.61 104500 1.7632
1.4269 0.61 105000 1.7627
1.989 0.62 105500 1.7621
2.1026 0.62 106000 1.7615
2.0304 0.62 106500 1.7609
1.6286 0.62 107000 1.7603
0.9544 0.63 107500 1.7599
1.6421 0.63 108000 1.7588
1.9841 0.63 108500 1.7586
1.7453 0.64 109000 1.7581
1.2119 0.64 109500 1.7575
2.1092 0.64 110000 1.7568
2.0849 0.64 110500 1.7564
1.9162 0.65 111000 1.7562
1.01 0.65 111500 1.7560
1.301 0.65 112000 1.7556
0.315 0.66 112500 1.7552
1.9964 0.66 113000 1.7548
2.4035 0.66 113500 1.7544
1.3559 0.66 114000 1.7542
2.1874 0.67 114500 1.7538
1.4373 0.67 115000 1.7534
0.0639 0.67 115500 1.7529
1.7667 0.68 116000 1.7526
1.6204 0.68 116500 1.7524
1.9859 0.68 117000 1.7521
0.9717 0.69 117500 1.7516
1.8844 0.69 118000 1.7514
1.3336 0.69 118500 1.7509
1.5781 0.69 119000 1.7506
1.8449 0.7 119500 1.7505
1.5305 0.7 120000 1.7503
2.1904 0.7 120500 1.7500
2.2285 0.71 121000 1.7496
1.8097 0.71 121500 1.7494
2.3631 0.71 122000 1.7493
2.0893 0.71 122500 1.7491
2.1201 0.72 123000 1.7489
1.8334 0.72 123500 1.7488
2.0222 0.72 124000 1.7486
1.6339 0.73 124500 1.7484
1.6754 0.73 125000 1.7482
1.3973 0.73 125500 1.7480
2.0594 0.73 126000 1.7479
1.8674 0.74 126500 1.7478
2.1948 0.74 127000 1.7476
1.4148 0.74 127500 1.7475
1.6734 0.75 128000 1.7473
2.2787 0.75 128500 1.7472
1.8999 0.75 129000 1.7471
1.6945 0.76 129500 1.7470
2.0165 0.76 130000 1.7469
2.2232 0.76 130500 1.7468
1.6201 0.76 131000 1.7466
2.4878 0.77 131500 1.7465
1.5317 0.77 132000 1.7465
1.9361 0.77 132500 1.7464
1.7127 0.78 133000 1.7463
1.7045 0.78 133500 1.7462
2.1827 0.78 134000 1.7461
2.0534 0.78 134500 1.7461
2.0808 0.79 135000 1.7460
1.9572 0.79 135500 1.7459
1.8762 0.79 136000 1.7459
1.4686 0.8 136500 1.7458
1.6241 0.8 137000 1.7458
1.4219 0.8 137500 1.7457
2.1605 0.8 138000 1.7457
2.1298 0.81 138500 1.7456
1.414 0.81 139000 1.7456
1.0115 0.81 139500 1.7455
1.9471 0.82 140000 1.7455
1.8873 0.82 140500 1.7455
1.8286 0.82 141000 1.7454
2.1418 0.83 141500 1.7454
1.9755 0.83 142000 1.7454
1.6908 0.83 142500 1.7454
2.3842 0.83 143000 1.7453
1.7665 0.84 143500 1.7453
1.8266 0.84 144000 1.7453
0.8768 0.84 144500 1.7453
1.2274 0.85 145000 1.7453
1.6647 0.85 145500 1.7453
1.4071 0.85 146000 1.7452
1.6073 0.85 146500 1.7452
2.201 0.86 147000 1.7452
1.5504 0.86 147500 1.7452
1.4377 0.86 148000 1.7452
1.4453 0.87 148500 1.7452
1.6929 0.87 149000 1.7451
1.7631 0.87 149500 1.7451
2.0868 0.87 150000 1.7451
0.6434 0.88 150500 1.7451
1.4851 0.88 151000 1.7451
1.5365 0.88 151500 1.7451
1.8129 0.89 152000 1.7451
1.1623 0.89 152500 1.7451
2.0714 0.89 153000 1.7451
1.9363 0.9 153500 1.7451
1.6408 0.9 154000 1.7451
0.618 0.9 154500 1.7451
1.7957 0.9 155000 1.7451
2.0056 0.91 155500 1.7451
1.3893 0.91 156000 1.7451
2.1426 0.91 156500 1.7451
1.6766 0.92 157000 1.7451
1.4206 0.92 157500 1.7451
1.7285 0.92 158000 1.7451
1.5779 0.92 158500 1.7451
1.8675 0.93 159000 1.7451
2.0217 0.93 159500 1.7451
0.9516 0.93 160000 1.7451
2.219 0.94 160500 1.7450
1.6214 0.94 161000 1.7451
1.7134 0.94 161500 1.7451
1.6128 0.94 162000 1.7451
2.0817 0.95 162500 1.7450
1.8055 0.95 163000 1.7451
1.909 0.95 163500 1.7451
1.7844 0.96 164000 1.7451
2.0719 0.96 164500 1.7451
1.8698 0.96 165000 1.7451
1.6926 0.96 165500 1.7451
2.2161 0.97 166000 1.7451
2.1111 0.97 166500 1.7451
1.8004 0.97 167000 1.7451
2.2364 0.98 167500 1.7451
1.6716 0.98 168000 1.7451
2.1804 0.98 168500 1.7451
1.2691 0.99 169000 1.7451
1.8306 0.99 169500 1.7451
0.5662 0.99 170000 1.7451
1.6516 0.99 170500 1.7451
2.0576 1.0 171000 1.7451
1.3638 1.0 171500 1.7451

Framework versions

  • Transformers 4.34.1
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.6
  • Tokenizers 0.14.1