Terjman-Large-v2.2

This model is a fine-tuned version of atlasia/Terjman-Large-v1.2 on BounharAbdelaziz/Terjman-v2-English-Darija-Dataset-350K. It achieves the following results on atlasia/TerjamaBench set:

  • Loss: 2.8068
  • Bleu: 20.1692
  • Chrf: 41.5829
  • Ter: 82.2755
  • Gen Len: 29.0212

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 128
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Bleu Chrf Ter Gen Len
25.1947 0.0361 100 4.3934 15.9485 35.999 89.4884 10.3094
23.9289 0.0723 200 4.3239 16.0039 36.1914 89.2378 10.3871
22.4636 0.1084 300 4.1613 16.4537 36.6901 88.6224 10.3341
20.7804 0.1446 400 3.8688 16.6385 36.9491 88.2061 10.2976
19.3631 0.1807 500 3.6065 16.8079 37.4258 87.9841 10.24
18.5934 0.2169 600 3.4287 17.1599 37.9096 87.8884 10.2471
17.5375 0.2530 700 3.3204 17.3512 38.3927 87.8847 10.5671
17.2196 0.2892 800 3.2327 17.8504 38.8577 88.2548 11.4647
16.648 0.3253 900 3.1727 18.0511 39.2475 87.3063 11.8929
15.9364 0.3615 1000 3.1162 18.5393 39.5155 86.2896 11.5035
15.4445 0.3976 1100 3.0761 18.8806 40.0084 86.0285 10.6118
15.1001 0.4338 1200 3.0376 19.0202 40.0596 85.4085 10.9141
14.4878 0.4699 1300 3.0135 19.1544 40.1196 85.0821 11.0424
14.0768 0.5061 1400 2.9858 18.8328 40.13 85.0403 11.6153
13.9057 0.5422 1500 2.9611 19.0325 40.3066 84.6503 12.9035
13.582 0.5783 1600 2.9354 19.2564 40.6244 84.4523 14.2729
13.2156 0.6145 1700 2.9283 19.2737 40.6857 84.1659 13.6271
13.216 0.6506 1800 2.9195 19.1842 40.5645 84.1892 15.3176
12.8209 0.6868 1900 2.9140 18.9829 40.2408 84.6937 15.2318
12.6086 0.7229 2000 2.9068 19.1247 40.4541 84.4727 15.5129
12.4518 0.7591 2100 2.8839 19.4021 40.8301 84.419 14.9365
12.4289 0.7952 2200 2.8866 19.7503 41.0128 83.2866 16.9776
12.2287 0.8314 2300 2.8734 19.8382 40.9191 83.0623 17.1
12.23 0.8675 2400 2.8700 19.6373 40.807 83.2168 19.0612
12.0073 0.9037 2500 2.8616 19.6527 40.7406 83.1873 18.9859
11.9013 0.9398 2600 2.8486 20.3566 41.2676 82.5944 20.0953
11.724 0.9760 2700 2.8548 19.9016 40.9086 82.4162 20.6
11.5015 1.0119 2800 2.8643 19.7775 41.0928 82.6536 19.4812
11.4529 1.0481 2900 2.8572 19.9022 41.1604 82.5963 20.5929
11.3655 1.0842 3000 2.8501 20.2911 41.3615 82.0111 20.8847
11.44 1.1204 3100 2.8478 20.2161 41.2086 82.363 24.5824
11.3663 1.1565 3200 2.8399 20.2121 41.4432 81.9885 23.4976
11.2401 1.1927 3300 2.8444 20.3898 41.3422 81.8858 25.3129
11.2055 1.2288 3400 2.8404 20.463 41.5269 81.9694 25.6447
11.0982 1.2650 3500 2.8417 20.5416 41.8609 81.852 24.7647
11.0639 1.3011 3600 2.8364 20.3427 41.7149 81.642 25.38
11.1082 1.3372 3700 2.8382 20.1427 41.525 82.3573 27.2624
11.0479 1.3734 3800 2.8349 20.092 41.4815 82.2276 26.3412
11.1308 1.4095 3900 2.8270 19.8146 41.2064 82.8144 26.9353
10.9374 1.4457 4000 2.8278 20.334 41.6593 81.7711 27.2165
11.0379 1.4818 4100 2.8208 20.3952 41.6785 81.7776 27.8235
11.0583 1.5180 4200 2.8225 20.3378 41.6073 82.1717 27.2506
10.7442 1.5541 4300 2.8193 20.326 41.5546 82.0312 27.2482
10.9269 1.5903 4400 2.8220 20.3777 41.6625 81.8084 27.2341
10.7753 1.6264 4500 2.8237 20.3371 41.63 81.788 26.0553
10.7841 1.6626 4600 2.8224 20.2222 41.4059 82.0374 26.6306
10.8673 1.6987 4700 2.8190 20.0474 41.3502 81.9854 27.2165
10.7534 1.7349 4800 2.8230 20.1404 41.3887 82.1984 27.2447
10.7428 1.7710 4900 2.8164 20.1693 41.4398 82.2848 26.64
10.7501 1.8072 5000 2.8152 20.2346 41.3471 81.9833 26.6235
10.5935 1.8433 5100 2.8179 20.0303 41.3017 82.2892 26.6341
10.8916 1.8795 5200 2.8142 20.1371 41.2921 82.3067 26.6388
10.4615 1.9156 5300 2.8147 20.1184 41.4823 82.4293 26.0271
10.6519 1.9517 5400 2.8145 19.8947 41.2411 82.4574 27.1906
10.7264 1.9879 5500 2.8182 19.9463 41.3386 82.2442 27.2424
10.3405 2.0239 5600 2.8159 19.9448 41.2743 82.3859 27.8
10.3744 2.0600 5700 2.8108 20.1041 41.5535 82.2007 26.04
10.4719 2.0962 5800 2.8152 19.9187 41.3583 82.2967 27.2235
10.5306 2.1323 5900 2.8127 19.944 41.3826 82.3566 27.8153
10.4882 2.1684 6000 2.8126 20.0051 41.4424 82.3244 27.2376
10.6208 2.2046 6100 2.8109 20.2372 41.6177 81.9337 26.6412
10.458 2.2407 6200 2.8103 20.259 41.6903 82.0083 26.6188
10.5497 2.2769 6300 2.8176 20.1553 41.6126 82.3499 26.6318
10.4733 2.3130 6400 2.8158 20.2925 41.6992 82.0426 26.6329
10.3814 2.3492 6500 2.8161 20.3436 41.6806 82.0206 26.6318
10.5096 2.3853 6600 2.8104 20.0678 41.5413 82.3613 26.6353
10.345 2.4215 6700 2.8162 20.0445 41.5661 82.3574 26.6424
10.3022 2.4576 6800 2.8150 20.1996 41.6086 82.1407 27.2247
10.511 2.4938 6900 2.8124 20.2234 41.6513 82.0956 26.0294
10.5411 2.5299 7000 2.8110 20.0885 41.6026 82.3022 27.1976
10.3113 2.5661 7100 2.8125 20.2328 41.6496 82.2802 27.8376
10.2469 2.6022 7200 2.8140 20.1713 41.6658 82.3415 26.6341
10.6242 2.6384 7300 2.8082 20.397 41.7355 81.8477 27.8071
10.3379 2.6745 7400 2.8141 20.2633 41.6424 82.2042 27.2282
10.2108 2.7106 7500 2.8112 20.1278 41.5304 82.2911 27.2106
10.3156 2.7468 7600 2.8113 20.2513 41.4369 82.1582 27.6129
10.5468 2.7829 7700 2.8092 20.2669 41.6353 82.0547 26.6212
10.3466 2.8191 7800 2.8095 20.1406 41.5998 82.2192 27.2294
10.1576 2.8552 7900 2.8102 20.4323 41.717 82.0407 27.2282
10.2929 2.8914 8000 2.8081 20.2843 41.6445 82.1534 27.2294
10.1551 2.9275 8100 2.8124 20.042 41.5029 82.3521 27.6212
10.3701 2.9637 8200 2.8074 20.149 41.6323 82.1568 27.7918
10.3841 2.9998 8300 2.8071 20.2359 41.7569 82.0191 27.2
10.2077 3.0358 8400 2.8088 20.1197 41.6256 82.2721 27.2141
10.2587 3.0719 8500 2.8089 20.1437 41.6294 82.2269 27.2082
10.4181 3.1081 8600 2.8087 20.3797 41.6715 81.84 27.2141
10.2875 3.1442 8700 2.8113 20.3151 41.7896 82.1183 27.2376
10.3484 3.1804 8800 2.8118 20.2499 41.5947 82.1579 29.0153
10.2257 3.2165 8900 2.8107 20.119 41.5849 82.1647 27.8012
10.2536 3.2527 9000 2.8098 20.208 41.5198 82.1859 28.4047
10.226 3.2888 9100 2.8083 20.0872 41.5215 82.2154 27.7941
10.2555 3.3250 9200 2.8066 20.0697 41.5266 82.2524 28.3953
10.1948 3.3611 9300 2.8077 19.9162 41.4577 82.5675 28.4
10.346 3.3973 9400 2.8087 20.0162 41.3612 82.5519 28.9906
10.2357 3.4334 9500 2.8088 20.1612 41.6107 82.364 27.8271
10.4022 3.4695 9600 2.8049 20.0821 41.5959 82.3291 26.6165
10.2421 3.5057 9700 2.8070 20.0572 41.4946 82.3907 26.6259
10.1731 3.5418 9800 2.8069 20.0779 41.4285 82.3776 27.7812
10.275 3.5780 9900 2.8061 20.1649 41.5337 82.2389 27.2106
10.1281 3.6141 10000 2.8081 20.2355 41.5411 82.2055 27.2212
10.178 3.6503 10100 2.8069 20.1021 41.4715 82.5423 27.2153
10.1036 3.6864 10200 2.8087 20.0886 41.532 82.4482 26.6365
10.2586 3.7226 10300 2.8066 20.2039 41.4906 82.184 27.7965
10.1633 3.7587 10400 2.8072 20.2001 41.6133 82.4142 27.2271
10.3653 3.7949 10500 2.8051 20.2178 41.7168 82.1759 26.6306
10.3772 3.8310 10600 2.8061 20.1352 41.4552 82.2656 28.9882
10.265 3.8672 10700 2.8063 20.3686 41.6936 82.0663 27.2247
10.1565 3.9033 10800 2.8066 20.3704 41.5896 82.1135 28.4235
10.1053 3.9395 10900 2.8064 20.232 41.638 82.1641 27.8224
10.2919 3.9756 11000 2.8058 20.2259 41.4303 82.3619 28.4035
10.2109 4.0116 11100 2.8063 20.1277 41.5148 82.4291 29.0071
10.2317 4.0477 11200 2.8065 20.1976 41.5571 82.3222 27.8082
10.2665 4.0839 11300 2.8045 20.1174 41.4949 82.3993 28.4118
10.2305 4.1200 11400 2.8054 20.1751 41.6274 82.3186 27.8212
10.1426 4.1562 11500 2.8066 20.2274 41.4655 82.3561 29.0047
10.2428 4.1923 11600 2.8060 20.1688 41.5318 82.415 27.8059
10.3369 4.2284 11700 2.8055 20.2572 41.6457 82.1688 27.8153
10.3457 4.2646 11800 2.8057 20.3668 41.6965 82.0632 27.2224
10.3538 4.3007 11900 2.8064 20.1635 41.5695 82.238 28.4165
10.1697 4.3369 12000 2.8059 20.3072 41.5665 82.1484 27.8094
10.2341 4.3730 12100 2.8064 20.1402 41.4892 82.2549 28.4106
10.1779 4.4092 12200 2.8064 20.2026 41.6443 82.2105 27.8412
10.287 4.4453 12300 2.8069 20.1142 41.4856 82.2969 28.4094
10.2197 4.4815 12400 2.8064 20.2797 41.5558 82.198 27.8294
10.3417 4.5176 12500 2.8065 20.2762 41.5022 82.129 28.4059
10.2043 4.5538 12600 2.8064 20.3172 41.5538 82.0916 27.8141
10.316 4.5899 12700 2.8067 20.1842 41.564 82.247 28.3953
10.1878 4.6261 12800 2.8061 20.1348 41.5324 82.3159 28.9941
10.3951 4.6622 12900 2.8063 20.1038 41.4362 82.3589 29.0212
10.2292 4.6984 13000 2.8061 19.9689 41.3646 82.5469 29.5953
10.1952 4.7345 13100 2.8068 20.2237 41.4602 82.2747 28.4012
10.2828 4.7706 13200 2.8066 20.2114 41.5359 82.2337 27.8165
10.3883 4.8068 13300 2.8063 20.1173 41.4379 82.2924 28.4024
10.2585 4.8429 13400 2.8065 20.1329 41.5104 82.3775 28.4224
10.1554 4.8791 13500 2.8064 20.2585 41.5165 82.1683 27.8224
10.2092 4.9152 13600 2.8063 20.0996 41.4114 82.3806 29.0118
10.2896 4.9514 13700 2.8067 20.0253 41.3706 82.4652 27.8082
10.2847 4.9875 13800 2.8068 20.1692 41.5829 82.2755 29.0212

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.21.0
Downloads last month
23
Safetensors
Model size
239M params
Tensor type
BF16
ยท
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for BounharAbdelaziz/Terjman-Large-v2.2

Finetuned
(2)
this model

Space using BounharAbdelaziz/Terjman-Large-v2.2 1