MarPla's picture
End of training
05c63ae verified
metadata
license: mit
base_model: facebook/bart-large-cnn
tags:
  - generated_from_trainer
metrics:
  - rouge
  - bleu
model-index:
  - name: PhysicalScienceBARTPrincipal
    results: []

PhysicalScienceBARTPrincipal

This model is a fine-tuned version of facebook/bart-large-cnn on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 4.5862
  • Rouge1: 49.7214
  • Rouge2: 15.9205
  • Rougel: 34.8099
  • Rougelsum: 45.9442
  • Bertscore Precision: 81.8626
  • Bertscore Recall: 83.3072
  • Bertscore F1: 82.5744
  • Bleu: 0.1065
  • Gen Len: 196.3779

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Bertscore Precision Bertscore Recall Bertscore F1 Bleu Gen Len
6.4881 0.0620 100 6.2790 38.9402 10.9737 28.0473 36.4124 78.6712 80.7927 79.7123 0.0702 196.3779
5.9838 0.1239 200 5.8574 39.6094 11.61 28.6653 36.6426 78.5563 81.2374 79.8672 0.0773 196.3779
5.5757 0.1859 300 5.5425 43.235 12.5595 30.3069 40.1431 79.7016 81.7103 80.6878 0.0826 196.3779
5.4752 0.2478 400 5.3518 45.0647 13.1878 31.0925 41.4826 79.7122 82.0455 80.8554 0.0880 196.3779
5.3711 0.3098 500 5.2193 47.1793 13.5223 31.7989 43.5774 80.6424 82.3476 81.4813 0.0892 196.3779
5.1653 0.3717 600 5.0858 45.2081 13.4909 31.8919 41.7813 80.7104 82.4561 81.5689 0.0897 196.3779
5.0684 0.4337 700 4.9837 46.4035 14.2034 32.654 42.8883 80.4628 82.4529 81.4399 0.0941 196.3779
4.9625 0.4957 800 4.9084 48.2088 14.8904 33.2025 44.5397 81.1668 82.8469 81.9935 0.0986 196.3779
4.8858 0.5576 900 4.8370 48.5919 14.7721 33.5041 44.7923 81.2656 82.8635 82.0522 0.0974 196.3779
4.8251 0.6196 1000 4.7813 49.2512 15.4584 34.0164 45.5215 81.4958 83.0067 82.2398 0.1030 196.3779
4.8581 0.6815 1100 4.7307 48.7203 15.379 34.0451 45.0395 81.7154 83.106 82.4008 0.1027 196.3779
4.7934 0.7435 1200 4.6861 49.5987 15.6207 34.3261 45.8512 81.7656 83.1546 82.4502 0.1042 196.3779
4.7163 0.8055 1300 4.6518 48.9818 15.5333 34.3788 45.3444 81.6763 83.1451 82.3998 0.1039 196.3779
4.6855 0.8674 1400 4.6199 49.1462 15.5914 34.5149 45.5788 81.7027 83.1199 82.401 0.1037 196.3779
4.615 0.9294 1500 4.5987 49.6903 15.8973 34.7628 45.9111 81.8545 83.302 82.5678 0.1064 196.3779
4.5964 0.9913 1600 4.5862 49.7214 15.9205 34.8099 45.9442 81.8626 83.3072 82.5744 0.1065 196.3779

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.1+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1