File size: 2,572 Bytes
12681ff d23df37 12681ff a3253ec 12681ff e94598a ed83b7e 722e776 ed83b7e 12681ff 8236ee0 12681ff 8236ee0 12681ff 580f81d 8236ee0 12681ff 8236ee0 12681ff 8236ee0 12681ff 8236ee0 12681ff 8236ee0 12681ff 8236ee0 12681ff 8236ee0 12681ff 8236ee0 12681ff 8236ee0 12681ff 8236ee0 12681ff 8236ee0 12681ff 8236ee0 6bff02f 12681ff 8236ee0 12681ff 8236ee0 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 |
---
language: en
license: mit
library_name: transformers
tags:
- summarization
- bart
datasets: ccdv/arxiv-summarization
model-index:
- name: BARTxiv
results:
- task:
type: summarization
dataset:
name: arxiv-summarization
type: ccdv/arxiv-summarization
split: validation
metrics:
- type: rouge1
value: 41.70204016592095
- type: rouge2
value: 15.134827404979639
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# BARTxiv
See the model implementation [here](https://interrsect.web.app).
This model is a fine-tuned version of [facebook/bart-large-cnn](https://huggingface.co/facebook/bart-large-cnn) on the [arxiv-summarization](https://huggingface.co/datasets/ccdv/arxiv-summarization) dataset.
It achieves the following results on the validation set:
- Loss: 0.86
- Rouge1: 41.70
- Rouge2: 15.13
- Rougel: 22.85
- Rougelsum: 37.77
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-6
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- optimizer: Adafactor
- num_epochs: 9
### Training results
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
|:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|
| 1.24 | 1.0 | 1073 | 1.24 | 38.32 | 12.80 | 20.55 | 34.50 |
| 1.04 | 2.0 | 2146 | 1.04 | 39.65 | 13.74 | 21.28 | 35.83 |
| 0.979 | 3.0 | 3219 | 0.98 | 40.19 | 14.30 | 21.87 | 36.38 |
| 0.970 | 4.0 | 4292 | 0.97 | 40.87 | 14.44 | 22.14 | 36.89 |
| 0.918 | 5.0 | 5365 | 0.92 | 41.17 | 14.94 | 22.54 | 37.40 |
| 0.901 | 6.0 | 6438 | 0.90 | 41.02 | 14.65 | 22.46 | 37.05 |
| 0.889 | 7.0 | 7511 | 0.89 | 41.32 | 15.09 | 22.64 | 37.42 |
| 0.900 | 8.0 | 8584 | 0 .90 | 41.23 | 15.02 | 22.67 | 37.28 |
| 0.869 | 9.0 | 9657 | 0.87 | 41.70 | 15.13 | 22.85 | 37.77 |
### Framework versions
- Transformers 4.25.1
- Pytorch 1.13.0+cu117
- Datasets 2.6.1
- Tokenizers 0.13.1 |