Edit model card

base

model image

This model is a fine-tuned version of google/flan-t5-base on the cnn_dailymail 3.0.0 dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4232
  • Rouge1: 42.1388
  • Rouge2: 19.7696
  • Rougel: 30.1512
  • Rougelsum: 39.3222
  • Gen Len: 71.8562

Model description

  • Model type: Language model
  • Language(s) (NLP): English, Spanish, Japanese, Persian, Hindi, French, Chinese, Bengali, Gujarati, German, Telugu, Italian, Arabic, Polish, Tamil, Marathi, Malayalam, Oriya, Panjabi, Portuguese, Urdu, Galician, Hebrew, Korean, Catalan, Thai, Dutch, Indonesian, Vietnamese, Bulgarian, Filipino, Central Khmer, Lao, Turkish, Russian, Croatian, Swedish, Yoruba, Kurdish, Burmese, Malay, Czech, Finnish, Somali, Tagalog, Swahili, Sinhala, Kannada, Zhuang, Igbo, Xhosa, Romanian, Haitian, Estonian, Slovak, Lithuanian, Greek, Nepali, Assamese, Norwegian
  • License: Apache 2.0
  • Related Models: All FLAN-T5 Checkpoints
  • Original Checkpoints: All Original FLAN-T5 Checkpoints
  • Resources for more information:

Intended uses & limitations

The information below in this section are copied from the model's official model card:

Language models, including Flan-T5, can potentially be used for language generation in a harmful way, according to Rae et al. (2021). Flan-T5 should not be used directly in any application,

Training and evaluation data

  • Loss: 1.4232
  • Rouge1: 42.1388
  • Rouge2: 19.7696
  • Rougel: 30.1512
  • Rougelsum: 39.3222
  • Gen Len: 71.8562

Training procedure

Training procedure example notebook for flan-T5 and pushing it to hub https://github.com/EveripediaNetwork/ai/blob/main/notebooks/Fine-Tuning-Flan-T5_1.ipynb

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 1
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 64
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: Constant
  • num_epochs: 3.0

Framework versions

  • Transformers 4.27.0.dev0
  • Pytorch 1.13.0+cu117
  • Datasets 2.7.1
  • Tokenizers 0.12.1

Downloads last month
12
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for braindao/flan-t5-cnn

Adapters
1 model

Dataset used to train braindao/flan-t5-cnn

Evaluation results