flant5 / README.md
eladven's picture
Evaluation results for talhaa/flant5 model as a base model for other tasks
81ca873
|
raw
history blame
3.54 kB
metadata
license: apache-2.0
tags:
  - generated_from_trainer
model-index:
  - name: flant5
    results: []

flant5

This model is a fine-tuned version of google/flan-t5-base on an unknown dataset.

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 2

Framework versions

  • Transformers 4.25.1
  • Pytorch 1.13.0+cu116
  • Datasets 2.8.0
  • Tokenizers 0.13.2

Model Recycling

Evaluation on 36 datasets using talhaa/flant5 as a base model yields average score of 77.86 in comparison to 68.82 by google/t5-v1_1-base.

The model is ranked 1st among all tested models for the google/t5-v1_1-base architecture as of 10/01/2023 Results:

20_newsgroup ag_news amazon_reviews_multi anli boolq cb cola copa dbpedia esnli financial_phrasebank imdb isear mnli mrpc multirc poem_sentiment qnli qqp rotten_tomatoes rte sst2 sst_5bins stsb trec_coarse trec_fine tweet_ev_emoji tweet_ev_emotion tweet_ev_hate tweet_ev_irony tweet_ev_offensive tweet_ev_sentiment wic wnli wsc yahoo_answers
87.0685 89.5333 67.14 52.1875 82.844 78.5714 80.1534 70 77.2667 90.6963 84.9 93.512 72.4902 87.4797 86.2745 61.8399 87.5 93.1173 90.7173 89.6811 85.9206 93.8073 56.5611 89.4438 97.4 91.6 47.054 80.5067 52.5926 74.8724 84.7674 71.76 68.8088 56.338 55.7692 72.6333

For more information, see: Model Recycling