Evaluation results for ibm/ColD-Fusion-itr13-seed2 model as a base model for other tasks (#8)

- Evaluation results for ibm/ColD-Fusion-itr13-seed2 model as a base model for other tasks (3f18a4af9919c4b03f026fb059d0c30e65f2264b)

Files changed (1) hide show

README.md +1 -3

README.md CHANGED Viewed

@@ -55,7 +55,7 @@ output = model(encoded_input)
 [Evaluation on 36 datasets](https://ibm.github.io/model-recycling/model_gain_chart?avg=2.50&mnli_lp=nan&20_newsgroup=1.08&ag_news=-0.47&amazon_reviews_multi=0.14&anli=2.75&boolq=3.32&cb=21.52&cola=0.07&copa=24.30&dbpedia=0.17&esnli=0.05&financial_phrasebank=2.19&imdb=-0.03&isear=0.67&mnli=0.41&mrpc=-0.12&multirc=2.46&poem_sentiment=4.52&qnli=0.27&qqp=0.37&rotten_tomatoes=3.04&rte=10.99&sst2=1.18&sst_5bins=1.47&stsb=1.72&trec_coarse=-0.11&trec_fine=3.24&tweet_ev_emoji=-1.35&tweet_ev_emotion=1.22&tweet_ev_hate=-0.34&tweet_ev_irony=5.48&tweet_ev_offensive=1.49&tweet_ev_sentiment=-1.25&wic=4.58&wnli=-5.49&wsc=0.19&yahoo_answers=0.16&model_name=ibm%2FColD-Fusion-itr13-seed2&base_name=roberta-base) using ibm/ColD-Fusion-itr13-seed2 as a base model yields average score of 78.72 in comparison to 76.22 by roberta-base.
-The model ranks 1st among all tested models for the roberta-base architecture as of 13/12/2022
 Results:
 |   20_newsgroup |   ag_news |   amazon_reviews_multi |    anli |   boolq |      cb |   cola |   copa |   dbpedia |   esnli |   financial_phrasebank |   imdb |   isear |    mnli |    mrpc |   multirc |   poem_sentiment |   qnli |     qqp |   rotten_tomatoes |     rte |    sst2 |   sst_5bins |    stsb |   trec_coarse |   trec_fine |   tweet_ev_emoji |   tweet_ev_emotion |   tweet_ev_hate |   tweet_ev_irony |   tweet_ev_offensive |   tweet_ev_sentiment |     wic |    wnli |     wsc |   yahoo_answers |
@@ -64,8 +64,6 @@ Results:
 For more information, see: [Model Recycling](https://ibm.github.io/model-recycling/)
-@article{ColDFusion,
-  author    = {Shachar Don-Yehiya, Elad Venezian, Colin Raffel, Noam Slonim, Yoav Katz, Leshem ChoshenYinhan Liu and},
   title     = {ColD Fusion: Collaborative Descent for Distributed Multitask Finetuning},
   journal   = {CoRR},
   volume    = {abs/2212.01378},

 [Evaluation on 36 datasets](https://ibm.github.io/model-recycling/model_gain_chart?avg=2.50&mnli_lp=nan&20_newsgroup=1.08&ag_news=-0.47&amazon_reviews_multi=0.14&anli=2.75&boolq=3.32&cb=21.52&cola=0.07&copa=24.30&dbpedia=0.17&esnli=0.05&financial_phrasebank=2.19&imdb=-0.03&isear=0.67&mnli=0.41&mrpc=-0.12&multirc=2.46&poem_sentiment=4.52&qnli=0.27&qqp=0.37&rotten_tomatoes=3.04&rte=10.99&sst2=1.18&sst_5bins=1.47&stsb=1.72&trec_coarse=-0.11&trec_fine=3.24&tweet_ev_emoji=-1.35&tweet_ev_emotion=1.22&tweet_ev_hate=-0.34&tweet_ev_irony=5.48&tweet_ev_offensive=1.49&tweet_ev_sentiment=-1.25&wic=4.58&wnli=-5.49&wsc=0.19&yahoo_answers=0.16&model_name=ibm%2FColD-Fusion-itr13-seed2&base_name=roberta-base) using ibm/ColD-Fusion-itr13-seed2 as a base model yields average score of 78.72 in comparison to 76.22 by roberta-base.
+The model is ranked 1st among all tested models for the roberta-base architecture as of 13/12/2022
 Results:
 |   20_newsgroup |   ag_news |   amazon_reviews_multi |    anli |   boolq |      cb |   cola |   copa |   dbpedia |   esnli |   financial_phrasebank |   imdb |   isear |    mnli |    mrpc |   multirc |   poem_sentiment |   qnli |     qqp |   rotten_tomatoes |     rte |    sst2 |   sst_5bins |    stsb |   trec_coarse |   trec_fine |   tweet_ev_emoji |   tweet_ev_emotion |   tweet_ev_hate |   tweet_ev_irony |   tweet_ev_offensive |   tweet_ev_sentiment |     wic |    wnli |     wsc |   yahoo_answers |
 For more information, see: [Model Recycling](https://ibm.github.io/model-recycling/)
   title     = {ColD Fusion: Collaborative Descent for Distributed Multitask Finetuning},
   journal   = {CoRR},
   volume    = {abs/2212.01378},