Add evaluation results on the default config of billsum

Beep boop, I am a bot from Hugging Face's automatic model evaluator 👋!\
Your model has been evaluated on the default config of the [billsum](https://huggingface.co/datasets/billsum) dataset by

@pszemraj

, using the predictions stored [here](https://huggingface.co/datasets/autoevaluate/autoeval-staging-eval-project-billsum-a6bd4aa5-11965601).\
Accept this pull request to see the results displayed on the [Hub leaderboard](https://huggingface.co/spaces/autoevaluate/leaderboards?dataset=billsum).\
Evaluate your model on more datasets [here](https://huggingface.co/spaces/autoevaluate/model-evaluator?dataset=billsum).

Files changed (1) hide show

README.md +82 -1

README.md CHANGED Viewed

@@ -58,7 +58,55 @@ widget:
     \ and parameters 0, and generalization is influenced by the inductive bias of\
     \ this function space (Section 5)."
   example_title: scientific paper
-- text: "Is a else or outside the cob and tree written being of early client rope and you have is for good reasons. On to the ocean in Orange for time. By's the aggregate we can bed it yet. Why this please pick up on a sort is do and also M Getoi's nerocos and do rain become you to let so is his brother is made in use and Mjulia's's the lay major is aging Masastup coin present sea only of Oosii rooms set to you We do er do we easy this private oliiishs lonthen might be okay. Good afternoon everybody. Welcome to this lecture of Computational Statistics. As you can see, I'm not socially my name is Michael Zelinger. I'm one of the task for this class and you might have already seen me in the first lecture where I made a quick appearance. I'm also going to give the tortillas in the last third of this course. So to give you a little bit about me, I'm a old student here with better Bulman and my research centres on casual inference applied to biomedical disasters, so that could be genomics or that could be hospital data. If any of you is interested in writing a bachelor thesis, a semester paper may be mastathesis about this topic feel for reach out to me. you have my name on models and my email address you can find in the directory I'd Be very happy to talk about it. you do not need to be sure about it, we can just have a chat. So with that said, let's get on with the lecture. There's an exciting topic today I'm going to start by sharing some slides with you and later on during the lecture we'll move to the paper. So bear with me for a few seconds. Well, the projector is starting up. Okay, so let's get started. Today's topic is a very important one. It's about a technique which really forms one of the fundamentals of data science, machine learning, and any sort of modern statistics. It's called cross validation. I know you really want to understand this topic I Want you to understand this and frankly, nobody's gonna leave Professor Mineshousen's class without understanding cross validation. So to set the stage for this, I Want to introduce you to the validation problem in computational statistics. So the problem is the following: You trained a model on available data. You fitted your model, but you know the training data you got could always have been different and some data from the environment. Maybe it's a random process. You do not really know what it is, but you know that somebody else who gets a different batch of data from the same environment they would get slightly different training data and you do not care that your method performs as well. On this training data. you want to to perform well on other data that you have not seen other data from the same environment. So in other words, the validation problem is you want to quantify the performance of your model on data that you have not seen. So how is this even possible? How could you possibly measure the performance on data that you do not know The solution to? This is the following realization is that given that you have a bunch of data, you were in charge. You get to control how much that your model sees. It works in the following way: You can hide data firms model. Let's say you have a training data set which is a bunch of doubtless so X eyes are the features those are typically hide and national vector. It's got more than one dimension for sure. And the why why eyes. Those are the labels for supervised learning. As you've seen before, it's the same set up as we have in regression. And so you have this training data and now you choose that you only use some of those data to fit your model. You're not going to use everything, you only use some of it the other part you hide from your model. And then you can use this hidden data to do validation from the point of you of your model. This hidden data is complete by unseen. In other words, we solve our problem of validation."
   example_title: transcribed audio - lecture
 - text: "Transformer-based models have shown to be very useful for many NLP tasks.\
     \ However, a major limitation of transformers-based models is its O(n^2)O(n 2)\
@@ -267,6 +315,39 @@ model-index:
       type: gen_len
       value: 82.2177
       verified: true
 ---
 # long-t5-tglobal-base-16384 + BookSum

     \ and parameters 0, and generalization is influenced by the inductive bias of\
     \ this function space (Section 5)."
   example_title: scientific paper
+- text: 'Is a else or outside the cob and tree written being of early client rope
+    and you have is for good reasons. On to the ocean in Orange for time. By''s the
+    aggregate we can bed it yet. Why this please pick up on a sort is do and also
+    M Getoi''s nerocos and do rain become you to let so is his brother is made in
+    use and Mjulia''s''s the lay major is aging Masastup coin present sea only of
+    Oosii rooms set to you We do er do we easy this private oliiishs lonthen might
+    be okay. Good afternoon everybody. Welcome to this lecture of Computational Statistics.
+    As you can see, I''m not socially my name is Michael Zelinger. I''m one of the
+    task for this class and you might have already seen me in the first lecture where
+    I made a quick appearance. I''m also going to give the tortillas in the last third
+    of this course. So to give you a little bit about me, I''m a old student here
+    with better Bulman and my research centres on casual inference applied to biomedical
+    disasters, so that could be genomics or that could be hospital data. If any of
+    you is interested in writing a bachelor thesis, a semester paper may be mastathesis
+    about this topic feel for reach out to me. you have my name on models and my email
+    address you can find in the directory I''d Be very happy to talk about it. you
+    do not need to be sure about it, we can just have a chat. So with that said, let''s
+    get on with the lecture. There''s an exciting topic today I''m going to start
+    by sharing some slides with you and later on during the lecture we''ll move to
+    the paper. So bear with me for a few seconds. Well, the projector is starting
+    up. Okay, so let''s get started. Today''s topic is a very important one. It''s
+    about a technique which really forms one of the fundamentals of data science,
+    machine learning, and any sort of modern statistics. It''s called cross validation.
+    I know you really want to understand this topic I Want you to understand this
+    and frankly, nobody''s gonna leave Professor Mineshousen''s class without understanding
+    cross validation. So to set the stage for this, I Want to introduce you to the
+    validation problem in computational statistics. So the problem is the following:
+    You trained a model on available data. You fitted your model, but you know the
+    training data you got could always have been different and some data from the
+    environment. Maybe it''s a random process. You do not really know what it is,
+    but you know that somebody else who gets a different batch of data from the same
+    environment they would get slightly different training data and you do not care
+    that your method performs as well. On this training data. you want to to perform
+    well on other data that you have not seen other data from the same environment.
+    So in other words, the validation problem is you want to quantify the performance
+    of your model on data that you have not seen. So how is this even possible? How
+    could you possibly measure the performance on data that you do not know The solution
+    to? This is the following realization is that given that you have a bunch of data,
+    you were in charge. You get to control how much that your model sees. It works
+    in the following way: You can hide data firms model. Let''s say you have a training
+    data set which is a bunch of doubtless so X eyes are the features those are typically
+    hide and national vector. It''s got more than one dimension for sure. And the
+    why why eyes. Those are the labels for supervised learning. As you''ve seen before,
+    it''s the same set up as we have in regression. And so you have this training
+    data and now you choose that you only use some of those data to fit your model.
+    You''re not going to use everything, you only use some of it the other part you
+    hide from your model. And then you can use this hidden data to do validation from
+    the point of you of your model. This hidden data is complete by unseen. In other
+    words, we solve our problem of validation.'
   example_title: transcribed audio - lecture
 - text: "Transformer-based models have shown to be very useful for many NLP tasks.\
     \ However, a major limitation of transformers-based models is its O(n^2)O(n 2)\
       type: gen_len
       value: 82.2177
       verified: true
+  - task:
+      type: summarization
+      name: Summarization
+    dataset:
+      name: billsum
+      type: billsum
+      config: default
+      split: test
+    metrics:
+    - name: ROUGE-1
+      type: rouge
+      value: 39.6378
+      verified: true
+    - name: ROUGE-2
+      type: rouge
+      value: 13.0017
+      verified: true
+    - name: ROUGE-L
+      type: rouge
+      value: 23.0255
+      verified: true
+    - name: ROUGE-LSUM
+      type: rouge
+      value: 32.9943
+      verified: true
+    - name: loss
+      type: loss
+      value: 1.9428048133850098
+      verified: true
+    - name: gen_len
+      type: gen_len
+      value: 162.3588
+      verified: true
 ---
 # long-t5-tglobal-base-16384 + BookSum