pszemraj commited on
Commit
2bd108e
1 Parent(s): e8d4939

embellish das README

Browse files
Files changed (1) hide show
  1. README.md +8 -6
README.md CHANGED
@@ -64,33 +64,35 @@ inference:
64
 
65
  ---
66
 
67
- # long-t5-tglobal-base-16384-booksum-V7
68
 
69
- - summarize long text and get a sparknotes-esque summary!
70
  - generalizes fairly well to academic & narrative text.
71
 
72
  ## Cheeky Proof-of-Concept
73
 
74
- A summary of the [famous navy seals copypasta](https://knowyourmeme.com/memes/navy-seal-copypasta):
75
 
76
  > The narrator tells the audience that he can kill anyone anywhere in the world with his bare hands, and he has access to all of the United States military's weapons.
77
 
78
 
79
  ## Model description
80
 
81
- This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on the `kmfoda/booksum` dataset:
 
82
  - between different checkpoints, about 20 epochs in total
83
  - all training was done at 16384 token input / 1024 max output
84
  - early checkpoints of this model were trained on a "smaller" subsection of the dataset as it was filtered for summaries of 1024 **characters**. This was subsequently caught and adjusted to **1024** tokens, and then trained further for at least five epochs.
85
 
86
  ## Intended uses & limitations
87
 
88
- - At time of writing, the model is not _fully converged_ despite training for 20+ epochs. This checkpoint is servicable enough (see examples).
89
  - I plan to update this page with newer checkpoints and post some metrics over time.
 
90
 
91
  ## Training and evaluation data
92
 
93
- More information needed
94
 
95
  ## Training procedure
96
 
 
64
 
65
  ---
66
 
67
+ # long-t5-tglobal-base-16384-booksum
68
 
69
+ - summarize long text and get a SparkNotes-esque summary of arbitrary topics!
70
  - generalizes fairly well to academic & narrative text.
71
 
72
  ## Cheeky Proof-of-Concept
73
 
74
+ A summary of the [infamous navy seals copypasta](https://knowyourmeme.com/memes/navy-seal-copypasta):
75
 
76
  > The narrator tells the audience that he can kill anyone anywhere in the world with his bare hands, and he has access to all of the United States military's weapons.
77
 
78
 
79
  ## Model description
80
 
81
+ A fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on the `kmfoda/booksum` dataset:
82
+
83
  - between different checkpoints, about 20 epochs in total
84
  - all training was done at 16384 token input / 1024 max output
85
  - early checkpoints of this model were trained on a "smaller" subsection of the dataset as it was filtered for summaries of 1024 **characters**. This was subsequently caught and adjusted to **1024** tokens, and then trained further for at least five epochs.
86
 
87
  ## Intended uses & limitations
88
 
89
+ - At time of writing, the model is not _fully converged_ despite training for 20+ epochs. This checkpoint is serviceable enough (see examples).
90
  - I plan to update this page with newer checkpoints and post some metrics over time.
91
+ - Compare performance to [LED-base](https://huggingface.co/pszemraj/led-base-book-summary) trained on the same dataset.
92
 
93
  ## Training and evaluation data
94
 
95
+ `kmfoda/booksum` dataset. Summaries longer than 1024 LongT5 tokens were filtered out to prevent the model from learning to generate "partial" summaries.
96
 
97
  ## Training procedure
98