language:
- en
tags:
- t5
- analysis
- book
- notes
datasets:
- kmfoda/booksum
metrics:
- rouge
widget:
- text: >-
A large drop of sun lingered on the horizon and then dripped over and was
gone, and the sky was brilliant over the spot where it had gone, and a
torn cloud, like a bloody rag, hung over the spot of its going. And dusk
crept over the sky from the eastern horizon, and darkness crept over the
land from the east.
example_title: grapes of wrath
- text: >-
The year was 2081, and everybody was finally equal. They weren’t only
equal before God and the law. They were equal every which way. Nobody was
smarter than anybody else. Nobody was better looking than anybody else.
Nobody was stronger or quicker than anybody else. All this equality was
due to the 211th, 212th, and 213th Amendments to the Constitution, and to
the unceasing vigilance of agents of the United States Handicapper
General.
example_title: Harrison Bergeron
- text: >-
The ledge, where I placed my candle, had a few mildewed books piled up in
one corner; and it was covered with writing scratched on the paint. This
writing, however, was nothing but a name repeated in all kinds of
characters, large and small—Catherine Earnshaw, here and there varied to
Catherine Heathcliff, and then again to Catherine Linton. In vapid
listlessness I leant my head against the window, and continued spelling
over Catherine Earnshaw—Heathcliff—Linton, till my eyes closed; but they
had not rested five minutes when a glare of white letters started from the
dark, as vivid as spectres—the air swarmed with Catherines; and rousing
myself to dispel the obtrusive name, I discovered my candle wick reclining
on one of the antique volumes, and perfuming the place with an odour of
roasted calf-skin.
example_title: Wuthering Heights
inference:
parameters:
no_repeat_ngram_size: 2
max_length: 32
early_stopping: true
literary analysis with t5-base
- t5 sort-of learning to do literary analysis. It was trained on the booksum dataset with
chapter
(original text) as input andsummary_analysis
as the output text, wheresummary_analysis
is the sparknotes/cliff notes/etc analysis - It was trained for 8 epochs
- Testing may need to be completed in Colab as it seems to be CPU-intensive. A link to an example notebook is here
Example
!pip install -U -q transformers
!pip install -U -q sentencepiece
from transformers import pipeline
analyzer = pipeline("text2text-generation",
"pszemraj/t5_1_1-base-writing-analysis")
- enter text and compute.
text = "text to be analyzed goes here"
result = analyzer(
text,
max_length=int(len(text) * 1.2),
no_repeat_ngram_size= 2,
repetition_penalty= 2.4,
num_beams=4,
early_stopping= True,
)
sample results
- inputs:
Beren now, he never thought he was going to get that silmaril from the iron crown in thangorodrim, and yet he did, and that was a worse place and a blacker danger than ours. but that’s a long tale, of course, and goes on past the happiness and into grief and beyond it – and the silmaril went on and came to eärendil. and why, sir, I never thought of that before! we’ve got – you’ve got some of the light of it in that star glass that the lady gave you! why, to think of it, we’re in the same tale still! it’s going on. don’t the great tales never end?
- output:
The silmaril comes to eärendil. it is a long tale, and it is going on beyond the happy ending of the story. in this way, we see that Berens dream has come true. he is in love with her, but she doesnt know how to do it. she thinks about it as if he had never thought of it before. then again, there is an ironic contrast between his dreams and reality. this is not surprising because they are so far apart from each other. they have been together foreshadowing one another. and yet, even though, they were at the same time. but now, their relationship is more important than any other characters. for example, the two men seem to be separated by them. their friendship is very much more interesting. we can imagine what happens when they get back home. when they meet up in the next chapter, however, then, the darkest danger seems to suggest that they will go on the night. now, after all, everyone else does not want to find outwardly.
* NOTE: As the above were not saved in real-time, both the input and output had humanize
string formatting applied to quickly clean them as they were copied and pasted from a Colab notebook.