Different hypothesis format/text
#16
by
marcj
- opened
Just FYI, the template (hypothesis) used in the manual PyTorch section is different to the template used in explanation.
In the README is
we could construct a hypothesis of "This text is about politics.".
but in pipeline the following template is used
"This example is {}."
which seems to be the correct one since this yields good results.
So make sure to not use This text is about {}.
or anything else. Using the correct hypothesis format is important to get consistent/good predictions.
Example, if you slightly change (because you think you are smarter) the hypothesis text from "This example is {}." to "This example is about {}." then the results totally crash:
text = "it's quite cheap, but the quality is not compromised – love the colors and the smooth application"
topic = "price"
# this is the correct one and used per default
classifier(text, topic, hypothesis_template="This example is {}.")
// => 'scores': [0.9757905602455139]}
# wrong from the README
classifier(text, topic, hypothesis_template="This text is about {}.")
// => 'scores': [0.2925598919391632]}
# wrong, too
classifier(text, topic, hypothesis_template="This example is about {}.")
// => 'scores': [0.3993900716304779]}
marcj
changed discussion title from
pipeline vs manual PyTorch results
to pipeline vs manual PyTorch results, due to different hypothesis text
marcj
changed discussion status to
closed
marcj
changed discussion status to
open
marcj
changed discussion title from
pipeline vs manual PyTorch results, due to different hypothesis text
to Different hypothesis format/text