generate_quality_improvement / general_suggestions.py
joaogante's picture
joaogante HF staff
Add tons of suggetsions; Improved matching logic
88d43b3
"""
This is a file holding task and model agnostic suggestions.
How to add a new suggestion:
1. Add a new constant at the bottom of the file with your suggestion. Please try to follow the same format as the
existing suggestions.
2. Add a new entry to the `GENERAL_SUGGESTIONS`, with format `((problem tags,), suggestion constant)`.
a. See `app.py` for the existing problem tags.
c. Make sure the problem tags are a tuple.
"""
SET_MAX_NEW_TOKENS = """
<details><summary>{match_emoji} {count}. Control the maximum output length.</summary>
&nbsp;
🤔 Why? &nbsp;
All text generation calls have a length-related stopping condition. Depending on the model and/or the tool you're
using to generate text, the default value may be too small or too large. I'd recommend ALWAYS setting this option.
&nbsp;
🤗 How? &nbsp;
Our text generation interfaces accept a `max_new_tokens` option. Set it to define the maximum number of tokens
that can be generated. &nbsp;
😱 Caveats &nbsp;
1. Allowing a longer output doesn't necessarily mean that the model will generate longer outputs. By default,
the model will stop generating when it generates a special `eos_token_id` token.
2. You shouldn't set `max_new_tokens` to a value larger than the maximum sequence length of the model. If you need a
longer output, consider using a model with a larger maximum sequence length.
3. The longer the output, the longer it will take to generate.
_________________
</details>
"""
SET_MIN_LENGTH = """
<details><summary>{match_emoji} {count}. Force a minimum output length.</summary>
&nbsp;
🤔 Why? &nbsp;
Text generation stops when the model generates a special `eos_token_id`. If you prevent it from happening, the model is
forced to continue generating. &nbsp;
🤗 How? &nbsp;
Our text generation interfaces accept a `min_new_tokens` argument. Set it to prevent `eos_token_id` from being
generated until `min_new_tokens` tokens are generated. &nbsp;
😱 Caveats &nbsp;
1. The quality of the output may suffer if the model is forced to generate beyond its own original expectations.
2. `min_new_tokens` must be smaller than than `max_new_tokens` (see related tip).
_________________
</details>
"""
REMOVE_EOS_TOKEN = """
<details><summary>{match_emoji} {count}. Force the model to generate until it reaches the maximum output length.</summary>
&nbsp;
🤔 Why? &nbsp;
Text generation stops when the model generates a special `eos_token_id`. If there is no `eos_token_id`, the model can't
stop. &nbsp;
🤗 How? &nbsp;
Our text generation interfaces accept a `eos_token_id` argument. Set it to a null value (e.g., in Python,
`eos_token_id=None`) to prevent generation to stop before it reaches other stopping conditions. &nbsp;
😱 Caveats &nbsp;
1. The quality of the output may suffer if the model is forced to generate beyond its own original expectations.
_________________
</details>
"""
LIST_EOS_TOKEN = """
<details><summary>{match_emoji} {count}. Add a stop word.</summary>
&nbsp;
🤔 Why? &nbsp;
Text generation stops when the model generates a special `eos_token_id`. Actually, this attribute can be a list of
tokens, which means you can define arbitrary stop words. &nbsp;
🤗 How? &nbsp;
Our text generation interfaces accept a `eos_token_id` argument. You can pass a list of tokens to make generation
stop in the presence of any of those tokens. &nbsp;
😱 Caveats &nbsp;
1. When passing a list of tokens, you probably shouldn't forget to include the default `eos_token_id` there.
_________________
</details>
"""
TRY_CONTRASTIVE_SEARCH = """
<details><summary>{match_emoji} {count}. Try Contrastive Search.</summary>
&nbsp;
🤔 Why? &nbsp;
Contrastive Search is a greedy decoding strategy that strikes a balance between picking the best token and avoiding
repetition in the representation space. Despite being a greedy decoding strategy, it can also perform well on tasks
that require creativity (i.e. Sampling territory). In some models, it greatly reduces the problem of repetition. &nbsp;
🤗 How? &nbsp;
Our text generation interfaces accept two arguments: `top_k` and `penalty_alpha`. The authors recomment starting with
`top_k=4` and `penalty_alpha=0.6`. &nbsp;
😱 Caveats &nbsp;
1. Contrastive Search does not work well with all models -- it depends on how distributed their representation spaces
are. See [this thread](https://huggingface.co/spaces/joaogante/contrastive_search_generation/discussions/1#63764a108623a4a7954a5be5)
for further information.
_________________
</details>
"""
BLOCK_BAD_WORDS = """
<details><summary>{match_emoji} {count}. Prevent certain words from being generated.</summary>
&nbsp;
🤔 Why? &nbsp;
You might want to prevent your model from generating certain tokens, such as swear words. &nbsp;
🤗 How? &nbsp;
Our text generation interfaces accept a `bad_words_ids` argument. There, you can pass a list of lists, where each
inner list contains a forbidden sequence of tokens.
Remember that you can get the token IDs for the words you want to block through
`bad_word_ids = tokenizer(bad_words, add_prefix_space=True, add_special_tokens=False).input_ids` &nbsp;
_________________
</details>
"""
GENERAL_SUGGESTIONS = (
(("length",), SET_MAX_NEW_TOKENS),
(("length",), SET_MIN_LENGTH),
(("length",), REMOVE_EOS_TOKEN),
(("length",), LIST_EOS_TOKEN),
(("quality", "repetitions"), TRY_CONTRASTIVE_SEARCH),
(("quality",), BLOCK_BAD_WORDS),
)
assert all(isinstance(problem_tags, tuple) for problem_tags, _ in GENERAL_SUGGESTIONS)