How does it perform compared to GPT-J? Can you fix repeat by increasing repeat_penalty?
It might be possible to use no_repeat_ngram_size in generation configs to suppress repetition.
· Sign up or log in to comment