generate gibberish response when input 3500 tokens

#2
by chenxiangyi10 - opened

I use load_in_8bit=True

Only gibberish is generated when the input token sequence length is around 3500.

Sign up or log in to comment