About max sequence length

#14
by jorisfu - opened

Was this model trained on a corpus with a length of 32k characters, and how does it perform on a corpus of the 32k length?

"Using this model for inputs longer than 4096 tokens is not recommended."

Is it fair to use tiktoken library and clk100 as the tokenizer for estimation purpose?

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment