Input validation error: `inputs` tokens + `max_new_tokens` must be <= 2048. on Mixtral8x7b 32K token
I have deployed Mistral on Sagemaker using the Huggignface image. I am getting good response for small input prompts. when I send little big size promt I am gettitng errror: Input validation error: inputs
tokens + max_new_tokens
must be <= 2048.
similar errors in huggingchat
I am able to solve this issue for sagemaker endpoint.
we need to set environment variables MAX_INPUT_LENGTH and MAX_TOTAL_TOKEN.
While deploying llm with sagemaker add this environment variables
hub = {
'HF_MODEL_ID':'mistralai/Mixtral-8x7B-Instruct-v0.1',
'SM_NUM_GPUS': json.dumps(8),
"MAX_INPUT_LENGTH": '30000', => put here any value upto 32768 as per your requirement.
"MAX_TOTAL_TOKENS": '32768',
"MAX_BATCH_PREFILL_TOKENS": '32768',
"MAX_BATCH_TOTAL_TOKENS": '32768',
}
It will change the defaul MAX_INPUT_TOKEN size from 2048 to 30000