Sagemaker endpoint
#1
by
MahmoudBL
- opened
thanks for the great work I see, but how to deploy that mode to an aws Sagemaker endpoint then with disabling the flash_attention_v2 ?
@MahmoudBL I didn't try deployments on aws sagemaker. What is the problem you are facing ?
Hello Mohammed thanks for your response.
I can deploy it properly but when I need to disable the Flash attention I cannot do that.
@MahmoudBL
Flash attention 2 is optional actually. you can remove it and remove with it the torch_dtype
flag and things should work fine.
Can you share with me the error you got ?
MohamedRashad
changed discussion status to
closed