Spaces:
Running
Asking help for using flash-attn on Gradio Zero
I'm trying to use flash-attn on Gradio Zero but haven't figured out the way.
The methods in https://discuss.huggingface.co/t/how-to-install-flash-attention-on-hf-gradio-space/70698 don't work as Zero only support Gradio.
Simply adding flash-attn to requirements.txt doesn't work either.
Hello @Yiwen-ntu , in order to install flash-attn, you must use the following code in your gradio space
import subprocess
# Install flash attention, skipping CUDA build if necessary
subprocess.run(
"pip install flash-attn --no-build-isolation",
env={"FLASH_ATTENTION_SKIP_CUDA_BUILD": "TRUE"},
shell=True,
)
Hi! Thanks for your reply! I have tried this, but it doesn't work. The flash_attn is still not found by Transformers.
Is this because before this installation, the Gradio environment has been launched?
Can you please provide the error you are getting, or give me a link to your space? I also don't believe the issue is related to gradio, I will check though.
Thanks a lot! I have found the problem. I should put the installation code at the beginning of the app.py.
You're welcome! Enjoy.