โšก ZeroGPU: New version rolled out! (sept 2024)

#107
by cbensimon HF staff - opened
ZeroGPU Explorers org

image.png

Hello everybody,

We've rolled out a major update to ZeroGPU! All the Spaces are now running on it.

Major improvements:

  1. GPU cold starts about twice as fast!
  2. RAM usage reduced by two-thirds, allowing more effective resource usage, meaning more GPUs for the community!
  3. ZeroGPU initializations (coldstarts) can now be tracked and displayed (use progress=gr.Progress(track_tqdm=True))
  4. Improved compatibility and PyTorch integration, increasing ZeroGPU compatible spaces without requiring any modifications!

Feel free to answer in this discussion if you have any questions!

๐Ÿค— Best regards,
Charles

cbensimon pinned discussion
ZeroGPU Explorers org
โ€ข
edited Sep 15

Hi, Charles!

No limit reset time is now displayed (to get more quotas or retry in 1:35:47)?
55543.jpg

Hi Charles.

The results from ZeroGPU differ from those on my local machine / Hugging Face's L4 GPU, even with the same code and Python dependencies.
For more information, visit: https://huggingface.co/spaces/zero-gpu-explorers/README/discussions/111

Hi.
I found a very strange behavior. It is hard to find and would never happen locally. Maybe it is related to the bug above.
https://huggingface.co/spaces/zero-gpu-explorers/README/discussions/104#66f66a4b693f423f5b6d9b2e

Apparently, this time the behavior of Gradio's Cancel task is wrong. If it is bad, it may be a problem with Queue in general.
https://huggingface.co/spaces/zero-gpu-explorers/README/discussions/113#66fbc59085944df7944ff4aa

ZeroGPU Explorers org
This comment has been hidden

Hi @cbensimon !

Is there an example of how to show cold-start time to users as mentioned here?:

ZeroGPU initializations (coldstarts) can now be tracked and displayed (use progress=gr.Progress(track_tqdm=True))

In my Zero GPU code, I assumed this meant to add it to the spaces.GPU decorator as

@spaces.GPU(duration=40, progress=gr.Progress(track_tqdm=True))

But I'm not seeing any visual indicator! There's no error thrown, but also no difference with that arg on or off, so not quite sure what I need to change! Thank you for your help!

(Code here if it helps: https://huggingface.co/spaces/WillHeld/diva-audio-chat/blob/main/app.py#L61)

I have heard that nest-asyncio, which is newly starting to be used in the spaces library, has quite a few problems around memory management.
I would like the library authors to find another alternative if possible.

I found that my model inference is much slower (~ x5 slower) running on zeroGPU than on my local GPU (V100, 16GB). May I know if there is a way to speed it up?

@cbensimon
I have a Pro plan ( 9$ for the test of my project ) and can easily access models by inference API and token authorization. I created the flux dev app. Next js but why is API slow generating images?

and this is my app.

https://aidreamgen.vercel.app/

Sign up or log in to comment