Seeking feedback
Wanted to get some feedback from your end on this Space. This Space basically helps to determine the memory requirements of a DiffusionPipeline
. I believe this is practically quite useful since, with this, the users can gauge the ballpark of a pipeline's memory requirement pretty easily and can plan resource allocation.
If you could provide some feedback on the Space or maybe even directly open PRs with your improvement patches, I would be grateful!
Hi
@sayakpaul
Looks great!
There seems to be a small bug (maybe it's just expected), but other than that, I don't have any suggestions for improvement.
Just fixed. Thank you for spotting!
Very cool, I love it!
- I'd include the total considering all components loaded.
- Maybe repeat this sentence in the results area too: "Generation typically requires an additional 20% to these numbers, as found by EleutherAI". I know it's in the description, but people will focus on the results. Or even add a new line item with this. This will depend on the resolution, if this gets traction we could consider trying to compute a better estimate.
- I noticed that memory is slightly different for
bin
vssafetensors
, why would that be the case? (I'd maybe remove this option to simplify).
I noticed that memory is slightly different for bin vs safetensors, why would that be the case? (I'd maybe remove this option to simplify).
Ccing @Wauplin for this.
This will depend on the resolution, if this gets traction we could consider trying to compute a better estimate.
Good point. Will add a note about the resolution.
I noticed that memory is slightly different for bin vs safetensors, why would that be the case? (I'd maybe remove this option to simplify).
Ccing @Wauplin for this.
How is the memory computed?
Not an expert here but can it be due to shared tensors that are not handled the same between the .bin and .safetensors versions?
How is the memory computed?
- Retrieve the files with
metadata
. - Use the
size
attribute.
Not an expert here but can it be due to shared tensors that are not handled the same between the .bin and .safetensors versions?
So, it seems like this happens at the serialization step and not at the size computation step per se.