Will there be a distilled model that fits inside 48GB VRAM (2x 3090)?
#29
by
gameveloster
- opened
And maybe another that fits inside 96GB when using a node of 4x 3090?
Hope someone can help distil this, thanks!
I'd recommend using bloomz-7b1 or mt0-xxl, which should work well for inference given your setup.
@ybelkada also ran distillation experiments on BLOOM - I'm not sure what the verdict was i.e. if it makes sense for models of this scale?