VRAM requirements
#1
by
dipjeb
- opened
Hi,
Would this run on a single instance of 4090 ( 24 GB )
No, for a single 24 GB card you can run 3.5bpw for a full 32k context, or 3.75bpw for ~12k context, both with fp8 cache enabled.