VRAM requirements

#1
by dipjeb - opened

Hi,
Would this run on a single instance of 4090 ( 24 GB )

No, for a single 24 GB card you can run 3.5bpw for a full 32k context, or 3.75bpw for ~12k context, both with fp8 cache enabled.

Sign up or log in to comment