3.25bpw quant request
#1
by
OrangeApples
- opened
Kindly requesting a 3.25bpw quant since it would be the perfect size for 8k context (Q4 cache) on a 3090.
Edit: Retracting my request. Just tested the 3.5bpw quant at it just barely fit in my 3090 w/ 8k contect and Q4 cache. No need for 3.25bpw
OrangeApples
changed discussion status to
closed