Champs

by groxaxo - opened about 1 month ago

Discussion

groxaxo

about 1 month ago

Hi Champs, thanks a lot for your work!
Is there any chance on running this in multiple gpus ? thank you !

shimmyshimmer

Unsloth AI org 19 days ago

I wouldn't really recommend using bnb 4bit models for inference. it will most likely work yes, but youre better off using an int4 quant

mtcl

19 days ago

I wouldn't really recommend using bnb 4bit models for inference. it will most likely work yes, but youre better off using an int4 quant

can you provide instructions on how to run int4 quant on a multiple GPU setup? I have 5x5090 on a Intel QYFS (56 cores/112 threads) with 512GB of DDR5 RAM 4800MHz. System is Ubuntu 24.04 and I run models usually on llama.cpp and ollama. I also use ik_llama and ktransformers. Is there an easy to follow guide for this?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment