Load into 2 GPUs

#28

by sauravm8 - opened Jul 28, 2023

Discussion

sauravm8

Jul 28, 2023

•

edited Jul 28, 2023

Have 2 A10 GPUs. The memory is not enough to load the model on 1 using cuda:0, is there a way both GPUs can be used? On not specifying device, the inference is not working

chraac

Aug 1, 2023

maybe you could try the gpu-split setting in the model config page, my 2 x 22g 2080ti run smoothly with this setting

neo-benjamin

Aug 1, 2023

•

edited Aug 1, 2023

@chraac how to do this programatically?

chraac

Aug 2, 2023

•

edited Aug 2, 2023

@chraac how to do this programatically?

from the README of the text-generation-webui, when using exllama loader, there's a parameter called --gpu-split can specify the ram usage of each GPU

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment