Memory Spikes while Getting Model Logits
#49
by
Nyandwi
- opened
Hello, thanks for this amazing visual language model.
I am having memory issues while forwarding the inputs to the model. The generate functionality works fine and I can run it multiple times. But when trying to get the logits with model(**inputs)
, I run out of memory. I have 48GB GPU RAM which is reasonably enough according to other discussions about devices. Is there something I am missing?
model_id = "adept/fuyu-8b"
processor = FuyuProcessor.from_pretrained(model_id)
model = FuyuForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype=torch.float16)
inputs = processor(text=prompt, images=sample_im_1, return_tensors="pt").to("cuda:0")
outputs = model(**inputs)
Thanks!