can not run it on V100
#2
by
						
CrazyAIGC
	
							
						- opened
							
					
This comment has been hidden
			
			
				
					
	
	
				
	
	
You need to also put your model on the GPU, i.e. model.to("cuda")
may not be able to on a 16gb V100
This model requires about 30 GB of GPU RAM if you use 8 bit inference (i.e. pass load_in_8bit=True to from_pretrained)
So I'd recommend to check out smaller BLIP-2 variants.
I use 8 bit inference on my 32gb V100 but failed.
I've already converted the input to fp16, but this bug still occurred.AssertionError: The input data type needs to be fp16 but torch.float32 was found!
Has anyone successfully run the xxl model on the V100?
@zhouqh
	 i've run the opt_6.7b variant on a 24gb a10 with fp16

 
						 
						
