run model in colab using 8 bit

by kabalanresearch - opened Oct 23, 2022

Oct 23, 2022

Im trying to run the model using the 8 bit library

model = T5ForConditionalGeneration.from_pretrained("google/flan-t5-xxl", device_map="auto",torch_dtype=torch.bfloat16, load_in_8bit=True)

the model gets loaded and returns output, but the return value is some kind of gibberish,
did some one have success with the 8 bit library ?

Google org Oct 24, 2022

This is expected as float16 does not work either on this model. We are investigating this!

Oct 24, 2022

Also, note that this happens only for xxl model, for other models the int8 quantization works as expected

nbroad

Oct 24, 2022

Oct 25, 2022

I tested the xl one using float16and int8and it does not work as expected (gibberish). However, it works like a charm in fp32

Oct 25, 2022

@mrm8488 can you pls post your model config

Nov 3, 2022

@mrm8488 can you pls post your model config

Apr 18, 2023

Anyone here able to run Flan-T5-XL on colab? I tried 8bit and got junk results.

Oct 11, 2023

can you try with the recent release of transformers pip install -U transformers + use 4bit instead (just pass load_in_4bit=True)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment