fp16 or bf16 version?

by xiangli - opened Sep 30, 2024

Sep 30, 2024

Hi, is there a float16 or bfloat16 version? The fp32 model takes too much memory , and the code is customized specifically for fp32, not easy to infer in fp16 or bf16.

chrisc36

Ai2 org Sep 30, 2024

We have adjusted to code for work with bfloat16, although note I have seen this change the model's output a bit.

mw44

Oct 1, 2024

We have adjusted to code for work with bfloat16, although note I have seen this change the model's output a bit.

What kind of VRAM requirements are there for this model + fp32 as well as bf16? Am already blown away by the 7B but curious to interact with the 72B.

amanrangapur

Ai2 org Oct 14, 2024

The VRAM requirements for this model are similar to those of the Qwen 7B models. I recommend referring to the Model Size Estimator on Hugging Face for detailed information.

amanrangapur changed discussion status to closed 23 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment