FP16 vs FP32

#127

by Taylor658 - opened Jun 8, 2024

Jun 8, 2024

What are the memory usage, performance differences, and accuracy trade-offs between FP16 and FP32 precision in Whisper-large-v3 on typical GPU like the NVIDIA A100?

faizsameerahmed96

Jun 9, 2024

You can get a rough idea of the memory usage to run any model using this formula

Approx memory usage = No of parameters * byte precision * 0.1

In theory, the memory would be a bit higher (sequence length, loading libraries etc)

When we say FP16, this equates to 2 bytes per parameter, Whisper Large v3 has ~1.6B params.

Therefore the total memory usage for params would be over 3.2GB.

Taylor658

Jun 10, 2024

Thanks for the feedback and formula

Taylor658 changed discussion status to closed Jun 10, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment