Cannot run with tensor parallel > 1. Might need padding like on Qwen2.5-72B?

by OwenArli - opened May 14

May 14

Is this the same issue that prevented Qwen2.5-72B from being run in tensor parallel? That it needs to be padded before being quanted to int4? https://qwen.readthedocs.io/zh-cn/latest/quantization/gptq.html#troubleshooting

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment