Hardware spec to train 70b model

by cnut1648 - opened Aug 10, 2023

Aug 10, 2023

Hello, nice work!
I wonder if you can disclose some of the hardware spec to train the model. Currently I am experimenting with using training 70B model but have no success on 8x A100 80G gpus (gets out of memory error), even with bf16 + LoRA + deepspeed Zero 3 Offload + FlashAttention.
Thanks!

WizardLM

WizardLM Team org Aug 14, 2023

8x A100 80G gpus is enough for the 70b training.

cnut1648

Aug 14, 2023

Hi @WizardLM thanks for the reply. Will the training details be released, in paper or in high-level? I am pretty curious about training 70B size model. Are you using deepspeed zero 3 offload or is there other acceleration method?

luffycodes

Sep 9, 2023

Yeah agree with @cnut1648 ! having the training config will be very helpful !

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment