## Run `accelerate config` and answer the questionnaire accordingly. Below is an example yaml for using multi-gpu training with 4 GPUs.
compute_environment: LOCAL_MACHINE deepspeed_config: {} distributed_type: MULTI_GPU downcast_bf16: 'no' dynamo_backend: 'NO' fsdp_config: {} gpu_ids: all machine_rank: 0 main_training_function: main megatron_lm_config: {} mixed_precision: 'no' num_machines: 1 num_processes: 4 rdzv_backend: static same_network: true use_cpu: false##
from accelerate import Accelerator + def main(): accelerator = Accelerator() model, optimizer, training_dataloader, scheduler = accelerator.prepare( model, optimizer, training_dataloader, scheduler ) for batch in training_dataloader: optimizer.zero_grad() inputs, targets = batch outputs = model(inputs) loss = loss_function(outputs, targets) accelerator.backward(loss) optimizer.step() scheduler.step() + if __name__ == "__main__": + main()Launching a script using default accelerate config file looks like the following: ``` accelerate launch {script_name.py} {--arg1} {--arg2} ... ``` Alternatively, you can use `accelerate launch` with right config params for multi-gpu training as shown below ``` accelerate launch --multi_gpu --num_processes=4 {script_name.py} {--arg1} {--arg2} ... ``` ## Using this feature involves no changes to the code apart from the ones mentioned in the tab `Simplify your code and improve efficieny`. ## To learn more checkout the related documentation: - Launching distributed code - The Accelerate CLI