accelerate_examples / code_samples /gradient_accumulation
muellerzr's picture
muellerzr HF staff
Finished for now
4bd209d
raw
history blame
1.43 kB
##
<pre>
from accelerate import Accelerator
accelerator = Accelerator(
+ gradient_accumulation_steps=2,
)
dataloader, model, optimizer scheduler = accelerator.prepare(
dataloader, model, optimizer, scheduler
)
for batch in dataloader:
+ with accelerator.accumulate(model):
optimizer.zero_grad()
inputs, targets = batch
outputs = model(inputs)
loss = loss_function(outputs, targets)
accelerator.backward(loss)
optimizer.step()
scheduler.step()</pre>
##
When performing gradient accumulation in a distributed setup, there are many opportunities for efficiency mistakes
to occur. `Accelerator` provides a context manager that will take care of the details for you and ensure that the
model is training correctly. Simply wrap the training loop in the `Accelerator.accumulate` context manager
while passing in the model you are training on and during training the gradients will accumulate and synchronize
automatically when needed.
##
To learn more checkout the related documentation:
- [API reference](https://huggingface.co/docs/accelerate/package_reference/accelerator#accelerate.Accelerator.accumulate)
- [Example script](https://github.com/huggingface/accelerate/blob/main/examples/by_feature/gradient_accumulation.py)
- [Performing automatic gradient accumulation](https://github.com/huggingface/accelerate/blob/main/examples/by_feature/automatic_gradient_accumulation.py)