Spaces:

hf-accelerate
/

accelerate_examples

Running on CPU Upgrade

App Files Files Community

muellerzr HF staff commited on Jan 26, 2023

Commit

4bd209d

1 Parent(s): dbf5fb1

Finished for now

Browse files

Files changed (5) hide show

code_samples/calculating_metrics +17 -3
code_samples/checkpointing +13 -1
code_samples/experiment_tracking +32 -0
code_samples/gradient_accumulation +15 -1
src/app.py +3 -3

code_samples/calculating_metrics CHANGED Viewed

@@ -1,9 +1,13 @@
 <pre>
 import evaluate
 +from accelerate import Accelerator
 +accelerator = Accelerator()
-+dataloader, model, optimizer scheduler = accelerator.prepare(
-+        dataloader, model, optimizer, scheduler
 +)
 metric = evaluate.load("accuracy")
 for batch in train_dataloader:
@@ -32,4 +36,14 @@ for batch in eval_dataloader:
         predictions = predictions,
         references = references
     )
-print(metric.compute())</pre>

+##
 <pre>
 import evaluate
 +from accelerate import Accelerator
 +accelerator = Accelerator()
++train_dataloader, eval_dataloader, model, optimizer, scheduler = (
++    accelerator.prepare(
++        train_dataloader, eval_dataloader,
++        model, optimizer, scheduler
++    )
 +)
 metric = evaluate.load("accuracy")
 for batch in train_dataloader:
         predictions = predictions,
         references = references
     )
+print(metric.compute())</pre>
+##
+When calculating metrics on a validation set, you can use the `Accelerator.gather_for_metrics`
+method to gather the predictions and references from all devices and then calculate the metric on the gathered values.
+This will also *automatically* drop the padded values from the gathered tensors that were added to ensure
+that all tensors have the same length. This ensures that the metric is calculated on the correct values.
+##
+To learn more checkout the related documentation:
+- [API reference](https://huggingface.co/docs/accelerate/v0.15.0/package_reference/accelerator#accelerate.Accelerator.gather_for_metrics)
+- [Example script](https://github.com/huggingface/accelerate/blob/main/examples/by_feature/multi_process_metrics.py)

code_samples/checkpointing CHANGED Viewed

@@ -1,3 +1,4 @@
 <pre>
 from accelerate import Accelerator
 accelerator = Accelerator()
@@ -13,4 +14,15 @@ for batch in dataloader:
     accelerator.backward(loss)
     optimizer.step()
     scheduler.step()
-+accelerator.save_state("checkpoint_dir")</pre>

+##
 <pre>
 from accelerate import Accelerator
 accelerator = Accelerator()
     accelerator.backward(loss)
     optimizer.step()
     scheduler.step()
++accelerator.save_state("checkpoint_dir")
++accelerator.load_state("checkpoint_dir")</pre>
+##
+To save or load a checkpoint in, `Accelerator` provides the `save_state` and `load_state` methods.
+These methods will save or load the state of the model, optimizer, scheduler, as well as random states and
+any custom registered objects from the main process on each device to a passed in folder.
+**This API is designed to save and resume training states only from within the same python script or training setup.**
+##
+To learn more checkout the related documentation:
+- [`save_state` reference](https://huggingface.co/docs/accelerate/v0.15.0/package_reference/accelerator#accelerate.Accelerator.save_state)
+- [`load_state` reference](https://huggingface.co/docs/accelerate/v0.15.0/package_reference/accelerator#accelerate.Accelerator.load_state)
+- [Example script](https://github.com/huggingface/accelerate/blob/main/examples/by_feature/checkpointing.py)

code_samples/experiment_tracking ADDED Viewed

	@@ -0,0 +1,32 @@

+##
+<pre>
+from accelerate import Accelerator
+-accelerator = Accelerator()
++accelerator = Accelerator(log_with="wandb")
+train_dataloader, model, optimizer scheduler = accelerator.prepare(
+        dataloader, model, optimizer, scheduler
+)
++accelerator.init_trackers()
+model.train()
+for batch in train_dataloader:
+    optimizer.zero_grad()
+    inputs, targets = batch
+    outputs = model(inputs)
+    loss = loss_function(outputs, targets)
++    accelerator.log({"loss":loss})
+    accelerator.backward(loss)
+    optimizer.step()
+    scheduler.step()
++accelerator.end_training()
+</pre>
+##
+To use experiment trackers with `accelerate`, simply pass the desired tracker to the `log_with` parameter
+when building the `Accelerator` object. Then initialize the tracker(s) by running `Accelerator.init_trackers()`
+passing in any configurations they may need. Afterwards call `Accelerator.log` to log a particular value to your tracker.
+At the end of training call `accelerator.end_training()` to call any finalization functions a tracking library
+may need automatically.
+##
+To learn more checkout the related documentation:
+- [Basic Tutorial](https://huggingface.co/docs/accelerate/usage_guides/tracking)
+- [Accelerator API Reference](https://huggingface.co/docs/accelerate/package_reference/accelerator#accelerate.Accelerator.log)
+- [Tracking API Reference](https://huggingface.co/docs/accelerate/package_reference/tracking)

code_samples/gradient_accumulation CHANGED Viewed

@@ -1,3 +1,4 @@
 <pre>
 from accelerate import Accelerator
 accelerator = Accelerator(
@@ -15,4 +16,17 @@ for batch in dataloader:
       loss = loss_function(outputs, targets)
       accelerator.backward(loss)
       optimizer.step()
-      scheduler.step()</pre>

+##
 <pre>
 from accelerate import Accelerator
 accelerator = Accelerator(
       loss = loss_function(outputs, targets)
       accelerator.backward(loss)
       optimizer.step()
+      scheduler.step()</pre>
+##
+When performing gradient accumulation in a distributed setup, there are many opportunities for efficiency mistakes
+to occur. `Accelerator` provides a context manager that will take care of the details for you and ensure that the
+model is training correctly. Simply wrap the training loop in the `Accelerator.accumulate` context manager
+while passing in the model you are training on and during training the gradients will accumulate and synchronize
+automatically when needed.
+##
+To learn more checkout the related documentation:
+- [API reference](https://huggingface.co/docs/accelerate/package_reference/accelerator#accelerate.Accelerator.accumulate)
+- [Example script](https://github.com/huggingface/accelerate/blob/main/examples/by_feature/gradient_accumulation.py)
+- [Performing automatic gradient accumulation](https://github.com/huggingface/accelerate/blob/main/examples/by_feature/automatic_gradient_accumulation.py)

src/app.py CHANGED Viewed

@@ -18,9 +18,9 @@ def change(inp, components=[]):
     if inp == "Basic":
         return (templates["initial"], highlight(code), "## Accelerate Code (Base Integration)", explanation, docs)
     elif inp == "Calculating Metrics":
-        return (templates["initial_with_metrics"], highlight(inp), f"## Accelerate Code ({inp})", explanation, docs)
     else:
-        return (templates["accelerate"], highlight(inp), f"## Accelerate Code ({inp})", explanation, docs)
 initial_md = gr.Markdown("## Initial Code")
 initial_code = gr.Markdown(templates["initial"])
@@ -30,7 +30,7 @@ with gr.Blocks() as demo:
 Here is a very basic Python training loop.
 Select how you would like to introduce an Accelerate capability to add to it.''')
     inp = gr.Radio(
-        ["Basic", "Calculating Metrics", "Checkpointing", "Gradient Accumulation", ],
         label="Select a feature"
     )
     with gr.Row():

     if inp == "Basic":
         return (templates["initial"], highlight(code), "## Accelerate Code (Base Integration)", explanation, docs)
     elif inp == "Calculating Metrics":
+        return (templates["initial_with_metrics"], highlight(code), f"## Accelerate Code ({inp})", explanation, docs)
     else:
+        return (templates["accelerate"], highlight(code), f"## Accelerate Code ({inp})", explanation, docs)
 initial_md = gr.Markdown("## Initial Code")
 initial_code = gr.Markdown(templates["initial"])
 Here is a very basic Python training loop.
 Select how you would like to introduce an Accelerate capability to add to it.''')
     inp = gr.Radio(
+        ["Basic", "Calculating Metrics", "Checkpointing", "Experiment Tracking", "Gradient Accumulation"],
         label="Select a feature"
     )
     with gr.Row():