FDSP + QLoRA

Background

Using FSDP with QLoRA is essential for fine-tuning larger (70b+ parameter) LLMs on consumer GPUs. For example, you can use FSDP + QLoRA to train a 70b model on two 24GB GPUs[^1].

Below, we describe how to use this feature in Axolotl.

Usage

To enable QLoRA with FSDP, you need to perform the following steps:

![Tip] See the example config file in addition to reading these instructions.

Set adapter: qlora in your axolotl config file.
Enable FSDP in your axolotl config, as described here.
Use one of the supported model types: llama, mistral or mixtral.

Example Config

examples/llama-2/qlora-fsdp.yml contains an example of how to enable QLoRA + FSDP in axolotl.

References

PR #1378 enabling QLoRA in FSDP in Axolotl.
Blog Post from the Answer.AI team describing the work that enabled QLoRA in FSDP.
Related HuggingFace PRs Enabling FDSP + QLoRA:
- Accelerate PR#2544
- Transformers PR#29587
- TRL PR#1416
- PEFT PR#1550

[^1]: This was enabled by this work from the Answer.AI team.