Spaces:

Dovakiins
/

qwerrwe

Build error

App Files Files Community

chansung

winglian commited on May 14

Commit

5d97e65

•

1 Parent(s): 2147cf6

add dstack section (#1612) [skip ci]

Browse files

* add dstack section

* chore: lint

---------

Co-authored-by: Wing Lian <wing.lian@gmail.com>

Files changed (1) hide show

README.md +37 -0

README.md CHANGED Viewed

@@ -34,6 +34,7 @@ Features:
   - [Mac](#mac)
   - [Google Colab](#google-colab)
   - [Launching on public clouds via SkyPilot](#launching-on-public-clouds-via-skypilot)
 - [Dataset](#dataset)
 - [Config](#config)
   - [Train](#train)
@@ -292,6 +293,42 @@ HF_TOKEN=xx sky launch axolotl.yaml --env HF_TOKEN
 HF_TOKEN=xx BUCKET=<unique-name> sky spot launch axolotl-spot.yaml --env HF_TOKEN --env BUCKET
 ```
 ### Dataset
 Axolotl supports a variety of dataset formats.  It is recommended to use a JSONL.  The schema of the JSONL depends upon the task and the prompt template you wish to use.  Instead of a JSONL, you can also use a HuggingFace dataset with columns for each JSONL field.

   - [Mac](#mac)
   - [Google Colab](#google-colab)
   - [Launching on public clouds via SkyPilot](#launching-on-public-clouds-via-skypilot)
+  - [Launching on public clouds via dstack](#launching-on-public-clouds-via-dstack)
 - [Dataset](#dataset)
 - [Config](#config)
   - [Train](#train)
 HF_TOKEN=xx BUCKET=<unique-name> sky spot launch axolotl-spot.yaml --env HF_TOKEN --env BUCKET
 ```
+#### Launching on public clouds via dstack
+To launch on GPU instance (both on-demand and spot instances) on public clouds (GCP, AWS, Azure, Lambda Labs, TensorDock, Vast.ai, and CUDO), you can use [dstack](https://dstack.ai/).
+Write a job description in YAML as below:
+```yaml
+# dstack.yaml
+type: task
+image: winglian/axolotl-cloud:main-20240429-py3.11-cu121-2.2.1
+env:
+  - HUGGING_FACE_HUB_TOKEN
+  - WANDB_API_KEY
+commands:
+  - accelerate launch -m axolotl.cli.train config.yaml
+ports:
+  - 6006
+resources:
+  gpu:
+    memory: 24GB..
+    count: 2
+```
+then, simply run the job with `dstack run` command. Append `--spot` option if you want spot instance. `dstack run` command will show you the instance with cheapest price across multi cloud services:
+```bash
+pip install dstack
+HUGGING_FACE_HUB_TOKEN=xxx WANDB_API_KEY=xxx dstack run . -f dstack.yaml # --spot
+```
+For further and fine-grained use cases, please refer to the official [dstack documents](https://dstack.ai/docs/) and the detailed description of [axolotl example](https://github.com/dstackai/dstack/tree/master/examples/fine-tuning/axolotl) on the official repository.
 ### Dataset
 Axolotl supports a variety of dataset formats.  It is recommended to use a JSONL.  The schema of the JSONL depends upon the task and the prompt template you wish to use.  Instead of a JSONL, you can also use a HuggingFace dataset with columns for each JSONL field.