Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,39 @@
|
|
1 |
---
|
2 |
-
license:
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
license: other
|
3 |
+
datasets:
|
4 |
+
- jondurbin/airoboros-gpt4-1.4
|
5 |
---
|
6 |
+
|
7 |
+
## Overview
|
8 |
+
|
9 |
+
This is a test of qlora fine-tuning of the mpt-30b model, __with 5 epochs__.
|
10 |
+
|
11 |
+
qlora compatible model: https://huggingface.co/jondurbin/mpt-30b-qlora-compatible
|
12 |
+
|
13 |
+
My fork of qlora with mpt-30b support: https://github.com/jondurbin/qlora
|
14 |
+
|
15 |
+
Differences in the qlora scripts:
|
16 |
+
|
17 |
+
- requires adding `--mpt True` for mpt-based models
|
18 |
+
- uses `--num_train_epochs` instead of `--max_steps`
|
19 |
+
- uses airoboros prompt format (mostly 1:1 with vicuna) rather than alpaca, and expects an input file in JSONL format with "instruction" and "response"
|
20 |
+
|
21 |
+
__I think there's a bug in gradient accumulation, so if you try this, maybe set gradient accumulation steps to 1__
|
22 |
+
|
23 |
+
See the mpt-30b-qlora-compatible model card for training details.
|
24 |
+
|
25 |
+
*This is not as high quality as the llama-33b versions unfortunately, but I don't have a great answer as to why. Perhaps there are fewer forward layers that can be tuned?*
|
26 |
+
|
27 |
+
### License and usage
|
28 |
+
|
29 |
+
This is a real gray area, here's why:
|
30 |
+
|
31 |
+
- the dataset was generated with gpt-4, via https://github.com/jondurbin/airoboros
|
32 |
+
- the ToS for openai API usage has a clause preventing the output from being used to train a model that __competes__ with OpenAI
|
33 |
+
- what does *compete* actually mean here?
|
34 |
+
- a 30b parameter model isn't anywhere near the quality of gpt-4, or even gpt-3.5, so I can't imagine this could credibly be considered competing in the first place
|
35 |
+
- if someone else uses the dataset to do the same, they wouldn't necessarily be violating the ToS because they didn't call the API, so I don't know how that works
|
36 |
+
- the training data used in essentially all large language models includes a significant of copyrighted or otherwise unallowable licensing in the first place
|
37 |
+
- other work using the self-instruct method, e.g. the original here: https://github.com/yizhongw/self-instruct released the data and model as apache-2
|
38 |
+
|
39 |
+
I am purposingly not placing a license on here because I am not a lawyer and refuse to attempt to interpret all of the terms accordingly. Your best bet is probably to avoid using this commercially, especially since it didn't perform quite as well as expected using qlora.
|