Add clarifications/disclaimer
Browse files
README.md
CHANGED
@@ -16,6 +16,16 @@ inference: false
|
|
16 |
|
17 |
# MPT-30B
|
18 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
19 |
MPT-30B is a decoder-style transformer pretrained from scratch on 1T tokens of English text and code.
|
20 |
This model was trained by [MosaicML](https://www.mosaicml.com).
|
21 |
|
@@ -242,4 +252,4 @@ for open-source foundation models},
|
|
242 |
note = {Accessed: 2023-06-22},
|
243 |
urldate = {2023-06-22}
|
244 |
}
|
245 |
-
```
|
|
|
16 |
|
17 |
# MPT-30B
|
18 |
|
19 |
+
This is the MPT-30B but with added support to finetune using peft (tested with qlora). It is not finetuned further, the weights are the same as the original MPT-30b.
|
20 |
+
|
21 |
+
I have not traced through the whole huggingface stack to see if this is working correctly but it does finetune with qlora and outputs are reasonable.
|
22 |
+
Inspired by implementations here https://huggingface.co/cekal/mpt-7b-peft-compatible/commits/main
|
23 |
+
https://huggingface.co/mosaicml/mpt-7b/discussions/42.
|
24 |
+
|
25 |
+
The original description for MosaicML team below:
|
26 |
+
|
27 |
+
|
28 |
+
|
29 |
MPT-30B is a decoder-style transformer pretrained from scratch on 1T tokens of English text and code.
|
30 |
This model was trained by [MosaicML](https://www.mosaicml.com).
|
31 |
|
|
|
252 |
note = {Accessed: 2023-06-22},
|
253 |
urldate = {2023-06-22}
|
254 |
}
|
255 |
+
```
|