eluzhnica
/

mpt-30b-peft-compatible

Text Generation

StreamingDatasets

text-generation-inference

Model card Files Files and versions Community

eluzhnica commited on Jun 27, 2023

Commit

6e454c1

•

1 Parent(s): a4a6419

Add clarifications/disclaimer

Files changed (1) hide show

README.md +11 -1

README.md CHANGED Viewed

@@ -16,6 +16,16 @@ inference: false
 # MPT-30B
 MPT-30B is a decoder-style transformer pretrained from scratch on 1T tokens of English text and code.
 This model was trained by [MosaicML](https://www.mosaicml.com).
@@ -242,4 +252,4 @@ for open-source foundation models},
     note      = {Accessed: 2023-06-22},
     urldate   = {2023-06-22}
 }
-```

 # MPT-30B
+This is the MPT-30B but with added support to finetune using peft (tested with qlora). It is not finetuned further, the weights are the same as the original MPT-30b.
+I have not traced through the whole huggingface stack to see if this is working correctly but it does finetune with qlora and outputs are reasonable.
+Inspired by implementations here https://huggingface.co/cekal/mpt-7b-peft-compatible/commits/main
+https://huggingface.co/mosaicml/mpt-7b/discussions/42.
+The original description for MosaicML team below:
 MPT-30B is a decoder-style transformer pretrained from scratch on 1T tokens of English text and code.
 This model was trained by [MosaicML](https://www.mosaicml.com).
     note      = {Accessed: 2023-06-22},
     urldate   = {2023-06-22}
 }
+```