crumb
/

ptune-FLAN-OPT-6.7b

Model card Files Files and versions Community

crumb commited on Feb 28, 2023

Commit

466d268

•

1 Parent(s): 8fbb2c7

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -13,9 +13,9 @@ tags:
 OPT was first introduced in [Open Pre-trained Transformer Language Models](https://arxiv.org/abs/2205.01068) and first released in [metaseq's repository](https://github.com/facebookresearch/metaseq) on May 3rd 2022 by Meta AI.
-This model is [facebook/opt-2.7b](https://hf.co/facebook/opt-2.7b) finetuned with prefix tuning (https://arxiv.org/abs/2101.00190) on the FLAN datasets (https://arxiv.org/pdf/2210.11416.pdf).
-A 24 token prefix was finetuned over 3.7m new tokens of a FLAN task mixture, with the start of each example cut off if it was too large to fit within a 512 token context.
 The model reaches a train ppl of 6.09 and an eval ppl of 5.91.

 OPT was first introduced in [Open Pre-trained Transformer Language Models](https://arxiv.org/abs/2205.01068) and first released in [metaseq's repository](https://github.com/facebookresearch/metaseq) on May 3rd 2022 by Meta AI.
+This model is [facebook/opt-6.7b](https://hf.co/facebook/opt-6.7b) finetuned with prefix tuning (https://arxiv.org/abs/2101.00190) on the FLAN datasets (https://arxiv.org/pdf/2210.11416.pdf).
+A 24 token prefix was finetuned over 1.5m new tokens of a FLAN task mixture, with the start of each example cut off if it was too large to fit within a 256 token context.
 The model reaches a train ppl of 6.09 and an eval ppl of 5.91.