Salesforce
/

codet5p-2b

Text2Text Generation

Model card Files Files and versions Community

yuewang-sf commited on May 17, 2023

Commit

2e03b4a

•

1 Parent(s): b93949f

Update README.md

Files changed (1) hide show

README.md +2 -3

README.md CHANGED Viewed

@@ -45,10 +45,9 @@ Supported languages (9 in total) are as follows:
 ## Training procedure
-This checkpoint is initialized from off-the-shelf LLMs, i.e. its encoder is initialized from CodeGen-350M-mono and its decoder is initialized from CodeGen-16B-mono.
 It is trained on the unimodal code data at the first-stage pretraining, which includes a diverse set of pretraining tasks including _span denoising_ and two variants of _causal language modeling_.
-After that, it is further trained on the Python subset with the causal language modeling objective for another epoch to better adapt for Python code generation.
-Finally, we apply instruction tuning to align it with natural language instructions following [Code Alpaca](https://github.com/sahil280114/codealpaca).
 Please refer to the paper for more details.
 ## Evaluation results

 ## Training procedure
+This checkpoint is initialized from off-the-shelf LLMs, i.e. its encoder is initialized from [CodeGen-350M-mono](https://huggingface.co/Salesforce/codegen-350M-mono) and its decoder is initialized from [CodeGen-2B-mono](https://huggingface.co/Salesforce/codegen-2B-mono).
 It is trained on the unimodal code data at the first-stage pretraining, which includes a diverse set of pretraining tasks including _span denoising_ and two variants of _causal language modeling_.
+After that, it is further trained on the Python subset with the causal language modeling objective for another epochs to better adapt for Python code generation.
 Please refer to the paper for more details.
 ## Evaluation results