yuewang-sf commited on
Commit
2e03b4a
1 Parent(s): b93949f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -3
README.md CHANGED
@@ -45,10 +45,9 @@ Supported languages (9 in total) are as follows:
45
 
46
  ## Training procedure
47
 
48
- This checkpoint is initialized from off-the-shelf LLMs, i.e. its encoder is initialized from CodeGen-350M-mono and its decoder is initialized from CodeGen-16B-mono.
49
  It is trained on the unimodal code data at the first-stage pretraining, which includes a diverse set of pretraining tasks including _span denoising_ and two variants of _causal language modeling_.
50
- After that, it is further trained on the Python subset with the causal language modeling objective for another epoch to better adapt for Python code generation.
51
- Finally, we apply instruction tuning to align it with natural language instructions following [Code Alpaca](https://github.com/sahil280114/codealpaca).
52
  Please refer to the paper for more details.
53
 
54
  ## Evaluation results
 
45
 
46
  ## Training procedure
47
 
48
+ This checkpoint is initialized from off-the-shelf LLMs, i.e. its encoder is initialized from [CodeGen-350M-mono](https://huggingface.co/Salesforce/codegen-350M-mono) and its decoder is initialized from [CodeGen-2B-mono](https://huggingface.co/Salesforce/codegen-2B-mono).
49
  It is trained on the unimodal code data at the first-stage pretraining, which includes a diverse set of pretraining tasks including _span denoising_ and two variants of _causal language modeling_.
50
+ After that, it is further trained on the Python subset with the causal language modeling objective for another epochs to better adapt for Python code generation.
 
51
  Please refer to the paper for more details.
52
 
53
  ## Evaluation results