HyperionHF commited on
Commit
c15f1a0
1 Parent(s): 5dfcc50

Add colab link

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -25,6 +25,8 @@ This model is a fine-tune of [codegen-2B-mono](https://huggingface.co/Salesforce
25
 
26
  diff-codegen-2b-v2 is an experimental research artifact and should be treated as such. We are releasing these results and this model in the hopes that it may be useful to the greater research community, especially those interested in LMs for code.
27
 
 
 
28
  ## Training Data
29
 
30
  This model is a fine-tune of [codegen-2B-mono](https://huggingface.co/Salesforce/codegen-2B-mono) by Salesforce. This language model was first pre-trained on The Pile, an 800Gb dataset composed of varied web corpora. The datasheet and paper for the Pile can be found [here](https://arxiv.org/abs/2201.07311) and [here](https://arxiv.org/abs/2101.00027) respectively. The model was then fine-tuned on a large corpus of code data in multiple languages, before finally being fine-tuned on a Python code dataset. The Codegen paper with full details of these datasets can be found [here](https://arxiv.org/abs/2203.13474).
 
25
 
26
  diff-codegen-2b-v2 is an experimental research artifact and should be treated as such. We are releasing these results and this model in the hopes that it may be useful to the greater research community, especially those interested in LMs for code.
27
 
28
+ An example Colab notebook with a brief example of prompting the model is [here](https://colab.research.google.com/drive/1ySm6HYvALerDiGmk6g3pDz68V7fAtrQH#scrollTo=thvzNpmahNNx).
29
+
30
  ## Training Data
31
 
32
  This model is a fine-tune of [codegen-2B-mono](https://huggingface.co/Salesforce/codegen-2B-mono) by Salesforce. This language model was first pre-trained on The Pile, an 800Gb dataset composed of varied web corpora. The datasheet and paper for the Pile can be found [here](https://arxiv.org/abs/2201.07311) and [here](https://arxiv.org/abs/2101.00027) respectively. The model was then fine-tuned on a large corpus of code data in multiple languages, before finally being fine-tuned on a Python code dataset. The Codegen paper with full details of these datasets can be found [here](https://arxiv.org/abs/2203.13474).