rooa commited on
Commit
4b7656b
1 Parent(s): d693bb4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -0
README.md CHANGED
@@ -102,6 +102,12 @@ Please refer to the [blog](https://blog.salesforceairesearch.com/codegen25) for
102
  As an autoregressive language model, CodeGen2.5 is capable of extracting features from given natural language and programming language texts, and calculating the likelihood of them.
103
  However, the model is intended for and best at **program synthesis**, that is, generating executable code given English prompts, where the prompts should be in the form of a comment string. The model can complete partially-generated code as well.
104
 
 
 
 
 
 
 
105
  ## BibTeX entry and citation info
106
 
107
  Please cite CodeGen2 paper:
 
102
  As an autoregressive language model, CodeGen2.5 is capable of extracting features from given natural language and programming language texts, and calculating the likelihood of them.
103
  However, the model is intended for and best at **program synthesis**, that is, generating executable code given English prompts, where the prompts should be in the form of a comment string. The model can complete partially-generated code as well.
104
 
105
+ ## Attribution & Other Requirements
106
+ The pretraining dataset of the model was filtered for permissive licenses only.
107
+ Nevertheless, the model can generate source code verbatim from the dataset.
108
+ The code's license might require attribution and/or other specific requirements that must be respected.
109
+ The data provider BigCode provides a [search index](https://huggingface.co/spaces/bigcode/starcoder-search) that lets you search through the pretraining data to identify where generated code came from and apply the proper attribution to your code.
110
+
111
  ## BibTeX entry and citation info
112
 
113
  Please cite CodeGen2 paper: