Text Generation
Transformers
PyTorch
code
gpt2
custom_code
Eval Results
text-generation-inference
codelion commited on
Commit
b943618
1 Parent(s): 7fc5167

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -27
README.md CHANGED
@@ -79,22 +79,26 @@ This is a LLM for code that is focussed on generating bug fixes using infilling.
79
  - **Model type:** GPT-2
80
  - **Finetuned from model:** [bigcode/santacoder](https://huggingface.co/bigcode/santacoder)
81
 
82
- ## Uses
83
 
84
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
85
 
86
- ### Direct Use
87
 
88
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
 
 
89
 
90
- [More Information Needed]
 
91
 
 
 
92
 
93
- ## How to Get Started with the Model
94
-
95
- Use the code below to get started with the model.
96
-
97
- [More Information Needed]
98
 
99
  ## Training Details
100
 
@@ -118,20 +122,4 @@ Supervised Fine Tuning (SFT)
118
 
119
  ## Evaluation
120
 
121
- <!-- This section describes the evaluation protocols and provides the results. -->
122
-
123
- ### Testing Data, Factors & Metrics
124
-
125
- #### Testing Data
126
-
127
- <!-- This should link to a Data Card if possible. -->
128
-
129
- [More Information Needed]
130
-
131
- ### Results
132
-
133
- [More Information Needed]
134
-
135
- #### Summary
136
-
137
- [More Information Needed]
 
79
  - **Model type:** GPT-2
80
  - **Finetuned from model:** [bigcode/santacoder](https://huggingface.co/bigcode/santacoder)
81
 
 
82
 
83
+ ## How to Get Started with the Model
84
 
85
+ Use the code below to get started with the model.
86
 
87
+ ```python
88
+ # pip install -q transformers
89
+ from transformers import AutoModelForCausalLM, AutoTokenizer
90
 
91
+ checkpoint = "lambdasec/santafixer"
92
+ device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
93
 
94
+ tokenizer = AutoTokenizer.from_pretrained(checkpoint)
95
+ model = AutoModelForCausalLM.from_pretrained(checkpoint, trust_remote_code=True).to(device)
96
 
97
+ input_text = "<fim-prefix>def print_hello_world():\n <fim-suffix>\n print('Hello world!')<fim-middle>"
98
+ inputs = tokenizer.encode(input_text, return_tensors="pt").to(device)
99
+ outputs = model.generate(inputs)
100
+ print(tokenizer.decode(outputs[0]))
101
+ ```
102
 
103
  ## Training Details
104
 
 
122
 
123
  ## Evaluation
124
 
125
+ The model was evaluted on the [GitHub top 1000 projects vulnerabilities dataset](https://huggingface.co/datasets/lambdasec/gh-top-1000-projects-vulns)