|
|
|
--- |
|
datasets: |
|
- codeparrot/self-instruct-starcoder |
|
pipeline_tag: text2text-generation |
|
metrics: |
|
- code_eval |
|
library_name: transformers |
|
tags: |
|
- code |
|
model-index: |
|
- name: StarCoder-SelfInstruct |
|
results: |
|
- task: |
|
type: text-generation |
|
dataset: |
|
type: openai_humaneval |
|
name: InstructHumanEval |
|
metrics: |
|
- name: pass@1 |
|
type: pass@1 |
|
value: 0.391 |
|
verified: false |
|
- task: |
|
type: text-generation |
|
dataset: |
|
type: openai_humaneval |
|
name: HumanEval |
|
metrics: |
|
- name: pass@1 |
|
type: pass@1 |
|
value: 0.346 |
|
verified: false |
|
--- |
|
|
|
|
|
# Model Card for Self-instruct-starcoder |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
|
|
This model is an instruction-tuned version of ⭐️ StarCoder. The instruction dataset involved is [Self-instruct-starcoder](https://huggingface.co/datasets/codeparrot/self-instruct-starcoder) |
|
which was built by boostrapping on StarCoder's generations. |
|
## Uses |
|
|
|
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. --> |
|
The model was fine-tuned with the following template |
|
``` |
|
Question: <instruction> |
|
|
|
Answer: <output> |
|
``` |
|
If you have your model and tokenizer loaded, you can use the following code to make the model generate the right output to a given instruction |
|
|
|
```python |
|
instruction = "Write a function to compute the GCD between two integers a and b" |
|
prompt = f"Question:{instruction}\n\nAnswer:" |
|
input_ids = tokenizer(prompt, return_tensors="pt")["input_ids"] |
|
completion = model.generate(input_ids, max_length=200) |
|
print(tokenizer.batch_decode(completion[:,input_ids.shape[1]:])[0]) |
|
``` |
|
|
|
## More information |
|
|
|
For additional information, check |
|
- [self-intruct-starcoder](https://huggingface.co/codeparrot/self-instruct-starcoder) |
|
- [starcoder](https://huggingface.co/bigcode/starcoder) |