|
--- |
|
datasets: |
|
- theblackcat102/evol-codealpaca-v1 |
|
model-index: |
|
- name: abacaj/starcoderbase-1b-sft |
|
results: |
|
- task: |
|
type: text-generation |
|
dataset: |
|
type: openai_humaneval |
|
name: HumanEval |
|
metrics: |
|
- name: pass@1 |
|
type: pass@1 |
|
value: 39 |
|
verified: false |
|
- task: |
|
type: text-generation |
|
dataset: |
|
type: mbpp |
|
name: MBPP |
|
metrics: |
|
- name: pass@1 |
|
type: pass@1 |
|
value: 31.74 |
|
verified: false |
|
language: |
|
- en |
|
--- |
|
|
|
Dataset credits go to: [theblackcat102](https://huggingface.co/theblackcat102) |
|
|
|
How to run inference: |
|
```python |
|
import transformers |
|
import torch |
|
|
|
|
|
def fmt_prompt(prompt: str) -> str: |
|
return f"""[Instructions]:\n{prompt}\n\n[Response]:""" |
|
|
|
|
|
if __name__ == "__main__": |
|
model_name = "abacaj/starcoderbase-1b-sft" |
|
tokenizer = transformers.AutoTokenizer.from_pretrained(model_name) |
|
|
|
model = ( |
|
transformers.AutoModelForCausalLM.from_pretrained( |
|
model_name, |
|
) |
|
.to("cuda:0") |
|
.eval() |
|
) |
|
|
|
prompt = "Write a python function to sort the following array in ascending order, don't use any built in sorting methods: [9,2,8,1,5]" |
|
prompt_input = fmt_prompt(prompt) |
|
inputs = tokenizer(prompt_input, return_tensors="pt").to(model.device) |
|
input_ids_cutoff = inputs.input_ids.size(dim=1) |
|
|
|
with torch.no_grad(): |
|
generated_ids = model.generate( |
|
**inputs, |
|
use_cache=True, |
|
max_new_tokens=512, |
|
temperature=0.2, |
|
top_p=0.95, |
|
do_sample=True, |
|
eos_token_id=tokenizer.eos_token_id, |
|
pad_token_id=tokenizer.pad_token_id, |
|
) |
|
|
|
completion = tokenizer.decode( |
|
generated_ids[0][input_ids_cutoff:], |
|
skip_special_tokens=True, |
|
) |
|
|
|
print(completion) |
|
``` |
|
|
|
Evals: |
|
![image/png](https://cdn-uploads.huggingface.co/production/uploads/62ceeb27e7f6014c0e9d9268/U7L1aOV7UxBEBcLGqOZ2s.png) |
|
|
|
Training charts: |
|
![image/png](https://cdn-uploads.huggingface.co/production/uploads/62ceeb27e7f6014c0e9d9268/PLkFqE7_34-hJmFW7_opG.png) |
|
|
|
Link to charts: |
|
https://api.wandb.ai/links/abacaj1/c4nkcs9r |
|
|
|
Code to train model: |
|
https://github.com/abacaj/train-with-fsdp |