patched-coder-34b / README.md
codelion's picture
Create README.md
768fb2e
|
raw
history blame
3.37 kB
---
inference: false
datasets:
- bigcode/commitpackft
model-index:
- name: patched-coder-34b
results:
- task:
type: text-generation
dataset:
type: openai_humaneval
name: HumanEval
metrics:
- name: pass@1
type: pass@1
value: 53.567
verified: false
- task:
type: text-generation
dataset:
type: bigcode/humanevalpack
name: HumanEvalFix Python
metrics:
- name: pass@1
type: pass@1
value: 41.341
verified: false
- task:
type: text-generation
dataset:
type: patched-codes/static-analysis-eval
name: Static Analysis Eval
metrics:
- name: pass@1
type: pass@1
value: 51.316
verified: false
---
# Model Card for patched-coder-34b
This is an instruction fine-tuned model focussed on the task of patching code. Patching may include fixing bugs, remediating security vulnerabilities,
doing API migrations and other kinds of code matainence.
## Model Details
### Model Description
- **Developed by:** [codelion](https://huggingface.co/codelion)
- **Model type:** Code Llama
- **Finetuned from model:** [CodeLlama-34b-Python](https://huggingface.co/codellama/CodeLlama-34b-Python-hf)
## How to Get Started with the Model
Make sure to install Transformers from the main git branch:
```bash
pip install git+https://github.com/huggingface/transformers.git
```
## How to Prompt the Model
This model accepts the alpaca instruction format.
For example:
```
### Instruction:
{instruction}
### Input:
{input}
### Response:
...
```
## Bias, Risks, and Limitations
This model has undergone very limited testing. Additional safety testing should be performed before any real-world deployments.
## Training Details
- **GPU:** A100 80 GB
- **Time:** ~8 hrs
### Training Data
The model was fine-tuned on [commitpackft](https://huggingface.co/datasets/bigcode/commitpackft), an open dataset consisting of commits.
We started with the commits for the `python` langauge from the dataset and then filtered all the commits that were related to fixing bugs.
### Training Procedure
Instruction fine-tuning to follow instructions in natural langauge related to code. We load the quantized base model in 4 bits
and then use QLoRA for Parameter-Efficient Fine-Tuning (PEFT) with Flash Attention. The model was trained for 2 epochs.
#### Training Hyperparameters
**Training regime:**
The following `bitsandbytes` quantization config was used during training:
- quant_method: bitsandbytes
- load_in_8bit: False
- load_in_4bit: True
- llm_int8_threshold: 6.0
- llm_int8_skip_modules: None
- llm_int8_enable_fp32_cpu_offload: False
- llm_int8_has_fp16_weight: False
- bnb_4bit_quant_type: nf4
- bnb_4bit_use_double_quant: True
- bnb_4bit_compute_dtype: bfloat16
## Evaluation
We evaluate the model on `HumanEval` and `HumanEvalPack` benchmarks using
[Code Generation LM Evaluation Harness](https://github.com/bigcode-project/bigcode-evaluation-harness).
We also evaluate the model for vulnerability remediation using the `Static Analysis Eval` benchmark available [here](https://huggingface.co/datasets/patched-codes/static-analysis-eval).
### Results