patched-codes
/

patched-coder-34b

Text Generation

text-generation-inference

Model card Files Files and versions Community

patched-coder-34b / README.md

codelion's picture

Create README.md

768fb2e about 1 year ago

|

3.37 kB

	---
	inference: false
	datasets:
	- bigcode/commitpackft
	model-index:
	- name: patched-coder-34b
	results:
	- task:
	type: text-generation
	dataset:
	type: openai_humaneval
	name: HumanEval
	metrics:
	- name: pass@1
	type: pass@1
	value: 53.567
	verified: false
	- task:
	type: text-generation
	dataset:
	type: bigcode/humanevalpack
	name: HumanEvalFix Python
	metrics:
	- name: pass@1
	type: pass@1
	value: 41.341
	verified: false
	- task:
	type: text-generation
	dataset:
	type: patched-codes/static-analysis-eval
	name: Static Analysis Eval
	metrics:
	- name: pass@1
	type: pass@1
	value: 51.316
	verified: false
	---
	# Model Card for patched-coder-34b


	This is an instruction fine-tuned model focussed on the task of patching code. Patching may include fixing bugs, remediating security vulnerabilities,
	doing API migrations and other kinds of code matainence.

	## Model Details

	### Model Description

	- Developed by: [codelion](https://huggingface.co/codelion)
	- Model type: Code Llama
	- Finetuned from model: [CodeLlama-34b-Python](https://huggingface.co/codellama/CodeLlama-34b-Python-hf)


	## How to Get Started with the Model

	Make sure to install Transformers from the main git branch:

	```bash
	pip install git+https://github.com/huggingface/transformers.git
	```

	## How to Prompt the Model

	This model accepts the alpaca instruction format.

	For example:

	```
	### Instruction:
	{instruction}

	### Input:
	{input}

	### Response:
	...
	```

	## Bias, Risks, and Limitations

	This model has undergone very limited testing. Additional safety testing should be performed before any real-world deployments.

	## Training Details

	- GPU: A100 80 GB
	- Time: ~8 hrs

	### Training Data

	The model was fine-tuned on [commitpackft](https://huggingface.co/datasets/bigcode/commitpackft), an open dataset consisting of commits.
	We started with the commits for the `python` langauge from the dataset and then filtered all the commits that were related to fixing bugs.

	### Training Procedure

	Instruction fine-tuning to follow instructions in natural langauge related to code. We load the quantized base model in 4 bits
	and then use QLoRA for Parameter-Efficient Fine-Tuning (PEFT) with Flash Attention. The model was trained for 2 epochs.

	#### Training Hyperparameters

	Training regime:

	The following `bitsandbytes` quantization config was used during training:
	- quant_method: bitsandbytes
	- load_in_8bit: False
	- load_in_4bit: True
	- llm_int8_threshold: 6.0
	- llm_int8_skip_modules: None
	- llm_int8_enable_fp32_cpu_offload: False
	- llm_int8_has_fp16_weight: False
	- bnb_4bit_quant_type: nf4
	- bnb_4bit_use_double_quant: True
	- bnb_4bit_compute_dtype: bfloat16

	## Evaluation

	We evaluate the model on `HumanEval` and `HumanEvalPack` benchmarks using
	[Code Generation LM Evaluation Harness](https://github.com/bigcode-project/bigcode-evaluation-harness).

	We also evaluate the model for vulnerability remediation using the `Static Analysis Eval` benchmark available [here](https://huggingface.co/datasets/patched-codes/static-analysis-eval).

	### Results