File size: 2,585 Bytes
18e14d3
 
 
 
 
 
 
 
 
177279f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
---
datasets:
- ML4SE2023-G1-WizardCoder/EvolInstruct-SCoT-1k
language:
- en
tags:
- code
---

# ML4SE23_G1_WizardCoder-SCoT-1B-V1.0

IN4334 ML4SE

Group1 WizardCoder

This model is the result of the fine-tunign of the WizardCoder-1B-V1.0 model using Structured Chain-of-Though (S-CoT) enhanced instructions.
S-CoT is used to enhance a sample of about 1200 entries from the Evol-Instruct 80k dataset. 
The resulting dataset is then used for the training task. 
The current WizardCoder model and the new S-CoT fine-tuned one are compared on both versions of HumanEval and MBPP (S-CoT enhanced and not) on the pass@1 metric.
The S-CoT enhancement of the evaluation datasets allows to study its effect when used just as a prompting technique, independently of the S-CoT fine-tuning of the model.

## Fine-tuning Details

| Hyperparameter | [WizardCoder-1B-V1.0](https://huggingface.co/WizardLM/WizardCoder-1B-V1.0) |
|----------------|---------------------|
| Batch size     | 16                  |
| Learning rate  | 2e-5                |
| Epochs         | 3                   |
| Max length     | 2048                |
| Warmup step    | 30                  |
| LR scheduler   | cosine              |
| Dataset        | [ML4SE23_G1_EvolInstruct-SCoT-1k](https://huggingface.co/datasets/ML4SE2023-G1-WizardCoder/ML4SE23_G1_EvolInstruct-SCoT-1k) |

The hardware consisted on a GPU instance rented from [DataCrunch](https://datacrunch.io/) with the following specifications:

| NVidia RTX A6000 48GB 1A6000.10V |
|----------------------------------|
| 2 GPUs                           |
| 48GB VRAM per GPU                |
| 60 GB RAM                        |
| 10 CPUs                          |
| 100GB SSD Storage                |
| Ubuntu 20.04                     |
| CUDA 11.6                        |

## Results

Results of pass@1(%) on HumanEval and MBPP compared to HumanEval-SCoT and MBPP-SCoT using WizardCoder-1B, WizardCoder-SCoT-1B and WizardCoder-15B.

| **Dataset**    | **WizardCoder-1B-V1.0** | **WizardCoder-SCoT-1B-V1.0** | **WizardCoder-15B-V1.0** |
|----------------|-------------------------|------------------------------|--------------------------|
| HumanEval      | 23.78                   | **17.68**                    | 57.3                     |
| HumanEval-SCoT | **44.51**               | **27.44**                    | **57.3**                 |
| MBPP           | 23.4                    | **19.4**                     | 51.8                     |
| MBPP-SCoT      | **40**                  | **28**                       | **45.6**                 |