juyongjiang
commited on
Commit
•
5e38c5a
1
Parent(s):
58adb92
fix bugs
Browse files
README.md
CHANGED
@@ -9,9 +9,10 @@ tags:
|
|
9 |
|
10 |
# CodeUp: A Multilingual Code Generation Llama2 Model with Parameter-Efficient Instruction-Tuning on a Single RTX 3090
|
11 |
|
12 |
-
<p align="center" width="70%">
|
13 |
-
<img src="
|
14 |
-
</p>
|
|
|
15 |
|
16 |
## Description
|
17 |
In recent years, large language models (LLMs) have shown exceptional capabilities in a wide range of applications due to their fantastic emergence ability. To align with human preference, instruction-tuning and reinforcement learning from human feedback (RLHF) are proposed for Chat-based LLMs (e.g., ChatGPT, GPT-4). However, these LLMs (except for Codex) primarily focus on the general domain and are not specifically designed for the code domain. Although Codex provides an alternative choice, it is a closed-source model developed by OpenAI. Hence, it is imperative to develop open-source instruction-following LLMs for the code domain.
|
@@ -40,9 +41,15 @@ Hence, we filter the ambiguous and irrelevant data by rigorous design to obtain
|
|
40 |
|
41 |
This way, we gain the 19K high-quality instruction data of code generation. The following is the instruction number distribution of each PL with Radar visualization before and after filtering.
|
42 |
|
43 |
-
| Raw Data (20K + 4K)| Filtered Data (19K) |
|
44 |
| -- | -- |
|
45 |
-
| <center><img src="
|
|
|
|
|
|
|
|
|
|
|
|
|
46 |
|
47 |
|
48 |
## Training & Inference
|
|
|
9 |
|
10 |
# CodeUp: A Multilingual Code Generation Llama2 Model with Parameter-Efficient Instruction-Tuning on a Single RTX 3090
|
11 |
|
12 |
+
<!-- <p align="center" width="70%">
|
13 |
+
<img src="assets/Logo.jpg" alt="HKUST CodeUp" style="width: 50%; min-width: 250px; display: block; margin: auto;">
|
14 |
+
</p> -->
|
15 |
+
![HKUST CodeUp](assets/Logo.jpg#pic_center =600x600)
|
16 |
|
17 |
## Description
|
18 |
In recent years, large language models (LLMs) have shown exceptional capabilities in a wide range of applications due to their fantastic emergence ability. To align with human preference, instruction-tuning and reinforcement learning from human feedback (RLHF) are proposed for Chat-based LLMs (e.g., ChatGPT, GPT-4). However, these LLMs (except for Codex) primarily focus on the general domain and are not specifically designed for the code domain. Although Codex provides an alternative choice, it is a closed-source model developed by OpenAI. Hence, it is imperative to develop open-source instruction-following LLMs for the code domain.
|
|
|
41 |
|
42 |
This way, we gain the 19K high-quality instruction data of code generation. The following is the instruction number distribution of each PL with Radar visualization before and after filtering.
|
43 |
|
44 |
+
<!-- | Raw Data (20K + 4K)| Filtered Data (19K) |
|
45 |
| -- | -- |
|
46 |
+
| <center><img src="assets/PL_Raw.png" width="100%"></center> | <center><img src="assets/PL_Clean.png" width="92%"></center> | -->
|
47 |
+
|
48 |
+
**Raw Data (20K + 4K)**
|
49 |
+
![Raw Data (20K + 4K)](assets/PL_Raw.png)
|
50 |
+
|
51 |
+
**Filtered Data (19K)**
|
52 |
+
![Filtered Data (19K)](assets/PL_Clean.png)
|
53 |
|
54 |
|
55 |
## Training & Inference
|