Commit
·
731fe29
1
Parent(s):
63abd40
Update README.md
Browse files
README.md
CHANGED
|
@@ -3,4 +3,18 @@ license: apache-2.0
|
|
| 3 |
---
|
| 4 |
TestCodeo - GoCodeo's fine-tuned Language Model dedicated to Python unit test generation.
|
| 5 |
|
| 6 |
-
www.gocodeo.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
---
|
| 4 |
TestCodeo - GoCodeo's fine-tuned Language Model dedicated to Python unit test generation.
|
| 5 |
|
| 6 |
+
www.gocodeo.com
|
| 7 |
+
|
| 8 |
+
Approach
|
| 9 |
+
Our team curated a unique dataset of 200,000 prompt-completion pairs in alpaca format, specifically designed for Python unit test generation.
|
| 10 |
+
|
| 11 |
+
Two-Stage Finetuning Process
|
| 12 |
+
Stage 1: We fine-tuned the base Codellama 7B Python model with 25k easy and 75k medium instructions.
|
| 13 |
+
|
| 14 |
+
Stage 2: The resulting Test-Codeo-Base was further refined with the remaining medium-hard questions to develop TestCodeo.
|
| 15 |
+
|
| 16 |
+
Evaluation Methodology
|
| 17 |
+
Utilizing OpenAI's human eval dataset, we generated test cases for 164 coding instructions and measured code coverage.
|
| 18 |
+
|
| 19 |
+
Results
|
| 20 |
+
TestCodeo achieved an impressive 89% code coverage, surpassing Codellama's 17% and approaching GPT-3.5-turbo's 93%.
|