File size: 808 Bytes
f4c1fda
 
 
63abd40
 
731fe29
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
---
license: apache-2.0
---
TestCodeo - GoCodeo's fine-tuned Language Model dedicated to Python unit test generation.

www.gocodeo.com

Approach
Our team curated a unique dataset of 200,000 prompt-completion pairs in alpaca format, specifically designed for Python unit test generation.

Two-Stage Finetuning Process
Stage 1: We fine-tuned the base Codellama 7B Python model with 25k easy and 75k medium instructions.

Stage 2: The resulting Test-Codeo-Base was further refined with the remaining medium-hard questions to develop TestCodeo.

Evaluation Methodology
Utilizing OpenAI's human eval dataset, we generated test cases for 164 coding instructions and measured code coverage.

Results
TestCodeo achieved an impressive 89% code coverage, surpassing Codellama's 17% and approaching GPT-3.5-turbo's 93%.