|
--- |
|
license: apache-2.0 |
|
--- |
|
TestCodeo - GoCodeo's fine-tuned Language Model dedicated to Python unit test generation. |
|
|
|
www.gocodeo.com |
|
|
|
Approach |
|
Our team curated a unique dataset of 200,000 prompt-completion pairs in alpaca format, specifically designed for Python unit test generation. |
|
|
|
Two-Stage Finetuning Process |
|
Stage 1: We fine-tuned the base Codellama 7B Python model with 25k easy and 75k medium instructions. |
|
|
|
Stage 2: The resulting Test-Codeo-Base was further refined with the remaining medium-hard questions to develop TestCodeo. |
|
|
|
Evaluation Methodology |
|
Utilizing OpenAI's human eval dataset, we generated test cases for 164 coding instructions and measured code coverage. |
|
|
|
Results |
|
TestCodeo achieved an impressive 89% code coverage, surpassing Codellama's 17% and approaching GPT-3.5-turbo's 93%. |