TestCodeo - GoCodeo's fine-tuned Language Model dedicated to Python unit test generation.

www.gocodeo.com

Approach

Our team curated a unique dataset of 200,000 prompt-completion pairs in alpaca format, specifically designed for Python unit test generation.

Two-Stage Finetuning Process

Stage 1: We fine-tuned the base Codellama 7B Python model with 25k easy and 75k medium instructions.

Stage 2: The resulting Test-Codeo-Base was further refined with the remaining medium-hard questions to develop TestCodeo.

Evaluation Methodology

Utilizing OpenAI's human eval dataset, we generated test cases for 164 coding instructions and measured code coverage.

Results

TestCodeo achieved an impressive 89% code coverage, surpassing Codellama's 17% and approaching GPT-3.5-turbo's 93%.

Downloads last month
15
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.