Human eval scores?

#1
by rombodawg - opened

If you make a coding model, then you have to tell use how good it is at coding. Saying it makes "error free" code means nothing. Please upload human eval and humaneval+ scores.

Also this is clearly a codellama finetune, i would add that to the model card so people know

Greetings @rombodawg ,
Thanks for bringing all the valid points to our attention.

We'd like to mention that this model is a mere test model that we have uploaded (v0.1) and we're working on our next model for which we will be releasing all the metrics.

Regards

kstyagi23 changed discussion status to closed

Thanks for the info, one more question, is there a date or release window for the next model? im quite interested in this model, id like to know how soon it is going to be released, reopening for this question.

rombodawg changed discussion status to open

Thanks for showing interest in CodeMate-v0.1
For the next version, you can expect it to be out by the second week of February.
However, there can be delays in case the model doesn't pass the internal testing.

Anything else I can help you out with?

actually since you asked, what do you plan on being diffrent between this 0.1 release, and the one that will have benchmarks. How is it an upgrade? And last question do you have any plans to made any models with any other sizes of codellama? 7b, 13b, and the newly released 70b?

With code models, we have noticed that their ability to engage in chat is quite depleted when compared to generic models like Llama, Mistral, and others.

We're focusing on making the code model as capable of being engaging in contextual conversations as other models, without depletion in code generation capabilities.

As for the other sizes, we do have that in our pipeline.

kstyagi23 changed discussion status to closed

Sign up or log in to comment