File size: 1,348 Bytes
edca0ee 69f22a7 edca0ee 69f22a7 |
1 2 3 4 5 6 7 8 |
This is an **interactive** blog, to give an overview of open-source language models for code generation. We present their pretraining datasets, model architecture and model evaluation along with examples and tips to use the 🤗 hub for this task. At the end of this blog, you will find a **demo** to test and compare code generation across these models ✨. ## Introduction The application of language models to code generation has sparked great interest recently. You have probably heard of [Codex](https://arxiv.org/pdf/2107.03374v2.pdf), the model behind [Github Copilot](https://copilot.github.com/), or [AlphaCode](https://arxiv.org/pdf/2203.07814v1.pdf) for competition-level programming. These models aren't open-source, and it is hard to reproduce them with a limited budget and incomplete information about their training. The ML community has luckily contributed some code models to allow for further research. However, It can be easy to get lost between models, so at Hugging Face we aim to democratize ML and centralize all information in the 🤗 ecosystem to make the usage of open-source tools easier and more efficient. Code models aren't an exception, you can find all open-source models on the hub, with several code datasets and evaluation metrics. In this blog we will give an overview of these tools and how to use them. |