Various architectures are used in code generation models, but most of them use the auto-regressive left-to-right setting, such as GPT. However InCoder used a decoder-only Transformer with Causal Masking objective, that combines both next token prediction and bidirectional context through masking. AlphaCode used an encoder-decoder architecture. <p align="center"> <img src="https://huggingface.co/datasets/loubnabnl/repo-images/resolve/main/model_size.png" alt="drawing" width="440"/> </p> For model-specific information about each architecture, please select a model below: