Graphcore and Hugging Face are working together to make training of Transformer models on IPUs fast and easy. Learn more about how to take advantage of the power of Graphcore IPUs to train Transformers models at hf.co/hardware/graphcore.
GPT2 Medium model IPU config
This model contains just the IPUConfig
files for running the gpt2-medium model on Graphcore IPUs.
This model contains no model weights, only an IPUConfig.
Model description
GPT2 is a large transformer-based language model. It is built using transformer decoder blocks. BERT, on the other hand, uses transformer encoder blocks. It adds Layer normalisation to the input of each sub-block, similar to a pre-activation residual networks and an additional layer normalisation.
Paper link : Language Models are Unsupervised Multitask Learners
Usage
from optimum.graphcore import IPUConfig
ipu_config = IPUConfig.from_pretrained("Graphcore/gpt2-medium-ipu")