gpt2-medium-ipu / README.md
Dongsung's picture
Update model description
347a424
|
raw
history blame
1.12 kB

Graphcore and Hugging Face are working together to make training of Transformer models on IPUs fast and easy. Learn more about how to take advantage of the power of Graphcore IPUs to train Transformers models at hf.co/hardware/graphcore.

GPT2 Medium model IPU config

This model contains just the IPUConfig files for running the gpt2-medium model on Graphcore IPUs.

This model contains no model weights, only an IPUConfig.

Model description

GPT2 is a large transformer-based language model. It is built using transformer decoder blocks. BERT, on the other hand, uses transformer encoder blocks. It adds Layer normalisation to the input of each sub-block, similar to a pre-activation residual networks and an additional layer normalisation.

Paper link : Language Models are Unsupervised Multitask Learners

Usage

from optimum.graphcore import IPUConfig
ipu_config = IPUConfig.from_pretrained("Graphcore/gpt2-medium-ipu")