README.md · Intel/gpt-j-6B-int8-static-inc at aa281eb35f09316073f9e2622e82ae9ec7a01b9e

metadata

license: apache-2.0
datasets:
  - lambada
language:
  - en
library_name: transformers
pipeline_tag: text-generation
tags:
  - text-generation-inference
  - causal-lm
  - int8
  - ONNX
  - PostTrainingStatic
  - Intel® Neural Compressor
  - neural-compressor

Model Details: INT8 GPT-J 6B

GPT-J 6B is a transformer model trained using Ben Wang's Mesh Transformer JAX. "GPT-J" refers to the class of model, while "6B" represents the number of trainable parameters.

This int8 ONNX model is generated by neural-compressor and the fp32 model can be exported with below command:

python -m transformers.onnx --model=EleutherAI/gpt-j-6B onnx_gptj/ --framework pt --opset 13 --feature=causal-lm-with-past

Model Detail	Description
Model Authors - Company	Intel
Date	April 10, 2022
Version	1
Type	Text Generation
Paper or Other Resources	-
License	Apache 2.0
Questions or Comments	Community Tab

Intended Use	Description
Primary intended uses	You can use the raw model for text generation inference
Primary intended users	Anyone doing text generation inference
Out-of-scope uses	This model in most cases will need to be fine-tuned for your particular task. The model should not be used to intentionally create hostile or alienating environments for people.

How to use

Download the model and script by cloning the repository:

git clone https://huggingface.co/Intel/gpt-j-6B-int8-static

Then you can do inference based on the model and script 'evaluation.ipynb'.

Metrics (Model Performance):

Model	Model Size (GB)	Lambada Acc
FP32	23	0.7954
INT8	6	0.7944