Spaces:

togethercomputer
/

GPT-JT

Runtime error

How to try it out? I provide WIP

by billy-ai - opened Nov 30, 2022

Nov 30, 2022

Hi all, I installed python 3.10.8, installed the latest version of torch and transformers. Afterwards, I tried the following code:

from transformers import GPTJModel, GPTJConfig
import torch
configuration = GPTJConfig()

# Initializing a model from the configuration
model = GPTJModel(configuration)

# (First I downloaded the model)
path_loader = torch.load("GPT-JT-6B-v1/pytorch_model.bin")
model.load_state_dict(path_loader)
model.eval()

but can't actually use the model. I tried using generate, but I got: TypeError: The current model class (GPTJModel) is not compatible with .generate(), as it doesn't have a language model head. Please use one of the following classes instead: {'GPTJForCausalLM'}

Any ideas on how to use the model after loading it? :)

nudelbrot

Nov 30, 2022

did you try using GPTJForCausualLM and supply the .bin https://huggingface.co/transformers/v4.11.3/model_doc/gptj.html ?

Did not find the time myself yet to try it out.

nudelbrot

Dec 4, 2022

you can just load_from_pretrained('your-local-model-dir') with the huggingface transformers lib

juewang

Together org Jan 29, 2023

Hi, you can simply do

from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("togethercomputer/GPT-JT-6B-v1").eval().half().to("cuda:0")

Or if you prefer to download and load manually, you should use GPTJForCausalLM instead of GPTJModel.
As the log has said, GPTJModel does not support generate() as it does not have the LM head but only the embeddings and the transformer layers.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment