OSError: teknium/Replit-v1-CodeInstruct-3B does not appear to have a tokenizer file

by Sulav - opened May 27, 2023

May 27, 2023

Receiving this error when trying to run this locally:
OSError: teknium/Replit-v1-CodeInstruct-3B does not appear to have a file named replit/replit-code-v1-3b--replit_lm_tokenizer.py. Checkout 'https://huggingface.co/teknium/Replit-v1-CodeInstruct-3B/main' for available files.

I can see that the main branch has a file called replit_lm_tokenizer.py not sure why there is a replit/replit-code-v1-3b-- being pre-pended to it. If I replace the model_name to be replit/replit-code-v1-3b then it loads properly.

This is the code snippet I am running for this:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch 

# model_name = "replit/replit-code-v1-3b" # loads properly
model_name = "teknium/Replit-v1-CodeInstruct-3B" # encounters the error above

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True
)
model.to('cuda')

teknium

Owner May 28, 2023

Receiving this error when trying to run this locally:
OSError: teknium/Replit-v1-CodeInstruct-3B does not appear to have a file named replit/replit-code-v1-3b--replit_lm_tokenizer.py. Checkout 'https://huggingface.co/teknium/Replit-v1-CodeInstruct-3B/main' for available files.

I can see that the main branch has a file called replit_lm_tokenizer.py not sure why there is a replit/replit-code-v1-3b-- being pre-pended to it. If I replace the model_name to be replit/replit-code-v1-3b then it loads properly.

This is the code snippet I am running for this:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch 

# model_name = "replit/replit-code-v1-3b" # loads properly
model_name = "teknium/Replit-v1-CodeInstruct-3B" # encounters the error above

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True
)
model.to('cuda')

Yes thank you we are fixing it right now ^_^ It's got some references to the original replit model in the config.json

teknium

Owner May 28, 2023

Receiving this error when trying to run this locally:
OSError: teknium/Replit-v1-CodeInstruct-3B does not appear to have a file named replit/replit-code-v1-3b--replit_lm_tokenizer.py. Checkout 'https://huggingface.co/teknium/Replit-v1-CodeInstruct-3B/main' for available files.

I can see that the main branch has a file called replit_lm_tokenizer.py not sure why there is a replit/replit-code-v1-3b-- being pre-pended to it. If I replace the model_name to be replit/replit-code-v1-3b then it loads properly.

This is the code snippet I am running for this:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch 

# model_name = "replit/replit-code-v1-3b" # loads properly
model_name = "teknium/Replit-v1-CodeInstruct-3B" # encounters the error above

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True
)
model.to('cuda')

It's fixed now. You can just update the config.json to fix it

teknium changed discussion status to closed May 28, 2023

Sulav

May 28, 2023

The issue still persists with the same error. I looked at the repo and changing the string in the tokenizer_config.json resolves the issue.

Current:

      "replit/replit-code-v1-3b--replit_lm_tokenizer.ReplitLMTokenizer",
      null
    ]
  }```

Fixed:
```"AutoTokenizer": [
      "replit_lm_tokenizer.ReplitLMTokenizer",
      null
    ]
  }```

Sulav changed discussion status to open May 28, 2023

teknium

Owner May 28, 2023

There were two sections of the config file to update, did you remove the model name portion?

teknium

Owner May 28, 2023

Technically, 3 lines to change, see the commit here:
https://huggingface.co/teknium/Replit-v1-CodeInstruct-3B/commit/ad718ea176f4089ff593187fe31a9632ad2a5daf

Sulav

May 29, 2023

I see. I thought I had the latest pulled. Thanks!

Sulav changed discussion status to closed May 29, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment