OSError: teknium/Replit-v1-CodeInstruct-3B does not appear to have a tokenizer file
Receiving this error when trying to run this locally:OSError: teknium/Replit-v1-CodeInstruct-3B does not appear to have a file named replit/replit-code-v1-3b--replit_lm_tokenizer.py. Checkout 'https://huggingface.co/teknium/Replit-v1-CodeInstruct-3B/main' for available files.
I can see that the main branch has a file called replit_lm_tokenizer.py
not sure why there is a replit/replit-code-v1-3b--
being pre-pended to it. If I replace the model_name
to be replit/replit-code-v1-3b
then it loads properly.
This is the code snippet I am running for this:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# model_name = "replit/replit-code-v1-3b" # loads properly
model_name = "teknium/Replit-v1-CodeInstruct-3B" # encounters the error above
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
trust_remote_code=True
)
model.to('cuda')
Receiving this error when trying to run this locally:
OSError: teknium/Replit-v1-CodeInstruct-3B does not appear to have a file named replit/replit-code-v1-3b--replit_lm_tokenizer.py. Checkout 'https://huggingface.co/teknium/Replit-v1-CodeInstruct-3B/main' for available files.
I can see that the main branch has a file called
replit_lm_tokenizer.py
not sure why there is areplit/replit-code-v1-3b--
being pre-pended to it. If I replace themodel_name
to bereplit/replit-code-v1-3b
then it loads properly.This is the code snippet I am running for this:
from transformers import AutoModelForCausalLM, AutoTokenizer import torch # model_name = "replit/replit-code-v1-3b" # loads properly model_name = "teknium/Replit-v1-CodeInstruct-3B" # encounters the error above tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype=torch.bfloat16, trust_remote_code=True ) model.to('cuda')
Yes thank you we are fixing it right now ^_^ It's got some references to the original replit model in the config.json
Receiving this error when trying to run this locally:
OSError: teknium/Replit-v1-CodeInstruct-3B does not appear to have a file named replit/replit-code-v1-3b--replit_lm_tokenizer.py. Checkout 'https://huggingface.co/teknium/Replit-v1-CodeInstruct-3B/main' for available files.
I can see that the main branch has a file called
replit_lm_tokenizer.py
not sure why there is areplit/replit-code-v1-3b--
being pre-pended to it. If I replace themodel_name
to bereplit/replit-code-v1-3b
then it loads properly.This is the code snippet I am running for this:
from transformers import AutoModelForCausalLM, AutoTokenizer import torch # model_name = "replit/replit-code-v1-3b" # loads properly model_name = "teknium/Replit-v1-CodeInstruct-3B" # encounters the error above tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype=torch.bfloat16, trust_remote_code=True ) model.to('cuda')
It's fixed now. You can just update the config.json to fix it
The issue still persists with the same error. I looked at the repo and changing the string in the tokenizer_config.json resolves the issue.
Current:
"replit/replit-code-v1-3b--replit_lm_tokenizer.ReplitLMTokenizer",
null
]
}```
Fixed:
```"AutoTokenizer": [
"replit_lm_tokenizer.ReplitLMTokenizer",
null
]
}```
There were two sections of the config file to update, did you remove the model name portion?
Technically, 3 lines to change, see the commit here:
https://huggingface.co/teknium/Replit-v1-CodeInstruct-3B/commit/ad718ea176f4089ff593187fe31a9632ad2a5daf
I see. I thought I had the latest pulled. Thanks!