Tokenizer not available (OSError: Can't load tokenizer)

by jwidmer - opened 9 days ago

9 days ago

Hello
Thanks for your work on the Sinhala Whisper model. I want to use it for a friend, but when loading it with transformers.pipeline(), I'm getting an OSError (Can't load tokenizer).
There is a discussion about this: https://discuss.huggingface.co/t/tokenizer-not-created-when-training-whisper-small-model/61876
You think you could add the tokenizer somehow?
Greetings and thanks,
jonas

RRashmini

Owner 9 days ago

I added tokenizer

jwidmer

5 days ago

thanks a lot!
tokenizer loading works.
Now I get: OSError: RRashmini/whisper-small-sinhala-26 does not appear to have a file named preprocessor_config.json.

I'm running the following code:

from transformers import pipeline
whisper = pipeline(
  "automatic-speech-recognition",
  "RRashmini/whisper-small-sinhala-26",
  torch_dtype=torch.float16,
)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment