Suddenly getting an error while executing processor = AutoProcessor.from_pretrained( 'llava-hf/llava-1.5-7b-hf')

#41

by Dipto084 - opened Nov 19, 2024

Nov 19, 2024

Hello, I have working with llava and suddenly today I am facing this error,

Exception Traceback (most recent call last)
in <cell line: 1>()
----> 1 processor = LlavaProcessor.from_pretrained(
2 'llava-hf/llava-1.5-7b-hf'
3 )

5 frames
/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_fast.py in init(self, *args, **kwargs)
109 elif fast_tokenizer_file is not None and not from_slow:
110 # We have a serialization from tokenizers which let us directly build the backend
--> 111 fast_tokenizer = TokenizerFast.from_file(fast_tokenizer_file)
112 elif slow_tokenizer is not None:
113 # We need to convert a slow tokenizer to build the backend

Exception: data did not match any variant of untagged enum ModelWrapper at line 277156 column 3

What is the reason for sudden occurrence of this error and how to resolve it?

Dipto084

Nov 19, 2024

I can see a commit in the processor 11 hours ago, could this be the reason? If so, how to resolve it?

lyf5877

Nov 19, 2024

I encounter a same error🥲.

Dipto084

Nov 19, 2024

The generation also seems off :(

lyf5877

Nov 19, 2024

•

edited Nov 19, 2024

I solve this error by upgrading transformers: pip install --upgrade transformers

But after upgrading that, I find that the outputs of processor are weird. I just input one image, but there are many image positions the inputs:

Dipto084

Nov 19, 2024

This is a way to load the model from the previous commit and does fine,

processor = AutoProcessor.from_pretrained(
'llava-hf/llava-1.5-7b-hf',
revision='a272c74'
)

RaushanTurganbay

Llava Hugging Face org Nov 19, 2024

@Dipto084 can you share your env setup pls? Might be that the new update uploaded fast tokenizer which is the default, but your env can't load it

For the many "image" tokens, that is expected, Each image will have as many placeholders as there are image embeddings after the vision tower. So it is around 500 tokens per image. You can skip_special_tokens=True to remove them and decode only the text

s1m0neee

Nov 19, 2024

@RaushanTurganbay Not the guy you were asking the env setup for. But I have Transformers 4.41.1 and experienced the same issues. I manually modified the config files and now it is working.
First, for processor_config.json 4.41.1 works just by having an empty dict, like "{}", as the llava processor in that version does not accept process processor_config parameters.
Second, tokenizer.json accepted by 4.41.1 had a different format for "merges" as you can see from my screenshot of the working one.

By using the older version of tokenizer.json and replace the content of processor_config.json to '{}' things should work. However, best solution is to specify the commit id as suggested by @Dipto084 .

RaushanTurganbay

Llava Hugging Face org Nov 20, 2024

Yes, the new tokenizer config need transformers at least v4.45 where we raised requirement for tokenizers>=0.20. Same goes for processors, as we'll stop supporting old logic for llava models in the next few releases. So it is expected and advised to use the new version of transformers

If you want to use older version for any reason, then yeah, feel free to indicate the commit hash.

jing08

Nov 20, 2024

This is a way to load the model from the previous commit and does fine,

processor = AutoProcessor.from_pretrained(
'llava-hf/llava-1.5-7b-hf',
revision='a272c74'
)

Thx so much for your solution. I encountered the same issue with "llava-hf/llava-1.5-7b-hf".
could you tell me where I can find the revision code?

Dipto084

Nov 20, 2024

@jing08 You do not need the revision code. It's already available as a previous commit. You just need to use this revision='a272c74' in both your Processor and Model

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment