Seems some of the checkpoint is not used?

#2
by nicolaus-huang - opened

Hi there, great job done!
I'm a new comer in LLM who met some issue when running this.

from transformers import ChameleonProcessor, ChameleonForConditionalGeneration

processor = ChameleonProcessor.from_pretrained("leloy/Anole-7b-v0.1-hf")
model = ChameleonForConditionalGeneration.from_pretrained("leloy/Anole-7b-v0.1-hf")

after running this I got

Some kwargs in processor config are unused and will not have any effect: image_seq_length, image_token. 
Loading checkpoint shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 3/3 [00:02<00:00,  1.17it/s]
Some weights of the model checkpoint at /home/h84382219/anole-huggingface/Anole-7b-v0.1-hf were not used when initializing ChameleonForConditionalGeneration: ['model.vqmodel.decoder.conv_in.bias', 'model.vqmodel.decoder.conv_in.weight', 'model.vqmodel.decoder.conv_out.bias', 'model.vqmodel.decoder.conv_out.weight', 'model.vqmodel.decoder.mid.attn_1.k.bias', 'model.vqmodel.decoder.mid.attn_1.k.weight', 'model.vqmodel.decoder.mid.attn_1.norm.bias', 'model.vqmodel.decoder.mid.attn_1.norm.weight', 'model.vqmodel.decoder.mid.attn_1.proj_out.bias', 'model.vqmodel.decoder.mid.attn_1.proj_out.weight', 'model.vqmodel.decoder.mid.attn_1.q.bias', 'model.vqmodel.decoder.mid.attn_1.q.weight', 'model.vqmodel.decoder.mid.attn_1.v.bias', 'model.vqmodel.decoder.mid.attn_1.v.weight', 'model.vqmodel.decoder.mid.block_1.conv1.bias', 'model.vqmodel.decoder.mid.block_1.conv1.weight', 'model.vqmodel.decoder.mid.block_1.conv2.bias', 'model.vqmodel.decoder.mid.block_1.conv2.weight', 'model.vqmodel.decoder.mid.block_1.norm1.bias', 'model.vqmodel.decoder.mid.block_1.norm1.weight', 'model.vqmodel.decoder.mid.block_1.norm2.bias', 'model.vqmodel.decoder.mid.block_1.norm2.weight', 'model.vqmodel.decoder.mid.block_2.conv1.bias', 'model.vqmodel.decoder.mid.block_2.conv1.weight', 'model.vqmodel.decoder.mid.block_2.conv2.bias', 'model.vqmodel.decoder.mid.block_2.conv2.weight', 'model.vqmodel.decoder.mid.block_2.norm1.bias', 'model.vqmodel.decoder.mid.block_2.norm1.weight', 'model.vqmodel.decoder.mid.block_2.norm2.bias', 'model.vqmodel.decoder.mid.block_2.norm2.weight', 'model.vqmodel.decoder.norm_out.bias', 'model.vqmodel.decoder.norm_out.weight', 'model.vqmodel.decoder.up.0.block.0.conv1.bias', 'model.vqmodel.decoder.up.0.block.0.conv1.weight', 'model.vqmodel.decoder.up.0.block.0.conv2.bias', 'model.vqmodel.decoder.up.0.block.0.conv2.weight', 'model.vqmodel.decoder.up.0.block.0.norm1.bias', 'model.vqmodel.decoder.up.0.block.0.norm1.weight', 'model.vqmodel.decoder.up.0.block.0.norm2.bias', 'model.vqmodel.decoder.up.0.block.0.norm2.weight', 'model.vqmodel.decoder.up.0.block.1.conv1.bias', 'model.vqmodel.decoder.up.0.block.1.conv1.weight', 'model.vqmodel.decoder.up.0.block.1.conv2.bias', 'model.vqmodel.decoder.up.0.block.1.conv2.weight', 'model.vqmodel.decoder.up.0.block.1.norm1.bias', 'model.vqmodel.decoder.up.0.block.1.norm1.weight', 'model.vqmodel.decoder.up.0.block.1.norm2.bias', 'model.vqmodel.decoder.up.0.block.1.norm2.weight', 'model.vqmodel.decoder.up.0.block.2.conv1.bias', 'model.vqmodel.decoder.up.0.block.2.conv1.weight', 'model.vqmodel.decoder.up.0.block.2.conv2.bias', 'model.vqmodel.decoder.up.0.block.2.conv2.weight', 'model.vqmodel.decoder.up.0.block.2.norm1.bias', 'model.vqmodel.decoder.up.0.block.2.norm1.weight', 'model.vqmodel.decoder.up.0.block.2.norm2.bias', 'model.vqmodel.decoder.up.0.block.2.norm2.weight', 'model.vqmodel.decoder.up.1.block.0.conv1.bias', 'model.vqmodel.decoder.up.1.block.0.conv1.weight', 'model.vqmodel.decoder.up.1.block.0.conv2.bias', 'model.vqmodel.decoder.up.1.block.0.conv2.weight', 'model.vqmodel.decoder.up.1.block.0.nin_shortcut.bias', 'model.vqmodel.decoder.up.1.block.0.nin_shortcut.weight', 'model.vqmodel.decoder.up.1.block.0.norm1.bias', 'model.vqmodel.decoder.up.1.block.0.norm1.weight', 'model.vqmodel.decoder.up.1.block.0.norm2.bias', 'model.vqmodel.decoder.up.1.block.0.norm2.weight', 'model.vqmodel.decoder.up.1.block.1.conv1.bias', 'model.vqmodel.decoder.up.1.block.1.conv1.weight', 'model.vqmodel.decoder.up.1.block.1.conv2.bias', 'model.vqmodel.decoder.up.1.block.1.conv2.weight', 'model.vqmodel.decoder.up.1.block.1.norm1.bias', 'model.vqmodel.decoder.up.1.block.1.norm1.weight', 'model.vqmodel.decoder.up.1.block.1.norm2.bias', 'model.vqmodel.decoder.up.1.block.1.norm2.weight', 'model.vqmodel.decoder.up.1.block.2.conv1.bias', 'model.vqmodel.decoder.up.1.block.2.conv1.weight', 'model.vqmodel.decoder.up.1.block.2.conv2.bias', 'model.vqmodel.decoder.up.1.block.2.conv2.weight', 'model.vqmodel.decoder.up.1.block.2.norm1.bias', 'model.vqmodel.decoder.up.1.block.2.norm1.weight', 'model.vqmodel.decoder.up.1.block.2.norm2.bias', 'model.vqmodel.decoder.up.1.block.2.norm2.weight', 'model.vqmodel.decoder.up.1.upsample.conv.bias', 'model.vqmodel.decoder.up.1.upsample.conv.weight', 'model.vqmodel.decoder.up.2.block.0.conv1.bias', 'model.vqmodel.decoder.up.2.block.0.conv1.weight', 'model.vqmodel.decoder.up.2.block.0.conv2.bias', 'model.vqmodel.decoder.up.2.block.0.conv2.weight', 'model.vqmodel.decoder.up.2.block.0.norm1.bias', 'model.vqmodel.decoder.up.2.block.0.norm1.weight', 'model.vqmodel.decoder.up.2.block.0.norm2.bias', 'model.vqmodel.decoder.up.2.block.0.norm2.weight', 'model.vqmodel.decoder.up.2.block.1.conv1.bias', 'model.vqmodel.decoder.up.2.block.1.conv1.weight', 'model.vqmodel.decoder.up.2.block.1.conv2.bias', 'model.vqmodel.decoder.up.2.block.1.conv2.weight', 'model.vqmodel.decoder.up.2.block.1.norm1.bias', 'model.vqmodel.decoder.up.2.block.1.norm1.weight', 'model.vqmodel.decoder.up.2.block.1.norm2.bias', 'model.vqmodel.decoder.up.2.block.1.norm2.weight', 'model.vqmodel.decoder.up.2.block.2.conv1.bias', 'model.vqmodel.decoder.up.2.block.2.conv1.weight', 'model.vqmodel.decoder.up.2.block.2.conv2.bias', 'model.vqmodel.decoder.up.2.block.2.conv2.weight', 'model.vqmodel.decoder.up.2.block.2.norm1.bias', 'model.vqmodel.decoder.up.2.block.2.norm1.weight', 'model.vqmodel.decoder.up.2.block.2.norm2.bias', 'model.vqmodel.decoder.up.2.block.2.norm2.weight', 'model.vqmodel.decoder.up.2.upsample.conv.bias', 'model.vqmodel.decoder.up.2.upsample.conv.weight', 'model.vqmodel.decoder.up.3.block.0.conv1.bias', 'model.vqmodel.decoder.up.3.block.0.conv1.weight', 'model.vqmodel.decoder.up.3.block.0.conv2.bias', 'model.vqmodel.decoder.up.3.block.0.conv2.weight', 'model.vqmodel.decoder.up.3.block.0.nin_shortcut.bias', 'model.vqmodel.decoder.up.3.block.0.nin_shortcut.weight', 'model.vqmodel.decoder.up.3.block.0.norm1.bias', 'model.vqmodel.decoder.up.3.block.0.norm1.weight', 'model.vqmodel.decoder.up.3.block.0.norm2.bias', 'model.vqmodel.decoder.up.3.block.0.norm2.weight', 'model.vqmodel.decoder.up.3.block.1.conv1.bias', 'model.vqmodel.decoder.up.3.block.1.conv1.weight', 'model.vqmodel.decoder.up.3.block.1.conv2.bias', 'model.vqmodel.decoder.up.3.block.1.conv2.weight', 'model.vqmodel.decoder.up.3.block.1.norm1.bias', 'model.vqmodel.decoder.up.3.block.1.norm1.weight', 'model.vqmodel.decoder.up.3.block.1.norm2.bias', 'model.vqmodel.decoder.up.3.block.1.norm2.weight', 'model.vqmodel.decoder.up.3.block.2.conv1.bias', 'model.vqmodel.decoder.up.3.block.2.conv1.weight', 'model.vqmodel.decoder.up.3.block.2.conv2.bias', 'model.vqmodel.decoder.up.3.block.2.conv2.weight', 'model.vqmodel.decoder.up.3.block.2.norm1.bias', 'model.vqmodel.decoder.up.3.block.2.norm1.weight', 'model.vqmodel.decoder.up.3.block.2.norm2.bias', 'model.vqmodel.decoder.up.3.block.2.norm2.weight', 'model.vqmodel.decoder.up.3.upsample.conv.bias', 'model.vqmodel.decoder.up.3.upsample.conv.weight', 'model.vqmodel.decoder.up.4.block.0.conv1.bias', 'model.vqmodel.decoder.up.4.block.0.conv1.weight', 'model.vqmodel.decoder.up.4.block.0.conv2.bias', 'model.vqmodel.decoder.up.4.block.0.conv2.weight', 'model.vqmodel.decoder.up.4.block.0.norm1.bias', 'model.vqmodel.decoder.up.4.block.0.norm1.weight', 'model.vqmodel.decoder.up.4.block.0.norm2.bias', 'model.vqmodel.decoder.up.4.block.0.norm2.weight', 'model.vqmodel.decoder.up.4.block.1.conv1.bias', 'model.vqmodel.decoder.up.4.block.1.conv1.weight', 'model.vqmodel.decoder.up.4.block.1.conv2.bias', 'model.vqmodel.decoder.up.4.block.1.conv2.weight', 'model.vqmodel.decoder.up.4.block.1.norm1.bias', 'model.vqmodel.decoder.up.4.block.1.norm1.weight', 'model.vqmodel.decoder.up.4.block.1.norm2.bias', 'model.vqmodel.decoder.up.4.block.1.norm2.weight', 'model.vqmodel.decoder.up.4.block.2.conv1.bias', 'model.vqmodel.decoder.up.4.block.2.conv1.weight', 'model.vqmodel.decoder.up.4.block.2.conv2.bias', 'model.vqmodel.decoder.up.4.block.2.conv2.weight', 'model.vqmodel.decoder.up.4.block.2.norm1.bias', 'model.vqmodel.decoder.up.4.block.2.norm1.weight', 'model.vqmodel.decoder.up.4.block.2.norm2.bias', 'model.vqmodel.decoder.up.4.block.2.norm2.weight', 'model.vqmodel.decoder.up.4.upsample.conv.bias', 'model.vqmodel.decoder.up.4.upsample.conv.weight']
- This IS expected if you are initializing ChameleonForConditionalGeneration from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing ChameleonForConditionalGeneration from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).

And there seems not being a example code to decode the images...

Is everything correct?

Sign up or log in to comment