Manual training

by edmond - opened May 22

May 22

Hello, for some reason running :
vlm = PaliGemmaForConditionalGeneration.from_pretrained(**llm_args)
pred = vlm(pixel_values=tensor, input_ids=input_ids[:, :-1],
attention_mask=torch.ones_like(input_ids[:, :-1])).logits
pred = pred[:, -nb_tokens_answer:]

loss = F.cross_entropy(pred.permute((0, 2, 1)), input_ids[:, -nb_tokens_answer:],
reduction='mean')

Gives me a very small loss. I have the feeling that input and target tokens were mixed.
Why is that ?

edmond

May 23

•

edited May 23

This is driving me crazy. This bugfix was supposed to solve my problem https://github.com/huggingface/transformers/pull/30967 ... (im checking on more data)

edmond

May 24

•

edited May 24

https://github.com/huggingface/transformers/issues/30993 ok I got help, apparently this models neededs also labels tokens and tokens type ids in input unlike imp or moondream...

edmond changed discussion status to closed May 24

merve

Google org May 27

@edmond sorry for late response. it's best if you pass suffix to processor and actually pass processor outputs.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment