not generating caption properly
I am using your implementation but for several images I am testing the model is not able to generate the caption.
I am using some stricter parameters for the generation to avoid too much hallucination and max new tokens 120 bc of the use case but I don't have this issue with the xturner version even if I have to downgrade the transformer version to use it
I've also noticed some hallucinations in my case, but I'm not sure if it's an issue with the huggingface format transformation. Have you tried running your model using xtuner's lmdeploy to see if the results differ? Also, I'm using transformers version 4.40.0.
lmdeploy is giving me other issues so I tried the CLI version which is working fine with the downgrade of transformers.
With your version I am also using the same 4.40 version.
The hallucination usually happen bc of temperature is too high or need to lower top_k and top_p.
Actually, I also haven't had the chance to try out lmdeploy myself, so I'm not completely sure if the issue stems from the way I modified it. As far as I know, the LlavaForConditionalGeneration
doesn't support applying a temperature parameter; it defaults to greedy decoding. Even if you tried applying a temperature setting, it likely didn't take effect properly.
the model generation uses model.generate() which should have all the same parameters from transformers, but could be you are right