Few-shot in-context learning

#18
by vsokolovskii - opened

Is there any possibility to support multiple images or it's against the paradigm of how the model was trained?
Now when I try to do 1-shot learning providing 1 GT input-ouput pair it gives me the error
At most 1 image(s) may be provided in one request.

In my knowledge, this model does not support image as an input. Try with Qwen VL models.

Sign up or log in to comment