Idefics3
Generate text based on an image and prompt
Generate text based on an image and prompt
Media understanding
Identify objects in images using text queries
Generate text and segment images using PaliGemma
Annotate and describe images with text prompts
Segment objects in images and videos using text prompts
Analyze images to caption, detect objects, extract text, and ground phrases
Decode images to teacher model outputs
Generate detailed descriptions from images and questions
Generate descriptions for images using text prompts
Chat with an AI that understands text and images
Analyze images to generate captions, detect objects, or perform OCR
Generate text from an image and question
Chat with Pixtral 12B using Mistral Inference
Chat with an AI that understands images and text
State-of-the-art Zero-shot Object Detection
Chat about images by uploading them and typing questions
Generate text responses using images and text prompts
Generate text responses based on images and chat history
Paligemma2 Detection with Supervision
Generate text responses based on images and input text
Generate images and insights from text and images
A unified multimodal understanding and generation model.