mistral-community
/

pixtral-12b

Image-Text-to-Text

Inference Endpoints

Model card Files Files and versions Community

Himetsu commited on Sep 13

Commit

e32103d

•

1 Parent(s): 4434ab1

Update README.md

Files changed (1) hide show

README.md +47 -1

README.md CHANGED Viewed

@@ -4,4 +4,50 @@ tags: []
 ---
 # Model Card for Model ID
-Transformers compatible pixtral checkpoints

 ---
 # Model Card for Model ID
+Transformers compatible pixtral checkpoints.
+How to use:
+```python
+from transformers import AutoProcessor, AutoModelForConditionalGeneration
+model_id = "Himetsu/pixtral-12b"
+model = LlavaForConditionalGeneration.from_pretrained(model_id, load_in_4bit=True)
+processor = AutoProcessor.from_pretrained(model_id)
+IMG_URLS = [
+    Image.open(requests.get("https://picsum.photos/id/237/400/300", stream=True).raw),
+    Image.open(requests.get("https://picsum.photos/id/231/200/300", stream=True).raw),
+    Image.open(requests.get("https://picsum.photos/id/27/500/500", stream=True).raw),
+    Image.open(requests.get("https://picsum.photos/id/17/150/600", stream=True).raw),
+]
+PROMPT = "<s>[INST]Describe the images.\n[IMG][IMG][IMG][IMG][/INST]"
+inputs = processor(text=PROMPT, images=IMG_URLS, return_tensors="pt").to("cuda")
+generate_ids = model.generate(**inputs, max_new_tokens=500)
+ouptut = processor.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
+```
+I got something like this:
+```
+"""
+Describe the images.
+Sure, let's break down each image description:
+1. **Image 1:**
+   - **Description:** A black dog with a glossy coat is sitting on a wooden floor. The dog has a focused expression and is looking directly at the camera.
+   - **Details:** The wooden floor has a rustic appearance with visible wood grain patterns. The dog's eyes are a striking color, possibly brown or amber, which contrasts with its black fur.
+2. **Image 2:**
+   - **Description:** A scenic view of a mountainous landscape with a winding road cutting through it. The road is surrounded by lush green vegetation and leads to a distant valley.
+   - **Details:** The mountains are rugged with steep slopes, and the sky is clear, indicating good weather. The winding road adds a sense of depth and perspective to the image.
+3. **Image 3:**
+   - **Description:** A beach scene with waves crashing against the shore. There are several people in the water and on the beach, enjoying the waves and the sunset.
+   - **Details:** The waves are powerful, creating a dynamic and lively atmosphere. The sky is painted with hues of orange and pink from the setting sun, adding a warm glow to the scene.
+4. **Image 4:**
+   - **Description:** A garden path leading to a large tree with a bench underneath it. The path is bordered by well-maintained grass and flowers.
+   - **Details:** The path is made of small stones or gravel, and the tree provides a shaded area with the bench invitingly placed beneath it. The surrounding area is lush and green, suggesting a well-kept garden.
+Each image captures a different scene, from a close-up of a dog to expansive natural landscapes, showcasing various elements of nature and human interaction with it.
+"""
+```