Commit
•
460ee72
1
Parent(s):
7f5b217
Update README.md
Browse files
README.md
CHANGED
@@ -53,9 +53,10 @@ Each image captures a different scene, from a close-up of a dog to expansive nat
|
|
53 |
"""
|
54 |
```
|
55 |
|
56 |
-
You can also use a chat template to format your chat history for Pixtral.
|
57 |
-
|
58 |
-
|
|
|
59 |
|
60 |
```python
|
61 |
from PIL import Image
|
@@ -105,6 +106,6 @@ If you're asking whether the dog can "live here," referring to the snowy landsca
|
|
105 |
Would you like more information on any specific aspect?
|
106 |
```
|
107 |
|
108 |
-
|
109 |
correctly separated by image tokens. Try decoding with special tokens included to see exactly what the model sees!
|
110 |
|
|
|
53 |
"""
|
54 |
```
|
55 |
|
56 |
+
You can also use a chat template to format your chat history for Pixtral. Make sure that the `images` argument to the `processor` contains the images in the order
|
57 |
+
that they appear in the chat, so that the model understands where each image is supposed to go.
|
58 |
+
|
59 |
+
Here's an example with text and multiple images interleaved in the same message:
|
60 |
|
61 |
```python
|
62 |
from PIL import Image
|
|
|
106 |
Would you like more information on any specific aspect?
|
107 |
```
|
108 |
|
109 |
+
While it may appear that spacing in the input is disrupted, this is caused by us skipping special tokens for display, and actually "Can this animal" and "live here" are
|
110 |
correctly separated by image tokens. Try decoding with special tokens included to see exactly what the model sees!
|
111 |
|