--- license: apache-2.0 library_name: vllm --- # Pixtral-12B-0910 > [!WARNING] > We still need to validate official evaluations with the below usage example. ...TODO ## Usage We recommend using Pixtral with the [vLLM library](https://github.com/vllm-project/vllm). **Important**: Make sure you have installed vLLM from source - more specifically make sure you have installed [this commit (TODO)]( ). Also make sure you have `mistral_common >= 1.4.0` installed: ``` pip install --upgrade mistral_common ``` **_Simple Example_** ```py from vllm import LLM from vllm.sampling_params import SamplingParams model_name = "mistralai/Pixtral-12B-2409" sampling_params = SamplingParams(max_tokens=8192) llm = LLM(model=model_name, tokenizer_mode="mistral") prompt = "Describe this image in one sentence." image_url = "https://picsum.photos/id/237/200/300" messages = [ { "role": "user", "content": [{"type": "text", "text": prompt}, {"type": "image_url", "image_url": {"url": image_url}}] }, ] outputs = llm.model.chat(messages, sampling_params=sampling_params) print(outputs[0].outputs[0].text) ``` **_Advanced Example_** You can also pass multiple images per message and/or pass multi-turn conversations ```py from vllm import LLM from vllm.sampling_params import SamplingParams model_name = "mistralai/Pixtral-12B-2409" max_img_per_msg = 5 max_tokens_per_img = 4096 sampling_params = SamplingParams(max_tokens=8192, temperature=0.7) llm = LLM(model=model_name, tokenizer_mode="mistral", limit_mm_per_prompt={"image": max_img_per_msg}, max_num_batched_tokens=max_img_per_msg * max_tokens_per_img) prompt = "Describe the following image." url_1 = "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png" url_2 = "https://picsum.photos/seed/picsum/200/300" url_3 = "https://picsum.photos/id/32/512/512" messages = [ { "role": "user", "content": [{"type": "text", "text": prompt}, {"type": "image_url", "image_url": {"url": url_1}}, {"type": "image_url", "image_url": {"url": url_2}}], }, { "role": "assistant", "content": "The images shows nature.", }, { "role": "user", "content": "More details please and answer only in French!." }, { "role": "user", "content": [{"type": "image_url", "image_url": {"url": url_3}}], } ] outputs = llm.chat(messages=messages, sampling_params=sampling_params) print(outputs[0].outputs[0].text) ``` **_Server_** You can also use pixtral in a server/client setting. 1. Spin up a server: ``` vllm serve mistralai/Pixtral-12B-2409 --tokenizer_mode mistral --limit_mm_per_prompt 'image=4' --max_num_batched_tokens 16384 ``` 2. And ping the client: ``` curl --location 'http://:8000/v1/chat/completions' \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer token' \ --data '{ "model": "mistralai/Pixtral-12B-2409", "messages": [ { "role": "user", "content": [ {"type" : "text", "text": "Describe this image in detail please."}, {"type": "image_url", "image_url": {"url": "https://s3.amazonaws.com/cms.ipressroom.com/338/files/201808/5b894ee1a138352221103195_A680%7Ejogging-edit/A680%7Ejogging-edit_hero.jpg"}}, {"type" : "text", "text": "and this one as well. Answer in French."}, {"type": "image_url", "image_url": {"url": "https://www.wolframcloud.com/obj/resourcesystem/images/a0e/a0ee3983-46c6-4c92-b85d-059044639928/6af8cfb971db031b.png"}} ] } ] }' ```