neuralmagic
/

pixtral-12b-FP8-dynamic

@@ -12,6 +12,8 @@ language:
 - es
 - th
 pipeline_tag: text-generation
 base_model:
 - mistral-community/pixtral-12b
 - mistralai/Pixtral-12B-2409
@@ -20,17 +22,17 @@ base_model:
 # pixtral-12b-FP8-dynamic
 ## Model Overview
-- **Model Architecture:** Llava
   - **Input:** Text/Image
   - **Output:** Text
 - **Model Optimizations:**
   - **Weight quantization:** FP8
   - **Activation quantization:** FP8
-- **Intended Use Cases:** Intended for commercial and research use in multiple languages. Similar to [mistral-community/pixtral-12b](https://huggingface.co/mistral-community/pixtral-12b), this models is intended for assistant-like chat.
 - **Out-of-scope:** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in languages other than English.
 - **Release Date:** 11/1/2024
 - **Version:** 1.0
-- **License(s):**
 - **Model Developers:** Neural Magic
 Quantized version of [mistral-community/pixtral-12b](https://huggingface.co/mistral-community/pixtral-12b).
@@ -51,39 +53,38 @@ This model can be deployed efficiently using the [vLLM](https://docs.vllm.ai/en/
 ```python
 from vllm import LLM, SamplingParams
-from vllm.assets.image import ImageAsset
 # Initialize the LLM
 model_name = "neuralmagic/pixtral-12b-FP8-dynamic"
-llm = LLM(model=model_name, max_num_seqs=1, enforce_eager=True)
-# Load the image
-image = ImageAsset("cherry_blossom").pil_image.convert("RGB")
 # Create the prompt
-question = "If I had to write a haiku for this one, it would be: "
-prompt = f"<|image|><|begin_of_text|>{question}"
 # Set up sampling parameters
-sampling_params = SamplingParams(temperature=0.2, max_tokens=30)
 # Generate the response
-inputs = {
-    "prompt": prompt,
-    "multi_modal_data": {
-        "image": image
-    },
-}
-outputs = llm.generate(inputs, sampling_params=sampling_params)
 # Print the generated text
-print(outputs[0].outputs[0].text)
 ```
 vLLM also supports OpenAI-compatible serving. See the [documentation](https://docs.vllm.ai/en/latest/) for more details.
 ```
-vllm serve neuralmagic/pixtral-12b-FP8-dynamic --max-num-seqs 16
 ```
 ## Creation

 - es
 - th
 pipeline_tag: text-generation
+license: apache-2.0
+library_name: vllm
 base_model:
 - mistral-community/pixtral-12b
 - mistralai/Pixtral-12B-2409
 # pixtral-12b-FP8-dynamic
 ## Model Overview
+- **Model Architecture:** Pixtral (Llava)
   - **Input:** Text/Image
   - **Output:** Text
 - **Model Optimizations:**
   - **Weight quantization:** FP8
   - **Activation quantization:** FP8
+- **Intended Use Cases:** Intended for commercial and research use in multiple languages. Similar to [mistralai/Pixtral-12B-2409](https://huggingface.co/mistralai/Pixtral-12B-2409), this models is intended for assistant-like chat.
 - **Out-of-scope:** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in languages other than English.
 - **Release Date:** 11/1/2024
 - **Version:** 1.0
+- **License(s):** [Apache 2.0](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/apache-2.0.md)
 - **Model Developers:** Neural Magic
 Quantized version of [mistral-community/pixtral-12b](https://huggingface.co/mistral-community/pixtral-12b).
 ```python
 from vllm import LLM, SamplingParams
 # Initialize the LLM
 model_name = "neuralmagic/pixtral-12b-FP8-dynamic"
+llm = LLM(model=model_name, max_model_len=10000)
 # Create the prompt
+image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
+messages = [
+    {
+        "role": "user",
+        "content": [
+            {"type": "text", "text": "Describe the image."},
+            {"type": "image_url", "image_url": {"url": image_url}},
+        ],
+    },
+]
 # Set up sampling parameters
+sampling_params = SamplingParams(temperature=0.2, max_tokens=100)
 # Generate the response
+outputs = llm.chat(messages, sampling_params=sampling_params)
 # Print the generated text
+for output in outputs:
+    print(output.outputs[0].text)
 ```
 vLLM also supports OpenAI-compatible serving. See the [documentation](https://docs.vllm.ai/en/latest/) for more details.
 ```
+vllm serve neuralmagic/pixtral-12b-FP8-dynamic
 ```
 ## Creation