Qwen3-VL-4B-Instruct

Run Qwen3-VL-4B-Instruct optimized for Qualcomm NPUs with nexaSDK.

Quickstart

  1. Install NexaSDK and create a free account at sdk.nexa.ai

  2. Activate your device with your access token:

    nexa config set license '<access_token>'
    
  3. Run the model on Qualcomm NPU in one line:

    nexa infer NexaAI/Qwen3-VL-4B-Instruct-NPU
    

Model Description

Qwen3-VL-4B-Instruct is a 4-billion-parameter instruction-tuned multimodal large language model from Alibaba Cloud’s Qwen team.
As part of the Qwen3-VL series, it fuses powerful vision-language understanding with conversational fine-tuning, optimized for real-world applications such as chat-based reasoning, document analysis, and visual dialogue.

The Instruct variant is tuned for following user prompts naturally and safely — producing concise, relevant, and user-aligned responses across text, image, and video contexts.

Features

  • Instruction-Following: Optimized for dialogue, explanation, and user-friendly task completion.
  • Vision-Language Fusion: Understands and reasons across text, images, and video frames.
  • Multilingual Capability: Handles multiple languages for diverse global use cases.
  • Contextual Coherence: Balances reasoning ability with natural, grounded conversational tone.
  • Lightweight & Deployable: 4B parameters make it efficient for edge and device-level inference.

Use Cases

  • Visual chatbots and assistants
  • Image captioning and scene understanding
  • Chart, document, or screenshot analysis
  • Educational or tutoring systems with visual inputs
  • Multilingual, multimodal question answering

Inputs and Outputs

Input:

  • Text prompts, image(s), or mixed multimodal instructions.

Output:

  • Natural-language responses or visual reasoning explanations.
  • Can return structured text (summaries, captions, answers, etc.) depending on the prompt.

License

Refer to the official Qwen license for terms of use and redistribution.

Downloads last month
1,703
GGUF
Model size
0.4B params
Architecture
Qwen3-VL-4B-Thinking-vision
Hardware compatibility
Log In to view the estimation

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for NexaAI/Qwen3-VL-4B-Instruct-NPU

Quantized
(16)
this model

Collection including NexaAI/Qwen3-VL-4B-Instruct-NPU