This model is Rex-Omni, a 3B-parameter Multimodal Large Language Model (MLLM) presented in the paper "Detect Anything via Next Point Prediction". It is compatible with the Hugging Face transformers library and is licensed under the IDEA License 1.0.

Detect Anything via Next Point Prediction

Rex-Omni is a 3B-parameter Multimodal Large Language Model (MLLM) that redefines object detection and a wide range of other visual perception tasks as a simple next-token prediction problem.

🚀 Quick Start

Installation

conda create -n rexomni -m python=3.10
pip install torch==2.6.0 torchvision==0.21.0 --index-url https://download.pytorch.org/whl/cu124
git clone https://github.com/IDEA-Research/Rex-Omni.git
cd Rex-Omni
pip install -v -e .

2. Quick Start: Using Rex-Omni for Detection

from PIL import Image
from rex_omni import RexOmniWrapper, RexOmniVisualize

# Initialize model
model = RexOmniWrapper(
    model_path="IDEA-Research/Rex-Omni",
    backend="transformers"  # or "vllm"
)

# Load image
image = Image.open("your_image.jpg")

# Object Detection
results = model.inference(
    images=image,
    task="detection",
    categories=["person", "car", "dog"]
)

result = results[0]

# 4) Visualize
vis = RexOmniVisualize(
    image=image,
    predictions=result["extracted_predictions"],
    font_size=20,
    draw_width=5,
    show_labels=True,
)
vis.save("visualize.jpg")

3. Tutorials

We provide a series of tutorials to help you get started with Rex-Omni.

📄 License

Rex-Omni is licensed under the IDEA License 1.0, Copyright (c) IDEA. All Rights Reserved. This model is based on Qwen, which is licensed under the Qwen RESEARCH LICENSE AGREEMENT, Copyright (c) Alibaba Cloud. All Rights Reserved.

🔗 Links

📧 Contact

For questions and feedback, please contact us at:

7. Citation

Rex-Omni comes from a series of prior works. If you’re interested, you can take a look.

@misc{jiang2025detectpointprediction,
      title={Detect Anything via Next Point Prediction}, 
      author={Qing Jiang and Junan Huo and Xingyu Chen and Yuda Xiong and Zhaoyang Zeng and Yihao Chen and Tianhe Ren and Junzhi Yu and Lei Zhang},
      year={2025},
      eprint={2510.12798},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2510.12798}, 
}
Downloads last month
15,749
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for IDEA-Research/Rex-Omni

Finetuned
(516)
this model

Space using IDEA-Research/Rex-Omni 1