OpenTrackVLA 🤖 👀

Visual Navigation & Following for Everyone.

OpenTrackVLA is a fully open-source Vision-Language-Action (VLA) stack that turns monocular video and natural-language instructions into actionable, short-horizon waypoints.

While we explore massive backbones (8B/30B) internally, this repository is dedicated to democratizing embodied AI. We have intentionally released our highly efficient 0.6B checkpoint along with the full training pipeline.

🚀 Why OpenTrackVLA?

Fully Open Source: We release the model weights, inference code, and the training stack—not just the inference wrapper.
Accessible: Designed to reproduce, fine-tune, and deploy with affordable compute .
Multimodal Control: Combines learned priors with visual input to guide real or simulated robots via simple text prompts.

Acknowledgment: OpenTrackVLA builds on the ideas introduced by the original TrackVLA project. Their partially-open release inspired this community-driven effort to keep the ecosystem open so researchers and developers can continue improving the stack together.

Demo In Action

The system processes video history and text instructions to predict future waypoints. Below are examples of the tracker in action:

This directory contains the HuggingFace-friendly export of the OpenTrackVLA planner.
Full project (code, datasets, training pipeline): https://github.com/om-ai-lab/OpenTrackVLA

Downloading from HuggingFace

Python

from transformers import AutoModel

model = AutoModel.from_pretrained("omlab/opentrackvla-qwen06b").eval()

Habitat evaluation using this export

OpenTrackVLA GitHub Repository
Full Project Documentation

trained_agent.py prefers HuggingFace weights when either env var is set:

HF_MODEL_DIR=/abs/path/to/open_trackvla_hf (already downloaded)
HF_MODEL_ID=omlab/opentrackvla-qwen06b (auto-download via huggingface_hub)

Example:

HF_MODEL_ID=omlab/opentrackvla-qwen06b bash eval.sh

Downloads last month: 256

Safetensors

Model size

0.6B params

Tensor type

F32

BF16

Inference Providers NEW

Other

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for omlab/opentrackvla-qwen06b

Embodied Navigation Foundation Model

Paper • 2509.12129 • Published Sep 15, 2025