Model Card for HiVis-critic for HiVis (History-aware Visually grounded) test-time intervention framework

Model Description

HIVIS (History-aware Visually grounded), a test-time intervention framework designed to equip CUAs with history state tracking and visually grounded error analysis. Inside our framework, we propose HiVis-critic, a multimodal model to serve as an intervention engine with these dual critique generation capabilities.

Key Highlights:

  • 📝 HiVis-critic for history state tracking: maintains a macro-action history, a compact record of past interactions to date, recursively compressing past interactions into multi-step achieved goals, enabling better history-aware planning of policies over long horizons.
  • 🎯 HiVis-critic for visually grounded error analysis: verifies raw execution coordinates against actual visual states. If a proposed action is flawed, the model identifies the error dimension to provide the policy with corrective guidance before execution.
  • Developed by: Jaewoo Lee, Zaid Khan, Archiki Prasad, Justin Chih-Yao Chen, Supriyo Chakraborty, Kartik Balasubramaniam, Sambit Sahu, Elias Stengel-Eskin, Hyunji Lee, Mohit Bansal
  • Model type: Qwen3ForCausalLM, fine-tuned Large Language Model
  • Language(s) (NLP): English
  • License: MIT
  • Finetuned from model: Qwen3-VL-8B-Thinking

Model Sources

Overview of HiVis

Overview of HiVis

Uses

Test-time intervention

The HiVis-critic model serves as an intervention engine with these dual critique generation capabilities to aid policies in long-horizon GUI tasks, enabling precise error analysis before execution and providing a history state tracking that allows better decisions.

Citation

If you find this work useful, please consider citing us:

@article{lee2026hisvis,
      title={A History-Aware Visually Grounded Critic for Computer Use Agents},
      author={Jaewoo Lee and Zaid Khan and Archiki Prasad and Justin Chih-Yao Chen and Supriyo Chakraborty and Kartik Balasubramaniam and Sambit Sahu and Elias Stengel-Eskin and Hyunji Lee and Mohit Bansal},
      year={2026},
      journal={arXiv preprint arXiv:tbd},
      url={https://arxiv.org/abs/tbd},
}
Downloads last month
30
Safetensors
Model size
9B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Jaew00Lee/HiVis-critic

Finetuned
(62)
this model
Quantizations
1 model

Dataset used to train Jaew00Lee/HiVis-critic