MILR: Improving Multimodal Image Generation via Test-Time Latent Reasoning Paper • 2509.22761 • Published Sep 26, 2025
MickeyMouse2024/qwen3-vl-8b-pointing-natural-synthetic Image-Text-to-Text • 770k • Updated Jan 30
MickeyMouse2024/qwen3-vl-8b-pointing-natural-synthetic Image-Text-to-Text • 770k • Updated Jan 30
Iterative Tool Usage Exploration for Multimodal Agents via Step-wise Preference Tuning Paper • 2504.21561 • Published Apr 30, 2025 • 1