7 17 6

Renrui

ZrrSkywalker

https://github.com/ZrrSkywalker

ZrrSkywalker

AI & ML interests

Computer Vision

Recent Activity

upvoted a paper 19 days ago

ATLAS: Agentic or Latent Visual Reasoning? One Word is Enough for Both

upvoted a paper about 2 months ago

Act Wisely: Cultivating Meta-Cognitive Tool Use in Agentic Multimodal Models

upvoted a paper 6 months ago

Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation

View all activity

Organizations

upvoted a paper 19 days ago

ATLAS: Agentic or Latent Visual Reasoning? One Word is Enough for Both

Paper • 2605.15198 • Published 20 days ago • 19

upvoted a paper about 2 months ago

Act Wisely: Cultivating Meta-Cognitive Tool Use in Agentic Multimodal Models

Paper • 2604.08545 • Published Apr 9 • 41

upvoted a paper 6 months ago

Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation

Paper • 2512.10949 • Published Dec 11, 2025 • 47

authored a paper 6 months ago

Thinking-while-Generating: Interleaving Textual Reasoning throughout Visual Generation

Paper • 2511.16671 • Published Nov 20, 2025 • 16

upvoted a paper 6 months ago

Thinking-while-Generating: Interleaving Textual Reasoning throughout Visual Generation

Paper • 2511.16671 • Published Nov 20, 2025 • 16

commented a paper 6 months ago

Thinking-while-Generating: Interleaving Textual Reasoning throughout Visual Generation

Paper • 2511.16671 • Published Nov 20, 2025 • 16 •

upvoted a paper 7 months ago

Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark

Paper • 2510.26802 • Published Oct 30, 2025 • 34

commented a paper 7 months ago

Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark

Paper • 2510.26802 • Published Oct 30, 2025 • 34 •

upvoted a paper 8 months ago

Artificial Hippocampus Networks for Efficient Long-Context Modeling

Paper • 2510.07318 • Published Oct 8, 2025 • 32

upvoted 2 papers over 1 year ago

Training-free Regional Prompting for Diffusion Transformers

Paper • 2411.02395 • Published Nov 4, 2024 • 25

PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions

Paper • 2409.15278 • Published Sep 23, 2024 • 24

commented a paper over 1 year ago

PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions

Paper • 2409.15278 • Published Sep 23, 2024 • 24 •

liked a dataset over 1 year ago

CaraJ/MMSearch

Viewer • Updated Apr 5 • 900 • 611 • 25

upvoted 2 papers over 1 year ago

MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines

Paper • 2409.12959 • Published Sep 19, 2024 • 38

InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning

Paper • 2409.12568 • Published Sep 19, 2024 • 50

updated a collection almost 2 years ago

SAM2Point

Collection

1 item • Updated Aug 30, 2024

upvoted a paper almost 2 years ago

SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable Manners

Paper • 2408.16768 • Published Aug 29, 2024 • 28

liked a Space almost 2 years ago

SAM2Point

🌖

Segment Any 3D as Videos

upvoted a paper almost 2 years ago

LLaVA-OneVision: Easy Visual Task Transfer

Paper • 2408.03326 • Published Aug 6, 2024 • 61

Renrui

AI & ML interests

Recent Activity

Organizations

ZrrSkywalker's activity

SAM2Point