hjkim

hojie11

hojie11

AI & ML interests

Computer Vision, 3D Vision, Anomaly Detection

Recent Activity

upvoted a paper about 23 hours ago

Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization

upvoted a paper 6 days ago

GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation

upvoted a paper 6 days ago

AnimateAnything: Consistent and Controllable Animation for Video Generation

View all activity

Organizations

None yet

hojie11's activity

upvoted a paper about 23 hours ago

Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization

Paper • 2411.10442 • Published 10 days ago • 56

upvoted 3 papers 6 days ago

upvoted 3 papers 12 days ago

GenXD: Generating Any 3D and 4D Scenes

Paper • 2411.02319 • Published 21 days ago • 20

JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation

Paper • 2411.07975 • Published 13 days ago • 24

SAMPart3D: Segment Any Part in 3D Objects

Paper • 2411.07184 • Published 14 days ago • 26

upvoted 3 papers 14 days ago

CAD-MLLM: Unifying Multimodality-Conditioned CAD Generation With MLLM

Paper • 2411.04954 • Published 18 days ago • 8

Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models

Paper • 2411.07232 • Published 14 days ago • 60

Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models

Paper • 2411.07126 • Published 14 days ago • 28

upvoted a paper about 1 month ago

MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models

Paper • 2410.10139 • Published Oct 14 • 50

upvoted 2 papers 3 months ago

Guide-and-Rescale: Self-Guidance Mechanism for Effective Tuning-Free Real Image Editing

Paper • 2409.01322 • Published Sep 2 • 94

TWLV-I: Analysis and Insights from Holistic Evaluation on Video Foundation Models

Paper • 2408.11318 • Published Aug 21 • 54

upvoted an article 4 months ago

Article

ZebraLogic: Benchmarking the Logical Reasoning Ability of Language Models

•

Jul 27

• 24

upvoted a paper 5 months ago

Unveiling Encoder-Free Vision-Language Models

Paper • 2406.11832 • Published Jun 17 • 49