Daniel's picture

15 35

Daniel

LighterDarkness

·

AI & ML interests

• ML • DL • ANN • CNN • RNN • Transformers • GNN ・3DGS • NeRF • CV • World Models • VFM • Gaussian Avatars • NLP • NLU • LLM • MLLM • SLM • RAG • Agentic AI • VLA Models • Latent World Models • Dynamic 3DGS

Recent Activity

upvoted a paper 4 days ago

CrossWordBench: Evaluating the Reasoning Capabilities of LLMs and LVLMs with Controllable Puzzle Generation

upvoted a paper 4 days ago

SE-DiCoW: Self-Enrolled Diarization-Conditioned Whisper

upvoted a paper 4 days ago

OmegaUse: Building a General-Purpose GUI Agent for Autonomous Task Execution

View all activity

Organizations

None yet

upvoted 4 papers 4 days ago

CrossWordBench: Evaluating the Reasoning Capabilities of LLMs and LVLMs with Controllable Puzzle Generation

Paper • 2504.00043 • Published Mar 30, 2025 • 10

SE-DiCoW: Self-Enrolled Diarization-Conditioned Whisper

Paper • 2601.19194 • Published 7 days ago • 3

OmegaUse: Building a General-Purpose GUI Agent for Autonomous Task Execution

Paper • 2601.20380 • Published 5 days ago • 8

Advancing Open-source World Models

Paper • 2601.20540 • Published 5 days ago • 105

upvoted a paper 6 days ago

Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Paper • 2511.22699 • Published Nov 27, 2025 • 235

upvoted a paper 8 days ago

Cosmos Policy: Fine-Tuning Video Models for Visuomotor Control and Planning

Paper • 2601.16163 • Published 11 days ago • 13

upvoted 3 papers 12 days ago

CamCloneMaster: Enabling Reference-based Camera Control for Video Generation

Paper • 2506.03140 • Published Jun 3, 2025 • 1

Motion Attribution for Video Generation

Paper • 2601.08828 • Published 20 days ago • 70

Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance

Paper • 2512.08765 • Published Dec 9, 2025 • 132

upvoted a paper 18 days ago

mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published Dec 31, 2025 • 292

upvoted 3 papers 25 days ago

NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

Paper • 2601.00393 • Published Jan 1 • 130

DreamID-V:Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformer

Paper • 2601.01425 • Published 29 days ago • 52

LTX-2: Efficient Joint Audio-Visual Foundation Model

Paper • 2601.03233 • Published 27 days ago • 145

upvoted a collection 12 months ago

Papers

Large Language Model (LLM) and NLP related papers. • 342 items • Updated 4 days ago • 13

upvoted a paper about 1 year ago

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14, 2025 • 300