LLaMAX2: Your Translation-Enhanced Model also Performs Well in Reasoning Paper • 2510.09189 • Published 5 days ago • 3
TAG:Tangential Amplifying Guidance for Hallucination-Resistant Diffusion Sampling Paper • 2510.04533 • Published 10 days ago • 44
Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding Paper • 2510.06308 • Published 8 days ago • 48
Equilibrium Matching: Generative Modeling with Implicit Energy-Based Models Paper • 2510.02300 • Published 13 days ago • 5
Learning to See Before Seeing: Demystifying LLM Visual Priors from Language Pre-training Paper • 2509.26625 • Published 15 days ago • 42
Self-Forcing++: Towards Minute-Scale High-Quality Video Generation Paper • 2510.02283 • Published 13 days ago • 88
TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning Paper • 2509.25760 • Published 16 days ago • 51
Hyperspherical Latents Improve Continuous-Token Autoregressive Generation Paper • 2509.24335 • Published 17 days ago • 6
Grove MoE: Towards Efficient and Superior MoE LLMs with Adjugate Experts Paper • 2508.07785 • Published Aug 11 • 28
Common Diffusion Noise Schedules and Sample Steps are Flawed Paper • 2305.08891 • Published May 15, 2023 • 12
CAME: Confidence-guided Adaptive Memory Efficient Optimization Paper • 2307.02047 • Published Jul 5, 2023 • 1
SMMF: Square-Matricized Momentum Factorization for Memory-Efficient Optimization Paper • 2412.08894 • Published Dec 12, 2024 • 1
Can Understanding and Generation Truly Benefit Together -- or Just Coexist? Paper • 2509.09666 • Published Sep 11 • 33
Set Block Decoding is a Language Model Inference Accelerator Paper • 2509.04185 • Published Sep 4 • 52
Reasoning Vectors: Transferring Chain-of-Thought Capabilities via Task Arithmetic Paper • 2509.01363 • Published Sep 1 • 57