Roman Abramov's picture

20

Roman Abramov

monsetrum

AI & ML interests

None yet

Organizations

None yet

upvoted 20 papers 4 months ago

R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning

Paper • 2505.02835 • Published May 5 • 28

A Survey on Inference Engines for Large Language Models: Perspectives on Optimization and Efficiency

Paper • 2505.01658 • Published May 3 • 39

Beyond One-Size-Fits-All: Inversion Learning for Highly Effective NLG Evaluation Prompts

Paper • 2504.21117 • Published Apr 29 • 26

Real-World Gaps in AI Governance Research

Paper • 2505.00174 • Published Apr 30 • 12

CORG: Generating Answers from Complex, Interrelated Contexts

Paper • 2505.00023 • Published Apr 25 • 9

WorldGenBench: A World-Knowledge-Integrated Benchmark for Reasoning-Driven Text-to-Image Generation

Paper • 2505.01490 • Published May 2 • 5

TeLoGraF: Temporal Logic Planning via Graph-encoded Flow Matching

Paper • 2505.00562 • Published May 1 • 4

X-Cross: Dynamic Integration of Language Models for Cross-Domain Sequential Recommendation

Paper • 2504.20859 • Published Apr 29 • 4

Think on your Feet: Adaptive Thinking via Reinforcement Learning for Social Agents

Paper • 2505.02156 • Published May 4 • 18

SkillMimic-V2: Learning Robust and Generalizable Interaction Skills from Sparse and Noisy Demonstrations

Paper • 2505.02094 • Published May 4 • 19

Agentic Reasoning and Tool Integration for LLMs via Reinforcement Learning

Paper • 2505.01441 • Published Apr 28 • 39

SuperEdit: Rectifying and Facilitating Supervision for Instruction-Based Image Editing

Paper • 2505.02370 • Published May 5 • 14

Low-Precision Training of Large Language Models: Methods, Challenges, and Opportunities

Paper • 2505.01043 • Published May 2 • 10

Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction

Paper • 2505.02471 • Published May 5 • 12

MUSAR: Exploring Multi-Subject Customization from Single-Subject Dataset via Attention Routing

Paper • 2505.02823 • Published May 5 • 5

TEMPURA: Temporal Event Masked Prediction and Understanding for Reasoning in Action

Paper • 2505.01583 • Published May 2 • 9

LLaMA-Omni2: LLM-based Real-time Spoken Chatbot with Autoregressive Streaming Speech Synthesis

Paper • 2505.02625 • Published May 5 • 22

Unlearning Sensitive Information in Multimodal LLMs: Benchmark and Attack-Defense Evaluation

Paper • 2505.01456 • Published May 1 • 2

Learning Heterogeneous Mixture of Scene Experts for Large-scale Neural Radiance Fields

Paper • 2505.02005 • Published May 4 • 3

Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers

Paper • 2504.20752 • Published Apr 29 • 93