pythagoras's picture

37 2

pythagoras

dingangui

·

dingangui

AI & ML interests

None yet

Recent Activity

upvoted a paper 14 days ago

Less is More: Improving LLM Reasoning with Minimal Test-Time Intervention

upvoted a paper about 1 month ago

MultiEdit: Advancing Instruction-based Image Editing on Diverse and Challenging Tasks

upvoted a paper 2 months ago

ODYSSEY: Open-World Quadrupeds Exploration and Manipulation for Long-Horizon Tasks

View all activity

Organizations

upvoted a paper 14 days ago

Less is More: Improving LLM Reasoning with Minimal Test-Time Intervention

Paper • 2510.13940 • Published 16 days ago • 6

upvoted a paper about 1 month ago

MultiEdit: Advancing Instruction-based Image Editing on Diverse and Challenging Tasks

Paper • 2509.14638 • Published Sep 18 • 11

upvoted a paper 2 months ago

ODYSSEY: Open-World Quadrupeds Exploration and Manipulation for Long-Horizon Tasks

Paper • 2508.08240 • Published Aug 11 • 45

upvoted a paper 3 months ago

Time Is a Feature: Exploiting Temporal Dynamics in Diffusion Language Models

Paper • 2508.09138 • Published Aug 12 • 36

upvoted 3 papers 5 months ago

Active-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO

Paper • 2505.21457 • Published May 27 • 15

Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration

Paper • 2505.20256 • Published May 26 • 18

Reinforcement Fine-Tuning Powers Reasoning Capability of Multimodal Large Language Models

Paper • 2505.18536 • Published May 24 • 18

upvoted 13 papers 8 months ago

SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories

Paper • 2503.08625 • Published Mar 11 • 27

X-Dancer: Expressive Music to Human Dance Video Generation

Paper • 2502.17414 • Published Feb 24 • 14

MONSTER: Monash Scalable Time Series Evaluation Repository

Paper • 2502.15122 • Published Feb 21 • 4

RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers

Paper • 2502.15894 • Published Feb 21 • 20

VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing

Paper • 2502.17258 • Published Feb 24 • 79

Beyond Release: Access Considerations for Generative AI Systems

Paper • 2502.16701 • Published Feb 23 • 16

Slamming: Training a Speech Language Model on One GPU in a Day

Paper • 2502.15814 • Published Feb 19 • 69

Forecasting Open-Weight AI Model Growth on Hugging Face

Paper • 2502.15987 • Published Feb 21 • 10

Mobile-Agent-V: Learning Mobile Device Operation Through Video-Guided Multi-Agent Collaboration

Paper • 2502.17110 • Published Feb 24 • 13

Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment

Paper • 2502.16894 • Published Feb 24 • 32

Benchmarking Temporal Reasoning and Alignment Across Chinese Dynasties

Paper • 2502.16922 • Published Feb 24 • 8

Linguistic Generalizability of Test-Time Scaling in Mathematical Reasoning

Paper • 2502.17407 • Published Feb 24 • 26

Multimodal Inconsistency Reasoning (MMIR): A New Benchmark for Multimodal Reasoning Models

Paper • 2502.16033 • Published Feb 22 • 18