Yuanshi's picture

Yuanshi PRO

Yuanshi

·

AI & ML interests

Reinforcement Learning; Large Language Model; Multimodality; AI Infrastructure;

Recent Activity

upvoted a paper 9 days ago

Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset

liked a dataset 9 days ago

QingyanBai/Ditto-1M

upvoted a paper 22 days ago

MixReasoning: Switching Modes to Think

View all activity

Organizations

upvoted a paper 9 days ago

Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset

Paper • 2510.15742 • Published 12 days ago • 49

upvoted a paper 22 days ago

MixReasoning: Switching Modes to Think

Paper • 2510.06052 • Published 22 days ago • 21

upvoted a paper 29 days ago

dParallel: Learnable Parallel Decoding for dLLMs

Paper • 2509.26488 • Published 29 days ago • 19

upvoted 3 papers 4 months ago

Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Paper • 2507.06261 • Published Jul 7 • 63

Test3R: Learning to Reconstruct 3D at Test Time

Paper • 2506.13750 • Published Jun 16 • 27

Discrete Diffusion in Large Language and Multimodal Models: A Survey

Paper • 2506.13759 • Published Jun 16 • 43

upvoted 3 papers 5 months ago

Image Editing As Programs with Diffusion Models

Paper • 2506.04158 • Published Jun 4 • 24

VeriThinker: Learning to Verify Makes Reasoning Model Efficient

Paper • 2505.17941 • Published May 23 • 25

Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding

Paper • 2505.16990 • Published May 22 • 22

upvoted 3 papers 10 months ago

Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models

Paper • 2501.09686 • Published Jan 16 • 41

CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up

Paper • 2412.16112 • Published Dec 20, 2024 • 23

GUI Agents: A Survey

Paper • 2412.13501 • Published Dec 18, 2024 • 29

upvoted 5 papers 11 months ago

Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient

Paper • 2411.17787 • Published Nov 26, 2024 • 12

ShowUI: One Vision-Language-Action Model for GUI Visual Agent

Paper • 2411.17465 • Published Nov 26, 2024 • 88

Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot Subject-Driven Image Generator

Paper • 2411.15466 • Published Nov 23, 2024 • 39

Style-Friendly SNR Sampler for Style-Driven Generation

Paper • 2411.14793 • Published Nov 22, 2024 • 39

OminiControl: Minimal and Universal Control for Diffusion Transformer

Paper • 2411.15098 • Published Nov 22, 2024 • 61

upvoted 3 papers about 1 year ago

Attention Prompting on Image for Large Vision-Language Models

Paper • 2409.17143 • Published Sep 25, 2024 • 7

LinFusion: 1 GPU, 1 Minute, 16K Image

Paper • 2409.02097 • Published Sep 3, 2024 • 34

Heavy Labels Out! Dataset Distillation with Label Space Lightening

Paper • 2408.08201 • Published Aug 15, 2024 • 21