10 22 29

Mengzhao Chen

ChenMnZ

https://chenmnz.github.io/

ChenMnZ

AI & ML interests

model compression

Recent Activity

upvoted a paper 1 day ago

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM

upvoted a paper 7 days ago

QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

upvoted a paper 25 days ago

Seedream 4.0: Toward Next-generation Multimodal Image Generation

View all activity

Organizations

None yet

upvoted a paper 1 day ago

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM

Paper • 2510.15870 • Published 3 days ago • 57

upvoted a paper 7 days ago

QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

Paper • 2510.11696 • Published 7 days ago • 162

upvoted a paper 25 days ago

Seedream 4.0: Toward Next-generation Multimodal Image Generation

Paper • 2509.20427 • Published 26 days ago • 73

upvoted an article 2 months ago

Article

Efficient LLM Pretraining: Packed Sequences and Masked Attention

•

Oct 7, 2024

• 54

liked a dataset 2 months ago

togethercomputer/Long-Data-Collections

Viewer • Updated Jan 4 • 4.12M • 469 • 154

upvoted 2 papers 5 months ago

LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning

Paper • 2505.16933 • Published May 22 • 34

Scaling Diffusion Transformers Efficiently via μP

Paper • 2505.15270 • Published May 21 • 35

authored 3 papers 5 months ago

upvoted a paper 5 months ago

Scaling Law for Quantization-Aware Training

Paper • 2505.14302 • Published May 20 • 75

commented a paper 5 months ago

Scaling Law for Quantization-Aware Training

Paper • 2505.14302 • Published May 20 • 75 •

upvoted a paper 5 months ago

Emerging Properties in Unified Multimodal Pretraining

Paper • 2505.14683 • Published May 20 • 132

commented a paper 5 months ago

Model Merging in Pre-training of Large Language Models

Paper • 2505.12082 • Published May 17 • 40 •

upvoted 2 papers 5 months ago

Model Merging in Pre-training of Large Language Models

Paper • 2505.12082 • Published May 17 • 40

DanceGRPO: Unleashing GRPO on Visual Generation

Paper • 2505.07818 • Published May 12 • 32

authored a paper 5 months ago

DanceGRPO: Unleashing GRPO on Visual Generation

Paper • 2505.07818 • Published May 12 • 32

liked a model 5 months ago

mlfoundations/scaling

Updated Mar 15, 2024 • 4

liked a model 8 months ago

nvidia/DeepSeek-R1-FP4

Text Generation • Updated Jun 6 • 7.32k • 265

upvoted a paper 8 months ago

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

Paper • 2412.18619 • Published Dec 16, 2024 • 58

Mengzhao Chen

AI & ML interests

Recent Activity

Organizations

ChenMnZ's activity

Efficient LLM Pretraining: Packed Sequences and Masked Attention