Mahdi Pourmirzaei's picture

76

Mahdi Pourmirzaei

Mahdip72

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 18 days ago

The Surprising Effectiveness of Test-Time Training for Abstract Reasoning

upvoted a paper 18 days ago

Cut Your Losses in Large-Vocabulary Language Models

upvoted a paper about 1 month ago

Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation

View all activity

Organizations

None yet

Mahdip72's activity

upvoted 2 papers 18 days ago

The Surprising Effectiveness of Test-Time Training for Abstract Reasoning

Paper • 2411.07279 • Published 23 days ago • 3

Cut Your Losses in Large-Vocabulary Language Models

Paper • 2411.09009 • Published 21 days ago • 41

upvoted 3 papers about 1 month ago

Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation

Paper • 2410.13848 • Published Oct 17 • 29

Quantifying Generalization Complexity for Large Language Models

Paper • 2410.01769 • Published Oct 2 • 13

Loong: Generating Minute-level Long Videos with Autoregressive Language Models

Paper • 2410.02757 • Published Oct 3 • 36

upvoted 2 papers about 2 months ago

Addition is All You Need for Energy-efficient Language Models

Paper • 2410.00907 • Published Oct 1 • 144

Depth Pro: Sharp Monocular Metric Depth in Less Than a Second

Paper • 2410.02073 • Published Oct 2 • 40

upvoted 4 papers 2 months ago

MIO: A Foundation Model on Multimodal Tokens

Paper • 2409.17692 • Published Sep 26 • 51

Emu3: Next-Token Prediction is All You Need

Paper • 2409.18869 • Published Sep 27 • 91

Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20 • 58

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19 • 135

upvoted 6 papers 3 months ago

Qwen2.5-Coder Technical Report

Paper • 2409.12186 • Published Sep 18 • 137

Amuro & Char: Analyzing the Relationship between Pre-Training and Fine-Tuning of Large Language Models

Paper • 2408.06663 • Published Aug 13 • 15

MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark

Paper • 2409.02813 • Published Sep 4 • 28

Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27 • 121

Scalable Autoregressive Image Generation with Mamba

Paper • 2408.12245 • Published Aug 22 • 25

Sapiens: Foundation for Human Vision Models

Paper • 2408.12569 • Published Aug 22 • 89

upvoted a collection 3 months ago

Llama 3.1

This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Sep 25 • 630

upvoted 2 papers 4 months ago

Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

Paper • 2408.03314 • Published Aug 6 • 33

JPEG-LM: LLMs as Image Generators with Canonical Codec Representations

Paper • 2408.08459 • Published Aug 15 • 44