paper2read - a Chuanming Collection

Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Chuanming 's Collections

paper2read

updated 6 days ago

A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions

Paper • 2312.08578 • Published Dec 14, 2023 • 20
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks

Paper • 2312.08583 • Published Dec 14, 2023 • 12
Vision-Language Models as a Source of Rewards

Paper • 2312.09187 • Published Dec 14, 2023 • 14
StemGen: A music generation model that listens

Paper • 2312.08723 • Published Dec 14, 2023 • 49
Pearl: A Production-ready Reinforcement Learning Agent

Paper • 2312.03814 • Published Dec 6, 2023 • 16
TinySAM: Pushing the Envelope for Efficient Segment Anything Model

Paper • 2312.13789 • Published Dec 21, 2023 • 15
PanGu-π: Enhancing Language Model Architectures via Nonlinearity Compensation

Paper • 2312.17276 • Published Dec 27, 2023 • 16
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

Paper • 2204.05862 • Published Apr 12, 2022 • 2
Improving Text Embeddings with Large Language Models

Paper • 2401.00368 • Published Dec 31, 2023 • 83
DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 189
Understanding LLMs: A Comprehensive Overview from Training to Inference

Paper • 2401.02038 • Published Jan 4, 2024 • 66
A Rank Stabilization Scaling Factor for Fine-Tuning with LoRA

Paper • 2312.03732 • Published Nov 28, 2023 • 10
Zephyr: Direct Distillation of LM Alignment

Paper • 2310.16944 • Published Oct 25, 2023 • 122
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts

Paper • 2401.04081 • Published Jan 8, 2024 • 74
Soaring from 4K to 400K: Extending LLM's Context with Activation Beacon

Paper • 2401.03462 • Published Jan 7, 2024 • 28
MagicVideo-V2: Multi-Stage High-Aesthetic Video Generation

Paper • 2401.04468 • Published Jan 9, 2024 • 50
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models

Paper • 2401.04658 • Published Jan 9, 2024 • 28
Masked Audio Generation using a Single Non-Autoregressive Transformer

Paper • 2401.04577 • Published Jan 9, 2024 • 44
Tuning LLMs with Contrastive Alignment Instructions for Machine Translation in Unseen, Low-resource Languages

Paper • 2401.05811 • Published Jan 11, 2024 • 8
Self-Instruct: Aligning Language Model with Self Generated Instructions

Paper • 2212.10560 • Published Dec 20, 2022 • 9
DeepSpeed-FastGen: High-throughput Text Generation for LLMs via MII and DeepSpeed-Inference

Paper • 2401.08671 • Published Jan 9, 2024 • 15
Scalable Pre-training of Large Autoregressive Image Models

Paper • 2401.08541 • Published Jan 16, 2024 • 39
DiffusionGPT: LLM-Driven Text-to-Image Generation System

Paper • 2401.10061 • Published Jan 18, 2024 • 32
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 152
Zero Bubble Pipeline Parallelism

Paper • 2401.10241 • Published Nov 30, 2023 • 25
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

Paper • 2401.10774 • Published Jan 19, 2024 • 59
Lost in the Middle: How Language Models Use Long Contexts

Paper • 2307.03172 • Published Jul 6, 2023 • 41
AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents

Paper • 2401.12963 • Published Jan 23, 2024 • 12
Lumiere: A Space-Time Diffusion Model for Video Generation

Paper • 2401.12945 • Published Jan 23, 2024 • 86
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI

Paper • 2311.16502 • Published Nov 27, 2023 • 36
Proactive Detection of Voice Cloning with Localized Watermarking

Paper • 2401.17264 • Published Jan 30, 2024 • 19
LongAlign: A Recipe for Long Context Alignment of Large Language Models

Paper • 2401.18058 • Published Jan 31, 2024 • 23
MobileDiffusion: Subsecond Text-to-Image Generation on Mobile Devices

Paper • 2311.16567 • Published Nov 28, 2023 • 20
A Long Way to Go: Investigating Length Correlations in RLHF

Paper • 2310.03716 • Published Oct 5, 2023 • 10
Efficient Exploration for LLMs

Paper • 2402.00396 • Published Feb 1, 2024 • 23
Transforming and Combining Rewards for Aligning Large Language Models

Paper • 2402.00742 • Published Feb 1, 2024 • 12
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models

Paper • 2401.15947 • Published Jan 29, 2024 • 54
OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset

Paper • 2402.10176 • Published Feb 15, 2024 • 38
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models

Paper • 2309.14509 • Published Sep 25, 2023 • 20
MambaByte: Token-free Selective State Space Model

Paper • 2401.13660 • Published Jan 24, 2024 • 61
S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Paper • 2311.03285 • Published Nov 6, 2023 • 32
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models

Paper • 2309.12307 • Published Sep 21, 2023 • 89
NExT-GPT: Any-to-Any Multimodal LLM

Paper • 2309.05519 • Published Sep 11, 2023 • 78
Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception

Paper • 2401.16158 • Published Jan 29, 2024 • 21
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM

Paper • 2403.07816 • Published Mar 12, 2024 • 45
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Paper • 2403.09611 • Published Mar 14, 2024 • 129
Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

Paper • 2402.09844 • Published Feb 15, 2024 • 21
Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 244
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework

Paper • 2404.14619 • Published Apr 22, 2024 • 128
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding

Paper • 2404.16710 • Published Apr 25, 2024 • 80
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

Paper • 2402.14905 • Published Feb 22, 2024 • 134
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 128
RLHF Workflow: From Reward Modeling to Online RLHF

Paper • 2405.07863 • Published May 13, 2024 • 72
LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published May 15, 2024 • 89
Pheme: Efficient and Conversational Speech Generation

Paper • 2401.02839 • Published Jan 5, 2024 • 18
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework

Paper • 2405.11143 • Published May 20, 2024 • 39
Parrot: Enhancing Multi-Turn Chat Models by Learning to Ask Questions

Paper • 2310.07301 • Published Oct 11, 2023 • 1
Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts

Paper • 2405.11273 • Published May 18, 2024 • 19
SimPO: Simple Preference Optimization with a Reference-Free Reward

Paper • 2405.14734 • Published May 23, 2024 • 11
Aligning Large Multimodal Models with Factually Augmented RLHF

Paper • 2309.14525 • Published Sep 25, 2023 • 31
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

Paper • 2310.11511 • Published Oct 17, 2023 • 78
Show, Don't Tell: Aligning Language Models with Demonstrated Feedback

Paper • 2406.00888 • Published Jun 2, 2024 • 34
Step-aware Preference Optimization: Aligning Preference with Denoising Performance at Each Step

Paper • 2406.04314 • Published Jun 6, 2024 • 31
Scalable Diffusion Models with Transformers

Paper • 2212.09748 • Published Dec 19, 2022 • 18
Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs

Paper • 2402.14740 • Published Feb 22, 2024 • 14
RewardBench: Evaluating Reward Models for Language Modeling

Paper • 2403.13787 • Published Mar 20, 2024 • 23
An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27, 2024 • 90
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks

Paper • 2311.06242 • Published Nov 10, 2023 • 94
Seed-TTS: A Family of High-Quality Versatile Speech Generation Models

Paper • 2406.02430 • Published Jun 4, 2024 • 39
MobileVLM : A Fast, Reproducible and Strong Vision Language Assistant for Mobile Devices

Paper • 2312.16886 • Published Dec 28, 2023 • 21
MobileVLM V2: Faster and Stronger Baseline for Vision Language Model

Paper • 2402.03766 • Published Feb 6, 2024 • 15
Qwen2 Technical Report

Paper • 2407.10671 • Published Jul 15, 2024 • 167
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

Paper • 2406.08464 • Published Jun 12, 2024 • 71
Imagen 3

Paper • 2408.07009 • Published Aug 13, 2024 • 63
xGen-MM (BLIP-3): A Family of Open Large Multimodal Models

Paper • 2408.08872 • Published Aug 16, 2024 • 101
mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval

Paper • 2407.19669 • Published Jul 29, 2024 • 25
Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22, 2024 • 132
Generative Verifiers: Reward Modeling as Next-Token Prediction

Paper • 2408.15240 • Published Aug 27, 2024 • 13
Language Model Can Listen While Speaking

Paper • 2408.02622 • Published Aug 5, 2024 • 43
WildVis: Open Source Visualizer for Million-Scale Chat Logs in the Wild

Paper • 2409.03753 • Published Sep 5, 2024 • 19
LLaMA-Omni: Seamless Speech Interaction with Large Language Models

Paper • 2409.06666 • Published Sep 10, 2024 • 59
MVLLaVA: An Intelligent Agent for Unified and Flexible Novel View Synthesis

Paper • 2409.07129 • Published Sep 11, 2024 • 8
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Paper • 2409.01704 • Published Sep 3, 2024 • 84
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution

Paper • 2409.12191 • Published Sep 18, 2024 • 78
Prithvi WxC: Foundation Model for Weather and Climate

Paper • 2409.13598 • Published Sep 20, 2024 • 45
Baichuan-Omni Technical Report

Paper • 2410.08565 • Published Oct 11, 2024 • 88
Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning

Paper • 2410.08146 • Published Oct 10, 2024 • 1
Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages

Paper • 2410.16153 • Published Oct 21, 2024 • 45
Baichuan Alignment Technical Report

Paper • 2410.14940 • Published Oct 19, 2024 • 52
PUMA: Empowering Unified MLLM with Multi-granular Visual Generation

Paper • 2410.13861 • Published Oct 17, 2024 • 57
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs

Paper • 2404.05719 • Published Apr 8, 2024 • 83
Skywork-Reward: Bag of Tricks for Reward Modeling in LLMs

Paper • 2410.18451 • Published Oct 24, 2024 • 20
Continuous Speech Synthesis using per-token Latent Diffusion

Paper • 2410.16048 • Published Oct 21, 2024 • 30
Fast Best-of-N Decoding via Speculative Rejection

Paper • 2410.20290 • Published Oct 26, 2024 • 10
Precise and Dexterous Robotic Manipulation via Human-in-the-Loop Reinforcement Learning

Paper • 2410.21845 • Published Oct 29, 2024 • 16
OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization

Paper • 2410.19609 • Published Oct 25, 2024 • 18
AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions

Paper • 2410.20424 • Published Oct 27, 2024 • 41
Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning

Paper • 2410.22304 • Published Oct 29, 2024 • 18
A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks

Paper • 2410.22391 • Published Oct 29, 2024 • 22
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters

Paper • 2410.23168 • Published Oct 30, 2024 • 24
Stealing User Prompts from Mixture of Experts

Paper • 2410.22884 • Published Oct 30, 2024 • 14
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning

Paper • 2410.02884 • Published Oct 3, 2024 • 55
Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent

Paper • 2411.02265 • Published Nov 4, 2024 • 26
Watermark Anything with Localized Messages

Paper • 2411.07231 • Published Nov 11, 2024 • 22
LLaVA-o1: Let Vision Language Models Reason Step-by-Step

Paper • 2411.10440 • Published Nov 15, 2024 • 128
Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions

Paper • 2411.14405 • Published Nov 21, 2024 • 62
Multimodal Autoregressive Pre-training of Large Vision Encoders

Paper • 2411.14402 • Published Nov 21, 2024 • 47
SpiRit-LM: Interleaved Spoken and Written Language Model

Paper • 2402.05755 • Published Feb 8, 2024 • 15
PaliGemma 2: A Family of Versatile VLMs for Transfer

Paper • 2412.03555 • Published Dec 4, 2024 • 134
Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published Dec 13, 2024 • 147
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 154
A Survey of Small Language Models

Paper • 2410.20011 • Published Oct 25, 2024 • 45
Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 374
Executable Code Actions Elicit Better LLM Agents

Paper • 2402.01030 • Published Feb 1, 2024 • 169
DynaSaur: Large Language Agents Beyond Predefined Actions

Paper • 2411.01747 • Published Nov 4, 2024 • 37
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

Paper • 2408.03314 • Published Aug 6, 2024 • 64
Solving math word problems with process- and outcome-based feedback

Paper • 2211.14275 • Published Nov 25, 2022 • 10
Self-Consistency Improves Chain of Thought Reasoning in Language Models

Paper • 2203.11171 • Published Mar 21, 2022 • 5
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published Jan 4 • 100
A2C is a special case of PPO

Paper • 2205.09123 • Published May 18, 2022 • 1
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published Jan 8 • 284
Search-o1: Agentic Search-Enhanced Large Reasoning Models

Paper • 2501.05366 • Published Jan 9 • 102
ProcessBench: Identifying Process Errors in Mathematical Reasoning

Paper • 2412.06559 • Published Dec 9, 2024 • 85
CodeXEmbed: A Generalist Embedding Model Family for Multiligual and Multi-task Code Retrieval

Paper • 2411.12644 • Published Nov 19, 2024 • 4
PaSa: An LLM Agent for Comprehensive Academic Paper Search

Paper • 2501.10120 • Published Jan 17 • 51
DeepSeek-V3 Technical Report

Paper • 2412.19437 • Published Dec 27, 2024 • 70
Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models

Paper • 2501.11873 • Published Jan 21 • 67
Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary Feedback

Paper • 2501.10799 • Published Jan 18 • 15
ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario

Paper • 2501.10132 • Published Jan 17 • 22
Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models

Paper • 2407.01906 • Published Jul 2, 2024 • 43
LIMA: Less Is More for Alignment

Paper • 2305.11206 • Published May 18, 2023 • 26
Chain-of-Retrieval Augmented Generation

Paper • 2501.14342 • Published Jan 24 • 60
Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch

Paper • 2501.18512 • Published Jan 30 • 30
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 123
VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models

Paper • 2502.02492 • Published Feb 4 • 67
The Lessons of Developing Process Reward Models in Mathematical Reasoning

Paper • 2501.07301 • Published Jan 13 • 100
Learn Your Reference Model for Real Good Alignment

Paper • 2404.09656 • Published Apr 15, 2024 • 88
Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19 • 203
SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound

Paper • 2405.00233 • Published Apr 30, 2024 • 18
R1-Omni: Explainable Omni-Multimodal Emotion Recognition with Reinforcing Learning

Paper • 2503.05379 • Published Mar 7 • 39
Learning from Failures in Multi-Attempt Reinforcement Learning

Paper • 2503.04808 • Published Mar 4 • 18
Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models

Paper • 2503.06749 • Published Mar 9 • 32
Gemini Embedding: Generalizable Embeddings from Gemini

Paper • 2503.07891 • Published Mar 10 • 43
Charting and Navigating Hugging Face's Model Atlas

Paper • 2503.10633 • Published Mar 13 • 88
Modifying Large Language Model Post-Training for Diverse Creative Writing

Paper • 2503.17126 • Published Mar 21 • 37
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

Paper • 2504.01990 • Published Mar 31 • 302
Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme

Paper • 2504.02587 • Published Apr 3 • 33
Skywork R1V: Pioneering Multimodal Reasoning with Chain-of-Thought

Paper • 2504.05599 • Published Apr 8 • 86
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis

Paper • 2504.04842 • Published Apr 7 • 36
DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 139
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

Paper • 2504.11536 • Published Apr 15 • 61
UI-R1: Enhancing Action Prediction of GUI Agents by Reinforcement Learning

Paper • 2503.21620 • Published Mar 27 • 63
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Paper • 2504.13837 • Published Apr 18 • 134
The Bitter Lesson Learned from 2,000+ Multilingual Benchmarks

Paper • 2504.15521 • Published Apr 22 • 64
LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities

Paper • 2504.16078 • Published Apr 22 • 20
Reasoning Models Can Be Effective Without Thinking

Paper • 2504.09858 • Published Apr 14 • 12
AttentionInfluence: Adopting Attention Head Influence for Weak-to-Strong Pretraining Data Selection

Paper • 2505.07293 • Published May 12 • 27
Seed1.5-VL Technical Report

Paper • 2505.07062 • Published May 11 • 150
DreamO: A Unified Framework for Image Customization

Paper • 2504.16915 • Published Apr 23 • 25
MiniMax-Speech: Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder

Paper • 2505.07916 • Published May 12 • 132
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective

Paper • 2505.15045 • Published May 21 • 55
Scaling Law for Quantization-Aware Training

Paper • 2505.14302 • Published May 20 • 76
Web-Shepherd: Advancing PRMs for Reinforcing Web Agents

Paper • 2505.15277 • Published May 21 • 103
Efficient Agent Training for Computer Use

Paper • 2505.13909 • Published May 20 • 45
MMaDA: Multimodal Large Diffusion Language Models

Paper • 2505.15809 • Published May 21 • 96
One RL to See Them All: Visual Triple Unified Reinforcement Learning

Paper • 2505.18129 • Published May 23 • 60
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

Paper • 2505.22617 • Published May 28 • 130
RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval

Paper • 2401.18059 • Published Jan 31, 2024 • 46
SeedVR2: One-Step Video Restoration via Diffusion Adversarial Post-Training

Paper • 2506.05301 • Published Jun 5 • 55
OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens

Paper • 2504.07096 • Published Apr 9 • 76
Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL

Paper • 2508.13167 • Published 29 days ago • 122
Step-Audio 2 Technical Report

Paper • 2507.16632 • Published Jul 22 • 67

Collection guide
Browse collections

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs