Collections
Discover the best community collections!
Collections including paper arxiv:2412.00154
-
Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTS
Paper • 2411.18478 • Published • 26 -
o1-Coder: an o1 Replication for Coding
Paper • 2412.00154 • Published • 29 -
A Simple and Provable Scaling Law for the Test-Time Compute of Large Language Models
Paper • 2411.19477 • Published • 3 -
Reverse Thinking Makes LLMs Stronger Reasoners
Paper • 2411.19865 • Published • 13
-
AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant
Paper • 2410.18603 • Published • 30 -
A Survey of Small Language Models
Paper • 2410.20011 • Published • 39 -
Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines
Paper • 2410.21220 • Published • 8 -
o1-Coder: an o1 Replication for Coding
Paper • 2412.00154 • Published • 29
-
Mamba-YOLO-World: Marrying YOLO-World with Mamba for Open-Vocabulary Detection
Paper • 2409.08513 • Published • 11 -
Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale
Paper • 2409.08264 • Published • 43 -
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution
Paper • 2409.12191 • Published • 74 -
LLMs + Persona-Plug = Personalized LLMs
Paper • 2409.11901 • Published • 30
-
LinFusion: 1 GPU, 1 Minute, 16K Image
Paper • 2409.02097 • Published • 32 -
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
Paper • 2409.11406 • Published • 25 -
Diffusion Models Are Real-Time Game Engines
Paper • 2408.14837 • Published • 121 -
Segment Anything with Multiple Modalities
Paper • 2408.09085 • Published • 21
-
LLM Pruning and Distillation in Practice: The Minitron Approach
Paper • 2408.11796 • Published • 55 -
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering
Paper • 2408.09174 • Published • 51 -
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Paper • 2408.10914 • Published • 41 -
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications
Paper • 2408.11878 • Published • 51
-
PAS: Data-Efficient Plug-and-Play Prompt Augmentation System
Paper • 2407.06027 • Published • 8 -
SpreadsheetLLM: Encoding Spreadsheets for Large Language Models
Paper • 2407.09025 • Published • 129 -
Toto: Time Series Optimized Transformer for Observability
Paper • 2407.07874 • Published • 29 -
SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers
Paper • 2407.09413 • Published • 9
-
RLHF Workflow: From Reward Modeling to Online RLHF
Paper • 2405.07863 • Published • 66 -
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Paper • 2405.09818 • Published • 126 -
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models
Paper • 2405.15574 • Published • 53 -
An Introduction to Vision-Language Modeling
Paper • 2405.17247 • Published • 86