Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2412.14689

Perception and abstraction. Each modality is tokenized and embedded into vectors for model to comprehend.

about 18 hours ago

VILA^2: VILA Augmented VILA

Paper • 2407.17453 • Published Jul 24 • 39
Octopus v4: Graph of language models

Paper • 2404.19296 • Published Apr 30 • 116
Octo-planner: On-device Language Model for Planner-Action Agents

Paper • 2406.18082 • Published Jun 26 • 47
Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models

Paper • 2408.15518 • Published Aug 28 • 42

Position Papers

How to Synthesize Text Data without Model Collapse?

Paper • 2412.14689 • Published 4 days ago • 37
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks

Paper • 2412.14161 • Published 4 days ago • 41
Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces

Paper • 2412.14171 • Published 4 days ago • 19
The Open Source Advantage in Large Language Models (LLMs)

Paper • 2412.12004 • Published 6 days ago • 8

How to Synthesize Text Data without Model Collapse?

Paper • 2412.14689 • Published 4 days ago • 37
SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator

Paper • 2412.12094 • Published 6 days ago • 9

VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation

Paper • 2412.10704 • Published 9 days ago • 14
How to Synthesize Text Data without Model Collapse?

Paper • 2412.14689 • Published 4 days ago • 37

Data and other things

MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval

Paper • 2412.14475 • Published 4 days ago • 49
How to Synthesize Text Data without Model Collapse?

Paper • 2412.14689 • Published 4 days ago • 37

How to Synthesize Text Data without Model Collapse?

Paper • 2412.14689 • Published 4 days ago • 37

MIT-10M: A Large Scale Parallel Corpus of Multilingual Image Translation

Paper • 2412.07147 • Published 13 days ago • 5
Grounding Descriptions in Images informs Zero-Shot Visual Recognition

Paper • 2412.04429 • Published 17 days ago
Exploring Multi-Grained Concept Annotations for Multimodal Large Language Models

Paper • 2412.05939 • Published 15 days ago • 12
Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity Visual Descriptions

Paper • 2412.08737 • Published 11 days ago • 51

Evaluating Language Models as Synthetic Data Generators

Paper • 2412.03679 • Published 18 days ago • 43
Smaller Language Models Are Better Instruction Evolvers

Paper • 2412.11231 • Published 7 days ago • 24
How to Synthesize Text Data without Model Collapse?

Paper • 2412.14689 • Published 4 days ago • 37

LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models

Paper • 2410.09732 • Published Oct 13 • 54
How to Synthesize Text Data without Model Collapse?

Paper • 2412.14689 • Published 4 days ago • 37

LLM Pruning and Distillation in Practice: The Minitron Approach

Paper • 2408.11796 • Published Aug 21 • 57
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering

Paper • 2408.09174 • Published Aug 17 • 51
To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20 • 41
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

Paper • 2408.11878 • Published Aug 20 • 52

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs