Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2412.06769

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6 • 25
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6 • 12
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7 • 39
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7 • 20

Perception and abstraction. Each modality is tokenized and embedded into vectors for model to comprehend.

VILA^2: VILA Augmented VILA

Paper • 2407.17453 • Published Jul 24 • 39
Octopus v4: Graph of language models

Paper • 2404.19296 • Published Apr 30 • 116
Octo-planner: On-device Language Model for Planner-Action Agents

Paper • 2406.18082 • Published Jun 26 • 47
Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models

Paper • 2408.15518 • Published Aug 28 • 42

about 10 hours ago

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18 • 144
Orion-14B: Open-source Multilingual Large Language Models

Paper • 2401.12246 • Published Jan 20 • 12
MambaByte: Token-free Selective State Space Model

Paper • 2401.13660 • Published Jan 24 • 51
MM-LLMs: Recent Advances in MultiModal Large Language Models

Paper • 2401.13601 • Published Jan 24 • 45

Papers - CoT - Latent Search Tree

Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published 9 days ago • 54

Papers - Reasoning - CoT - Tree Search - BFS

Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published 9 days ago • 54

Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published 9 days ago • 54

about 9 hours ago

Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published 9 days ago • 54
Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published 5 days ago • 50

Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published 9 days ago • 54

Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published 9 days ago • 54

about 7 hours ago

Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published 9 days ago • 54
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

Paper • 2408.03314 • Published Aug 6 • 44
Solving math word problems with process- and outcome-based feedback

Paper • 2211.14275 • Published Nov 25, 2022 • 5

Previous
1
2
3
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs