Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2412.08635

Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping

Paper • 2402.14083 • Published Feb 21 • 47
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 603
Genie: Generative Interactive Environments

Paper • 2402.15391 • Published Feb 23 • 70
Humanoid Locomotion as Next Token Prediction

Paper • 2402.19469 • Published Feb 29 • 26

Interesting new techniques

Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

Paper • 2401.01335 • Published Jan 2 • 64
Lumiere: A Space-Time Diffusion Model for Video Generation

Paper • 2401.12945 • Published Jan 23 • 86
Adding NVMe SSDs to Enable and Accelerate 100B Model Fine-tuning on a Single GPU

Paper • 2403.06504 • Published Mar 11 • 53
Transformer-Lite: High-efficiency Deployment of Large Language Models on Mobile Phone GPUs

Paper • 2403.20041 • Published Mar 29 • 34

multi-modalities

AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model

Paper • 2309.16058 • Published Sep 27, 2023 • 55
OneLLM: One Framework to Align All Modalities with Language

Paper • 2312.03700 • Published Dec 6, 2023 • 20
Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models

Paper • 2402.07865 • Published Feb 12 • 12
SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers

Paper • 2401.08740 • Published Jan 16 • 12

Diffusion Paper

Multimodal Latent Language Modeling with Next-Token Diffusion

Paper • 2412.08635 • Published 7 days ago • 36

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs