Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2404.19753

DOCCI: Descriptions of Connected and Contrasting Images

Paper • 2404.19753 • Published Apr 30 • 11

InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation

Paper • 2404.19427 • Published Apr 30 • 71
MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model

Paper • 2404.19759 • Published Apr 30 • 24
Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation

Paper • 2404.19752 • Published Apr 30 • 22
Invisible Stitch: Generating Smooth 3D Scenes with Depth Inpainting

Paper • 2404.19758 • Published Apr 30 • 10

Papers - Image - Annotation UI

DOCCI: Descriptions of Connected and Contrasting Images

Paper • 2404.19753 • Published Apr 30 • 11

Papers - Image - Annotation Pipeline

DOCCI: Descriptions of Connected and Contrasting Images

Paper • 2404.19753 • Published Apr 30 • 11

Papers - Image - Datasets - DOCCI

DOCCI: Descriptions of Connected and Contrasting Images

Paper • 2404.19753 • Published Apr 30 • 11

Papers - University of North Carolina Chapel Hill

Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model

Paper • 2404.09967 • Published Apr 15 • 20
DOCCI: Descriptions of Connected and Contrasting Images

Paper • 2404.19753 • Published Apr 30 • 11

Papers - University - Princeton University

JetMoE: Reaching Llama2 Performance with 0.1M Dollars

Paper • 2404.07413 • Published Apr 11 • 36
Allowing humans to interactively guide machines where to look does not always improve a human-AI team's classification accuracy

Paper • 2404.05238 • Published Apr 8 • 3
Cognitive Architectures for Language Agents

Paper • 2309.02427 • Published Sep 5, 2023 • 8
Latent Positional Information is in the Self-Attention Variance of Transformer Language Models Without Positional Embeddings

Paper • 2305.13571 • Published May 23, 2023 • 2

Papers - Image - Frechet Inception Distance (FID)

https://machinelearningmastery.com/how-to-implement-the-frechet-inception-distance-fid-from-scratch/

about 2 hours ago

Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion

Paper • 2310.03502 • Published Oct 5, 2023 • 77
GLIGEN: Open-Set Grounded Text-to-Image Generation

Paper • 2301.07093 • Published Jan 17, 2023 • 3
Music Consistency Models

Paper • 2404.13358 • Published Apr 20 • 12
Align Your Steps: Optimizing Sampling Schedules in Diffusion Models

Paper • 2404.14507 • Published Apr 22 • 21

Papers - Image - Coco Testing

about 2 hours ago

Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion

Paper • 2310.03502 • Published Oct 5, 2023 • 77
Transferable and Principled Efficiency for Open-Vocabulary Segmentation

Paper • 2404.07448 • Published Apr 11 • 11
Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models

Paper • 2404.07973 • Published Apr 11 • 30
COCONut: Modernizing COCO Segmentation

Paper • 2404.08639 • Published Apr 12 • 27

Papers - Google

Lumiere: A Space-Time Diffusion Model for Video Generation

Paper • 2401.12945 • Published Jan 23 • 86
Long-form factuality in large language models

Paper • 2403.18802 • Published Mar 27 • 24
ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object Removal and Insertion

Paper • 2403.18818 • Published Mar 27 • 25
TC4D: Trajectory-Conditioned Text-to-4D Generation

Paper • 2403.17920 • Published Mar 26 • 16

Previous
1
2
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs