Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2301.07093

Papers - Image - Keypoint

GLIGEN: Open-Set Grounded Text-to-Image Generation

Paper • 2301.07093 • Published Jan 17, 2023 • 3

Papers - Image - Inpainting

GLIGEN: Open-Set Grounded Text-to-Image Generation

Paper • 2301.07093 • Published Jan 17, 2023 • 3

Papers - Image - Object Detection - YOLO

GLIGEN: Open-Set Grounded Text-to-Image Generation

Paper • 2301.07093 • Published Jan 17, 2023 • 3
YOLO-World: Real-Time Open-Vocabulary Object Detection

Paper • 2401.17270 • Published Jan 30 • 32

Papers - Inference - Scheduled Sampling

improved visual quality as the rough concept location and outline are decided in the early stages, followed by fine-grained details in later stages.

GLIGEN: Open-Set Grounded Text-to-Image Generation

Paper • 2301.07093 • Published Jan 17, 2023 • 3

Papers - Attention - Gated Self-Attentio - Spatial Grounding

GLIGEN: Open-Set Grounded Text-to-Image Generation

Paper • 2301.07093 • Published Jan 17, 2023 • 3

Papers - Text - Instruct - Grounding and Captions

GLIGEN: Open-Set Grounded Text-to-Image Generation

Paper • 2301.07093 • Published Jan 17, 2023 • 3

Papers - Image - Glip

Core techniques: 1) unified grounding loss 2) language-aware deep fusion 3) pre-training with both types of data.

Grounded Language-Image Pre-training

Paper • 2112.03857 • Published Dec 7, 2021 • 3
GLIGEN: Open-Set Grounded Text-to-Image Generation

Paper • 2301.07093 • Published Jan 17, 2023 • 3
YOLO-World: Real-Time Open-Vocabulary Object Detection

Paper • 2401.17270 • Published Jan 30 • 32

Papers - Image - Dataset - LVIS

Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models

Paper • 2404.07973 • Published Apr 11 • 30
COCONut: Modernizing COCO Segmentation

Paper • 2404.08639 • Published Apr 12 • 27
GLIGEN: Open-Set Grounded Text-to-Image Generation

Paper • 2301.07093 • Published Jan 17, 2023 • 3
Grounded Language-Image Pre-training

Paper • 2112.03857 • Published Dec 7, 2021 • 3

Papers - University - Columbia University

Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models

Paper • 2404.07973 • Published Apr 11 • 30
GLIGEN: Open-Set Grounded Text-to-Image Generation

Paper • 2301.07093 • Published Jan 17, 2023 • 3
PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation

Paper • 2404.13026 • Published Apr 19 • 23
MoDE: CLIP Data Experts via Clustering

Paper • 2404.16030 • Published Apr 24 • 12

Papers - Image - Frechet Inception Distance (FID)

https://machinelearningmastery.com/how-to-implement-the-frechet-inception-distance-fid-from-scratch/

about 2 hours ago

Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion

Paper • 2310.03502 • Published Oct 5, 2023 • 77
GLIGEN: Open-Set Grounded Text-to-Image Generation

Paper • 2301.07093 • Published Jan 17, 2023 • 3
Music Consistency Models

Paper • 2404.13358 • Published Apr 20 • 12
Align Your Steps: Optimizing Sampling Schedules in Diffusion Models

Paper • 2404.14507 • Published Apr 22 • 21

Previous
1
2
3
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs