diffusion - a williamcstanford Collection

williamcstanford 's Collections

video segmentation

RL

LLMs

Autonomous agents

Transformer improvements

video understanding

brain

singing portraits

Depth Estimation

Cellular Automata DL

Code Understanding

diffusion

updated Jul 12

A Picture is Worth a Thousand Words: Principled Recaptioning Improves Image Generation

Paper • 2310.16656 • Published Oct 25, 2023 • 40
CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images

Paper • 2310.16825 • Published Oct 25, 2023 • 32
Matryoshka Diffusion Models

Paper • 2310.15111 • Published Oct 23, 2023 • 40
I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models

Paper • 2311.04145 • Published Nov 7, 2023 • 32
LCM-LoRA: A Universal Stable-Diffusion Acceleration Module

Paper • 2311.05556 • Published Nov 9, 2023 • 80
The Chosen One: Consistent Characters in Text-to-Image Diffusion Models

Paper • 2311.10093 • Published Nov 16, 2023 • 57
AutoStory: Generating Diverse Storytelling Images with Minimal Human Effort

Paper • 2311.11243 • Published Nov 19, 2023 • 14
NeuroPrompts: An Adaptive Framework to Optimize Prompts for Text-to-Image Generation

Paper • 2311.12229 • Published Nov 20, 2023 • 26
MagicDance: Realistic Human Dance Video Generation with Motions & Facial Expressions Transfer

Paper • 2311.12052 • Published Nov 18, 2023 • 32
GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning

Paper • 2311.12631 • Published Nov 21, 2023 • 13
VideoBooth: Diffusion-based Video Generation with Image Prompts

Paper • 2312.00777 • Published Dec 1, 2023 • 21
DreamVideo: Composing Your Dream Videos with Customized Subject and Motion

Paper • 2312.04433 • Published Dec 7, 2023 • 9
Clockwork Diffusion: Efficient Generation With Model-Step Distillation

Paper • 2312.08128 • Published Dec 13, 2023 • 12
StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation

Paper • 2312.12491 • Published Dec 19, 2023 • 69
DreamTuner: Single Image is Enough for Subject-Driven Generation

Paper • 2312.13691 • Published Dec 21, 2023 • 26
I2V-Adapter: A General Image-to-Video Adapter for Video Diffusion Models

Paper • 2312.16693 • Published Dec 27, 2023 • 13
VideoDrafter: Content-Consistent Multi-Scene Video Generation with LLM

Paper • 2401.01256 • Published Jan 2 • 19
Improving Diffusion-Based Image Synthesis with Context Prediction

Paper • 2401.02015 • Published Jan 4 • 6
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

Paper • 2401.02954 • Published Jan 5 • 40
Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation

Paper • 2401.05675 • Published Jan 11 • 20
Object-Centric Diffusion for Efficient Video Editing

Paper • 2401.05735 • Published Jan 11 • 6
PALP: Prompt Aligned Personalization of Text-to-Image Models

Paper • 2401.06105 • Published Jan 11 • 46
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers

Paper • 2401.11605 • Published Jan 21 • 21
Learning Continuous 3D Words for Text-to-Image Generation

Paper • 2402.08654 • Published Feb 13 • 9
PRDP: Proximal Reward Difference Prediction for Large-Scale Reward Finetuning of Diffusion Models

Paper • 2402.08714 • Published Feb 13 • 10
FiT: Flexible Vision Transformer for Diffusion Model

Paper • 2402.12376 • Published Feb 19 • 48
Neural Network Diffusion

Paper • 2402.13144 • Published Feb 20 • 94
DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models

Paper • 2402.19481 • Published Feb 29 • 20
StreamMultiDiffusion: Real-Time Interactive Generation with Region-Based Semantic Control

Paper • 2403.09055 • Published Mar 14 • 24
CosmoCLIP: Generalizing Large Vision-Language Models for Astronomical Imaging

Paper • 2407.07315 • Published Jul 10 • 6