AI Math: Diffusion - a Lirbi Collection

Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Lirbi 's Collections

AI Math: 3DGauss

AI Math: Diffusion

Ciekawe realizacje

AI Math: Diffusion

updated 2 days ago

Controllable Text Generation for Large Language Models: A Survey

Paper • 2408.12599 • Published Aug 22 • 63
xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations

Paper • 2408.12590 • Published Aug 22 • 34
Real-Time Video Generation with Pyramid Attention Broadcast

Paper • 2408.12588 • Published Aug 22 • 15
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20 • 56
MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning

Paper • 2408.11001 • Published Aug 20 • 11
CODE: Confident Ordinary Differential Editing

Paper • 2408.12418 • Published Aug 22 • 4
SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher

Paper • 2408.14176 • Published Aug 26 • 60
Foundation Models for Music: A Survey

Paper • 2408.14340 • Published Aug 26 • 42
Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27 • 121
Distribution Backtracking Builds A Faster Convergence Trajectory for One-step Diffusion Distillation

Paper • 2408.15991 • Published Aug 28 • 15
ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model

Paper • 2408.16767 • Published Aug 29 • 29
Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution

Paper • 2310.16834 • Published Oct 25, 2023 • 4
VisionTS: Visual Masked Autoencoders Are Free-Lunch Zero-Shot Time Series Forecasters

Paper • 2408.17253 • Published Aug 30 • 35
FLUX that Plays Music

Paper • 2409.00587 • Published Sep 1 • 31
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos

Paper • 2409.02095 • Published Sep 3 • 35
Diffusion Policy Policy Optimization

Paper • 2409.00588 • Published Sep 1 • 19
LinFusion: 1 GPU, 1 Minute, 16K Image

Paper • 2409.02097 • Published Sep 3 • 32
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency

Paper • 2409.02634 • Published Sep 4 • 89
FastVoiceGrad: One-step Diffusion-Based Voice Conversion with Adversarial Conditional Diffusion Distillation

Paper • 2409.02245 • Published Sep 3 • 9
Guide-and-Rescale: Self-Guidance Mechanism for Effective Tuning-Free Real Image Editing

Paper • 2409.01322 • Published Sep 2 • 94
Geometry Image Diffusion: Fast and Data-Efficient Text-to-3D with Image-Based Surface Representation

Paper • 2409.03718 • Published Sep 5 • 25
Qihoo-T2X: An Efficiency-Focused Diffusion Transformer via Proxy Tokens for Text-to-Any-Task

Paper • 2409.04005 • Published Sep 6 • 16
SongCreator: Lyrics-based Universal Song Generation

Paper • 2409.06029 • Published Sep 9 • 21
Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis

Paper • 2409.06135 • Published Sep 10 • 14
Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models

Paper • 2409.07452 • Published Sep 11 • 19
Instant Facial Gaussians Translator for Relightable and Interactable Facial Rendering

Paper • 2409.07441 • Published Sep 11 • 10
IFAdapter: Instance Feature Control for Grounded Text-to-Image Generation

Paper • 2409.08240 • Published Sep 12 • 18
DreamHOI: Subject-Driven Generation of 3D Human-Object Interactions with Diffusion Priors

Paper • 2409.08278 • Published Sep 12 • 10
FlashSplat: 2D to 3D Gaussian Splatting Segmentation Solved Optimally

Paper • 2409.08270 • Published Sep 12 • 9
Robust Dual Gaussian Splatting for Immersive Human-centric Volumetric Videos

Paper • 2409.08353 • Published Sep 12 • 10
InstantDrag: Improving Interactivity in Drag-based Image Editing

Paper • 2409.08857 • Published Sep 13 • 30
A Diffusion Approach to Radiance Field Relighting using Multi-Illumination Synthesis

Paper • 2409.08947 • Published Sep 13 • 11
DrawingSpinUp: 3D Animation from Single Character Drawings

Paper • 2409.08615 • Published Sep 13 • 14
Seed-Music: A Unified Framework for High Quality and Controlled Music Generation

Paper • 2409.09214 • Published Sep 13 • 47
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion

Paper • 2409.11406 • Published Sep 17 • 25
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think

Paper • 2409.11355 • Published Sep 17 • 28
OSV: One Step is Enough for High-Quality Image to Video Generation

Paper • 2409.11367 • Published Sep 17 • 13
EzAudio: Enhancing Text-to-Audio Generation with Efficient Diffusion Transformer

Paper • 2409.10819 • Published Sep 17 • 17
SplatFields: Neural Gaussian Splats for Sparse 3D and 4D Reconstruction

Paper • 2409.11211 • Published Sep 17 • 8
Single-Layer Learnable Activation for Implicit Neural Representation (SL^{2}A-INR)

Paper • 2409.10836 • Published Sep 17 • 4
Implicit Neural Representations with Fourier Kolmogorov-Arnold Networks

Paper • 2409.09323 • Published Sep 14 • 5
Towards Diverse and Efficient Audio Captioning via Diffusion Models

Paper • 2409.09401 • Published Sep 14 • 6
Vista3D: Unravel the 3D Darkside of a Single Image

Paper • 2409.12193 • Published Sep 18 • 9
LVCD: Reference-based Lineart Video Colorization with Diffusion Models

Paper • 2409.12960 • Published Sep 19 • 22
3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion

Paper • 2409.12957 • Published Sep 19 • 18
3DGS-LM: Faster Gaussian-Splatting Optimization with Levenberg-Marquardt

Paper • 2409.12892 • Published Sep 19 • 5
Denoising Reuse: Exploiting Inter-frame Motion Consistency for Efficient Video Latent Generation

Paper • 2409.12532 • Published Sep 19 • 5
FlexiTex: Enhancing Texture Generation with Visual Guidance

Paper • 2409.12431 • Published Sep 19 • 11
MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling

Paper • 2409.16160 • Published Sep 24 • 32
Tabular Data Generation using Binary Diffusion

Paper • 2409.13882 • Published Sep 20 • 3
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions

Paper • 2409.15278 • Published Sep 23 • 22
MaterialFusion: Enhancing Inverse Rendering with Material Diffusion Priors

Paper • 2409.15273 • Published Sep 23 • 10
MaskedMimic: Unified Physics-Based Character Control Through Masked Motion Inpainting

Paper • 2409.14393 • Published Sep 22 • 7
SpaceBlender: Creating Context-Rich Collaborative Spaces Through Generative 3D Scene Blending

Paper • 2409.13926 • Published Sep 20 • 5
Self-Supervised Audio-Visual Soundscape Stylization

Paper • 2409.14340 • Published Sep 22 • 2
MuCodec: Ultra Low-Bitrate Music Codec

Paper • 2409.13216 • Published Sep 20 • 22
Portrait Video Editing Empowered by Multimodal Generative Priors

Paper • 2409.13591 • Published Sep 20 • 15
Colorful Diffuse Intrinsic Image Decomposition in the Wild

Paper • 2409.13690 • Published Sep 20 • 12
V^3: Viewing Volumetric Videos on Mobiles via Streamable 2D Dynamic Gaussians

Paper • 2409.13648 • Published Sep 20 • 9
Temporally Aligned Audio for Video with Autoregression

Paper • 2409.13689 • Published Sep 20 • 7
Reflecting Reality: Enabling Diffusion Models to Produce Faithful Mirror Reflections

Paper • 2409.14677 • Published Sep 23 • 14
Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction

Paper • 2409.18124 • Published Sep 26 • 31
Pixel-Space Post-Training of Latent Diffusion Models

Paper • 2409.17565 • Published Sep 26 • 19
Disco4D: Disentangled 4D Human Generation and Animation from a Single Image

Paper • 2409.17280 • Published Sep 25 • 9
DreamWaltz-G: Expressive 3D Gaussian Avatars from Skeleton-Guided 2D Diffusion

Paper • 2409.17145 • Published Sep 25 • 13
Degradation-Guided One-Step Image Super-Resolution with Diffusion Priors

Paper • 2409.17058 • Published Sep 25 • 11
PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation

Paper • 2409.18964 • Published Sep 27 • 25
Image Copy Detection for Diffusion Models

Paper • 2409.19952 • Published Sep 30 • 12
Posterior-Mean Rectified Flow: Towards Minimum MSE Photo-Realistic Image Restoration

Paper • 2410.00418 • Published Oct 1 • 9
SyntheOcc: Synthesize Geometric-Controlled Street View Images through 3D Semantic MPIs

Paper • 2410.00337 • Published Oct 1 • 10
DressRecon: Freeform 4D Human Reconstruction from Monocular Video

Paper • 2409.20563 • Published Sep 30 • 7
Flex3D: Feed-Forward 3D Generation With Flexible Reconstruction Model And Input View Curation

Paper • 2410.00890 • Published Oct 1 • 18
Cottention: Linear Transformers With Cosine Attention

Paper • 2409.18747 • Published Sep 27 • 15
Addition is All You Need for Energy-efficient Language Models

Paper • 2410.00907 • Published Oct 1 • 144
ComfyGen: Prompt-Adaptive Workflows for Text-to-Image Generation

Paper • 2410.01731 • Published Oct 2 • 15
3DGS-DET: Empower 3D Gaussian Splatting with Boundary Guidance and Box-Focused Sampling for 3D Object Detection

Paper • 2410.01647 • Published Oct 2 • 28
HarmoniCa: Harmonizing Training and Inference for Better Feature Cache in Diffusion Transformer Acceleration

Paper • 2410.01723 • Published Oct 2 • 4
Eliminating Oversaturation and Artifacts of High Guidance Scales in Diffusion Models

Paper • 2410.02416 • Published Oct 3 • 25
PHI-S: Distribution Balancing for Label-Free Multi-Teacher Distillation

Paper • 2410.01680 • Published Oct 2 • 32
EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control

Paper • 2410.00316 • Published Oct 1 • 4
VideoGuide: Improving Video Diffusion Models without Training Through a Teacher's Guide

Paper • 2410.04364 • Published Oct 6 • 27
Presto! Distilling Steps and Layers for Accelerating Music Generation

Paper • 2410.05167 • Published Oct 7 • 15
OmniBooth: Learning Latent Control for Image Synthesis with Multi-modal Instruction

Paper • 2410.04932 • Published Oct 7 • 9
RoCoTex: A Robust Method for Consistent Texture Synthesis with Diffusion Models

Paper • 2409.19989 • Published Sep 30 • 17
Redefining Temporal Modeling in Video Diffusion: The Vectorized Timestep Approach

Paper • 2410.03160 • Published Oct 4 • 4
SePPO: Semi-Policy Preference Optimization for Diffusion Alignment

Paper • 2410.05255 • Published Oct 7 • 4
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation

Paper • 2410.07171 • Published Oct 9 • 41
Pyramidal Flow Matching for Efficient Video Generative Modeling

Paper • 2410.05954 • Published Oct 8 • 37
TweedieMix: Improving Multi-Concept Fusion for Diffusion-based Image/Video Generation

Paper • 2410.05591 • Published Oct 8 • 13
Holistic Unlearning Benchmark: A Multi-Faceted Evaluation for Text-to-Image Diffusion Model Unlearning

Paper • 2410.05664 • Published Oct 8 • 7
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching

Paper • 2410.06885 • Published Oct 9 • 40
Diversity-Rewarded CFG Distillation

Paper • 2410.06084 • Published Oct 8 • 10
DICE: Discrete Inversion Enabling Controllable Editing for Multinomial Diffusion and Masked Generative Models

Paper • 2410.08207 • Published Oct 10 • 18
Semantic Score Distillation Sampling for Compositional Text-to-3D Generation

Paper • 2410.09009 • Published Oct 11 • 13
DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation

Paper • 2410.08159 • Published Oct 10 • 24
Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow

Paper • 2410.07303 • Published Oct 9 • 17
Progressive Autoregressive Video Diffusion Models

Paper • 2410.08151 • Published Oct 10 • 15
ViBiDSampler: Enhancing Video Interpolation Using Bidirectional Diffusion Sampler

Paper • 2410.05651 • Published Oct 8 • 13
Animate-X: Universal Character Image Animation with Enhanced Motion Representation

Paper • 2410.10306 • Published Oct 14 • 52
Cavia: Camera-controllable Multi-view Video Diffusion with View-Integrated Attention

Paper • 2410.10774 • Published Oct 14 • 24
Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations

Paper • 2410.10792 • Published Oct 14 • 26
Generalizable Humanoid Manipulation with Improved 3D Diffusion Policies

Paper • 2410.10803 • Published Oct 14 • 6
Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices

Paper • 2410.11795 • Published Oct 15 • 16
Constant Acceleration Flow

Paper • 2411.00322 • Published 24 days ago • 22
In-Context LoRA for Diffusion Transformers

Paper • 2410.23775 • Published 25 days ago • 10
Minimum Entropy Coupling with Bottleneck

Paper • 2410.21666 • Published 27 days ago • 4
Task Vectors are Cross-Modal

Paper • 2410.22330 • Published 26 days ago • 11
MarDini: Masked Autoregressive Diffusion for Video Generation at Scale

Paper • 2410.20280 • Published 29 days ago • 21
Continuous Speech Synthesis using per-token Latent Diffusion

Paper • 2410.16048 • Published Oct 21 • 28
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality

Paper • 2410.19355 • Published about 1 month ago • 23
SMITE: Segment Me In TimE

Paper • 2410.18538 • Published Oct 24 • 15
Scaling Diffusion Language Models via Adaptation from Autoregressive Models

Paper • 2410.17891 • Published Oct 23 • 15
DPLM-2: A Multimodal Diffusion Protein Language Model

Paper • 2410.13782 • Published Oct 17 • 19
Diffusion Curriculum: Synthetic-to-Real Generative Curriculum Learning via Image-Guided Diffusion

Paper • 2410.13674 • Published Oct 17 • 15
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion

Paper • 2411.04928 • Published 17 days ago • 48
SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models

Paper • 2411.05007 • Published 17 days ago • 16
SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation

Paper • 2411.04989 • Published 17 days ago • 13
Controlling Language and Diffusion Models by Transporting Activations

Paper • 2410.23054 • Published 25 days ago • 16
DreamPolish: Domain Score Distillation With Progressive Geometry Generation

Paper • 2411.01602 • Published 21 days ago • 9
Constrained Diffusion Implicit Models

Paper • 2411.00359 • Published 24 days ago • 5
MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D

Paper • 2411.02336 • Published 20 days ago • 23
Scaling Properties of Diffusion Models for Perceptual Tasks

Paper • 2411.08034 • Published 12 days ago • 13
Wavelet Latent Diffusion (Wala): Billion-Parameter 3D Generative Model with Compact Wavelet Encodings

Paper • 2411.08017 • Published 12 days ago • 11
Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models

Paper • 2411.07232 • Published 13 days ago • 60
Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models

Paper • 2411.07126 • Published 13 days ago • 28
GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation

Paper • 2411.08033 • Published 12 days ago • 21
Generative World Explorer

Paper • 2411.11844 • Published 6 days ago • 63
Stylecodes: Encoding Stylistic Information For Image Generation

Paper • 2411.12811 • Published 5 days ago • 7
Stable Flow: Vital Layers for Training-Free Image Editing

Paper • 2411.14430 • Published 3 days ago • 10

Collection guide
Browse collections

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs