SAVGBench: Benchmarking Spatially Aligned Audio-Video Generation Paper • 2412.13462 • Published 19 days ago
A Simple but Strong Baseline for Sounding Video Generation: Effective Adaptation of Audio and Video Diffusion Models for Joint Generation Paper • 2409.17550 • Published Sep 26, 2024
Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis Paper • 2412.15322 • Published 17 days ago • 16
Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis Paper • 2412.15322 • Published 17 days ago • 16
BigVSAN: Enhancing GAN-based Neural Vocoders with Slicing Adversarial Network Paper • 2309.02836 • Published Sep 6, 2023
GenWarp: Single Image to Novel Views with Semantic-Preserving Generative Warping Paper • 2405.17251 • Published May 27, 2024 • 2
SAN: Inducing Metrizability of GAN with Discriminative Normalized Linear Layer Paper • 2301.12811 • Published Jan 30, 2023
Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation Paper • 2405.14598 • Published May 23, 2024 • 11
SoundCTM: Uniting Score-based and Consistency Models for Text-to-Sound Generation Paper • 2405.18503 • Published May 28, 2024 • 9