NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models Paper • 2403.03100 • Published Mar 5 • 34
MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models Paper • 2310.11954 • Published Oct 18, 2023 • 25
UniAudio: An Audio Foundation Model Toward Universal Audio Generation Paper • 2310.00704 • Published Oct 1, 2023 • 21
Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers Paper • 2309.08532 • Published Sep 15, 2023 • 53
PromptTTS 2: Describing and Generating Voices with Text Prompt Paper • 2309.02285 • Published Sep 5, 2023 • 11
Pre-Trained Large Language Models for Industrial Control Paper • 2308.03028 • Published Aug 6, 2023 • 6
GETMusic: Generating Any Music Tracks with a Unified Representation and Diffusion Framework Paper • 2305.10841 • Published May 18, 2023 • 2