MinMo: A Multimodal Large Language Model for Seamless Voice Interaction Paper โข 2501.06282 โข Published 8 days ago โข 32
CosyVoice 2: Scalable Streaming Speech Synthesis with Large Language Models Paper โข 2412.10117 โข Published Dec 13, 2024 โข 2