An end-to-end (e2e) Voice Language Model by Fish Audio.
a tiny vision language model
Gradio demo of CogView-3-Plus
MaskGCT TTS Demo
Demo EraX-NSFW-V1.0