Generate Talking avatars from Text-to-Speech
Identify emotion from multi-lingual audio
Combine voice cloning and portrait lipsync animation