Transcribe audio from microphone, file, or YouTube link
Engage in multimedia chat with LLMs and ML models
Generate images from text descriptions