MemGovern: Enhancing Code Agents through Learning from Governed Human Experiences Paper • 2601.06789 • Published 7 days ago • 74
Stream-Omni: Simultaneous Multimodal Interactions with Large Language-Vision-Speech Model Paper • 2506.13642 • Published Jun 16, 2025 • 27
LLaMA-Omni2: LLM-based Real-time Spoken Chatbot with Autoregressive Streaming Speech Synthesis Paper • 2505.02625 • Published May 5, 2025 • 22
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token Paper • 2501.03895 • Published Jan 7, 2025 • 52
LLaMA-Omni: Seamless Speech Interaction with Large Language Models Paper • 2409.06666 • Published Sep 10, 2024 • 60
TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space Paper • 2402.17811 • Published Feb 27, 2024 • 1