Agentic Robot: A Brain-Inspired Framework for Vision-Language-Action Models in Embodied Agents Paper • 2505.23450 • Published May 29 • 9
SAMed-2: Selective Memory Enhanced Medical Segment Anything Model Paper • 2507.03698 • Published Jul 4 • 11
BlenderLLM: Training Large Language Models for Computer-Aided Design with Self-improvement Paper • 2412.14203 • Published Dec 16, 2024 • 1
On the Compositional Generalization of Multimodal LLMs for Medical Imaging Paper • 2412.20070 • Published Dec 28, 2024 • 47
Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination Paper • 2411.03823 • Published Nov 6, 2024 • 50
Less is More: A Simple yet Effective Token Reduction Method for Efficient Multi-modal LLMs Paper • 2409.10994 • Published Sep 17, 2024 • 1
HuatuoGPT-II, One-stage Training for Medical Adaption of LLMs Paper • 2311.09774 • Published Nov 16, 2023 • 1
TCBERT: A Technical Report for Chinese Topic Classification BERT Paper • 2211.11304 • Published Nov 21, 2022
LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture Paper • 2409.02889 • Published Sep 4, 2024 • 55