The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio Paper • 2410.12787 • Published Oct 16 • 30
Towards a Unified View of Preference Learning for Large Language Models: A Survey Paper • 2409.02795 • Published Sep 4 • 71
Towards Achieving Human Parity on End-to-end Simultaneous Speech Translation via LLM Agent Paper • 2407.21646 • Published Jul 31 • 18
Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning Paper • 2407.18248 • Published Jul 25 • 31
Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning Paper • 2407.18248 • Published Jul 25 • 31
Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning Paper • 2407.18248 • Published Jul 25 • 31 • 4
Learning Multi-Step Reasoning by Solving Arithmetic Tasks Paper • 2306.01707 • Published Jun 2, 2023 • 1
NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window? Paper • 2407.11963 • Published Jul 16 • 43
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs Paper • 2406.18629 • Published Jun 26 • 41