Exciting Papers
Our curated list of AI papers @Temus AI
Paper • 2310.04406 • Published • 8Note Top reasoning trick on HummanEval: MCTS + LLM + Feedback + Reflection @UIUC
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 99Note Our Re-Implementation code: https://github.com/fangyuan-ksgk/CoT-Reasoning-without-Prompting Insight: Decoding time reasoning is cheap, effective, and can bring out the 'inherent' reasoning capacity from pre-trained LLM. Drawback: Indentification of the set of answer, and its location reamains the million dollar question.
ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization
Paper • 2402.09320 • Published • 6Note In-Context-Learning based preference alignment, performance on-par with Supervised Fine-Tuning (SFT). Can be used to generated optimal preference pairs, or augment the preference dataset.
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 109Note Self-Discover solves any task in three steps: Pickging a reasoning structure, designing a stepwise reasoning plan, then implement the thinking process to get the answer. Significant performance improvement is observed.
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 143Note Meta's work on iterative self-improvement of LLM.
Direct Language Model Alignment from Online AI Feedback
Paper • 2402.04792 • Published • 29Note A simplification of Meta's self-rewarding LLM, relying on LLM's innate capacity of understanding the preference shown in the original labeled dataset, and use it to gives thumb up & down, which are then feed back to the model weight through DPO.
Matryoshka Representation Learning
Paper • 2205.13147 • Published • 9Recursive Introspection: Teaching Language Model Agents How to Self-Improve
Paper • 2407.18219 • Published • 3Mixture of A Million Experts
Paper • 2407.04153 • Published • 4LoQT: Low Rank Adapters for Quantized Training
Paper • 2405.16528 • Published • 3MindSearch: Mimicking Human Minds Elicits Deep AI Searcher
Paper • 2407.20183 • Published • 37Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge
Paper • 2407.19594 • Published • 19Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability
Paper • 2408.07852 • Published • 14Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models
Paper • 2408.15518 • Published • 42Iterative Graph Alignment
Paper • 2408.16667 • Published • 2Law of Vision Representation in MLLMs
Paper • 2408.16357 • Published • 92LinFusion: 1 GPU, 1 Minute, 16K Image
Paper • 2409.02097 • Published • 31Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized Academic Assistance
Paper • 2409.04593 • Published • 22Agent Workflow Memory
Paper • 2409.07429 • Published • 27