Exciting Papers - a Ksgk-fy Collection

Ksgk-fy 's Collections

RL

Representation & Optimization

Exciting Papers

Memory

What I don't understand

Exciting Papers

updated Sep 12, 2024

Our curated list of AI papers @Temus AI

Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models

Paper • 2310.04406 • Published Oct 6, 2023 • 10
Note Top reasoning trick on HummanEval: MCTS + LLM + Feedback + Reflection @UIUC
Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15, 2024 • 109
Note Our Re-Implementation code: https://github.com/fangyuan-ksgk/CoT-Reasoning-without-Prompting Insight: Decoding time reasoning is cheap, effective, and can bring out the 'inherent' reasoning capacity from pre-trained LLM. Drawback: Indentification of the set of answer, and its location reamains the million dollar question.
ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization

Paper • 2402.09320 • Published Feb 14, 2024 • 6
Note In-Context-Learning based preference alignment, performance on-par with Supervised Fine-Tuning (SFT). Can be used to generated optimal preference pairs, or augment the preference dataset.
Self-Discover: Large Language Models Self-Compose Reasoning Structures

Paper • 2402.03620 • Published Feb 6, 2024 • 117
Note Self-Discover solves any task in three steps: Pickging a reasoning structure, designing a stepwise reasoning plan, then implement the thinking process to get the answer. Significant performance improvement is observed.
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 151
Note Meta's work on iterative self-improvement of LLM.
Direct Language Model Alignment from Online AI Feedback

Paper • 2402.04792 • Published Feb 7, 2024 • 34
Note A simplification of Meta's self-rewarding LLM, relying on LLM's innate capacity of understanding the preference shown in the original labeled dataset, and use it to gives thumb up & down, which are then feed back to the model weight through DPO.
Matryoshka Representation Learning

Paper • 2205.13147 • Published May 26, 2022 • 24
Recursive Introspection: Teaching Language Model Agents How to Self-Improve

Paper • 2407.18219 • Published Jul 25, 2024 • 3
Mixture of A Million Experts

Paper • 2407.04153 • Published Jul 4, 2024 • 5
LoQT: Low Rank Adapters for Quantized Training

Paper • 2405.16528 • Published May 26, 2024 • 3
MindSearch: Mimicking Human Minds Elicits Deep AI Searcher

Paper • 2407.20183 • Published Jul 29, 2024 • 43
Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge

Paper • 2407.19594 • Published Jul 28, 2024 • 21
Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability

Paper • 2408.07852 • Published Aug 14, 2024 • 16
Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models

Paper • 2408.15518 • Published Aug 28, 2024 • 42
Iterative Graph Alignment

Paper • 2408.16667 • Published Aug 29, 2024 • 2
Law of Vision Representation in MLLMs

Paper • 2408.16357 • Published Aug 29, 2024 • 95
LinFusion: 1 GPU, 1 Minute, 16K Image

Paper • 2409.02097 • Published Sep 3, 2024 • 34
Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized Academic Assistance

Paper • 2409.04593 • Published Sep 6, 2024 • 26
Agent Workflow Memory

Paper • 2409.07429 • Published Sep 11, 2024 • 31