Offline Reinforcement Learning for LLM Multi-Step Reasoning Paper • 2412.16145 • Published Dec 20, 2024 • 38
Hymba: A Hybrid-head Architecture for Small Language Models Paper • 2411.13676 • Published Nov 20, 2024 • 41
Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models Paper • 2410.03290 • Published Oct 4, 2024 • 7
MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs Paper • 2410.04698 • Published Oct 7, 2024 • 13
ThinK: Thinner Key Cache by Query-Driven Pruning Paper • 2407.21018 • Published Jul 30, 2024 • 31
LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models Paper • 2306.12420 • Published Jun 21, 2023 • 2
RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment Paper • 2304.06767 • Published Apr 13, 2023 • 2
Towards Unifying Medical Vision-and-Language Pre-training via Soft Prompts Paper • 2302.08958 • Published Feb 17, 2023
Active Prompting with Chain-of-Thought for Large Language Models Paper • 2302.12246 • Published Feb 23, 2023
Mixture-of-Domain-Adapters: Decoupling and Injecting Domain Knowledge to Pre-trained Language Models Memories Paper • 2306.05406 • Published Jun 8, 2023
Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data Paper • 2302.12822 • Published Feb 24, 2023
R-Tuning: Teaching Large Language Models to Refuse Unknown Questions Paper • 2311.09677 • Published Nov 16, 2023 • 3
Can We Verify Step by Step for Incorrect Answer Detection? Paper • 2402.10528 • Published Feb 16, 2024
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning Paper • 2403.17919 • Published Mar 26, 2024 • 16
Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards Paper • 2402.18571 • Published Feb 28, 2024
SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales Paper • 2405.20974 • Published May 31, 2024