-
Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering
Paper • 2411.11504 • Published • 19 -
Top-nσ: Not All Logits Are You Need
Paper • 2411.07641 • Published • 18 -
Adaptive Decoding via Latent Preference Optimization
Paper • 2411.09661 • Published • 10 -
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training
Paper • 2411.13476 • Published • 14
Collections
Discover the best community collections!
Collections including paper arxiv:2412.06769
-
The Impact of Hyperparameters on Large Language Model Inference Performance: An Evaluation of vLLM and HuggingFace Pipelines
Paper • 2408.01050 • Published • 8 -
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
Paper • 2408.03314 • Published • 44 -
Towards a Unified View of Preference Learning for Large Language Models: A Survey
Paper • 2409.02795 • Published • 71 -
Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized Academic Assistance
Paper • 2409.04593 • Published • 23
-
Contrastive Decoding Improves Reasoning in Large Language Models
Paper • 2309.09117 • Published • 37 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 102 -
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
Paper • 2403.14624 • Published • 51 -
Chain of Thought Empowers Transformers to Solve Inherently Serial Problems
Paper • 2402.12875 • Published • 13
-
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Paper • 2403.03507 • Published • 183 -
RAFT: Adapting Language Model to Domain Specific RAG
Paper • 2403.10131 • Published • 67 -
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
Paper • 2403.13372 • Published • 62 -
InternLM2 Technical Report
Paper • 2403.17297 • Published • 30
-
Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models
Paper • 2402.14848 • Published • 18 -
Teaching Large Language Models to Reason with Reinforcement Learning
Paper • 2403.04642 • Published • 46 -
How Far Are We from Intelligent Visual Deductive Reasoning?
Paper • 2403.04732 • Published • 19 -
Learning to Reason and Memorize with Self-Notes
Paper • 2305.00833 • Published • 4
-
LoRA+: Efficient Low Rank Adaptation of Large Models
Paper • 2402.12354 • Published • 6 -
The FinBen: An Holistic Financial Benchmark for Large Language Models
Paper • 2402.12659 • Published • 17 -
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Paper • 2402.13249 • Published • 11 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 65
-
Text-to-3D using Gaussian Splatting
Paper • 2309.16585 • Published • 31 -
FP8-LM: Training FP8 Large Language Models
Paper • 2310.18313 • Published • 33 -
Zephyr: Direct Distillation of LM Alignment
Paper • 2310.16944 • Published • 122 -
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Paper • 2312.06585 • Published • 28