Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2412.08905

LLM Pruning and Distillation in Practice: The Minitron Approach

Paper • 2408.11796 • Published Aug 21 • 57
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering

Paper • 2408.09174 • Published Aug 17 • 51
To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20 • 41
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

Paper • 2408.11878 • Published Aug 20 • 52

Papers - Text - Training - Long Context

Qwen2 Technical Report

Paper • 2407.10671 • Published Jul 15 • 160
Phi-4 Technical Report

Paper • 2412.08905 • Published 7 days ago • 84

Papers - Phi - Technical Report

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published Apr 22 • 253
Phi-4 Technical Report

Paper • 2412.08905 • Published 7 days ago • 84

Rho-1: Not All Tokens Are What You Need

Paper • 2404.07965 • Published Apr 11 • 85
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Paper • 2404.10667 • Published Apr 16 • 18
Instruction-tuned Language Models are Better Knowledge Learners

Paper • 2402.12847 • Published Feb 20 • 25
DoRA: Weight-Decomposed Low-Rank Adaptation

Paper • 2402.09353 • Published Feb 14 • 26

Papers - Dataset Generation - Guide

AI Competitions and Benchmarks: Dataset Development

Paper • 2404.09703 • Published Apr 15 • 1
Phi-4 Technical Report

Paper • 2412.08905 • Published 7 days ago • 84

Papers - Training Research - Dataset Ordering

Instruction Tuning with Human Curriculum

Paper • 2310.09518 • Published Oct 14, 2023 • 3
Phi-4 Technical Report

Paper • 2412.08905 • Published 7 days ago • 84

Papers - Fine-tuning - SFT

InternLM2 Technical Report

Paper • 2403.17297 • Published Mar 26 • 30
sDPO: Don't Use Your Data All at Once

Paper • 2403.19270 • Published Mar 28 • 40
Learn Your Reference Model for Real Good Alignment

Paper • 2404.09656 • Published Apr 15 • 82
OpenBezoar: Small, Cost-Effective and Open Models Trained on Mixes of Instruction Data

Paper • 2404.12195 • Published Apr 18 • 11

Papers - Fine-tuning - DPO

Refer to additional papers: https://link.springer.com/article/10.1007/s10994-014-5458-8 and https://link.springer.com/article/10.1007/BF00992696

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 49
ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization

Paper • 2402.09320 • Published Feb 14 • 6
sDPO: Don't Use Your Data All at Once

Paper • 2403.19270 • Published Mar 28 • 40
Dueling RL: Reinforcement Learning with Trajectory Preferences

Paper • 2111.04850 • Published Nov 8, 2021 • 2

To read... eventually

A collection of papers that i have read or plan to read all in one place. Includes a wide range of topics.

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Paper • 2403.09611 • Published Mar 14 • 124
Evolutionary Optimization of Model Merging Recipes

Paper • 2403.13187 • Published Mar 19 • 50
MobileVLM V2: Faster and Stronger Baseline for Vision Language Model

Paper • 2402.03766 • Published Feb 6 • 12
LLM Agent Operating System

Paper • 2403.16971 • Published Mar 25 • 65

Papers - CoT - Chain of Thought

Contrastive Decoding Improves Reasoning in Large Language Models

Paper • 2309.09117 • Published Sep 17, 2023 • 37
Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15 • 102
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?

Paper • 2403.14624 • Published Mar 21 • 51
Chain of Thought Empowers Transformers to Solve Inherently Serial Problems

Paper • 2402.12875 • Published Feb 20 • 13

Previous
1
2
3
4
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs