Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2406.10023

Model Training - Learning Scheme

Instruction Pre-Training: Language Models are Supervised Multitask Learners

Paper • 2406.14491 • Published Jun 20 • 85
Pre-training Small Base LMs with Fewer Tokens

Paper • 2404.08634 • Published Apr 12 • 34
Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training

Paper • 2405.15319 • Published May 24 • 25
Can LLMs Learn by Teaching? A Preliminary Study

Paper • 2406.14629 • Published Jun 20 • 17

Alignment and Unlearning

Learn Your Reference Model for Real Good Alignment

Paper • 2404.09656 • Published Apr 15 • 82
Aligning Teacher with Student Preferences for Tailored Training Data Generation

Paper • 2406.19227 • Published Jun 27 • 24
Self-Play Preference Optimization for Language Model Alignment

Paper • 2405.00675 • Published May 1 • 24
CantTalkAboutThis: Aligning Language Models to Stay on Topic in Dialogues

Paper • 2404.03820 • Published Apr 4 • 24

mDPO: Conditional Preference Optimization for Multimodal Large Language Models

Paper • 2406.11839 • Published Jun 17 • 37
Pandora: Towards General World Model with Natural Language Actions and Video States

Paper • 2406.09455 • Published Jun 12 • 14
WPO: Enhancing RLHF with Weighted Preference Optimization

Paper • 2406.11827 • Published Jun 17 • 14
In-Context Editing: Learning Knowledge from Self-Induced Distributions

Paper • 2406.11194 • Published Jun 17 • 15

Understanding the performance gap between online and offline alignment algorithms

Paper • 2405.08448 • Published May 14 • 14
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment

Paper • 2405.19332 • Published May 29 • 15
Offline Regularised Reinforcement Learning for Large Language Models Alignment

Paper • 2405.19107 • Published May 29 • 13
Show, Don't Tell: Aligning Language Models with Demonstrated Feedback

Paper • 2406.00888 • Published Jun 2 • 30

active learning

AnchorAL: Computationally Efficient Active Learning for Large and Imbalanced Datasets

Paper • 2404.05623 • Published Apr 8 • 3
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment

Paper • 2405.19332 • Published May 29 • 15
BPO: Supercharging Online Preference Learning by Adhering to the Proximity of Behavior LLM

Paper • 2406.12168 • Published Jun 18 • 7
Deep Bayesian Active Learning for Preference Modeling in Large Language Models

Paper • 2406.10023 • Published Jun 14 • 2

Data Efficient Approaches

How to Train Data-Efficient LLMs

Paper • 2402.09668 • Published Feb 15 • 38
LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement

Paper • 2403.15042 • Published Mar 22 • 25
MAGID: An Automated Pipeline for Generating Synthetic Multi-modal Datasets

Paper • 2403.03194 • Published Mar 5 • 12
Orca-Math: Unlocking the potential of SLMs in Grade School Math

Paper • 2402.14830 • Published Feb 16 • 24

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs