Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2305.14387

[lecture artifacts] aligning open language models

artifacts referenced in the talk timeline! Slides: https://docs.google.com/presentation/d/1quMyI4BAx4rvcDfk8jjv063bmHg4RxZd9mhQloXpMn0/edit?usp=sharin

LLaMA: Open and Efficient Foundation Language Models

Paper • 2302.13971 • Published Feb 27, 2023 • 13
tatsu-lab/alpaca-7b-wdiff

Text Generation • Updated May 22, 2023 • 182 • 54
lmsys/vicuna-13b-delta-v0

Text Generation • Updated Aug 1, 2023 • 48 • 454
anon8231489123/ShareGPT_Vicuna_unfiltered

Updated Apr 12, 2023 • 68 • 722

Papers - Training - Instruction-Following

Alpaca eval: https://github.com/tatsu-lab/alpaca_eval

UltraFeedback: Boosting Language Models with High-quality Feedback

Paper • 2310.01377 • Published Oct 2, 2023 • 5
AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback

Paper • 2305.14387 • Published May 22, 2023 • 1

Papers - Fine-tuning - PPO

HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2 • 19
UltraFeedback: Boosting Language Models with High-quality Feedback

Paper • 2310.01377 • Published Oct 2, 2023 • 5
AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback

Paper • 2305.14387 • Published May 22, 2023 • 1

Papers - University - Stanford University

BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text

Paper • 2403.18421 • Published Mar 27 • 21
Long-form factuality in large language models

Paper • 2403.18802 • Published Mar 27 • 23
stanford-crfm/BioMedLM

Text Generation • Updated Mar 28 • 3.32k • 390
Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 44

Proximal Policy Optimization Algorithms

Paper • 1707.06347 • Published Jul 20, 2017 • 3
Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 44
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18 • 140
Training language models to follow instructions with human feedback

Paper • 2203.02155 • Published Mar 4, 2022 • 14

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs