Collections
Discover the best community collections!
Collections including paper arxiv:2305.14387
-
BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text
Paper • 2403.18421 • Published • 21 -
Long-form factuality in large language models
Paper • 2403.18802 • Published • 23 -
stanford-crfm/BioMedLM
Text Generation • Updated • 3.32k • 390 -
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 44
-
Proximal Policy Optimization Algorithms
Paper • 1707.06347 • Published • 3 -
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 44 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 140 -
Training language models to follow instructions with human feedback
Paper • 2203.02155 • Published • 14