Vignesh
Vigneshwaran
AI & ML interests
None yet
Recent Activity
updated
a collection
16 days ago
RLHF
updated
a collection
19 days ago
training
updated
a collection
19 days ago
training
Organizations
Collections
3
-
ORPO: Monolithic Preference Optimization without Reference Model
Paper • 2403.07691 • Published • 62 -
sDPO: Don't Use Your Data All at Once
Paper • 2403.19270 • Published • 40 -
Teaching Large Language Models to Reason with Reinforcement Learning
Paper • 2403.04642 • Published • 46 -
Best Practices and Lessons Learned on Synthetic Data for Language Models
Paper • 2404.07503 • Published • 29
models
None public yet
datasets
None public yet