Mohamed Anwar
mootje
Β·
AI & ML interests
None yet
Recent Activity
liked
a Space 10 days ago
ysharma/drag-and-drop-kanban-board liked
a Space 17 days ago
maldons77/ai-storyboard-creator reacted
to
BramVanroy's
post with π over 1 year ago
The InstructGPT paper mentions that they insert 10% pretraining data during SFT, which they find improves the effect of PPO (IIUC). Has anyone else done later ablations on this? I've only seen the inverse suggested, mixing in SFT data during pretraining. Organizations
None yet