AI & ML interests

Deep Learning, Computer Vision, Machine Learning

Recent Activity

pyimagesearch's activity

ariG23498Β 
posted an update 10 days ago
view post
Post
1883
Tried my hand at simplifying the derivations of Direct Preference Optimization.

I cover how one can reformulate RLHF into DPO. The idea of implicit reward modeling is chef's kiss.

Blog: https://huggingface.co/blog/ariG23498/rlhf-to-dpo
ariG23498Β 
posted an update 13 days ago
ariG23498Β 
posted an update about 2 months ago
ariG23498Β 
posted an update 3 months ago
ariG23498Β 
posted an update 3 months ago
ariG23498Β 
posted an update 5 months ago