Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
6
Sahand Rezaei-Shoshtari
sahandrez
Follow
https://sahandrez.github.io/
sahandrez
AI & ML interests
Reinforcement Learning
Recent Activity
updated
a model
2 days ago
sahandrez/pairwise-reward-Qwen2.5-1.5B-ultrafeedback
updated
a model
2 days ago
sahandrez/pairwise-reward-Qwen2.5-1.5B-ultrafeedback
updated
a model
2 days ago
sahandrez/pairwise-reward-Qwen2.5-1.5B-ultrafeedback
View all activity
Organizations
None yet
models
5
Sort: Recently updated
sahandrez/sft-Qwen2.5-1.5B-ultrafeedback-binarized-20241122-140310
Updated
20 minutes ago
sahandrez/pairwise-reward-Qwen2.5-1.5B-ultrafeedback
Text Classification
•
Updated
2 days ago
•
4
sahandrez/pairwise-reward-sft-zephyr-7b-sft-qlora-ultrafeedback
Updated
Oct 14
•
2
sahandrez/pairwise-reward-zephyr-7b-sft-qlora-ultrafeedback
Updated
Oct 13
sahandrez/sft-zephyr-7b-sft-qlora-ultrafeedback
Updated
Oct 12
•
1
datasets
2
Sort: Recently updated
sahandrez/ultrafeedback_kto
Viewer
•
Updated
Sep 23
•
126k
•
31
sahandrez/ultrafeedback_unpaired
Viewer
•
Updated
Sep 20
•
126k
•
30