RLAIF

Team

community

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

nlile updated a dataset about 1 month ago

RLAIF/pretext-ui-harbor-runs-v0

nlile published a dataset about 1 month ago

RLAIF/pretext-ui-harbor-runs-v0

Asap7772 authored a paper 8 months ago

Personalized Preference Fine-tuning of Diffusion Models

View all activity

Collections 3

View 3 collections

models 80

datasets 135

RLAIF/pretext-ui-harbor-runs-v0

Viewer • Updated May 2 • 42.6k • 9.56k

RLAIF/webgpt

Viewer • Updated Dec 8, 2025 • 13.3k • 25

RLAIF/tldr

Viewer • Updated Dec 8, 2025 • 92.9k • 12

RLAIF/ultrafeedback-binarized

Viewer • Updated Dec 8, 2025 • 63.5k • 13

RLAIF/gm_toy_example

Viewer • Updated Nov 1, 2025 • 1.1k • 11

RLAIF/dpo_thinking_reddit_judge4_1e-6_0.02_4B_4B_with_gold_labels_kl_estimation

Viewer • Updated Sep 15, 2025 • 27k • 19

RLAIF/dpo_thinking_reddit_judge3_1e-6_0.02_4B_4B_with_gold_labels_kl_estimation

Viewer • Updated Sep 15, 2025 • 8k • 15

RLAIF/dpo_thinking_reddit_judge2_1e-6_0.02_4B_4B_with_gold_labels_kl_estimation

Viewer • Updated Sep 14, 2025 • 27k • 12

RLAIF/dpo_thinking_reddit_judge_1e-6_0.02_4B_4B_with_gold_labels_kl_estimation

Viewer • Updated Sep 14, 2025 • 27k • 8

RLAIF/dpo_thinking_reddit_offtheshelf_1e-6_0.02_4B_4B_with_gold_labels_kl_estimation

Viewer • Updated Sep 14, 2025 • 27k • 37

View 135 datasets

RLAIF

AI & ML interests

Recent Activity

Collections 3

SynthLabsAI/ALP_DeepScaleR_1.5B_C16K

SynthLabsAI/ALP_R1_Qwen1.5B

RLAIF/CODE-BEHAVIOR-NUMINA-V1-Blocks

SynthLabsAI/ALP_DeepScaleR_1.5B_C16K

SynthLabsAI/ALP_R1_Qwen1.5B

RLAIF/CODE-BEHAVIOR-NUMINA-V1-Blocks

models 80

RLAIF/twitter_8EUB__5e-06_0.1_20_0.9_20_0.95

RLAIF/dpo_thinking_reddit_judge_last_minute_50_1e-6_0.02_4B_4B

RLAIF/dpo_thinking_reddit_judge_last_minute_150_1e-6_0.02_4B_4B

RLAIF/dpo_thinking_reddit_judge_last_minute_100_1e-6_0.02_4B_4B

RLAIF/dpo_thinking_reddit_judge_last_minute_200_1e-6_0.02_4B_4B

RLAIF/dpo_thinking_reddit_judge_last_minute_250_1e-6_0.02_4B_4B

RLAIF/grpo_reddit_judge_last_minute_16_64_8_3e-5_1e-6_4B

RLAIF/dpo_thinking_reddit_judge_full_1e-6_0.02_8B_4B

RLAIF/dpo_answer_reddit_judge_full_1e-6_0.02_4B_1.7B

RLAIF/dpo_answer_reddit_judge_full_1e-6_0.02_8B_4B

datasets 135

RLAIF/pretext-ui-harbor-runs-v0

RLAIF/webgpt

RLAIF/tldr

RLAIF/ultrafeedback-binarized

RLAIF/gm_toy_example

RLAIF/dpo_thinking_reddit_judge4_1e-6_0.02_4B_4B_with_gold_labels_kl_estimation

RLAIF/dpo_thinking_reddit_judge3_1e-6_0.02_4B_4B_with_gold_labels_kl_estimation

RLAIF/dpo_thinking_reddit_judge2_1e-6_0.02_4B_4B_with_gold_labels_kl_estimation

RLAIF/dpo_thinking_reddit_judge_1e-6_0.02_4B_4B_with_gold_labels_kl_estimation

RLAIF/dpo_thinking_reddit_offtheshelf_1e-6_0.02_4B_4B_with_gold_labels_kl_estimation

AI & ML interests

Recent Activity

Team members 10

Collections 3

models 80 Sort: Recently updated

datasets 135 Sort: Recently updated

models 80

datasets 135