AlexLI

AlexLINB

AI & ML interests

None yet

Recent Activity

replied to di-zhang-fdu's post about 1 month ago

LLaMA-O1-PRM and LLaMA-O1-Reinforcement will release in this weekend. We have implemented a novel Reinforcement finetune(RFT) pipeline that taught models learning reasoning and reward labeling without human annotation.

reacted to di-zhang-fdu's post with 👀 about 1 month ago

upvoted a collection 5 months ago

Stable Diffusion 3

View all activity

Organizations

None yet

AlexLINB's activity

replied to di-zhang-fdu's post about 1 month ago

Looking forward to it

reacted to di-zhang-fdu's post with 👀 about 1 month ago

Post

2605

LLaMA-O1-PRM and LLaMA-O1-Reinforcement will release in this weekend.
We have implemented a novel Reinforcement finetune(RFT) pipeline that taught models learning reasoning and reward labeling without human annotation.

3 replies

upvoted a collection 5 months ago

Stable Diffusion 3

Collection

Stable Diffusion 3 and related models for text-to-image and image-to-image • 2 items • Updated 7 days ago • 95

updated a collection 8 months ago

AImodel

Collection

1 item • Updated May 21, 2024