Linfeng Song's picture

2 5

Linfeng Song

freesunshine0316

https://freesunshine0316.github.io/

AI & ML interests

Researcher @Tencent AI Lab working on reasoning and RLAIF with LLM, especially search + RL. Working on NLP since 2010.

Recent Activity

authored a paper about 2 months ago

Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning

upvoted a paper about 2 months ago

Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning

commented a paper about 2 months ago

Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning

View all activity

Organizations

freesunshine0316's activity

authored a paper about 2 months ago

Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning

Paper • 2410.06508 • Published Oct 9 • 10

upvoted a paper about 2 months ago

Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning

Paper • 2410.06508 • Published Oct 9 • 10

commented a paper about 2 months ago

Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning

Paper • 2410.06508 • Published Oct 9 • 10 •

authored a paper 5 months ago

LiteSearch: Efficacious Tree Search for LLM

Paper • 2407.00320 • Published Jun 29 • 37

upvoted 2 papers 5 months ago

Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning

Paper • 2407.00617 • Published Jun 30 • 7

LiteSearch: Efficacious Tree Search for LLM

Paper • 2407.00320 • Published Jun 29 • 37

commented a paper 5 months ago

LiteSearch: Efficacious Tree Search for LLM

Paper • 2407.00320 • Published Jun 29 • 37 •

upvoted a paper 5 months ago

Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing

Paper • 2404.12253 • Published Apr 18 • 53

upvoted a paper 6 months ago

Stabilizing RLHF through Advantage Model and Selective Rehearsal

Paper • 2309.10202 • Published Sep 18, 2023 • 9

authored a paper 7 months ago

Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing

Paper • 2404.12253 • Published Apr 18 • 53

authored a paper about 1 year ago

Stabilizing RLHF through Advantage Model and Selective Rehearsal

Paper • 2309.10202 • Published Sep 18, 2023 • 9