13 40 77

Seungone Kim PRO

seungone

https://seungonekim.github.io/

AI & ML interests

Large Language Models, LLM-as-a-Judge, Reward Model Overoptimization, Personalized Alignment

Recent Activity

authored a paper 3 days ago

RefineBench: Evaluating Refinement Capability of Language Models via Checklists

upvoted a paper 3 days ago

RefineBench: Evaluating Refinement Capability of Language Models via Checklists

commented on a paper 3 days ago

RefineBench: Evaluating Refinement Capability of Language Models via Checklists

View all activity

Organizations

authored a paper 3 days ago

RefineBench: Evaluating Refinement Capability of Language Models via Checklists

Paper • 2511.22173 • Published 7 days ago • 12

upvoted a paper 3 days ago

RefineBench: Evaluating Refinement Capability of Language Models via Checklists

Paper • 2511.22173 • Published 7 days ago • 12

commented a paper 3 days ago

RefineBench: Evaluating Refinement Capability of Language Models via Checklists

Paper • 2511.22173 • Published 7 days ago • 12 •

liked a dataset 16 days ago

RefineBench/RefineBench

Viewer • Updated 2 days ago • 1k • 501 • 4

updated a dataset 25 days ago

facebook/principia-collection

Viewer • Updated 25 days ago • 554k • 2.67k • 38

liked a dataset 25 days ago

facebook/principia-collection

Viewer • Updated 25 days ago • 554k • 2.67k • 38

published a dataset 25 days ago

facebook/principia-collection

Viewer • Updated 25 days ago • 554k • 2.67k • 38

upvoted a paper about 1 month ago

SPICE: Self-Play In Corpus Environments Improves Reasoning

Paper • 2510.24684 • Published Oct 28 • 15

liked 2 datasets 5 months ago

toloka/u-math

Viewer • Updated Dec 5, 2024 • 1.1k • 464 • 24

xw27/scibench

Viewer • Updated May 6, 2024 • 692 • 658 • 21

upvoted a paper 5 months ago

Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

Paper • 2507.00432 • Published Jul 1 • 79

upvoted a paper 6 months ago

Text2Grad: Reinforcement Learning from Natural Language Feedback

Paper • 2505.22338 • Published May 28 • 8

liked a dataset 6 months ago

TIGER-Lab/WebInstruct-verified

Viewer • Updated 6 days ago • 462k • 447 • 52

authored a paper 6 months ago

Datasheets Aren't Enough: DataRubrics for Automated Quality Metrics and Accountability

Paper • 2506.01789 • Published Jun 2 • 14

upvoted a paper 6 months ago

Datasheets Aren't Enough: DataRubrics for Automated Quality Metrics and Accountability

Paper • 2506.01789 • Published Jun 2 • 14

liked a dataset 6 months ago

hendrydong/gpqa_diamond_mc

Viewer • Updated Jan 3 • 198 • 828 • 2

authored a paper 6 months ago

Let's Predict Sentence by Sentence

Paper • 2505.22202 • Published May 28 • 19

upvoted a paper 6 months ago

Let's Predict Sentence by Sentence

Paper • 2505.22202 • Published May 28 • 19

authored 2 papers 6 months ago

Web-Shepherd: Advancing PRMs for Reinforcing Web Agents

Paper • 2505.15277 • Published May 21 • 104

FREESON: Retriever-Free Retrieval-Augmented Reasoning via Corpus-Traversing MCTS

Paper • 2505.16409 • Published May 22 • 2

Seungone Kim PRO

AI & ML interests

Recent Activity

Organizations

seungone's activity