Seungone Kim's picture

Seungone Kim PRO

seungone

·

https://seungonekim.github.io/

AI & ML interests

Large Language Models, LLM-as-a-Judge, Reward Model Overoptimization, Personalized Alignment

Recent Activity

authored a paper about 1 month ago

Measuring Sycophancy of Language Models in Multi-turn Dialogues

authored a paper about 1 month ago

Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

authored a paper about 1 month ago

OptimalThinkingBench: Evaluating Over and Underthinking in LLMs

View all activity

Organizations

upvoted a paper 2 months ago

RefineBench: Evaluating Refinement Capability of Language Models via Checklists

Paper • 2511.22173 • Published Nov 27, 2025 • 15

upvoted a paper 3 months ago

SPICE: Self-Play In Corpus Environments Improves Reasoning

Paper • 2510.24684 • Published Oct 28, 2025 • 18

upvoted a paper 7 months ago

Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

Paper • 2507.00432 • Published Jul 1, 2025 • 79

upvoted 3 papers 8 months ago

Text2Grad: Reinforcement Learning from Natural Language Feedback

Paper • 2505.22338 • Published May 28, 2025 • 8

Datasheets Aren't Enough: DataRubrics for Automated Quality Metrics and Accountability

Paper • 2506.01789 • Published Jun 2, 2025 • 15

Let's Predict Sentence by Sentence

Paper • 2505.22202 • Published May 28, 2025 • 19

upvoted 2 papers 9 months ago

Reasoning Models Better Express Their Confidence

Paper • 2505.14489 • Published May 20, 2025 • 20

The CoT Encyclopedia: Analyzing, Predicting, and Controlling how a Reasoning Model will Think

Paper • 2505.10185 • Published May 15, 2025 • 26

upvoted a paper 12 months ago

Demystifying Long Chain-of-Thought Reasoning in LLMs

Paper • 2502.03373 • Published Feb 5, 2025 • 58

upvoted 6 papers about 1 year ago

VideoRAG: Retrieval-Augmented Generation over Video Corpus

Paper • 2501.05874 • Published Jan 10, 2025 • 75

LLM-as-an-Interviewer: Beyond Static Testing Through Dynamic LLM Evaluation

Paper • 2412.10424 • Published Dec 10, 2024 • 2

Bridging the Data Provenance Gap Across Text, Speech and Video

Paper • 2412.17847 • Published Dec 19, 2024 • 10

Revisiting In-Context Learning with Long Context Language Models

Paper • 2412.16926 • Published Dec 22, 2024 • 32

MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale

Paper • 2412.05237 • Published Dec 6, 2024 • 46

Evaluating Language Models as Synthetic Data Generators

Paper • 2412.03679 • Published Dec 4, 2024 • 47

upvoted 2 articles over 1 year ago

Article

Navigating Korean LLM Research #1: Models

Oct 22, 2024

•

26

Article

Navigating Korean LLM Research #2: Evaluation Tools

Oct 23, 2024

•

8

upvoted 3 papers over 1 year ago

Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages

Paper • 2410.16153 • Published Oct 21, 2024 • 44

Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation

Paper • 2410.13232 • Published Oct 17, 2024 • 44

Coffee-Gym: An Environment for Evaluating and Improving Natural Language Feedback on Erroneous Code

Paper • 2409.19715 • Published Sep 29, 2024 • 10