Seungju Han's picture

3 5

Seungju Han

seungjuh

·

https://seungjuhan.me

AI & ML interests

None yet

Organizations

None yet

seungjuh's activity

upvoted 2 papers 5 months ago

WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs

Paper • 2406.18495 • Published Jun 26 • 12

WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models

Paper • 2406.18510 • Published Jun 26 • 8

upvoted a collection 5 months ago

AI2 Safety Toolkit

Safety data, moderation tools and safe LLMs. • 6 items • Updated 7 days ago • 3