Model Safety - a floom Collection

floom 's Collections

Coding

ICL

RL

Agents

NLU

RAG

Data Efficient Approaches

Personalization

sentence-transformer-models

Tool Use & more

Feedback Analysis

Memory

SSM

Efficient Serving/Inference

Synthetic Data Generation

Frontier research ideas

Model Safety

updated Apr 1

Evaluating Frontier Models for Dangerous Capabilities

Paper • 2403.13793 • Published Mar 20 • 7
Coercing LLMs to do and reveal (almost) anything

Paper • 2402.14020 • Published Feb 21 • 12