Gabriele Sarti's picture

Gabriele Sarti

gsarti

·

https://gsarti.com

AI & ML interests

Interpretability for generative language models

Recent Activity

upvoted a collection 5 days ago

Activation Oracles

updated a collection 7 days ago

🔍 Interpretability & Analysis of LMs

upvoted a paper 7 days ago

GIM: Improved Interpretability for Large Language Models

View all activity

Organizations

upvoted a collection 5 days ago

Activation Oracles

10 items • Updated 3 days ago • 2

updated a collection 7 days ago

🔍 Interpretability & Analysis of LMs

Outstanding research in LM interpretability and evaluation, summarized • 135 items • Updated 7 days ago • 116

upvoted a paper 7 days ago

GIM: Improved Interpretability for Large Language Models

Paper • 2505.17630 • Published May 23 • 1

liked a dataset 7 days ago

datapizza-ai-lab/salaries

Preview • Updated 4 days ago • 557 • 44

liked a model 7 days ago

tencent/HY-WorldPlay

Image-to-Video • Updated 7 days ago • 3.47k • 435

liked a dataset 8 days ago

saracandu/smol_reas_traces

Viewer • Updated 7 days ago • 36k • 30 • 1

liked a model 10 days ago

allenai/Bolmo-1B

Text Generation • 1B • Updated 3 days ago • 499 • 37

liked a Space 11 days ago

The Eiffel Tower Llama

Explore the Eiffel Tower Llama experiment with open-source models

upvoted a collection 11 days ago

Sparse Auto-Encoders (SAEs) for Mechanistic Interpretability

A compilation of sparse auto-encoders trained on large language models. • 37 items • Updated 9 days ago • 14

updated a collection 11 days ago

👤 Implicit Personalization in Language Models

Works on detecting, attributing and controlling implicit personalization in language models • 20 items • Updated 11 days ago • 1

liked a dataset 11 days ago

Anthropic/AnthropicInterviewer

Viewer • Updated 17 days ago • 1.25k • 12k • 339

liked a model 18 days ago

EssentialAI/rnj-1-instruct

Text Generation • 8B • Updated 1 day ago • 455k • • 289

updated a collection 23 days ago

👤 Implicit Personalization in Language Models

Works on detecting, attributing and controlling implicit personalization in language models • 20 items • Updated 11 days ago • 1

upvoted a paper 23 days ago

Accumulating Context Changes the Beliefs of Language Models

Paper • 2511.01805 • Published Nov 3 • 2

updated a Space 24 days ago

MIRAGE

Model Internals to generate RAG citations

updated a collection 24 days ago

👤 Implicit Personalization in Language Models

Works on detecting, attributing and controlling implicit personalization in language models • 20 items • Updated 11 days ago • 1

liked a dataset 24 days ago

proj-persona/PersonaHub

Viewer • Updated Sep 26 • 375k • 16.1k • 658

updated a collection 24 days ago

👤 Implicit Personalization in Language Models

Works on detecting, attributing and controlling implicit personalization in language models • 20 items • Updated 11 days ago • 1