Sparse Auto-Encoders (SAEs) for Mechanistic Interpretability - a dlouapre Collection

dlouapre 's Collections

Sparse Auto-Encoders (SAEs) for Mechanistic Interpretability

Sparse Auto-Encoders (SAEs) for Mechanistic Interpretability

updated 6 days ago

A compilation of sparse auto-encoders trained on large language models.