norabelrose's picture
Update README.md
3d77bfc verified
|
raw
history blame contribute delete
No virus
737 Bytes
---
license: mit
datasets:
- togethercomputer/RedPajama-Data-V2
language:
- en
library_name: transformers
---
This is a set of sparse autoencoders (SAEs) trained on [Llama 3.1 8B](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B) using the 10B sample of the [RedPajama v2 corpus](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-V2), which comes out to roughly 8.5B tokens using the Llama 3 tokenizer. The SAEs are organized by hookpoint, and can be loaded using the EleutherAI [`sae` library](https://github.com/EleutherAI/sae).
With the `sae` library installed, you can access an SAE like this:
```python
from sae import Sae
sae = Sae.load_from_hub("EleutherAI/sae-llama-3.1-8b-32x", hookpoint="layers.23.mlp")
```