|
--- |
|
library_name: saelens |
|
license: apache-2.0 |
|
datasets: |
|
- Juliushanhanhan/openwebtext-1b-llama3-tokenized-cxt-1024 |
|
--- |
|
|
|
# Llama-3-8B SAEs (layer 25, Post-MLP Residual Stream) |
|
|
|
## Introduction |
|
|
|
We train a Gated SAE on the post-MLP residual stream of the 25th layer of [Llama-3-8b-instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) model. The width of SAE hidden dimensions is 65536 (x16). |
|
|
|
The SAE is trained with 500M tokens from the [OpenWebText corpus](https://huggingface.co/datasets/Juliushanhanhan/openwebtext-1b-llama3-tokenized-cxt-1024). |
|
|
|
Feature visualizations are hosted at https://www.neuronpedia.org/llama3-8b-it. The wandb run is recorded [here](https://wandb.ai/jiatongg/sae_semantic_entropy/runs/ruuu0izg?nw=nwuserjiatongg). |
|
|
|
## Load the Model |
|
|
|
|
|
This repository contains the following SAEs: |
|
- blocks.25.hook_resid_post |
|
|
|
Load these SAEs using SAELens as below: |
|
```python |
|
from sae_lens import SAE |
|
|
|
sae, cfg_dict, sparsity = SAE.from_pretrained("Juliushanhanhan/llama-3-8b-it-res", "<sae_id>") |
|
``` |
|
|
|
## Citation |
|
|
|
``` |
|
@misc {jiatong_han_2024, |
|
author = { {Jiatong Han} }, |
|
title = { llama-3-8b-it-res (Revision 53425c3) }, |
|
year = 2024, |
|
url = { https://huggingface.co/Juliushanhanhan/llama-3-8b-it-res }, |
|
doi = { 10.57967/hf/2889 }, |
|
publisher = { Hugging Face } |
|
} |
|
``` |