namgoodfire commited on
Commit
348ec1b
·
verified ·
1 Parent(s): 5fd1483

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -6,7 +6,7 @@ base_model:
6
  - meta-llama/Llama-3.3-70B-Instruct
7
  ---
8
 
9
- ### Model Information
10
 
11
  The Goodfire SAE (Sparse Autoencoder) for Llama 3.3 70B is an interpreter model designed to analyze and understand
12
  the internal representations of Llama-3.3-70B-Instruct. This SAE model is trained specifically on layer 50 of 
@@ -16,7 +16,7 @@ allowing researchers and developers to gain insights into the model's internal
16
  As an open-source tool, it serves as a foundation for advancing interpretability research and enhancing control
17
  over large language model operations.
18
 
19
- ### Intended Use
20
 
21
  By open-sourcing SAEs for leading open models, especially large-scale
22
  models like Llama 3.3 70B, we aim to accelerate progress in interpretability research.
@@ -28,7 +28,7 @@ foundations and uncovers new applications.
28
 
29
  #### Feature labels
30
 
31
- ### How to use
32
 
33
  ```python
34
  import torch
@@ -262,7 +262,7 @@ logits, kv_cache, features = llama_3_1_8b.forward(
262
  print(llama_3_1_8b.tokenizer.decode(logits[-1].argmax(-1)))
263
  ```
264
 
265
- ### Responsibility & Safety
266
 
267
  Safety is at the core of everything we do at Goodfire. As a public benefit
268
  corporation, we’re dedicated to understanding AI models to enable safer, more reliable
 
6
  - meta-llama/Llama-3.3-70B-Instruct
7
  ---
8
 
9
+ ## Model Information
10
 
11
  The Goodfire SAE (Sparse Autoencoder) for Llama 3.3 70B is an interpreter model designed to analyze and understand
12
  the internal representations of Llama-3.3-70B-Instruct. This SAE model is trained specifically on layer 50 of 
 
16
  As an open-source tool, it serves as a foundation for advancing interpretability research and enhancing control
17
  over large language model operations.
18
 
19
+ ## Intended Use
20
 
21
  By open-sourcing SAEs for leading open models, especially large-scale
22
  models like Llama 3.3 70B, we aim to accelerate progress in interpretability research.
 
28
 
29
  #### Feature labels
30
 
31
+ ## How to use
32
 
33
  ```python
34
  import torch
 
262
  print(llama_3_1_8b.tokenizer.decode(logits[-1].argmax(-1)))
263
  ```
264
 
265
+ ## Responsibility & Safety
266
 
267
  Safety is at the core of everything we do at Goodfire. As a public benefit
268
  corporation, we’re dedicated to understanding AI models to enable safer, more reliable