ebetica commited on
Commit
79e88cb
·
verified ·
1 Parent(s): ba4e709

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +99 -90
README.md CHANGED
@@ -1,146 +1,155 @@
1
  ---
2
- license:
3
- - mit
4
  - other
5
  license_link: https://github.com/Biohub/esm/blob/main/THIRD_PARTY_NOTICE.md
6
- language:
7
- - en
8
- tags:
9
- - biology
10
- - esm
11
- - protein
12
- - sparse-autoencoder
13
- - interpretability
14
- - protein-embeddings
15
- - feature-extraction
16
- - protein-language-model
17
- - unsupervised-learning
18
  - transformers
19
-
20
  ---
21
- # ESMC Sparse Autoencoder (SAE) Explanation
22
 
23
- This model card provides an overview of the intended use of the ESMC SAE models and examples of how to access them, but it does not have a specific model or model weights. To access the individual SAE model cards for any of the models, use the format below:
24
 
25
- ```py
26
- https://huggingface.co/Biohub/esmc-<modelsize>-2024-12-sae-sweep<layer><number>-k64-codebook<size>
27
- ```
 
 
 
 
28
 
29
- The ESMC sparse autoencoders (SAEs) are unsupervised neural networks trained to decompose the learned internal representations from the [ESMC model variants](https://huggingface.co/collections/biohub/esmc-model-family) into a sparse set of more easily interpretable features, revealing what the model “sees” of the user’s protein input. Each feature represents a specific biologically relevant property of the protein, such as a zinc binding site, beta barrel structure, or transmembrane helix.
30
 
31
- The ESM Atlas is a map of 6.8 billion proteins covering the full breadth of life’s biodiversity and more than one billion predicted structures, built upon the [ESMC 6B](https://huggingface.co/biohub/esmc-6b-2024-12) SAEs to translate the model’s internal representations into \~16,000 interpretable biological features. Learn more about how to use the ESM Atlas [here](https://biohub.ai/esmc/atlas).
32
 
33
  ## Intended Use
34
 
35
- * Reconstructing ESMC embeddings into interpretable features
36
- * Visualizing feature activations on sequences and structures
37
 
38
  ## Usage
39
 
40
- The SAE model `esmc-6b-2024-12-sae-sweep-layer60-k64-codebook16384` provides users with interpretable, agent-generated feature descriptions that are detailed [here](https://huggingface.co/datasets/biohub/ESMC-SAE-Features). Users can access these feature descriptions through the [ESM Atlas](https://biohub.ai/esmc/atlas) or through the [Biohub Platform](https://biohub.ai/).
41
 
42
- While all SAE models can be accessed through Hugging Face, only the 5 SAE models that have normalization statistics are available through the [Biohub Platform](https://biohub.ai/models/esmc). These SAE models are:
43
 
44
- * [esmc-6b-2024-12-sae-sweep-layer60-k64-codebook16384](https://huggingface.co/biohub/esmc-6b-2024-12-sae-sweep-layer60-k64-codebook16384)
45
- * [Esmc-6b-2024-12-sae-sweep-layer60-k64-codebook65536](https://huggingface.co/biohub/esmc-6b-2024-12-sae-sweep-layer60-k64-codebook65536)
46
- * [esmc-600m-2024-12-sae-sweep-layer27-k64-codebook16384](https://huggingface.co/biohub/esmc-600m-2024-12-sae-sweep-layer27-k64-codebook16384)
47
- * [esmc-600m-2024-12-sae-sweep-layer27-k64-codebook65536](https://huggingface.co/biohub/esmc-600m-2024-12-sae-sweep-layer27-k64-codebook65536)
48
- * [esmc-300m-2024-12-sae-sweep-layer23-k64-codebook65536](https://huggingface.co/biohub/esmc-300M-2024-12-sae-sweep-layer23-k64-codebook65536)
49
 
50
  You can access an SAE model through Hugging Face using the code below:
51
 
52
  ```py
53
  import torch
54
- from transformers import AutoModel, AutoModelForMaskedLM, AutoTokenizer
55
 
56
- # GFP sequence
57
- sequences = ["MSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPTLVTTFSYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQLADHYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITHGMDELYK"]
58
 
59
- model = AutoModelForMaskedLM.from_pretrained(
60
- "Biohub/esmc-6b-2024-12",
61
- device_map="auto", # place model on GPU(s) if available
62
- ).eval()
63
- tokenizer = AutoTokenizer.from_pretrained("Biohub/esmc-6b-2024-12")
 
 
 
 
64
 
65
- # Load SAE(s)
66
- sae_models = []
67
- sae = AutoModel.from_pretrained("Biohub/esmc-6b-2024-12-sae-sweep-layer60-k64-codebook65536", device_map="auto")
68
- sae_models.append(sae)
69
 
70
- # Add SAE(s) to the ESMC model
71
- model.add_sae_models(sae_models)
72
-
73
- inputs = tokenizer(sequences, return_tensors="pt", padding=True)
74
- inputs = {k: v.to(model.device) for k, v in inputs.items()} # place inputs on device
75
  with torch.inference_mode():
76
- output = model(**inputs)
77
- print(output.sae_outputs) # Access SAE outputs
78
- ```
79
 
80
- For additional details about using SAE models, see the tutorials [here](https://colab.research.google.com/github/Biohub/esm/blob/main/cookbook/tutorials/8_protein_interpretation_sae.ipynb).
 
 
81
 
82
  ## Model Details
83
 
84
- ESMC SAEs are trained to reconstruct ESMC embeddings at a *residue level*, meaning for a protein of length L, there are L sparse vectors of SAE features. The embeddings are extracted from the hidden states after the multi-layered perceptron (MLP) layer. SAE models are trained on MLP and on state activations to provide different options for protein interpretability. Training SAE models on MLP layers can make individual features from each layer more interpretable while training the SAE models on state activations may provide a more global understanding. We use the [TopK](https://arxiv.org/abs/2412.06410) approach for training SAEs to control sparsity by only allowing the top k features at each position to be active.
85
 
86
  There are two critical hyperparameters for the TopK approach:
87
 
88
- * `k`: the number of active features per position
89
- * `codebook_size`: the total size of the codebook.
90
 
 
91
 
92
- The SAE codebook size (some fields use the term “dictionary”) determines how many distinct features the model can learn to represent. For protein models, the codebook size determines the balance between how accurately the model can reconstruct complex biological data and how easily interpretable those features are. As codebook size increases, the SAE can both reconstruct embeddings more faithfully and learn a higher number of specific features. However, the larger the codebook size the greater the computational expense and probability of detecting more features that are difficult to interpret or rarely activate.
93
 
94
- * Small codebook: The SAE is forced to group related concepts together. For example, a single feature might activate for all "metal-binding sites."
95
- * Large codebook: The model can "split" that general concept into more specific, granular features. Instead of one broad feature, you might get separate dedicated features for example "zinc-finger motifs," "iron-sulfur clusters," and "calcium-binding loops".
96
 
97
- ### Model naming
98
 
99
- Models are named as follows: `{esmc-model}-sae-{sweep|state|residual}-layer{layer}-k{k}-codebook{codebook_size}`
 
 
100
 
101
- For example, to get the sweep model at ESMC 6B using the target sparsity value (k) of 64 with the 65k codebook use the following code: `esmc-6b-2024-12-sae-sweep-layer60-k64-codebook65536`. The example code below shows how to use this model naming convention to obtain a specific model variant. See the Usage section below to see an example.
102
 
103
- The codebook values are as follows: 8192, 16384, 32768, 65536, 131072.
 
 
104
 
105
- ### SAE models
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
106
 
107
- The table below lists the variations for the SAE models. Each SAE variant is trained on a base model, either MLP or state activations, a specific codebook size, and a specific layer of the base model. Thus, the first row of the table indicates that there are 81 (one model for every 1-90 transformer layers, plus an input embedding layer) SAE variants trained on ESMC 6B using state activations and a codebook of 16k features.
108
 
109
- | Model | MLP or State Activations | Codebooks | Layers |
110
- | :---- | :---- | :---- | :---- |
111
- | ESMC 6B | state | 16k | A model for every 1-80 transformer layers \+ input embedding layer |
112
- | ESMC 6B | state | 131k | A model for every 1-80 transformer layers \+ input embedding layer |
113
- | ESMC 6B | MLP | 131k | A model for every 1-80 transformer layers \+ 1 input embedding layer |
114
- | ESMC 600M | state | 16k | A model for every 1-37 transformer layers \+ 1 input embedding layer |
115
- | ESMC 600M | MLP | 131k | A model for every 1-37 transformer layers \+ 1 input embedding layer |
116
- | ESMC 300M | state | 16k | 1-31 transformer layers \+ 1 input embedding layer |
117
- | ESMC 300M | MLP | 131k | A model for every 1-31 transformer layers \+ 1 input embedding layer |
118
 
119
- ### Sweep variants
120
 
121
- In addition to the SAEs listed above, there are sweeps trained across the different ESMC model variants.The table below summarizes the ESMC variant and the layer used for training. We targeted a 75% depth after testing showed that middle-to-late layers yielded the most pertinent feature information, similar to findings from other large language learning models. For each layer, there are 30 pre-trained models covering every combination of five codebook sizes (8k ,16k, 32k, 65k, 131k) and six target sparsities (16, 32, 64 ,128, 256, 512).
122
 
123
- | Base model variant | Training layer |
124
- | :---- | :---- |
125
- | ESMC 6B | 60 |
126
- | ESMC 600M | 27 |
127
- | ESMC 300M | 23 |
128
 
129
- ### ESM Atlas
130
 
131
- The SAE model `esmc-6b-2024-12-sae-sweep-layer60-k64-codebook16384` is the model that provides users with interpretable, agent-generated feature descriptions that are detailed [here](https://huggingface.co/datasets/biohub/ESMC-SAE-Features). Users can access these feature descriptions through the [ESM Atlas](https://biohub.ai/esmc/atlas) or through the Biohub Platform Biohub’s open platform with the code [here](https://github.com/evolutionaryscale/esm/blob/cookbook/snippets/sae_example.py).
132
 
133
- ### Normalization Statistics
134
 
135
- The following four SAE models include max/IDF normalization statistics:
136
 
137
- * `esmc-600m-2024-12-sae-sweep-layer27-k64-codebook16384`
138
- * `esmc-600m-2024-12-sae-sweep-layer27-k64-codebook65536`
139
- * `esmc-6b-2024-12-sae-sweep-layer60-k64-codebook16384`
140
- * `esmc-6b-2024-12-sae-sweep-layer60-k64-codebook65536`
141
 
142
- Normalization statistics are computed by running each model over UniRef90 and recording two quantities per feature: (1) the maximum activation value observed across the entire dataset, and (2) the Inverse Document Frequency (IDF), defined as log(N / f), where N is the total number of proteins and f is the number of proteins in which the feature was active (non-zero).
143
 
144
- At inference activations are normalized as: (activation / max) \* idf. Dividing by the maximum scales each feature's output to the range \[0, 1\], making features more comparable to each other. Multiplying by IDF then upweights rare, distinctive features and downweights ubiquitous ones, making it easier for users to distinguish biologically relevant features.
145
 
 
146
 
 
 
1
  ---
2
+ license:
3
+ - mit
4
  - other
5
  license_link: https://github.com/Biohub/esm/blob/main/THIRD_PARTY_NOTICE.md
6
+ library_name: transformers
7
+ language: en
8
+ tags:
9
+ - biology
10
+ - esm
11
+ - protein
12
+ - sparse-autoencoder
13
+ - interpretability
14
+ - protein-embeddings
15
+ - feature-extraction
16
+ - protein-language-model
17
+ - unsupervised-learning
18
  - transformers
 
19
  ---
 
20
 
21
+ # ESMC Sparse Autoencoders
22
 
23
+ This model card provides an overview of the intended use of the ESMC SAE models and examples of how to access them, but it does not have a specific model or model weights. To access each SAE model collection, use the links below:
24
+
25
+ - [ESMC SAEs for hidden states (all layers)](https://huggingface.co/collections/biohub/esmc-saes-for-hidden-states-all-layers)
26
+ - [ESMC SAEs for MLP outputs (all layers)](https://huggingface.co/collections/biohub/esmc-saes-for-mlp-outputs-all-layers)
27
+ - [ESMC SAEs for one layer (different sparsity / codebook size)](https://huggingface.co/collections/biohub/esmc-saes-for-one-layer-different-sparsity-codebook-size)
28
+
29
+ The ESMC sparse autoencoders (SAEs) are unsupervised neural networks trained to decompose the learned internal representations from the ESMC model variants into a sparse representation space comprising more biologically interpretable features, revealing what the model "sees" of the user's protein input. Each feature is encouraged to be approximately monosemantic (capturing one interpretable concept) through a large feature space combined with a sparsity constraint, and may represent a specific biologically relevant property of the protein, such as a zinc binding site, beta barrel structure, or transmembrane helix.
30
 
31
+ Building on top of the ESMC 6B SAEs, the ESM Atlas is a map of 6.8 billion proteins covering the full breadth of life's biodiversity and more than one billion predicted structures. The SAEs enable translation of the model's internal representations into ~16,000 interpretable biological features. Learn more about how to use the ESM Atlas on the Biohub Platform.
32
 
33
+ Read more about ESMC and SAEs in our [paper](https://biohub.ai/papers/esm_protein.pdf).
34
 
35
  ## Intended Use
36
 
37
+ - Decomposing ESMC embeddings into interpretable features
38
+ - Visualizing feature activations on sequences and structures
39
 
40
  ## Usage
41
 
42
+ The SAE model `ESMC-6B-sae-layer60-k64-codebook16384` provides users with interpretable, agent-generated feature descriptions. Users can access these feature descriptions through the ESM Atlas or through the Biohub Platform.
43
 
44
+ While all SAE models can be accessed through Hugging Face, only the following five SAE models are available through the Biohub Platform:
45
 
46
+ - `ESMC-6B-sae-layer60-k64-codebook16384`
47
+ - `ESMC-6B-sae-layer60-k64-codebook65536`
48
+ - `ESMC-600M-sae-layer27-k64-codebook16384`
49
+ - `ESMC-600M-sae-layer27-k64-codebook65536`
50
+ - `ESMC-300M-sae-layer23-k64-codebook65536`
51
 
52
  You can access an SAE model through Hugging Face using the code below:
53
 
54
  ```py
55
  import torch
56
+ from transformers import AutoModel, AutoTokenizer
57
 
58
+ sequence = "MGSNKSKPKDASQRRRSLEPAENVHGAGGGAFPASQTPSKPASADGHRGPSAAFAPAAAEPKLFGGFNSSDTVTSPQRAGPLAGGVTTFVALYDYESRTETDLSFKKGERLQIVNNTEGDWWLAHSLSTGQTGYIPSNYVAPSDSIQAEEWYFGKITRRESERLLLNAENPRGTFLVRESETTKGAYCLSVSDFDNAKGLNVKHYKIRKLDSGGFYITSRTQFNSLQQLVAYYSKHADGLCHRLTTVCPTSKPQTQGLAKDAWEIPRESLRLEVKLGQGCFGEVWMGTWNGTTRVAIKTLKPGTMSPEAFLQEAQVMKKLRHEKLVQLYAVVSEEPIYIVTEYMSKGSLLDFLKGETGKYLRLPQLVDMAAQIASGMAYVERMNYVHRDLRAANILVGENLVCKVADFGLARLIEDNEYTARQGAKFPIKWTAPEAALYGRFTIKSDVWSFGILLTELTTKGRVPYPGMVNREVLDQVERGYRMPCPPECPESLHDLMCQCWRKEPEERPTFEYLQAFLEDYFTSTEPQYQPGENL"
 
59
 
60
+ model = AutoModel.from_pretrained("biohub/ESMC-6B", device_map="auto").eval()
61
+ tokenizer = AutoTokenizer.from_pretrained("biohub/ESMC-6B")
62
+ sae = AutoModel.from_pretrained(
63
+ "biohub/ESMC-6B-sae-k64-codebook16384",
64
+ allow_patterns=["config.json", "layer_30.safetensors", "layer_60.safetensors"],
65
+ device=model.device,
66
+ )
67
+ sae.initialize_layers([30, 60])
68
+ model.add_sae_models([sae.layers["30"], sae.layers["60"]])
69
 
70
+ inputs = tokenizer(sequence, return_tensors="pt", padding=True)
71
+ inputs = {k: v.to(model.device) for k, v in inputs.items()}
 
 
72
 
 
 
 
 
 
73
  with torch.inference_mode():
74
+ output = model(**inputs)
 
 
75
 
76
+ # sparse.coo tensor of shape (batch, seq_len, codebook_size)
77
+ print(output["sae_outputs"]["layer60"].shape)
78
+ ```
79
 
80
  ## Model Details
81
 
82
+ ESMC SAEs are trained to reconstruct ESMC embeddings at a residue level, meaning for a protein of length L, there are L sparse vectors of SAE features. For the models hosted on the Biohub platform, the embeddings are extracted from the hidden states. To provide different options for protein interpretability, our full set of SAE models contains both models that have been trained on hidden states and models that have been trained directly on the MLP outputs. Training SAE models on MLP outputs generates features specific to that layer's computation, while training the SAE models on hidden states may provide a more global understanding. We use the TopK approach for training SAEs to control sparsity by only allowing the top `k` features at each position to be active.
83
 
84
  There are two critical hyperparameters for the TopK approach:
85
 
86
+ - `k`: the number of active features per position
87
+ - `codebook_size`: total number of features the SAE can learn
88
 
89
+ With smaller codebooks, the SAE may group related concepts together. For example, a single feature might activate for all metal-binding sites. With larger codebooks, the model can split general concepts into more granular features. For example, the model may learn dedicated features for zinc-finger motifs, iron-sulfur clusters, and calcium-binding loops.
90
 
91
+ ### Model Naming
92
 
93
+ There are three different families of models based on the SAE training target.
 
94
 
95
+ **Hidden states — SAE model for every layer.** The first family is trained on hidden states at every layer of the respective ESMC models. The naming convention is:
96
 
97
+ ```
98
+ {esmc-model}-sae-k64-codebook16384
99
+ ```
100
 
101
+ The options for `esmc-model` are:
102
 
103
+ - `ESMC-300M`
104
+ - `ESMC-600M`
105
+ - `ESMC-6B`
106
 
107
+ **MLP outputs — SAE model for every layer.** The second family is trained on the per-layer MLP output (before the residual connection) at every layer of the respective ESMC models. The naming convention is:
108
+
109
+ ```
110
+ {esmc-model}-sae-mlp-k64-codebook131072
111
+ ```
112
+
113
+ The options for `esmc-model` are:
114
+
115
+ - `ESMC-300M`
116
+ - `ESMC-600M`
117
+ - `ESMC-6B`
118
+
119
+ **Layer-specific — SAE model for every combination of `k` and `codebook_size`.** The last family is trained on one specific layer of the respective ESMC models with different top-`k` and codebook sizes. With these models, we targeted a 75% depth after various analyses showed that representations at this depth are often the most generalizable to a variety of downstream tasks, similar to findings from other large language models. The naming convention is:
120
+
121
+ ```
122
+ {esmc-model}-sae-layer{layer_num}-k{k}-codebook{codebook}
123
+ ```
124
 
125
+ The options for `esmc-model` with the corresponding `layer_num`:
126
 
127
+ | `esmc-model` | `layer_num` |
128
+ | :---- | ----: |
129
+ | `ESMC-300M` | 23 |
130
+ | `ESMC-600M` | 27 |
131
+ | `ESMC-6B` | 60 |
 
 
 
 
132
 
133
+ The options for `k` are: 16, 32, 64, 128, 256, 512.
134
 
135
+ The options for `codebook` are: 8192, 16384, 32768, 65536, 131072.
136
 
137
+ For example, to load the SAE trained on hidden states at layer 60 in ESMC 6B with `k=64` and the 65k codebook, use `ESMC-6B-sae-layer60-k64-codebook65536`.
 
 
 
 
138
 
139
+ ## Feature Descriptions
140
 
141
+ The SAE model `ESMC-6B-sae-layer60-k64-codebook16384` is the model most heavily studied in our paper and was used to generate features for the ESM Atlas. We also created agent-generated feature descriptions for this model. Users can access these feature descriptions through the ESM Atlas or through the Biohub Platform API.
142
 
143
+ ## Normalization Statistics
144
 
145
+ Only the `ESMC-6B-sae-layer60-k64-codebook16384` model has accessible normalization statistics. The other Biohub-platform-hosted models also include the option to normalize the SAE features, but these statistics are not currently accessible.
146
 
147
+ Normalization statistics are computed by using each model to compute SAE features for all proteins in UniRef90 and recording two quantities per feature: (1) the maximum activation value observed across the entire dataset, and (2) the Inverse Document Frequency (IDF), defined as `log(N / f)`, where `N` is the total number of proteins and `f` is the number of proteins in which the feature was active (non-zero).
 
 
 
148
 
149
+ At inference, activations are normalized as `(activation / max) * idf`. Dividing by the maximum scales each feature's output to the range `[0, 1]`, making features more comparable to each other. Multiplying by IDF then upweights rare, distinctive features and downweights ubiquitous ones, making it easier to distinguish biologically relevant features. These statistics can be accessed through the feature-description API.
150
 
151
+ ## Frontier Safety
152
 
153
+ Biohub has established a safety team to assess the benefits and potential risks of our models and tools prior to release, and develop mitigations where necessary. Informed by our risk assessments, we are releasing the source code and model weights for our ESMC SAEs. We are also releasing our ESM Atlas dataset openly.
154
 
155
+ Biohub.ai Platform: We implement guardrails that detect and restrict the use of keywords and sequences corresponding to controlled pathogens and toxins on our freely accessible platform. For further details regarding these guardrails, please refer to our Biohub platform Resources page.