biohub
/

ESMC-SAE-Overview

 ---
 license: mit
 ---
+# ESMC SAE Model Card
+## Description
+This model card provides an overview of the intended use of the ESMC SAE models and examples of how to access them. To access the individual Hugging Face model cards for any of the models, use the format below:
+```py
+https://huggingface.co/Biohub/esmc-<modelsize>-2024-12-sae-sweep<layer><number>-k64-codebook<size>
+```
+The ESMC sparse autoencoders (SAEs) are unsupervised neural networks trained to decompose the learned internal representations of the [ESMC 6B model](https://huggingface.co/biohub/esmc-6b-2024-12) into a sparse set of more easily interpretable features, revealing what the model “sees” of the user’s protein input. Each feature represents a specific biologically relevant property of the protein, such as a zinc binding site, beta barrel structure, or transmembrane helix.
+The ESM Atlas is a map of 6.8 billion proteins covering the full breadth of life’s biodiversity and more than one billion predicted structures, built upon the ESMC SAEs to translate the model’s internal representations into \~16,000 interpretable biological features. Learn more about how to use the ESM Atlas [here](https://biohub.ai/esmc/atlas).
+For additional information, visit the [Biohub Platform](https://biohub.ai) for no-code tools, step-by-step tutorial notebooks, and detailed information on the models.
+## Intended Use
+* Reconstructing ESMC embeddings into interpretable features
+* Visualizing feature activations on sequences and structures
+## Model Details
+ESMC SAEs are trained to reconstruct ESMC embeddings at a *residue level*, meaning for a protein of length L, there are L sparse vectors of SAE features. The embeddings are extracted from the hidden states after the multi-layered perceptron (MLP) layer. SAE models are trained on MLP and on state activations to provide different options for protein interpretability. Training SAE models on MLP layers can make individual features from each layer more interpretable while training the SAE models on state activations may provide a more global understanding. We use the [TopK](https://arxiv.org/abs/2412.06410) approach for training SAEs to control sparsity by only allowing the top k features at each position to be active.
+There are two critical hyperparameters for the TopK approach:
+* `k`: the number of active features per position
+* `codebook_size`: the total size of the codebook.
+The SAE codebook size (some fields use the term “dictionary”) determines how many distinct features the model can learn to represent. For protein models, the codebook size determines the balance between how accurately the model can reconstruct complex biological data and how easily interpretable those features are. As codebook size increases, the SAE can both reconstruct embeddings more faithfully and learn a higher number of specific features. However, the larger the codebook size the greater the computational expense and probability of detecting more features that are difficult to interpret or rarely activate.
+* Small codebook: The SAE is forced to group related concepts together. For example, a single feature might activate for all "metal-binding sites."
+* Large codebook: The model can "split" that general concept into more specific, granular features. Instead of one broad feature, you might get separate dedicated features for example "zinc-finger motifs," "iron-sulfur clusters," and "calcium-binding loops".
+### Model naming
+Models are named as follows: `{esmc-model}-sae-{sweep|state|residual}-layer{layer}-k{k}-codebook{codebook_size}`
+For example, to get the sweep model at ESMC 6B using the target sparsity value (k) of 64 with the 65k codebook use the following code:  `esmc-6b-2024-12-sae-sweep-layer60-k64-codebook65536`. The example code below shows how to use this model naming convention to obtain a specific model variant. See the Usage section below to see an example.
+The codebook values are as follows: 8192, 16384, 32768, 65536, 131072\.
+### SAE models
+The table below lists the variations for the SAE models. Each SAE variant is trained on a base model, either MLP or state activations, a specific codebook size, and a specific layer of the base model. Thus, the first row of the table indicates that there are 81 (one model for every 1-90 transformer layers, plus an input embedding layer) SAE variants trained on ESMC 6B using state activations and a codebook of 16k features.
+| Model | MLP or State Activations | Codebooks | Layers |
+| :---- | :---- | :---- | :---- |
+| ESMC 6B | state | 16k | A model for every 1-80 transformer layers \+ input embedding layer |
+| ESMC 6B | state | 131k | A model for every 1-80 transformer layers \+ input embedding layer |
+| ESMC 6B | MLP | 131k | A model for every 1-80 transformer layers \+ 1 input embedding layer |
+| ESMC 600M | state | 16k | A model for every 1-37 transformer layers \+ 1 input embedding layer |
+| ESMC 600M | MLP | 131k | A model for every 1-37 transformer layers \+ 1 input embedding layer |
+| ESMC 300M | state | 16k | 1-31 transformer layers \+ 1 input embedding layer |
+| ESMC 300M | MLP | 131k | A model for every 1-31 transformer layers \+ 1 input embedding layer |
+### Sweep variants
+In addition to the SAEs listed above, there are sweeps trained across the different ESMC model variants.The table below summarizes the ESMC variant and the layer used for training. We targeted a 75% depth after testing showed that middle-to-late layers yielded the most pertinent feature information, similar to findings from other large language learning models. For each layer, there are 30 pre-trained models covering every combination of five codebook sizes (8k ,16k, 32k, 65k, 131k) and six target sparsities (16, 32, 64 ,128, 256, 512).
+| Base model variant | Training layer |
+| :---- | :---- |
+| ESMC 6B | 60 |
+| ESMC 600M | 27 |
+| ESMC 300M | 23 |
+###
+### Spotlight model
+The SAE model `esmc-6b-2024-12-sae-sweep-layer60-k64-codebook16384` is highlighted as a spotlight model that provides users with interpretable, agent-generated feature descriptions that are detailed [here](https://huggingface.co/datasets/biohub/ESMC-SAE-Features). Users can access these feature descriptions through the [ESM Atlas](https://biohub.ai/esmc/atlas)  or through Biohub’s open platform with the code [here](https://github.com/biohub/esm/blob/cookbook/snippets/sae_example.py).
+### Normalization Statistics
+The following four SAE models include max/IDF normalization statistics:
+* `esmc-600m-2024-12-sae-sweep-layer27-k64-codebook16384`
+* `esmc-600m-2024-12-sae-sweep-layer27-k64-codebook65536`
+* `esmc-6b-2024-12-sae-sweep-layer60-k64-codebook16384`
+* `esmc-6b-2024-12-sae-sweep-layer60-k64-codebook65536`
+Normalization statistics are computed by running each model over UniRef90 and recording two quantities per feature: (1) the maximum activation value observed across the entire dataset, and (2) the Inverse Document Frequency (IDF), defined as log(N / f), where N is the total number of proteins and f is the number of proteins in which the feature was active (non-zero).
+At inference  activations are normalized as: (activation / max) \* idf. Dividing by the maximum scales each feature's output to the range \[0, 1\], making features more comparable to each other. Multiplying by IDF then upweights rare, distinctive features and downweights ubiquitous ones, making it easier for users to distinguish biologically relevant features.
+## Usage
+While all SAE models can be accessed through Hugging Face, only the 5 SAE models mentioned below are available through the [Biohub Platform](https://biohub.ai/models/esmc). They are the four models with normalization statistics (including the Spotlight Model), and an ESMC 300M model:
+* [esmc-300m-2024-12-sae-sweep-layer23-k64-codebook65536](https://huggingface.co/biohub/esmc-300M-2024-12-sae-sweep-layer23-k64-codebook65536)
+* [esmc-600m-2024-12-sae-sweep-layer27-k64-codebook16384](https://huggingface.co/biohub/esmc-600m-2024-12-sae-sweep-layer27-k64-codebook16384)
+* [esmc-600m-2024-12-sae-sweep-layer27-k64-codebook65536](https://huggingface.co/biohub/esmc-600m-2024-12-sae-sweep-layer27-k64-codebook65536)
+* [esmc-6b-2024-12-sae-sweep-layer60-k64-codebook16384](https://huggingface.co/biohub/esmc-6b-2024-12-sae-sweep-layer60-k64-codebook16384)
+* [Esmc-6b-2024-12-sae-sweep-layer60-k64-codebook65536](https://huggingface.co/biohub/esmc-6b-2024-12-sae-sweep-layer60-k64-codebook65536)
+You can access an SAE model through Hugging Face using the code below:
+```py
+import torch
+from transformers import AutoModel, AutoModelForMaskedLM, AutoTokenizer
+# GFP sequence
+sequences = ["MSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPTLVTTFSYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQLADHYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITHGMDELYK"]
+model = AutoModelForMaskedLM.from_pretrained(
+"Biohub/esmc-6b-2024-12",
+device_map="auto", # place model on GPU(s) if available
+).eval()
+tokenizer = AutoTokenizer.from_pretrained("Biohub/esmc-6b-2024-12")
+# Load SAE(s)
+sae_models = []
+sae = AutoModel.from_pretrained("Biohub/esmc-6b-2024-12-sae-sweep-layer60-k64-codebook65536", device_map="auto")
+sae_models.append(sae)
+# Add SAE(s) to the ESMC model
+model.add_sae_models(sae_models)
+inputs = tokenizer(sequences, return_tensors="pt", padding=True)
+inputs = {k: v.to(model.device) for k, v in inputs.items()} # place inputs on device
+with torch.inference_mode():
+output = model(**inputs)
+print(output.sae_outputs) # Access SAE outputs
+```
+For additional details about using SAE models, see the tutorials [here](https://colab.research.google.com/github/Biohub/esm/blob/main/cookbook/tutorials/8_protein_interpretation_sae.ipynb).