|
--- |
|
license: cc-by-nc-nd-4.0 |
|
extra_gated_fields: |
|
Name: text |
|
Company: text |
|
Country: country |
|
Specific date: date_picker |
|
I want to use this model for: |
|
type: select |
|
options: |
|
- Research |
|
- Education |
|
- label: Other |
|
value: other |
|
I agree to include the authors of the code (Tianlai Chen and Pranam Chatterjee) as authors on manuscripts with data from designed peptides: checkbox |
|
I agree to share generated sequences and associated data with authors before publishing: checkbox |
|
I agree not to file patents on any sequences generated by this model: checkbox |
|
I agree to use this model for non-commercial use ONLY: checkbox |
|
--- |
|
**PepMLM: Target Sequence-Conditioned Generation of Peptide Binders via Masked Language Modeling** |
|
![image/png](/static-proxy?url=https%3A%2F%2Fcdn-uploads.huggingface.co%2Fproduction%2Fuploads%2F63df6223f351dc0745681f77%2FhkKA0GttGY5l3oVcKf0bR.png%3C%2Fspan%3E)%3C!-- HTML_TAG_END --> |
|
In this work, we introduce **PepMLM**, a purely target sequence-conditioned *de novo* generator of linear peptide binders. |
|
By employing a novel masking strategy that uniquely positions cognate peptide sequences at the terminus of target protein sequences, |
|
PepMLM tasks the state-of-the-art ESM-2 pLM to fully reconstruct the binder region, |
|
achieving low perplexities matching or improving upon previously-validated peptide-protein sequence pairs. |
|
After successful *in silico* benchmarking with AlphaFold-Multimer, we experimentally verify PepMLM’s efficacy via fusion of model-derived peptides to E3 ubiquitin ligase domains, demonstrating endogenous degradation of target substrates in cellular models. |
|
In total, PepMLM enables the generative design of candidate binders to any target protein, without the requirement of target structure, empowering downstream programmable proteome editing applications. |
|
|
|
- Demo: HuggingFace Space Demo [Link](https://huggingface.co/spaces/TianlaiChen/PepMLM).[Temporarily Unavailable] |
|
- Colab Notebook: [Link](https://colab.research.google.com/drive/1u0i-LBog_lvQ5YRKs7QLKh_RtI-tV8qM?usp=sharing) |
|
- Preprint: [Link](https://arxiv.org/abs/2310.03842) |
|
|
|
# Apply for Access |
|
As of February 2024, the model has been gated on HuggingFace. If you wish to use our model, please visit our page on the HuggingFace site ([Link](https://huggingface.co/ChatterjeeLab/PepMLM-650M)) and submit your access request there. An active HuggingFace account is necessary for both the application and subsequent modeling use. Approval of requests may take a few days, as we are a small lab with a manual approval process. |
|
|
|
Once your request is approved, you will need your personal access token to begin using this notebook. We appreciate your understanding. |
|
|
|
- How to find your access token: https://huggingface.co/docs/hub/en/security-tokens |
|
|
|
``` |
|
# Load model directly |
|
from transformers import AutoTokenizer, AutoModelForMaskedLM |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("TianlaiChen/PepMLM-650M") |
|
model = AutoModelForMaskedLM.from_pretrained("TianlaiChen/PepMLM-650M") |
|
``` |
|
![Logo](logo.png) |