TCR-ESM: Employing protein language embeddings to predict TCR-peptide-MHC binding
Affiliation
Department of Biomedical Engineering, University of ArizonaIssue Date
2023-12-08Keywords
Peptide embeddingsProtein language models
T-cell therapy
TCR specificity
TCR-pMHC interactions
Metadata
Show full item recordPublisher
Elsevier B.V.Citation
Yadav, S., Vora, D. S., Sundar, D., & Dhanjal, J. K. (2024). TCR-ESM: Employing protein language embeddings to predict TCR-peptide-MHC binding. Computational and Structural Biotechnology Journal, 23, 165-173.Rights
© 2023 The Authors. Published by Elsevier B.V. on behalf of Research Network of Computational and Structural Biotechnology. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).Collection Information
This item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at repository@u.library.arizona.edu.Abstract
Cognate target identification for T-cell receptors (TCRs) is a significant barrier in T-cell therapy development, which may be overcome by accurately predicting TCR interaction with peptide-bound major histocompatibility complex (pMHC). In this study, we have employed peptide embeddings learned from a large protein language model- Evolutionary Scale Modeling (ESM), to predict TCR-pMHC binding. The TCR-ESM model presented outperforms existing predictors. The complementarity-determining region 3 (CDR3) of the hypervariable TCR is located at the center of the paratope and plays a crucial role in peptide recognition. TCR-ESM trained on paired TCR data with both CDR3α and CDR3β chain information performs significantly better than those trained on data with only CDR3β, suggesting that both TCR chains contribute to specificity, the relative importance however depends on the specific peptide-MHC targeted. The study illuminates the importance of MHC information in TCR-peptide binding which remained inconclusive so far and was thought dependent on the dataset characteristics. TCR-ESM outperforms existing approaches on external datasets, suggesting generalizability. Overall, the potential of deep learning for predicting TCR-pMHC interactions and improving the understanding of factors driving TCR specificity are highlighted. The prediction model is available at http://tcresm.dhanjal-lab.iiitd.edu.in/ as an online tool. © 2023 The AuthorsNote
Open access journalISSN
2001-0370Version
Final Published Versionae974a485f413a2113503eed53cd6c53
10.1016/j.csbj.2023.11.037
Scopus Count
Collections
Except where otherwise noted, this item's license is described as © 2023 The Authors. Published by Elsevier B.V. on behalf of Research Network of Computational and Structural Biotechnology. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).