Show simple item record

dc.contributor.authorPalacios, A.V.
dc.contributor.authorAcharya, P.
dc.contributor.authorPeidl, A.S.
dc.contributor.authorBeck, M.R.
dc.contributor.authorBlanco, E.
dc.contributor.authorMishra, A.
dc.contributor.authorBawa-Khalfe, T.
dc.contributor.authorPakhrin, S.C.
dc.date.accessioned2024-08-03T03:56:03Z
dc.date.available2024-08-03T03:56:03Z
dc.date.issued2024-02-07
dc.identifier.citationAndrew Vargas Palacios, Pujan Acharya, Anthony Stephen Peidl, Moriah Rene Beck, Eduardo Blanco, Avdesh Mishra, Tasneem Bawa-Khalfe, Subash Chandra Pakhrin, SumoPred-PLM: human SUMOylation and SUMO2/3 sites Prediction using Pre-trained Protein Language Model, NAR Genomics and Bioinformatics, Volume 6, Issue 1, March 2024, lqae011, https://doi.org/10.1093/nargab/lqae011
dc.identifier.issn2631-9268
dc.identifier.doi10.1093/nargab/lqae011
dc.identifier.urihttp://hdl.handle.net/10150/673179
dc.description.abstractSUMOylation is an essential post-translational modification system with the ability to regulate nearly all aspects of cellular physiology. Three major paralogues SUMO1, SUMO2 and SUMO3 form a covalent bond between the small ubiquitin-like modifier with lysine residues at consensus sites in protein substrates. Biochemical studies continue to identify unique biological functions for protein targets conjugated to SUMO1 versus the highly homologous SUMO2 and SUMO3 paralogues. Yet, the field has failed to harness contemporary AI approaches including pre-trained protein language models to fully expand and/or recognize the SUMOylated proteome. Herein, we present a novel, deep learning-based approach called SumoPred-PLM for human SUMOylation prediction with sensitivity, specificity, Matthew's correlation coefficient, and accuracy of 74.64%, 73.36%, 0.48% and 74.00%, respectively, on the CPLM 4.0 independent test dataset. In addition, this novel platform uses contextualized embeddings obtained from a pre-trained protein language model, ProtT5-XL-UniRef50 to identify SUMO2/3-specific conjugation sites. The results demonstrate that SumoPred-PLM is a powerful and unique computational tool to predict SUMOylation sites in proteins and accelerate discovery. © 2024 The Author(s). Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics.
dc.language.isoen
dc.publisherOxford University Press
dc.rights© The Author(s) 2024. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. This is an Open Access article distributed under the terms of the Creative Commons Attribution License.
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.titleSumoPred-PLM: human SUMOylation and SUMO2/3 sites Prediction using Pre-trained Protein Language Model
dc.typeArticle
dc.typetext
dc.contributor.departmentDepartment of Computer Science, University of Arizona
dc.identifier.journalNAR Genomics and Bioinformatics
dc.description.noteOpen access journal
dc.description.collectioninformationThis item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at repository@u.library.arizona.edu.
dc.eprint.versionFinal Published Version
dc.source.journaltitleNAR Genomics and Bioinformatics
refterms.dateFOA2024-08-03T03:56:03Z


Files in this item

Thumbnail
Name:
lqae011.pdf
Size:
1.385Mb
Format:
PDF
Description:
Final Published Version

This item appears in the following Collection(s)

Show simple item record

© The Author(s) 2024. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. This is an Open Access article distributed under the terms of the Creative Commons Attribution License.
Except where otherwise noted, this item's license is described as © The Author(s) 2024. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. This is an Open Access article distributed under the terms of the Creative Commons Attribution License.