Show simple item record

dc.contributor.authorKrieger, Spencer
dc.contributor.authorKececioglu, John
dc.date.accessioned2020-12-03T00:45:26Z
dc.date.available2020-12-03T00:45:26Z
dc.date.issued2020-07-13
dc.identifier.citationSpencer Krieger, John Kececioglu, Boosting the accuracy of protein secondary structure prediction through nearest neighbor search and method hybridization, Bioinformatics, Volume 36, Issue Supplement_1, July 2020, Pages i317–i325, https://doi.org/10.1093/bioinformatics/btaa336en_US
dc.identifier.issn1367-4803
dc.identifier.pmid32657384
dc.identifier.doi10.1093/bioinformatics/btaa336
dc.identifier.urihttp://hdl.handle.net/10150/649169
dc.description.abstractMotivation: Protein secondary structure prediction is a fundamental precursor to many bioinformatics tasks. Nearly all state-of-the-art tools when computing their secondary structure prediction do not explicitly leverage the vast number of proteins whose structure is known. Leveraging this additional information in a so-called template-based method has the potential to significantly boost prediction accuracy. Method: We present a new hybrid approach to secondary structure prediction that gains the advantages of both template- and non-template-based methods. Our core template-based method is an algorithmic approach that uses metric-space nearest neighbor search over a template database of fixed-length amino acid words to determine estimated class-membership probabilities for each residue in the protein. These probabilities are then input to a dynamic programming algorithm that finds a physically valid maximum-likelihood prediction for the entire protein. Our hybrid approach exploits a novel accuracy estimator for our core method, which estimates the unknown true accuracy of its prediction, to discern when to switch between template- and non-template-based methods. Results: On challenging CASP benchmarks, the resulting hybrid approach boosts the state-of-the-art Q(8) accuracy by more than 2-10%, and Q(3) accuracy by more than 1-3%, yielding the most accurate method currently available for both 3- and 8-state secondary structure prediction.en_US
dc.description.sponsorshipNational Science Foundationen_US
dc.language.isoenen_US
dc.publisherOxford University Press (OUP)en_US
dc.rights© The Author(s) 2020. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/).en_US
dc.rights.urihttps://creativecommons.org/licenses/by-nc/4.0/en_US
dc.titleBoosting the accuracy of protein secondary structure prediction through nearest neighbor search and method hybridizationen_US
dc.typeArticleen_US
dc.identifier.eissn1460-2059
dc.contributor.departmentUniv Arizona, Dept Comp Scien_US
dc.identifier.journalBIOINFORMATICSen_US
dc.description.noteOpen access articleen_US
dc.description.collectioninformationThis item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at repository@u.library.arizona.edu.en_US
dc.eprint.versionFinal published versionen_US
dc.source.journaltitleBioinformatics
dc.source.volume36
dc.source.issueSupplement_1
dc.source.beginpagei317
dc.source.endpagei325
refterms.dateFOA2020-12-03T00:45:27Z


Files in this item

Thumbnail
Name:
btaa336.pdf
Size:
468.8Kb
Format:
PDF
Description:
Final Published Version

This item appears in the following Collection(s)

Show simple item record

© The Author(s) 2020. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/).
Except where otherwise noted, this item's license is described as © The Author(s) 2020. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/).