• Login
    View Item 
    •   Home
    • UA Faculty Research
    • UA Faculty Publications
    • View Item
    •   Home
    • UA Faculty Research
    • UA Faculty Publications
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of UA Campus RepositoryCommunitiesTitleAuthorsIssue DateSubmit DateSubjectsPublisherJournalThis CollectionTitleAuthorsIssue DateSubmit DateSubjectsPublisherJournal

    My Account

    LoginRegister

    About

    AboutUA Faculty PublicationsUA DissertationsUA Master's ThesesUA Honors ThesesUA PressUA YearbooksUA CatalogsUA Libraries

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    Learning parameter-advising sets for multiple sequence alignment

    • CSV
    • RefMan
    • EndNote
    • BibTex
    • RefWorks
    Thumbnail
    Name:
    tcbb2017.pdf
    Size:
    2.465Mb
    Format:
    PDF
    Description:
    Final Accepted Manuscript
    Download
    Author
    DeBlasio, Dan cc
    Kececioglu, John
    Affiliation
    Computational Biology Department, Carnegie Mellon University
    Department of Computer Science, The University of Arizona
    Issue Date
    2017
    Keywords
    Multiple sequence alignment
    alignment scoring functions
    parameter values
    accuracy estimation
    parameter advising
    
    Metadata
    Show full item record
    Publisher
    IEEE COMPUTER SOC
    Citation
    IEEE/ACM Transactions on Computational Biology and Bioinformatics 14:5, 1028-1041, 2017
    Journal
    IEEE/ACM Transactions on Computational Biology and Bioinformatics
    Rights
    © 2015 IEEE.
    Collection Information
    This item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at repository@u.library.arizona.edu.
    Abstract
    While the multiple sequence alignment output by an aligner strongly depends on the parameter values used for the alignment scoring function (such as the choice of gap penalties and substitution scores), most users rely on the single default parameter setting provided by the aligner. A different parameter setting, however, might yield a much higher-quality alignment for the specific set of input sequences. The problem of picking a good choice of parameter values for specific input sequences is called parameter advising. A parameter advisor has two ingredients: (i) a set of parameter choices to select from, and (ii) an estimator that provides an estimate of the accuracy of the alignment computed by the aligner using a parameter choice. The parameter advisor picks the parameter choice from the set whose resulting alignment has highest estimated accuracy. We consider for the first time the problem of learning the optimal set of parameter choices for a parameter advisor that uses a given accuracy estimator. The optimal set is one that maximizes the expected true accuracy of the resulting parameter advisor, averaged over a collection of training data. While we prove that learning an optimal set for an advisor is NP-complete, we show there is a natural approximation algorithm for this problem, and prove a tight bound on its approximation ratio. Experiments with an implementation of this approximation algorithm on biological benchmarks, using various accuracy estimators from the literature, show it finds sets for advisors that are surprisingly close to optimal. Furthermore, the resulting parameter advisors are significantly more accurate in practice than simply aligning with a single default parameter choice.
    ISSN
    1545-5963
    EISSN
    1557-9964
    PubMed ID
    28991725
    DOI
    10.1109/TCBB.2015.2430323
    Version
    Final accepted manuscript
    Sponsors
    US National Science Foundation [IIS-1217886]; University of Arizona IGERT in Comparative Genomics through US National Science Foundation [DGE-0654435]
    ae974a485f413a2113503eed53cd6c53
    10.1109/TCBB.2015.2430323
    Scopus Count
    Collections
    UA Faculty Publications

    entitlement

    Related articles

    • Accuracy estimation and parameter advising for protein multiple sequence alignment.
    • Authors: Kececioglu J, DeBlasio D
    • Issue date: 2013 Apr
    • Learning scoring schemes for sequence alignment from partial examples.
    • Authors: Kim E, Kececioglu J
    • Issue date: 2008 Oct-Dec
    • Adaptive Local Realignment of Protein Sequences.
    • Authors: DeBlasio D, Kececioglu J
    • Issue date: 2018 Jul
    • Reducing Alignment Time Complexity of Ultra-Large Sets of Sequences.
    • Authors: Rubio-Largo Á, Vanneschi L, Castelli M, Vega-Rodríguez MA
    • Issue date: 2017 Nov
    • An improved scoring method for protein residue conservation and multiple sequence alignment.
    • Authors: Nguyen KD, Pan Y
    • Issue date: 2011 Dec
    The University of Arizona Libraries | 1510 E. University Blvd. | Tucson, AZ 85721-0055
    Tel 520-621-6442 | repository@u.library.arizona.edu
    DSpace software copyright © 2002-2017  DuraSpace
    Quick Guide | Contact Us | Send Feedback
    Open Repository is a service operated by 
    Atmire NV
     

    Export search results

    The export option will allow you to export the current search results of the entered query to a file. Different formats are available for download. To export the items, click on the button corresponding with the preferred download format.

    By default, clicking on the export buttons will result in a download of the allowed maximum amount of items.

    To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export. The amount of items that can be exported at once is similarly restricted as the full export.

    After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.