We are upgrading the repository! A content freeze is in effect until December 6th, 2024 - no new submissions will be accepted; however, all content already published will remain publicly available. Please reach out to repository@u.library.arizona.edu with your questions, or if you are a UA affiliate who needs to make content available soon. Note that any new user accounts created after September 22, 2024 will need to be recreated by the user in November after our migration is completed.

Show simple item record

dc.contributor.authorChoi, Illyoung
dc.contributor.authorPonsero, Alise J
dc.contributor.authorBomhoff, Matthew
dc.contributor.authorYouens-Clark, Ken
dc.contributor.authorHartman, John H
dc.contributor.authorHurwitz, Bonnie L
dc.date.accessioned2019-07-30T21:38:47Z
dc.date.available2019-07-30T21:38:47Z
dc.date.issued2019-02-01
dc.identifier.citationIllyoung Choi, Alise J Ponsero, Matthew Bomhoff, Ken Youens-Clark, John H Hartman, Bonnie L Hurwitz, Libra: scalable k-mer–based tool for massive all-vs-all metagenome comparisons, GigaScience, Volume 8, Issue 2, February 2019, giy165, https://doi.org/10.1093/gigascience/giy165en_US
dc.identifier.issn2047-217X
dc.identifier.pmid30597002
dc.identifier.doi10.1093/gigascience/giy165
dc.identifier.urihttp://hdl.handle.net/10150/633586
dc.description.abstractBackground Shotgun metagenomics provides powerful insights into microbial community biodiversity and function. Yet, inferences from metagenomic studies are often limited by dataset size and complexity and are restricted by the availability and completeness of existing databases. De novo comparative metagenomics enables the comparison of metagenomes based on their total genetic content. Results We developed a tool called Libra that performs an all-vs-all comparison of metagenomes for precise clustering based on their k-mer content. Libra uses a scalable Hadoop framework for massive metagenome comparisons, Cosine Similarity for calculating the distance using sequence composition and abundance while normalizing for sequencing depth, and a web-based implementation in iMicrobe (http://imicrobe.us) that uses the CyVerse advanced cyberinfrastructure to promote broad use of the tool by the scientific community. Conclusions A comparison of Libra to equivalent tools using both simulated and real metagenomic datasets, ranging from 80 million to 4.2 billion reads, reveals that methods commonly implemented to reduce compute time for large datasets, such as data reduction, read count normalization, and presence/absence distance metrics, greatly diminish the resolution of large-scale comparative analyses. In contrast, Libra uses all of the reads to calculate k-mer abundance in a Hadoop architecture that can scale to any size dataset to enable global-scale analyses and link microbial signatures to biological processes.en_US
dc.description.sponsorshipNational Science Foundation [1640775]en_US
dc.language.isoenen_US
dc.publisherOXFORD UNIV PRESSen_US
dc.relation.urlhttps://academic.oup.com/gigascience/article/8/2/giy165/5266304en_US
dc.rights© The Author(s) 2018. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution License.en_US
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectmetagenomicsen_US
dc.subjectHadoopen_US
dc.subjectk-meren_US
dc.subjectdistance metricsen_US
dc.subjectclusteringen_US
dc.titleLibra: scalable k-mer-based tool for massive all-vs-all metagenome comparisonsen_US
dc.typeArticleen_US
dc.contributor.departmentUniv Arizona, Dept Comp Scien_US
dc.contributor.departmentUniv Arizona, Dept Biosyst Engnen_US
dc.contributor.departmentUniv Arizona, BIO5 Insten_US
dc.identifier.journalGIGASCIENCEen_US
dc.description.noteOpen access journalen_US
dc.description.collectioninformationThis item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at repository@u.library.arizona.edu.en_US
dc.eprint.versionFinal published versionen_US
dc.source.journaltitleGigaScience
refterms.dateFOA2019-07-30T21:38:47Z


Files in this item

Thumbnail
Name:
giy165.pdf
Size:
2.669Mb
Format:
PDF
Description:
Final Published Version

This item appears in the following Collection(s)

Show simple item record

© The Author(s) 2018. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution License.
Except where otherwise noted, this item's license is described as © The Author(s) 2018. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution License.