Highly accurate long-read HiFi sequencing data for five complex genomes
Karalius, Joseph W
Landolin, Jane M
Hardigan, Michael A
Steiner, Cynthia C
Knapp, Steven J
Rank, David R
AffiliationUniv Arizona, Arizona Genom Inst
Univ Arizona, Sch Plant Sci
MetadataShow full item record
CitationHon, T., Mars, K., Young, G., Tsai, Y. C., Karalius, J. W., Landolin, J. M., ... & Rank, D. R. (2020). Highly accurate long-read HiFi sequencing data for five complex genomes. Scientific Data, 7(1), 1-11.
Rights© The Author(s) 2020. Open Access. This article is licensed under a Creative Commons Attribution 4.0 International License. The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files associated with this article.
Collection InformationThis item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at email@example.com.
AbstractThe PacBio® HiFi sequencing method yields highly accurate long-read sequencing datasets with read lengths averaging 10-25 kb and accuracies greater than 99.5%. These accurate long reads can be used to improve results for complex applications such as single nucleotide and structural variant detection, genome assembly, assembly of difficult polyploid or highly repetitive genomes, and assembly of metagenomes. Currently, there is a need for sample data sets to both evaluate the benefits of these long accurate reads as well as for development of bioinformatic tools including genome assemblers, variant callers, and haplotyping algorithms. We present deep coverage HiFi datasets for five complex samples including the two inbred model genomes Mus musculus and Zea mays, as well as two complex genomes, octoploid Fragaria × ananassa and the diploid anuran Rana muscosa. Additionally, we release sequence data from a mock metagenome community. The datasets reported here can be used without restriction to develop new algorithms and explore complex genome structure and evolution. Data were generated on the PacBio Sequel II System.
NoteOpen access journal
VersionFinal published version
Except where otherwise noted, this item's license is described as © The Author(s) 2020. Open Access. This article is licensed under a Creative Commons Attribution 4.0 International License. The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files associated with this article.
- Linked read technology for assembling large complex and polyploid genomes.
- Authors: Ott A, Schnable JC, Yeh CT, Wu L, Liu C, Hu HC, Dalgard CL, Sarkar S, Schnable PS
- Issue date: 2018 Sep 4
- Comparison of the two up-to-date sequencing technologies for genome assembly: HiFi reads of Pacific Biosciences Sequel II system and ultralong reads of Oxford Nanopore.
- Authors: Lang D, Zhang S, Ren P, Liang F, Sun Z, Meng G, Tan Y, Li X, Lai Q, Han L, Wang D, Hu F, Wang W, Liu S
- Issue date: 2020 Dec 15
- MinION™ nanopore sequencing of environmental metagenomes: a synthetic approach.
- Authors: Brown BL, Watson M, Minot SS, Rivera MC, Franklin RB
- Issue date: 2017 Mar 1
- Effect of sequence depth and length in long-read assembly of the maize inbred NC358.
- Authors: Ou S, Liu J, Chougule KM, Fungtammasan A, Seetharam AS, Stein JC, Llaca V, Manchanda N, Gilbert AM, Wei S, Chin CS, Hufnagel DE, Pedersen S, Snodgrass SJ, Fengler K, Woodhouse M, Walenz BP, Koren S, Phillippy AM, Hannigan BT, Dawe RK, Hirsch CN, Hufford MB, Ware D
- Issue date: 2020 May 8
- Improved assembly and variant detection of a haploid human genome using single-molecule, high-fidelity long reads.
- Authors: Vollger MR, Logsdon GA, Audano PA, Sulovari A, Porubsky D, Peluso P, Wenger AM, Concepcion GT, Kronenberg ZN, Munson KM, Baker C, Sanders AD, Spierings DCJ, Lansdorp PM, Surti U, Hunkapiller MW, Eichler EE
- Issue date: 2020 Mar