Variance Component Selection With Applications to Microbiome Taxonomic Data.
Affiliation
Univ Arizona, Dept Epidemiol & BiostatUniv Arizona, Dept Med, Div Pulm Allergy Crit Care & Sleep Med
Issue Date
2018-03-28Keywords
Human Immunodeficiency Virus (HIV)MM-algorithm
lasso
longitudinal study
lung microbiome
variable selection
variance component models
Metadata
Show full item recordPublisher
FRONTIERS MEDIA SACitation
Zhai J, Kim J, Knox KS, Twigg HL III, Zhou H and Zhou JJ (2018) Variance Component Selection With Applications to Microbiome Taxonomic Data. Front. Microbiol. 9:509. doi: 10.3389/fmicb.2018.00509Journal
FRONTIERS IN MICROBIOLOGYRights
© 2018 Zhai, Kim, Knox, Twigg, Zhou and Zhou. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY).Collection Information
This item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at repository@u.library.arizona.edu.Abstract
High-throughput sequencing technology has enabled population-based studies of the role of the human microbiome in disease etiology and exposure response. Microbiome data are summarized as counts or composition of the bacterial taxa at different taxonomic levels. An important problem is to identify the bacterial taxa that are associated with a response. One method is to test the association of specific taxon with phenotypes in a linear mixed effect model, which incorporates phylogenetic information among bacterial communities. Another type of approaches consider all taxa in a joint model and achieves selection via penalization method, which ignores phylogenetic information. In this paper, we consider regression analysis by treating bacterial taxa at different level as multiple random effects. For each taxon, a kernel matrix is calculated based on distance measures in the phylogenetic tree and acts as one variance component in the joint model. Then taxonomic selection is achieved by the lasso (least absolute shrinkage and selection operator) penalty on variance components. Our method integrates biological information into the variable selection problem and greatly improves selection accuracies. Simulation studies demonstrate the superiority of our methods versus existing methods, for example, group-lasso. Finally, we apply our method to a longitudinal microbiome study of Human Immunodeficiency Virus (HIV) infected patients. We implement our method using the high performance computing language Julia. Software and detailed documentation are freely available at https://github.com/JingZhai63/VCselection.Note
Open access journal.UA Open Access Publishing Fund.
ISSN
1664-302XPubMed ID
29643839Version
Final published versionSponsors
NIH [K01DK106116, HG006139, GM105785, GM53275, UO1 HL121831, UO1 HL098960]; Arizona Biomedical Research Commission (ABRC) grant; NSF [DMS-1645093]Additional Links
https://www.frontiersin.org/articles/10.3389/fmicb.2018.00509/fullae974a485f413a2113503eed53cd6c53
10.3389/fmicb.2018.00509
Scopus Count
Collections
Except where otherwise noted, this item's license is described as © 2018 Zhai, Kim, Knox, Twigg, Zhou and Zhou. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY).
Related articles
- Exact variance component tests for longitudinal microbiome studies.
- Authors: Zhai J, Knox K, Twigg HL 3rd, Zhou H, Zhou JJ
- Issue date: 2019 Apr
- Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.
- Authors: Crider K, Williams J, Qi YP, Gutman J, Yeung L, Mai C, Finkelstain J, Mehta S, Pons-Duran C, Menéndez C, Moraleda C, Rogers L, Daniels K, Green P
- Issue date: 2022 Feb 1
- Phylogeny-guided microbiome OTU-specific association test (POST).
- Authors: Huang C, Callahan BJ, Wu MC, Holloway ST, Brochu H, Lu W, Peng X, Tzeng JY
- Issue date: 2022 Jun 7
- Testing in Microbiome-Profiling Studies with MiRKAT, the Microbiome Regression-Based Kernel Association Test.
- Authors: Zhao N, Chen J, Carroll IM, Ringel-Kulka T, Epstein MP, Zhou H, Zhou JJ, Ringel Y, Li H, Wu MC
- Issue date: 2015 May 7
- Part 1. Statistical Learning Methods for the Effects of Multiple Air Pollution Constituents.
- Authors: Coull BA, Bobb JF, Wellenius GA, Kioumourtzoglou MA, Mittleman MA, Koutrakis P, Godleski JJ
- Issue date: 2015 Jun