A Novel Approach on Differential Abundance Analysis for Matched Metagenomic Samples
AuthorLu, Wen Chi
MetadataShow full item record
PublisherThe University of Arizona.
RightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.
EmbargoRelease after 17-Jan-2019
AbstractHuman microbial research has become increasingly popular in biomedical areas due to the importance of role of human microbiome in human health. One purpose of studying human microbiome is to detect differentially abundant features from a limited group of subjects across biological conditions. Metagenomic analyses of the human microbial communities are extensively used for biomedical applications due to its reliable and evident comparative discoveries across more than one metagenomes when multiple communities are taken into consideration. Next-generation sequencing technology helps to detect taxonomic compositions of specific features/species contained in human microbial communities. Statistical analysis often starts by generating the Operational Taxonomic Units (OTUs) using taxonomic compositions to classify groups of closely associated human microbiomes. Oftentimes, the counts of features are observed as matched count data with excess zeros. Such data lead some differential abundance analysis methods to apply Zero-Inflated Poisson (ZIP) or Zero-Inflated Negative Binomial (ZINB) regression for modeling the microbial abundance. However, over-dispersion as well as within-subject variation and correlation of matched count data render the standard ZIP and ZINB regression inadequate. To account for the inherent within-subject variation and correlation, independent random effect terms are commonly included in the regressions. Therefore, a robust method that accounts the effect of matched samples and correlated random effects while considering over-dispersion and excess zeros of count data is need for statistical analysis. In this paper, a statistical method, the two-part correlated ZINB model with correlated random effects (cZINB), is proposed for testing the matched samples with repeated measurements.
Degree ProgramGraduate College