Efficient Exact Tests in Linear Mixed Models for Longitudinal Microbiome Studies
Multiple variance components
MetadataShow full item record
PublisherThe University of Arizona.
RightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.
EmbargoRelease after 01-Jan-2017
AbstractMicrobiome plays an important role in human health. The analysis of association between microbiome and clinical outcome has become an active direction in biostatistics research. Testing the microbiome effect on clinical phenotypes directly using operational taxonomic unit abundance data is a challenging problem due to the high dimensionality, non-normality and phylogenetic structure of the data. Most of the studies only focus on describing the change of microbe population that occur in patients who have the specific clinical condition. Instead, a statistical strategy utilizing distance-based or similarity-based non-parametric testing, in which a distance or similarity measure is defined between any two microbiome samples, is developed to assess association between microbiome composition and outcomes of interest. Despite the improvements, this test is still not easily interpretable and not able to adjust for potential covariates. A novel approach, kernel-based semi-parametric regression framework, is applied in evaluating the association while controlling the covariates. The framework utilizes a kernel function which is a measure of similarity between samples' microbiome compositions and characterizes the relationship between the microbiome and the outcome of interest. This kernel-based regression model, however, cannot be applied in longitudinal studies since it could not model the correlation between the repeated measurements. We proposed microbiome association exact tests (MAETs) in linear mixed model can deal with longitudinal microbiome data. MAETs can test not only the effect of overall microbiome but also the effect from specific cluster of the OTUs while controlling for others by introducing more random effects in the model. The current methods for multiple variance component testing are based on either asymptotic distribution or parametric bootstrap which require large sample size or high computational cost. The exact (R)LRT tests, an computational efficient and powerful testing methodology, was derived by Crainiceanu. Since the exact (R)LRT can only be used in testing one variance component, we proposed an approach that combines the recent development of exact (R)LRT and a strategy for simplifying linear mixed model with multiple variance components to a single case. The Monte Carlo simulation studies present correctly controlled type I error and provided superior power in testing association between microbiome and outcomes in longitudinal studies. Finally, the MAETs were applied to longitudinal pulmonary microbiome datasets to demonstrate that microbiome composition is associated with lung function and immunological outcomes. We also successfully found two interesting genera Prevotella and Veillonella which are associated with forced vital capacity.
Degree ProgramGraduate College