Partitioning Components for Dimension Reduction for Compositional Data
Publisher
The University of Arizona.Rights
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction, presentation (such as public display or performance) of protected items is prohibited except with permission of the author.Embargo
Release after 01/12/2023Abstract
Compositional data are vectors of proportions describing the relative abundance of eachcomponent to the total. High-dimensionality of many compositional data sets, often with more components than observations, has caused an increased demand for capturing observed patterns of variability through lower dimensions. Current dimension reduction methods applicable to compositional data are either difficult to interpret or lack a statistical model. Amalgamation, the summation of two components, and subcomposition, a subset of the original components, both serve as straightforward and interpretable ways of combining components in all applications of compositional data analysis and reduce the number of components in the composition. This paper proposes achieving reduced dimensions by partitioning components, which simultaneously models the subcompositions and amalgamation. Partition selection was proposed by maximizing the posterior probability of the Partition Logistic Normal distribution developed by Aitchison (1986). This dimension reduction methodology was then extended to perturbation, which characterizes compositional change. Perturbation responses may capture treatment effects, age effects on skin microbiota, and changes across time. Reducing the dimensions of an observed perturbation aims at capturing groups of components that were perturbed similarly. This paper provided a new reference component to correctly interpret the perturbed components and proposed reducing the dimensions by partitioning the perturbed components given the latent variables in the Gaussian mixture model which accounted for the uncertainty induced from estimating the compositional centers. These methods were applied to a skin microbiota studying an age perturbation by contrasting children and mothers.Type
textElectronic Dissertation
Degree Name
Ph.D.Degree Level
doctoralDegree Program
Graduate CollegeBiostatistics