Convergent, Fast, and Expressive Approximations of Entropy and Mutual Information
Publisher
The University of Arizona.Rights
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction, presentation (such as public display or performance) of protected items is prohibited except with permission of the author.Abstract
Mutual information (MI) and differential entropy are fundamental concepts in information theory, pivotal for numerous applications in machine learning and statistical inference. This thesis delves into advanced methods for estimating these quantities, particularly in the context of Gaussian Mixture Models (GMMs) and variational inference, addressing significant challenges in uncertainty quantification and MI computation. Firstly, the thesis investigates polynomial approximations to GMM entropy. We show that traditional methods lack convergence under certain conditions. This work introduces a novel Taylor series approximation that is guaranteed to converge to the true entropy of any GMM. Additionally, it demonstrates that orthogonal polynomial series offer more accurate approximations. Experimental results corroborate the theoretical findings, showcasing the computational efficiency and practical utility of these methods. Secondly, the thesis presents a novel approach to variational MI approximations by utilizing moment matching operations, thereby replacing the need for costly nonconvex optimization. This approach is applicable to implicit models that lack closed-form likelihood functions, providing substantial computational speedups. Theoretical results are supported by numerical evaluations in both parameterized models and implicit models, such as a simulation-based epidemiology model, highlighting significant performance enhancements. Finally, the thesis extends variational MI estimators by incorporating Normalizing Flows, enhancing the flexibility of the variational distribution beyond the commonly used Gaussian assumptions. These new flow-based estimators are validated on large MI problems and diverse benchmarking tests, often outperforming traditional critic-based estimators. Their effectiveness is also demonstrated in Bayesian Optimal Experimental Design for online sequential decision making.Type
Electronic Dissertationtext
Degree Name
Ph.D.Degree Level
doctoralDegree Program
Graduate CollegeApplied Mathematics