We are upgrading the repository! A content freeze is in effect until December 6th, 2024 - no new submissions will be accepted; however, all content already published will remain publicly available. Please reach out to repository@u.library.arizona.edu with your questions, or if you are a UA affiliate who needs to make content available soon. Note that any new user accounts created after September 22, 2024 will need to be recreated by the user in November after our migration is completed.
Comparison of Lasso Granger and PCMCI for Causal Feature Selection in Multivariate Time Series
Author
Peterson, SayehIssue Date
2022Advisor
Morrison, Clayton T.
Metadata
Show full item recordPublisher
The University of Arizona.Rights
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction, presentation (such as public display or performance) of protected items is prohibited except with permission of the author.Abstract
Causal feature selection and reconstructing interaction networks in observational multivariate time series is currently a very active area of research in many fields of science. There are two main reasons for this: increased access to extensive amounts of observational time series data in today’s era of big data and research in fields where controlled experiments are impossible, unethical, or expensive such as climate, Earth systems or the human body. Correlation based studies on pairwise association networks cannot be interpreted causally. The goal of causal network reconstruction goes beyond inferring association and directionality between two time series; the objective of causal discovery is to distinguish direct from indirect dependencies and common drivers among multiple time series. In this thesis, I compare Lasso Granger causality with causal inference based PCMCI for detecting causal associations in multivariate time series of various lengths, edge density, noise level, and dimensions. Lasso Granger causality is based on Granger causality, a concept introduced over half a century ago, that employs classical multivariate regression. Lasso Granger remains a popular method for analyzing the temporal dependencies among time series. PCMCI is a recent novel method based on causal inference that combines conditional independence tests with a causal discovery algorithm to estimate causal networks from large-scale time series datasets. The preliminaries required prior to discussing the PCMCI method are introduced. Extensive synthetic data is generated for comparing the two methods on multivariate time series from 6 to 81 total variables with lengths from 100 to 1000 with three levels of edge density and noise. The results of this study shows that PCMCI improves the reliability of the conditional independence tests by optimizing the choice of conditioning sets and yields higher F1 score, precision and recall while still controlling the false positive rates compared to Lasso Granger causality. An unexpected result of this study was that PCMCI algorithm was found to also exploit sparsity in higher-dimensional data such that the performance improves in higher dimension with increased edge density. I also applied the Lasso Granger and PCMCI methods to a data set of dairy product prices of Italy and found that Lasso Granger found fewer edges than PCMCI.Type
textElectronic Thesis
Degree Name
M.S.Degree Level
mastersDegree Program
Graduate CollegeStatistics