Show simple item record

dc.contributor.advisorMorrison, Clayton T.
dc.contributor.authorPeterson, Sayeh
dc.creatorPeterson, Sayeh
dc.date.accessioned2022-06-09T02:36:47Z
dc.date.available2022-06-09T02:36:47Z
dc.date.issued2022
dc.identifier.citationPeterson, Sayeh. (2022). Comparison of Lasso Granger and PCMCI for Causal Feature Selection in Multivariate Time Series (Master's thesis, University of Arizona, Tucson, USA).
dc.identifier.urihttp://hdl.handle.net/10150/665005
dc.description.abstractCausal feature selection and reconstructing interaction networks in observational multivariate time series is currently a very active area of research in many fields of science. There are two main reasons for this: increased access to extensive amounts of observational time series data in today’s era of big data and research in fields where controlled experiments are impossible, unethical, or expensive such as climate, Earth systems or the human body. Correlation based studies on pairwise association networks cannot be interpreted causally. The goal of causal network reconstruction goes beyond inferring association and directionality between two time series; the objective of causal discovery is to distinguish direct from indirect dependencies and common drivers among multiple time series. In this thesis, I compare Lasso Granger causality with causal inference based PCMCI for detecting causal associations in multivariate time series of various lengths, edge density, noise level, and dimensions. Lasso Granger causality is based on Granger causality, a concept introduced over half a century ago, that employs classical multivariate regression. Lasso Granger remains a popular method for analyzing the temporal dependencies among time series. PCMCI is a recent novel method based on causal inference that combines conditional independence tests with a causal discovery algorithm to estimate causal networks from large-scale time series datasets. The preliminaries required prior to discussing the PCMCI method are introduced. Extensive synthetic data is generated for comparing the two methods on multivariate time series from 6 to 81 total variables with lengths from 100 to 1000 with three levels of edge density and noise. The results of this study shows that PCMCI improves the reliability of the conditional independence tests by optimizing the choice of conditioning sets and yields higher F1 score, precision and recall while still controlling the false positive rates compared to Lasso Granger causality. An unexpected result of this study was that PCMCI algorithm was found to also exploit sparsity in higher-dimensional data such that the performance improves in higher dimension with increased edge density. I also applied the Lasso Granger and PCMCI methods to a data set of dairy product prices of Italy and found that Lasso Granger found fewer edges than PCMCI.
dc.language.isoen
dc.publisherThe University of Arizona.
dc.rightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction, presentation (such as public display or performance) of protected items is prohibited except with permission of the author.
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/
dc.subjectcausal inference
dc.subjectLasso Granger
dc.subjectmultivariate time series
dc.subjectPCMCI
dc.titleComparison of Lasso Granger and PCMCI for Causal Feature Selection in Multivariate Time Series
dc.typetext
dc.typeElectronic Thesis
thesis.degree.grantorUniversity of Arizona
thesis.degree.levelmasters
dc.contributor.committeememberWatkins, Joseph C.
dc.contributor.committeememberZhang, Hao
thesis.degree.disciplineGraduate College
thesis.degree.disciplineStatistics
thesis.degree.nameM.S.
refterms.dateFOA2022-06-09T02:36:47Z


Files in this item

Thumbnail
Name:
azu_etd_19681_sip1_m.pdf
Size:
1.789Mb
Format:
PDF

This item appears in the following Collection(s)

Show simple item record