Outlier Detection based on Robust Regression via Chance-Constrained Programming
Issue Date
2021Keywords
Chance-constrained ProgrammingKernel density estimation
Least quantile of squares
Outlier Detection
Robust Regression
Advisor
Zhang, Hao Helen
Metadata
Show full item recordPublisher
The University of Arizona.Rights
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction, presentation (such as public display or performance) of protected items is prohibited except with permission of the author.Abstract
Outlier detection is a critical step in data pre-processing to identify heterogeneous points in data. For high dimensional and extremely noisy data, many challenges are posed in outlier detection, including estimating the number of outliers, providing probabilistic confidence statement on identified outliers, fitting a model robust against outliers in the data set, and achieving high breakdown points with guarantee. In this paper, we propose a novel chance-constrained outlier detection (CCOD) model that not only finds a robust fit to the data set without guessing the proportion of outliers, but also automatically offers a diagnostic criteria (i.e., the relative outlying probabilities) to detect outliers with confidence. The main idea is to first model a probabilistic least quantile of squares (LQS) problem using chance-constrained optimization, then reformulate the problem using kernel density estimation. Since the resulting kernel-based LQS is nonlinear and nonconvex, we further propose a tractable convex approximation, the so-called CCOD model, and use its optimization to develop two outlier detection algorithms. Through numerical results, we show that our CCOD model outperforms the state-of-art LQS methodologies in terms of estimation accuracy, robustness, and computational time, and it provides robust fits to large-scale data that were otherwise intractable via other methodologies.Type
textElectronic Thesis
Degree Name
M.S.Degree Level
mastersDegree Program
Graduate CollegeStatistics
