SURVEILLANCE IN THE INFORMATION AGE: TEXT QUANTIFICATION, ANOMALY DETECTION, AND EMPIRICAL EVALUATION
chief complaint classification
Markov switching with jumps
text-based risk recognition
Committee ChairChen, Hsinchun
MetadataShow full item record
PublisherThe University of Arizona.
RightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.
AbstractDeep penetration of personal computers, data communication networks, and the Internet has created a massive platform for data collection, dissemination, storage, and retrieval. Large amounts of textual data are now available at a very low cost. Valuable information, such as consumer preferences, new product developments, trends, and opportunities, can be found in this large collection of textual data. Growing worldwide competition, new technology development, and the Internet contribute to an increasingly turbulent business environment. Conducting surveillance on this growing collection of textual data could help a business avoid surprises, identify threats and opportunities, and gain competitive advantages.Current text mining approaches, nonetheless, provide limited support for conducting surveillance using textual data. In this dissertation, I develop novel text quantification approaches to identify useful information in textual data, effective anomaly detection approaches to monitor time series data aggregated based on the text quantification approaches, and empirical evaluation approaches that verify the effectiveness of text mining approaches using external numerical data sources.In Chapter 2, I present free-text chief complaint classification studies that aim to classify incoming emergency department free-text chief complaints into syndromic categories, a higher level of representation that facilitates syndromic surveillance. Chapter 3 presents a novel detection algorithm based on Markov switching with jumps models. This surveillance model aims at detecting different types of disease outbreaks based on the time series generated from the chief complaint classification system.In Chapters 4 and 5, I studied the surveillance issue under the context of business decision making. Chapter 4 presents a novel text-based risk recognition design framework that can be used to monitor the changing business environment. Chapter 5 presents an empirical evaluation study that looks at the interaction between news sentiment and numerical accounting earnings information. Chapter 6 concludes this dissertation by highlighting major research contributions and the relevance to MIS research.
Degree ProgramManagement Information Systems