Health Data Analytics: Data and Text Mining Approaches for Pharmacovigilance
MetadataShow full item record
PublisherThe University of Arizona.
RightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.
AbstractPharmacovigilance is defined as the science and activities relating to the detection, assessment, understanding, and prevention of adverse drug events (WHO 2004). Post-approval adverse drug events are a major health concern. They attribute to about 700,000 emergency department visits, 120,000 hospitalizations, and $75 billion in medical costs annually (Yang et al. 2014). However, certain adverse drug events are preventable if detected early. Timely and accurate pharmacovigilance in the post-approval period is an urgent goal of the public health system. The availability of various sources of healthcare data for analysis in recent years opens new opportunities for the data-driven pharmacovigilance research. In an attempt to leverage the emerging healthcare big data, pharmacovigilance research is facing a few challenges. Most studies in pharmacovigilance focus on structured and coded data, and therefore miss important textual data from patient social media and clinical documents in EHR. Most prior studies develop drug safety surveillance systems using a single data source with only one data mining algorithm. The performance of such systems is hampered by the bias in data and the pitfalls of the data mining algorithms adopted. In my dissertation, I address two broad research questions: 1) How do we extract rich adverse drug event related information in textual data for active drug safety surveillance? 2) How do we design an integrated pharmacovigilance system to improve the decision-making process for drug safety regulatory intervention? To these ends, the dissertation comprises three essays. The first essay examines how to develop a high-performance information extraction framework for patient reports of adverse drug events in health social media. I found that medical entity extraction, drug-event relation extraction, and report source classification are necessary components for this task. In the second essay, I address the scalability issue of using social media for pharmacovigilance by proposing a distant supervision approach for information extraction. In the last essay, I develop a MetaAlert framework for pharmacovigilance with advanced text mining and data mining techniques to provide timely and accurate detection of adverse drug reactions. Models, frameworks, and design principles proposed in these essays advance not only pharmacovigilance research, but also more broadly contribute to health IT, business analytics, and design science research.
Degree ProgramGraduate College
Management Information Systems