We are upgrading the repository! A content freeze is in effect until December 6th, 2024 - no new submissions will be accepted; however, all content already published will remain publicly available. Please reach out to repository@u.library.arizona.edu with your questions, or if you are a UA affiliate who needs to make content available soon. Note that any new user accounts created after September 22, 2024 will need to be recreated by the user in November after our migration is completed.
Toward Enhancing Automated Credibility Assessment: A Model for Question Type Classification and Tools for Linguistic Analysis
Author
Moffitt, Kevin ChristopherIssue Date
2011Keywords
Automated Linguistic AnalysisCredibility Assessment
Fraudulent Financial Reporting
Question Type
Advisor
Burgoon, Judee K.Nunamaker, Jay F.
Metadata
Show full item recordPublisher
The University of Arizona.Rights
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.Abstract
The three objectives of this dissertation were to develop a question type model for predicting linguistic features of responses to interview questions, create a tool for linguistic analysis of documents, and use lexical bundle analysis to identify linguistic differences between fraudulent and non-fraudulent financial reports. First, The Moffitt Question Type Model (MQTM) was developed to aid in predicting linguistic features of responses to questions. It focuses on three context independent features of questions: tense (past vs. present vs. future), perspective (introspective vs. extrospective), and abstractness (concrete vs. conjectural). The MQTM was tested on responses to real-world pre-polygraph examination questions in which guilty (n = 27) and innocent (n = 20) interviewees were interviewed. The responses were grouped according to question type and the linguistic cues from each groups' transcripts were compared using independent samples t-tests with the following results: future tense questions elicited more future tense words than either past or present tense questions and present tense questions elicited more present tense words than past tense questions; introspective questions elicited more cognitive process words and affective words than extrospective questions; and conjectural questions elicited more auxiliary verbs, tentativeness words, and cognitive process words than concrete questions. Second, a tool for linguistic analysis of text documents, Structured Programming for Linguistic Cue Extraction (SPLICE), was developed to help researchers and software developers compute linguistic values for dictionary-based cues and cues that require natural language processing techniques. SPLICE implements a GUI interface for researchers and an API for developers. Finally, an analysis of 560 lexical bundles detected linguistic differences between 101 fraudulent and 101 non-fraudulent 10-K filings. Phrases such as "the fair value of," and "goodwill and other intangible assets" were used at a much higher rate in fraudulent 10-Ks. A principal component analysis reduced the number of variables to 88 orthogonal components which were used in a discriminant analysis that classified the documents with 71% accuracy. Findings in this dissertation suggest the MQTM could be used to predict features of interviewee responses in most contexts and that lexical bundle analysis is a viable tool for discriminating between fraudulent and non-fraudulent text.Type
Electronic Dissertationtext
Degree Name
Ph.D.Degree Level
doctoralDegree Program
Graduate CollegeManagement Information Systems