• GANNET: A machine learning approach to document retrieval

      Chen, Hsinchun; Kim, Jinwoo, 1963- (M.E. Sharpe, Inc., 1994-12)
      Information science researchers have recently turned to new artificial intelligence-based inductive learning techniques including neural networks, symbolic learning and genetic algorithms. An overview of the new techniques and their usage in information science research is provided. The algorithms adopted for a hybrid genetic algorithms and neural nets based system, called GANNET, are presented. GANNET performed concept (keyword) optimization for user-selected documents during information retrieval using the genetic algorithms. It then used the optimized concepts to perform concept exploration in a large network of related concepts through the Hopfield net parallel relaxation procedure. Based on a test collection of about 3,000 articles from DIALOG and an automatically created thesaurus, and using Jaccard's score as a performance measure, the experiment showed that GANNET improved the Jaccard's scores by about 50% and helped identify the underlying concepts that best describe the user-selected documents.
    • A methodology for analyzing Web-based qualitative data

      Romano, Nicholas C.; Donovan, Christina; Chen, Hsinchun; Nunamaker, Jay F. (2003)
      The volume of qualitative data (QD)available via the Internet is growing at an increasing pace and firms are anxious to extract and understand user's thought processes, wants and needs, attitudes, and purchase intentions contained therein. An information systems (IS) methodology to meaningfully analyze this vase resource of QD could provide useful information, knowledge, or wisdom firms could use for a number of purposes including new product development and quality improvement, target marketing, accurate "user focused" profiling, and future sales prediction. In this paper, we present an IS methodology for analysis of Internet-based QD consisting of three steps: elicitation; reduction through IS-facilitated selection, coding, and clustering; and visualization to provide at-a-glance understanding.
    • Quantifying Qualitative Data for Electronic Commerce Attitude Assessment and Visualization

      Romano, Nicholas C.; Bauer, Christina; Chen, Hsinchun; Nunamaker, Jay F. (2000)
      We propose a methodology to collect, quantify and visualize qualitative consumer data. We employ a Web-based Group Support System (GSS), GSw,b, to elicit free-form comments and a prototype comment analysis support system to facilitate comment classification, categorization and visualization to measure attitudes. We argue that such a methodology is needed due to the proliferation of qualitative data, the limitations of qualitative data analysis and the dearth of methods to measure attitudes contained within free-form comments. We conducted two experiments to compare our methodology with two long-established traditional methods, Likert scale evaluations and first-week box office sales records. We found that our methodology provides equivalent and superior affective and evaluative attitude information, compared to Likert scale ratings. We also found that comment analysis more accurately reflected actual first-week box office sales than did Likert scale ratings. Comment analysis with the prototype tool was seventy-five percent more efficient than manual coding. We designed the prototype to generate visualizations to make sense of multiple attitude dimensions through at-a-glance understanding and comparative presentation. The methodology we propose overcomes drawbacks often associated with qualitative data analysis and offers marketers and researchers a method to measure attitudes from free-form comments. The results indicate that qualitative data in the form of freeform comments may be quantified and visualized to provide meaningful attitude assessment. Finally, we present future research directions to enhance data collection and the comment analysis support system.
    • Verifying the proximity and size hypothesis for self-organizing maps

      Lin, Chienting; Chen, Hsinchun; Nunamaker, Jay F. (M.E. Sharpe, Inc., 2000-12)
      The Kohonen Self-Organizing Map (SOM) is an unsupervised learning technique for summarizing high-dimensional data so that similar inputs are, in general, mapped close to one another. When applied to textual data, SOM has been shown to be able to group together related concepts in a data collection and to present major topics within the collection with larger regions. Research in which properties of SOM were validated, called the Proximity and Size Hypotheses,is presented through a user evaluation study. Building upon the previous research in automatic concept generation and classification, it is demonstrated that the Kohonen SOM was able to perform concept clustering effectively, based on its concept precision and recall7 scores as judged by human experts. A positive relationship between the size of an SOM region and the number of documents contained in the region is also demonstrated.