• Applying Associative Retrieval Techniques to Alleviate the Sparsity Problem in Collaborative Filtering

      Huang, Zan; Chen, Hsinchun; Zeng, Daniel (ACM, 2004-01)
      Recommender systems are being widely applied in many application settings to suggest products, services, and information items to potential consumers. Collaborative filtering, the most successful recommendation approach, makes recommendations based on past transactions and feedback from consumers sharing similar interests. A major problem limiting the usefulness of collaborative filtering is the sparsity problem, which refers to a situation in which transactional or feedback data is sparse and insufficient to identify similarities in consumer interests. In this article, we propose to deal with this sparsity problem by applying an associative retrieval framework and related spreading activation algorithms to explore transitive associations among consumers through their past transactions and feedback. Such transitive associations are a valuable source of information to help infer consumer interests and can be explored to deal with the sparsity problem. To evaluate the effectiveness of our approach, we have conducted an experimental study using a data set from an online bookstore. We experimented with three spreading activation algorithms including a constrained Leaky Capacitor algorithm, a branch-and-bound serial symbolic search algorithm, and a Hopfield net parallel relaxation search algorithm. These algorithms were compared with several collaborative filtering approaches that do not consider the transitive associations: a simple graph search approach, two variations of the user-based approach, and an item-based approach. Our experimental results indicate that spreading activation-based approaches significantly outperformed the other collaborative filtering methods as measured by recommendation precision, recall, the F-measure, and the rank score.We also observed the over-activation effect of the spreading activation approach, that is, incorporating transitive associations with past transactional data that is not sparse may “dilute” the data used to infer user preferences and lead to degradation in recommendation performance.
    • Automatic Concept Classification of Text From Electronic Meetings

      Chen, Hsinchun; Hsu, P.; Orwig, Richard E.; Hoopes, L.; Nunamaker, Jay F. (ACM, 1994-10)
      In this research we adopted an artificial intelligence (AI) approach to designing an automatic concept classification tool for electronic brainstorming output. The role of AI techniques such as machine learning and neural networks computing in groupware development can be significant. Through extensive content analysis, concept space generation, and neural network-based concept classification, our system can generate a tentative list of the important ideas and topics represented in meeting comments. Participants then can examine the systemâ s suggested list and the underlying comments. They can also revise or augment the list to produce their final consensus list. Allowing the system to act as an â intelligentâ aide for idea organization can alleviate some of the burdens of convergent tasks.
    • Automaticially Detecting Deceptive Criminal Identities

      Wang, Gang; Chen, Hsinchun; Atabakhsh, Homa (ACM, 2004-03)
      Fear about identity verification reached new heights since the terrorist attacks on Sept. 11, 2001, with national security issues related to detecting identity deception attracting more interest than ever before. Identity deception is an intentional falsification of identity in order to deter investigations. Conventional investigation methods run into difficulty when dealing with criminals who use deceptive or fraudulent identities, as the FBI discovered when trying to determine the true identities of 19 hijackers involved in the attacks. Besides its use in post-event investigation, the ability to validate identity can also be used as a tool to prevent future tragedies. Here, we focus on uncovering patterns of criminal identity deception based on actual criminal records and suggest an algorithmic approach to revealing deceptive identities.
    • CopLink: Managing Law Enforcement Data And Knowledge

      Chen, Hsinchun; Zeng, Daniel; Atabakhsh, Homa; Wyzga, Wojciech; Schroeder, Jennifer (ACM, 2003-01)
      In response to the September 11 terrorist attacks, major government efforts to modernize federal law enforcement authorities’ intelligence collection and processing capabilities have been initiated. At the state and local levels, crime and police report data is rapidly migrating from paper records to automated records management systems in recent years, making them increasingly accessible. However, despite the increasing availability of data, many challenges continue to hinder effective use of law enforcement data and knowledge, in turn limiting crime-fighting capabilities of related government agencies. For instance, most local police have database systems used by their own personnel, but lack an efficient manner in which to share information with other agencies. More importantly, the tools necessary to retrieve, filter, integrate, and intelligently present relevant information have not yet been sufficiently refined. According to senior Justice Department officials quoted on MSNBC, Sept. 26, 2001, there is “justifiable skepticism about the FBI’s ability to handle massive amounts of information,” and recent anti-terrorism initiatives will create more data overload problems. As part of nationwide, ongoing digital government initiatives, COPLINK is an integrated information and knowledge management environment aimed at meeting some of these challenges.
    • Element Matching in Concept Maps

      Marshall, Byron; Madhusudan, Therani (ACM, 2004)
      Concept maps (CM) are informal, semantic, node-link conceptual graphs used to represent knowledge in a variety of applications. Algorithms that compare concept maps would be useful in supporting educational processes and in leveraging indexed digital collections of concept maps. Map comparison begins with element matching and faces computational challenges arising from vocabulary overlap, informality, and organizational variation. Our implementation of an adapted similarity flooding algorithm improves matching of CM knowledge elements over a simple string matching approach.
    • Interactive Term Suggestion for Users of Digital Libraries: Using Subject Thesauri and Co-occurrence Lists for Information Retrieval

      Schatz, Bruce R.; Johnson, Eric H.; Cochrane, Pauline A.; Chen, Hsinchun (ACM, 1996)
      The basic problem in information retrieval is that large scale searches can only match terms specified by the user to terms appearing in documents in the digital library collection. Intermediate sources that support term suggestion can thus enhance retrieval by providing altentative search terms for the user. Term suggestion increases the recall, while interaction enables the user to attempt to not decrease the precision. We are building a prototype user interface that will become the Web interface for the University of Illinois Digital Library Initiative (DLI) testbed. It supports the principle of multiple views, wherc different kinds of term suggestors can be used to complement search and each other. This paper discusses its operation with two complementary term suggestors, subject thesauri and co-occurrence lists, and compares their utility. Thesauri are generatad by human indexers and place selected terms in a subject hierarchy. Co-occurrence lists are generated by computer and place all terms in frequency order of occurrence together. This paper concludes with a discussion of how multiple views can help provide good quality Search for the Net. This is a paper about the design of a retrieval system prototype that allows users to simultaneously combine terms offered by different suggestion techniques, not about comparing the merits of each in a systematic and controlled way. It offers no experimental results.
    • A Knowledge-Based Approach to the Design of Document-Based Retrieval Systems

      Chen, Hsinchun; Dhar, Vasant (ACM, 1990)
      This article presents a knowledge-based approach to the design of document-based retrieval systems. We conducted two empirical studies investigating the users' behavior using an online catalog. The studies revcaled a range of knowledge elements which are necessary for performing a successful search. We proposed a semantic network based representation to capture these knowledge elements. The findings we derived from our empirical studies were used to construct a knowledge-based retrieval system. We performed a laboratory experiment to calculate the search performance of our system. The experiment showed that our system out-performed a conventional retrieval system in recall and user satisfaction. The implications of our study to the design of document-based retrieval systems are also discussed in this article.
    • Online Query Refinement on Information Retrieval Systems: A Process Model of Searched System Interactions

      Chen, Hsinchun; Dhar, Vasant (ACM, 1990)
      This article reports findings of empirical research that investigated information searchers online query refinement process. Prior studies have recognized the information specialists' role in helping searchers articulate and refine queries. Using a semantic network and a Problem Behavior Graph to represent the online search our study revealed that searchers also refined their own queries in an online task environment. The information retrieval system played a passive role in assisting online query refinement, which was, however, one that confirmed Taylor's four-level query formulation model. Based on our empirical findings, we proposed using process model to facilitate and improve query refinement in an online environment. We believe incorporating this model into retrieval systems can result in the design of more "intelligent" and useful information retrieval systems.
    • Partnership Reviewing: A Cooperative Approach for Peer Review of Complex Educational Resources

      Weatherley, John; Sumner, Tamara; Khoo, Michael; Hoffmann, Marcel (ACM, 2002)
      Review of digital educational resources, such as course modules, simulations, and data analysis tools, can differ from review of scholarly articles, in the heterogeneity and complexity of the resources themselves. The Partnership Review Model, as demonstrated in two cases, appears to promote cooperative interactions between distributed resource reviewers, enabling reviewers to effectively divide up the task of reviewing complex resources with little explicit coordination. The shared structural outline of the resource made visible in the review environment enables participants to monitor other reviewersâ actions and to thus target their efforts accordingly. This reviewing approach may be effective in educational digital libraries that depend on community volunteers for most of their reviewing.
    • The Use of Dynamic Contexts to Improve Casual Internet Searching

      Leroy, Gondy; Lally, Ann M.; Chen, Hsinchun (ACM, 2003-07)
      Research has shown that most usersâ online information searches are suboptimal. Query optimization based on a relevance feedback or genetic algorithm using dynamic query contexts can help casual users search the Internet. These algorithms can draw on implicit user feedback based on the surrounding links and text in a search engine result set to expand user queries with a variable number of keywords in two manners. Positive expansion adds terms to a userâ s keywords with a Boolean â and,â negative expansion adds terms to the userâ s keywords with a Boolean â not.â Each algorithm was examined for three user groups, high, middle, and low achievers, who were classified according to their overall performance. The interactions of users with different levels of expertise with different expansion types or algorithms were evaluated. The genetic algorithm with negative expansion tripled recall and doubled precision for low achievers, but high achievers displayed an opposed trend and seemed to be hindered in this condition. The effect of other conditions was less substantial.
    • Visual search and reading tasks using ClearType and regular displays: two experiments

      Dillon, Andrew; Kleinman, Lisa; Choi, Gil Ok; Bias, Randolph (ACM, 2006)
      Two experiments comparing user performance on ClearType and Regular displays are reported. In the first, 26 participants scanned a series of spreadsheets for target information. Speed of performance was significantly faster with ClearType. In the second experiment, 25 users read two articles for meaning. Reading speed was significantly faster for ClearType. In both experiments no differences in accuracy of performance or visual fatigue scores were observed. The data also reveal substantial individual differences in performance suggesting ClearType may not be universally beneficial to information workers.