PublisherThe University of Arizona.
RightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.
EmbargoEmbargo: Release after 5/3/2012
AbstractThe amount of information on the Internet has been proliferated rapidly in recent years as new technologies and applications become popular. The broad heterogeneous contents bring us a substantial challenge in the field of knowledge discovery and information retrieval. The objective of this dissertation is to design and implement a systematic framework to help users access huge and various information on the Web by combining different techniques and algorithms in different domains. In this dissertation, we propose an effective Application Specific Knowledge Engine framework to build structured and semantic data repositories, and support keyword search and semantic search. The framework is consistent with the architecture of most search engines. It enhances the general search engines in three ways: various data retrieval ability; semantic data support; and post-retrieval analysis. Various techniques and algorithms that could facilitate knowledge discovery are used in the framework.In the first part, we review different types of data on the Internet and approaches to retrieve various data: structured and unstructured data, online community data, and Peer-to-Peer data. After that we present an overview of the system architecture of the ASKE framework, and especially discuss the core components of the framework in details.The following chapters aim to investigate how the ASKE framework can be applied in two different domains (counter-terrorism and anti-piracy). We present the research in developing a counter-terrorism knowledge portal that incorporates various data collection and post-retrieval analysis. The process of building the portal following ASKE framework is described. The details of the data collections of Web sites and online forums are also reported. In the anti-piracy domain, we mainly discuss building Peer-to-Peer data collection and serving users with customized profiles. A case study of monitoring the movie Watchmen piracy on typical Peer-to-Peer Networks is discussed also.This dissertation has two main contributions. Firstly, it demonstrates how information retrieval, Web mining and other artificial intelligence techniques can be used in heterogeneous environment. Secondly, it provides a feasible framework which can facilitate users to discover knowledge in their specific searching and browsing activities.
Degree ProgramGraduate College
Systems & Industrial Engineering