MetadataShow full item record
PublisherThe University of Arizona.
RightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.
AbstractXML has emerged as a new standard for information representation and exchange on the Internet. To efficiently process XML data, we propose the extended preorder numbering scheme, which determines the ancestor-descendant relationship between nodes in the hierarchy of XML data in constant time, and adapts to the dynamics of XML data by allocating extra space. Based on this numbering scheme, we propose sort-merge based algorithms, εA-Join and εε-Join, to process ancestor-descendant path expressions. The experimental results showed an order of magnitude performance improvement over conventional methods. We further propose the partition-based algorithms, which can be chosen by a query optimizer according to the characteristics of the input data. For complex path expressions with branches, we propose the Containment B⁺-tree (CB-tree) index and the IndexTwig algorithm. The CB-tree, which is an extension of the B⁺-tree, supports both the containment query and the reverse containment query. It is an effective indexing scheme for XML documents with or without a small number of recursions. The proposed IndexTwig algorithm works with any index supporting containment and reverse containment queries, such as the CB-tree. We also introduce a simplified output model, which outputs only the necessary result of a path expression. The output model enables the Fast Existence Test (FET) optimization to skip unnecessary data and avoid generating unwanted results. Also in this dissertation, we introduce techniques to process the predicates in XML path expressions using the EVR-tree. The EVR-tree combines the advantages of indexing on values or elements individually using B+-trees. It utilizes the high value selectivity and/or high structural selectivity, and provides ordered element access by using a priority queue. At the end of the dissertation, we introduce the XISS/R system, which is an implementation of the XML Indexing and Storage System (XISS) on top of a relational database. The XISS/R includes a web-based user interface and a XPath query engine to translate XPath queries into efficient SQL statements.
Degree ProgramGraduate College