Dynamic schema evolution in a heterogeneous database environment: A graph theoretic approach
MetadataShow full item record
PublisherThe University of Arizona.
RightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.
AbstractThe objective of this dissertation is to create a theoretical framework and mechanisms for automating dynamic schema evolution in a heterogeneous database environment. The structure or schema of databases changes over time. Accommodating changes to the schema without loss of existing data and without significantly affecting the day to day operation of the database is the management of dynamic schema evolution. To address the problem of schema evolution in a heterogeneous database environment, we first propose a comprehensive taxonomy of schema changes and examine their implications. We then propose a formal methodology for managing schema evolution using graph theory with a well-defined set of operators and graph-based algorithms for tracking and propagating schema changes. We show that these operators and algorithms preserve the consistency and correctness of the schema following the changes. The complete framework is embedded in prototype software system called SEMAD (Schema Evolution Management ADvisor). We evaluate the system for its usefulness by conducting exploratory case studies using two different heterogeneous database domains, viz., a University database environment and a scientific database environment that is used by atmospheric scientists and hydrologists. The results of the exploratory case studies supported the hypothesis that SEMAD does help database administrators in their tasks. The results indicate that SEMAD helps the administrators identify and incorporate changes better than performing these tasks manually. An important overhead cost in SEMAD is the creation of the semantic data model, capturing the meta data associated with the model, and defining the mapping information that relates the model and the set of underlying databases. This task is a one-time effort that is performed at the beginning. The subsequent changes are incrementally captured by SEMAD. However, the benefits of using SEMAD in dynamically managing schema evolution appear to offset this overhead cost.
Degree ProgramGraduate College