• Login
    View Item 
    •   Home
    • UA Graduate and Undergraduate Research
    • UA Theses and Dissertations
    • Dissertations
    • View Item
    •   Home
    • UA Graduate and Undergraduate Research
    • UA Theses and Dissertations
    • Dissertations
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of UA Campus RepositoryCommunitiesTitleAuthorsIssue DateSubmit DateSubjectsPublisherJournalThis CollectionTitleAuthorsIssue DateSubmit DateSubjectsPublisherJournal

    My Account

    LoginRegister

    About

    AboutUA Faculty PublicationsUA DissertationsUA Master's ThesesUA Honors ThesesUA PressUA YearbooksUA CatalogsUA Libraries

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    W7 MODEL OF PROVENANCE AND ITS USE IN THE CONTEXT OF WIKIPEDIA

    • CSV
    • RefMan
    • EndNote
    • BibTex
    • RefWorks
    Thumbnail
    Name:
    azu_etd_11468_sip1_m.pdf
    Size:
    4.549Mb
    Format:
    PDF
    Download
    Author
    Liu, Jun
    Issue Date
    2011
    Keywords
    Collaboration
    Data provenance
    Ontology
    Wikipedia
    Advisor
    Ram, Sudha
    
    Metadata
    Show full item record
    Publisher
    The University of Arizona.
    Rights
    Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.
    Abstract
    Data provenance refers to the lineage or pedigree of data, including information such as its origin and key events that affect it over the course of its lifecycle. In recent years, provenance has become increasingly important as more and more people are using data that they themselves did not generate. Tracking data provenance helps ensure that data provided by many different providers and sources can be trusted and used appropriately. Data provenance also has several other critical uses, including data quality assessment, generating data replication recipes, data security management, etc.One of the major objectives of our research is to investigate the semantics or meaning of data provenance. We describe a generic ontology of data provenance called the W7 model that represents the semantics of data provenance. Formalized in the conceptual graph formalism, the W7 model represents provenance as a combination of seven interconnected elements including "what," "when," "where," "how," "who," "which" and "why." The W7 model is designed to be general and comprehensive enough to cover a broad range of provenance-related vocabularies. However, the W7 model alone, no matter how comprehensive it is, is insufficient for capturing all domain-specific provenance requirements. We hence present a novel approach to developing domain ontologies of provenance. This approach relies on various conceptual graph mechanisms, including schema definitions and canonical formation rules, and enables us to easily adapt and extend the W7 model to develop domain ontologies of provenance. The W7 model for data provenance has been widely adopted and adapted for use within Raytheon Missile Systems and the iPlant Collaborative, as well as the US Army's ATRAP IV (Asymmetric Threat Response and Analysis Program) system.We also developed a domain ontology of provenance for Wikipedia based on the W7 model. This domain ontology enables us to extract provenance for each Wikipedia article. We present a study in which we use their provenance to assess the quality of Wikipedia articles. Assessing and guaranteeing data quality has become a critical concern that, to a large extent, determines the future success and survival of Wikipedia since the quality of Wikipedia has been continuously called into question due to various incidents of vandalism and misinformation since its launch in 2001. Our study shows that the quality of Wikipedia articles depends not only on the different types of contributors but also on how they collaborate. We identify a number of contributor roles based on the provenance. Based on the roles and provenance, our research identifies several collaboration patterns that are preferable or detrimental for data quality, thus providing insights for designing tools and mechanisms to improve Wikipedia article quality.
    Type
    Electronic Dissertation
    text
    Degree Name
    Ph.D.
    Degree Level
    doctoral
    Degree Program
    Graduate College
    Management Information Systems
    Degree Grantor
    University of Arizona
    Collections
    Dissertations

    entitlement

     
    The University of Arizona Libraries | 1510 E. University Blvd. | Tucson, AZ 85721-0055
    Tel 520-621-6442 | repository@u.library.arizona.edu
    DSpace software copyright © 2002-2017  DuraSpace
    Quick Guide | Contact Us | Send Feedback
    Open Repository is a service operated by 
    Atmire NV
     

    Export search results

    The export option will allow you to export the current search results of the entered query to a file. Different formats are available for download. To export the items, click on the button corresponding with the preferred download format.

    By default, clicking on the export buttons will result in a download of the allowed maximum amount of items.

    To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export. The amount of items that can be exported at once is similarly restricted as the full export.

    After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.