AuthorFarrar, Scott O.
AdvisorLangendoen, D. Terence
MetadataShow full item record
PublisherThe University of Arizona.
RightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.
AbstractThe current research presents an ontology for linguistics useful for an implementation on the Semantic Web. By adhering to this model, it is shown that data of the kind routinely collected by field linguists may be represented so as to facilitate automatic analysis and semantic search. The literature concerning typological databases, knowledge engineering, and the Semantic Web is reviewed. It is argued that the time is right for the integration of these three areas of research. Linguistic knowledge is discussed in the overall context of common-sense knowledge representation. A three-layer approach to meaning is assumed, one that includes conceptual, semantic, and linguistic levels of knowledge. In particular the level of semantics is shown to be crucial for a notional account of grammatical categories such as tense, aspect, and case. The level of semantic is viewed as an encoding of common-sense reality. To develop the ontology an upper model based on the Suggested Upper Merged Ontology (SUMO) is adopted, though elements from other ontologies are utilized as well. A brief comparison of available upper models is presented. It is argued that any ontology for linguistics should provide an account of at least (1) linguistic expressions, (2) mental linguistic units, (3) linguistic categories, and (4) discrete semantic units. The concepts and relations concerning these four domains are motivated as part of the ontology. Finally, an implementation for the Semantic Web is given by discussing the various data constructs necessary for markup (interlinear text, lexicons, paradigms, grammatical descriptions). It is argued that a characterization of the data constructs should not be included in the general ontology, but should be left up to the individual data provider to implement in XML Schema. A search scenario for linguistic data is discussed. It is shown that an ontology for linguistics provides the machinery for pure semantic search, that is, an advanced search framework whereby the user may use linguistic concepts, not just simple strings, as the search query.
Degree ProgramGraduate College