Show simple item record

dc.contributor.advisorHammond, Michael
dc.contributor.authorChen, Yuan-Lu
dc.creatorChen, Yuan-Lu
dc.date.accessioned2018-10-12T01:01:43Z
dc.date.available2018-10-12T01:01:43Z
dc.date.issued2018
dc.identifier.urihttp://hdl.handle.net/10150/630172
dc.description.abstractInterlinear Glossed Text (IGT) is widely used in linguistic studies. In a form of Interlinear Glossed Text, the first line is a sentence of the language of interest, the second line is a word-by-word translation, annotated with relevant grammatical information, and the third line is an English translation. The innovation of the current work is to incorporate the gloss information of Interlinear Glossed Text data into neural net machine translation systems. Critically, if the Gaelic data and the gloss data are combined in a specific way as the training data, which is named as Parallel-Partial treatment, the performance of the systems is improved significantly. The systems with Parallel-Partial treatment outperform the baseline systems by 93% and outperform Google translation by 40%. The Parallel-Partial treatment lets the machine learn four sets of mappings: 1.) from source sentences to target sentences, 2.) from gloss lines to target sentences, 3.) from gloss lines to source sentences, and 4) from source language words to gloss items. Moreover, the boosting effect of the Parallel-Partial treatment is consistent across different languages and across neural net machine translation systems with different hyper-parameter settings. How theoretical linguistics may work hand in hand with natural language processing, and how neural net machine learning may exploit linguistics are important questions (Pater 2017). The current work also exemplifies how theoretical linguistics may work hand in hand with natural language processing successfully, in addition to practically building better machine translation systems.
dc.language.isoen
dc.publisherThe University of Arizona.
dc.rightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction, presentation (such as public display or performance) of protected items is prohibited except with permission of the author.
dc.subjectInterlinear Glossed Text
dc.subjectMachine Translation
dc.subjectNeural Machine Learning
dc.subjectScottish Gaelic
dc.titleImproving Neural Net Machine Translation Systems with Linguistic Information
dc.typetext
dc.typeElectronic Dissertation
thesis.degree.grantorUniversity of Arizona
thesis.degree.leveldoctoral
dc.contributor.committeememberCarnie, Andrew
dc.contributor.committeememberFong, Sandiway
thesis.degree.disciplineGraduate College
thesis.degree.disciplineLinguistics
thesis.degree.namePh.D.
refterms.dateFOA2018-10-12T01:01:43Z


Files in this item

Thumbnail
Name:
azu_etd_16521_sip1_m.pdf
Size:
1.070Mb
Format:
PDF

This item appears in the following Collection(s)

Show simple item record