Show simple item record

dc.contributor.authorNitschke, R.
dc.date.accessioned2022-03-17T01:56:58Z
dc.date.available2022-03-17T01:56:58Z
dc.date.issued2021
dc.identifier.citationNitschke, R. (2021, June). Restoring the Sister: Reconstructing a Lexicon from Sister Languages using Neural Machine Translation. In Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas (pp. 122-130).
dc.identifier.isbn9781954085442
dc.identifier.doi10.18653/v1/2021.americasnlp-1.13
dc.identifier.urihttp://hdl.handle.net/10150/663577
dc.description.abstractThe historical comparative method has a long history in historical linguists. It describes a process by which historical linguists aim to reverse-engineer the historical developments of language families in order to reconstruct proto-forms and familial relations between languages. In recent years, there have been multiple attempts to replicate this process through machine learning, especially in the realm of cognate detection (List et al., 2016; Ciobanu and Dinu, 2014; Rama et al., 2018). So far, most of these experiments aimed at actual reconstruction have attempted the prediction of a proto-form from the forms of the daughter languages (Ciobanu and Dinu, 2018; Meloni et al., 2019). Here, we propose a reimplementation that uses modern related languages, or sisters, instead, to reconstruct the vocabulary of a target language. In particular, we show that we can reconstruct vocabulary of a target language by using a fairly small data set of parallel cognates from different sister languages, using a neural machine translation (NMT) architecture with a standard encoder-decoder setup. This effort is directly in furtherance of the goal to use machine learning tools to help under-served language communities in their efforts at reclaiming, preserving, or reconstructing their own languages. © 2021 Association for Computational Linguistics
dc.language.isoen
dc.publisherAssociation for Computational Linguistics (ACL)
dc.rightsCopyright © 2021 Association for Computational Linguistics. Licensed on a Creative Commons Attribution 4.0 International License.
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.titleRestoring the Sister: Reconstructing a Lexicon from Sister Languages using Neural Machine Translation
dc.typeProceedings
dc.typetext
dc.contributor.departmentThe University of Arizona
dc.identifier.journalProceedings of the 1st Workshop on Natural Language Processing for Indigenous Languages of the Americas, AmericasNLP 2021
dc.description.noteOpen access journal
dc.description.collectioninformationThis item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at repository@u.library.arizona.edu.
dc.eprint.versionFinal published version
dc.source.journaltitleProceedings of the 1st Workshop on Natural Language Processing for Indigenous Languages of the Americas, AmericasNLP 2021
refterms.dateFOA2022-03-17T01:56:58Z


Files in this item

Thumbnail
Name:
2021.americasnlp-1.13.pdf
Size:
347.2Kb
Format:
PDF
Description:
Final Published Version

This item appears in the following Collection(s)

Show simple item record

Copyright © 2021 Association for Computational Linguistics. Licensed on a Creative Commons Attribution 4.0 International License.
Except where otherwise noted, this item's license is described as Copyright © 2021 Association for Computational Linguistics. Licensed on a Creative Commons Attribution 4.0 International License.