Show simple item record

dc.contributor.advisorSurdeanu, Mihai
dc.contributor.authorEhsani, Sina
dc.creatorEhsani, Sina
dc.date.accessioned2022-12-17T00:11:07Z
dc.date.available2022-12-17T00:11:07Z
dc.date.issued2022
dc.identifier.citationEhsani, Sina. (2022). OD-TQA: On-Demand Visual Augmentation for Textual Question Answering Task (Master's thesis, University of Arizona, Tucson, USA).
dc.identifier.urihttp://hdl.handle.net/10150/667289
dc.description.abstractTextual Question Answering is a difficult task that has been studied for over a decade. With the rise of transformer networks, there has been an increase in the utilization of external knowledge (pre-trained models) on this task. However, these methodologies are missing a critical component: external visual comprehension. When asked a question, we as humans use imagination, in the form of vision and audio, to better understand the concepts of the question, and that is what we are doing in this study: providing machines with the necessary visualization to allow them to comprehend a given question and generate more pertinent answers. This is accomplished using Google's image search, which provides us with access to worldwide knowledge. A novel methodology for determining the best answer using on-demand visual grounding is presented, and various multimedia model designs are introduced and compared. Lastly, we demonstrated that the proposed solution outperforms the previous system without any pre-training, proving the benefits of the on-demand image retrieval concept for textual question answering task.
dc.language.isoen
dc.publisherThe University of Arizona.
dc.rightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction, presentation (such as public display or performance) of protected items is prohibited except with permission of the author.
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/
dc.subjectData Augmentation
dc.subjectMultimedia Information Retrieval
dc.subjectMultimodal Deep Learning
dc.subjectNatural Language Processing
dc.subjectQuestion Answering
dc.subjectVisual Grounding
dc.titleOD-TQA: On-Demand Visual Augmentation for Textual Question Answering Task
dc.typetext
dc.typeElectronic Thesis
thesis.degree.grantorUniversity of Arizona
thesis.degree.levelmasters
dc.contributor.committeememberBethard, Steven
dc.contributor.committeememberBarnard, Kobus
thesis.degree.disciplineGraduate College
thesis.degree.disciplineComputer Science
thesis.degree.nameM.S.
refterms.dateFOA2022-12-17T00:11:07Z


Files in this item

Thumbnail
Name:
azu_etd_20172_sip1_m.pdf
Size:
4.630Mb
Format:
PDF

This item appears in the following Collection(s)

Show simple item record