Me, myself, and ire: Effects of automatic transcription quality on emotion, sarcasm, and personality detection
Citation
John Culnan, Seongjin Park, Meghavarshini Krishnaswamy, and Rebecca Sharp. 2021. Me, myself, and ire: Effects of automatic transcription quality on emotion, sarcasm, and personality detection. In Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pages 250–256, Online. Association for Computational Linguistics.Rights
Copyright © 2021 Association for Computational Linguistics. Licensed on a Creative Commons Attribution 4.0 International License.Collection Information
This item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at repository@u.library.arizona.edu.Abstract
In deployment, systems that use speech as input must make use of automated transcriptions. Yet, typically when these systems are evaluated, gold transcriptions are assumed. We explicitly examine the impact of transcription errors on the downstream performance of a multi-modal system on three related tasks from three datasets: emotion, sarcasm, and personality detection. We include three separate transcription tools and show that while all automated transcriptions propagate errors that substantially impact downstream performance, the open-source tools fair worse than the paid tool, though not always straightforwardly, and word error rates do not correlate well with downstream performance. We further find that the inclusion of audio features partially mitigates transcription errors, but that a naive usage of a multi-task setup does not. We make available all code and data splits needed to reproduce all of our experiments. © 2021 Association for Computational Linguistics.Note
Open access journalISBN
9781954085183Version
Final published versionCollections
Except where otherwise noted, this item's license is described as Copyright © 2021 Association for Computational Linguistics. Licensed on a Creative Commons Attribution 4.0 International License.