Contrastive Hyperarticulation in Arabic Geminate and Singleton Stops: How Lexical Competition and Morphology Shape Closure Duration in Behavioral and Neural Speech
Author
Alshakhori, Mohammed KhodhorIssue Date
2025Keywords
Arabic geminationContrastive hyperarticulation
Lexical competition
Phonetic enhancement
Speech recognition
Speech Synthesis
Advisor
Hammond, Mike
Metadata
Show full item recordPublisher
The University of Arizona.Rights
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction, presentation (such as public display or performance) of protected items is prohibited except with permission of the author.Abstract
This dissertation investigates the phenomenon of contrastive hyperarticulation under lexical competition–a dynamic enhancement of phonetic cues to preserve perceptual distinctions between lexical competitors–in Qatifi Arabic, a dialect spoken in Saudi Arabia. Research on contrastive hyperarticulation has predominantly focused on languages with linear morphological structures, whereas its dynamics in non-linear systems have received comparatively little attention. Arabic, with its root-and-pattern morphology, offers an ideal yet underexplored domain for such an inquiry. By analyzing Qatifi Arabic, this dissertation advances our understanding of how contrastive hyperarticulation adapts to a distinct morphological system, offering new insights into the dynamics of contrastive hyperarticulation and non-linear morphological systems in lexical competition environments. This study analyzes contrastive hyperarticulation as a function of morphological derivational processes and lexical competition, testing its phonetic correlates in geminate (double consonants)-singleton (single consonants) stop duration contrasts. Furthermore, the research extends its investigation of the phenomenon to advanced neural speech systems, including speech synthesis and automatic speech recognition models. Specifically, it examines whether speech synthesis systems generate contrastive hyperarticulation in geminate-singleton pairs—lexically competing morphemes distinguished solely by duration—and evaluates how automatic speech recognition systems process these enhanced phonetic contrasts in sensitive contexts that includes both competitors. Over the past decade, the phenomenon of contrastive hyperarticulation has garnered increasing attention in linguistic research, yet existing studies have focused overwhelmingly on languages with linear morphological structures. Seyfarth (2016) explored the contrastive hyperarticulation of English morphemic and non-morphemic forms, demonstrating that it is triggered by the phonological plans of morphological relatives. Other studies have examined the durational contrasts of geminate and singleton stops in a mora-timed language and languages with tono-genetic sound change such as Japanese and Korean, respectively (Jeong 2024, Sano 2018). Sano (2018) focused on how lexical competition factors motivate the enhancements of geminate and singleton stops. The findings highlight the role of factors such as the manner of articulation, informativity levels, and the presence or absence of minimal pairs in shaping durational distinctions. Similarly, Wedel et al. (2018) emphasize that contrastive hyperarticulation of English stops operates primarily as a function of lexical competition, contrary to claims that emphasize phonological neighborhood density (Fox et al. 2015, Munson & Solomon 2004). However, existing research has yet to fully reconcile three key dimensions of this phenomenon: while one line of inquiry has focused on contrastive enhancement of morphological units compared to non-morphological phonemes, it has not systematically examined how morphemes in lexical competition conditions influence these enhancements (Losiewicz 1992, Plag et al. 2017, Smith et al. 2012, Walsh & Parker 1983, Yung Song et al. 2013). Another line has investigated geminate and singleton stops in competition-driven environments, yet the geminate phonemes in these studies lack morphological significance in the languages examined (Sano 2018). While a third research strand has highlighted the role of lexical competition in triggering contrasts in English stops (Nelson 2019, Wedel et al. 2018), morphological influences were not within their investigation scope. Notably, most of these studies share a common focus on languages with linear morphological systems, where morphemes and word forms are strung together linearly, while non-concatenative systems remain overlooked. Building on these studies, this dissertation contributes to our understanding of how lexical competition interacts with morphological complexity to shape contrastive hyperarticulation, focusing on geminate morphemes versus singleton units in Arabic’s non-linear root-and-pattern system. The study provides new insights into how contrastive hyperarticulation functions in morphologically rich languages, particularly under conditions of lexical competition. The dissertation is structured around three studies. The first study involves a behavioral speech production study examining contrastive hyperarticulation in Qatifi Arabic geminate and singleton stops. The study investigates whether the closure durations of geminate morphemes and singleton phonemes are influenced by minimal pair competition. The results reveal asymmetric enhancement patterns: singleton durations are shortened in minimal pair contexts under fast speech rates, while geminate durations remain stable. The disparity suggests that non-linear morphological systems may impose constraints for maintaining phonetic contrasts under lexical competition and morphological complexity. One possible explanation for this asymmetric behavior is that Arabic’s templatic morphology imposes constraints on geminate production: gemination may be produced as a by-product of the root-and-pattern system, making it less susceptible to durational enhancement than singletons. The second study examines neural speech synthesis, specifically assessing whether the Variational Inference with Adversarial Learning (VITS-TTS) model (Kim et al. 2021) can generate contrastively hyperarticulated geminate and singleton stops—mirroring the patterns observed in the first study. Given that neural synthesis systems have demonstrated the ability to learn categorical phonological patterns from limited training data (Beguš 2021a,b), this experiment serves as a novel test of whether such models can generate contrastive hyperarticulation patterns based on lexical competition or distributional factors, thereby providing insights into how artificial systems operate under such conditions. While neural models excel at learning categorical distinctions (e.g., geminate vs. singleton), they often struggle with gradient phenomena such as contrastive durations (Hanzlí?ek 2024, Matoušek & Tihelka 2022). Testing whether such systems can replicate human-like enhancement patterns illuminates their capacity to encode phonetic informativity—the dynamic adjustment of cues based on functional demands. This reveals whether artificial systems implicitly learn the statistical dependencies between lexical competition and phonetic realization, or merely approximate surface-level acoustic patterns. Using the same geminate and singleton tokens investigated in the first study, the research fine-tuned VITS-TTS on a specialized corpus and tested its ability to synthesize geminate and singleton consonant contrasts. While VITS-TTS successfully encoded categorical distinctions between geminates and singletons, it failed to generate hyperarticulated forms in minimal pair and non-minimal pair contexts. These results highlight the limitations of current neural architectures in modeling gradient phonetic phenomena, particularly those driven by lexical competition and morphology. The third study evaluates how automatic speech recognition (ASR) systems adapt to competition pressure when resolving contrastively hyperarticulated minimal pairs in shared linguistic contexts, building on two prior findings. Study 1 reveals how orthographic constraints shape hyperarticulation production in Qatifi Arabic, while Study 2 demonstrates synthetic speech systems’ reliance on orthographic markers to disambiguate geminate-singleton contrasts in Modern Standard Arabic. The current work extends this by probing whether ASR models develop latent representations robust enough to recognize competing candidates under acoustic ambiguity. Specifically, it evaluates ASR models’ ability to: (1) decode contrastive hyperarticulation cues in the acoustic signal to distinguish geminate-singleton minimal pairs in a single context, and (2) reconcile these phonetic cues with near-identical orthographic representations that differ only by a geminate marker. For this study, a Transformer-based (Whisper) model (Radford et al. 2023) was fine-tuned on a fully diacritized Modern Standard Arabic dataset. The results demonstrate that Whisper successfully recognizes most minimal pairs in the tested contexts, indicating that the model learns latent representations that jointly encode contrastive hyper-articulated durational cues and orthographic markers. The findings demonstrate that successful recognition requires both bottom-up phonetic learning of enhanced acoustic cues such as separable latent features and top-down orthographic guidance through diacritics to resolve phonetic ambiguities. The findings reveal asymmetric hyperarticulation in Qatifi Arabic, where singleton stops exhibit reduced durations under lexical competition, contrasting with the symmetric enhancement observed in linear morphological systems, such as in English and Japanese. This disparity suggests language-specific dynamics in contrast maintenance. Conversely, results from the synthesis model indicate an absence of hyperarticulated geminate and singleton phonemes—even when controlling for geminate stop distribution and lexical competition environments. This suggests that, unlike human speakers, the synthesis system does not spontaneously implement contrastive enhancement strategies under competitive lexical conditions. In contrast, the speech recognition system demonstrates effective learning of contrastively hyperarticulated forms through learning the latent representations of diacritic-driven orthographic regularization in concert with the emergent learning of high-fidelity phonetic representations to resolve hyperarticulated contrasts.Type
textElectronic Dissertation
Degree Name
Ph.D.Degree Level
doctoralDegree Program
Graduate CollegeLinguistics
