Exploring the Acoustic Output of a Computational Model of Aging Voice
Author
Melendez, DiegoIssue Date
2025Keywords
Age-related dysphoniaCepstral peak prominence
Computational kinematic model of speech
Presbyphonia
Advisor
Samlan, Robin A.
Metadata
Show full item recordPublisher
The University of Arizona.Rights
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction, presentation (such as public display or performance) of protected items is prohibited except with permission of the author.Abstract
Purpose: This study investigated the acoustic outputs of a computational model of aging voice by comparing acoustic values (i.e., CPP) of real participants to simulations produced by a computational kinematic model of speech. The first aim looked to determine the relation of CPP measured from patient recordings to CPP of simulated vowels, and whether modeling pyriform sinuses increased the correlation to patient recordings. The second aim looked to determine whether decreasing epilaryngeal area increased the CPP of signals produced by the model. Method: 18 laryngeal high-speed video and audio recordings were used from a larger study focused on improving voice production for adults with age-related dysphonia (NIDCD R21DC016356). Glottal area waveforms were segmented and used as the vibratory source in a computational speech production model. Both /a:/ and /i:/ vowels were simulated with and without the presence of the pyriform sinuses and using three settings of epilaryngeal input areas. Cepstral peak prominence of real and simulated vowels were compared. Spearman Rank Correlations were used to compare results of real vs. simulated values under various conditions, including the presence or absence of pyriform sinuses and different epilaryngeal input areas. A repeated measures Analysis of Variance (ANOVA) was used to compare the CPP differences across epilaryngeal areas at the vocal tract entry. Results: Simulated vowels correlated weakly with real production and the correlation was higher when the pyriform sinuses were included in the model compared to when they were excluded. The correlations for males (present vs. absent) were as follows: (/a:/) rs = 0.455 vs. 0.275 and (/i:/) rs = -0.006 vs. rs = 0.108. The correlations for females (present vs. absent) were as follows: (/a:/) rs = 0.268 vs. rs = 0.260 and (/i:/) rs = 0.200 vs. rs = 0.152. Narrowing the epilaryngeal area from 1.0 to 0.2 cm2 increased the CPP in 6 out of 18 simulations for vowel /a:/, as well as /i:/. Qualitative evaluation of spectrograms revealed variability in harmonic energy across conditions, which might partially explain the weak correlations. Conclusion: In comparing real to simulated recordings for vowels /a:/ and /i:/, no strong relationship was found between groups. While some similarities in CPP were seen between real participants and simulations, this was not consistent for all comparisons. While not consistent among all simulations, there was a stronger relationship between real participants and simulations that included the pyriform sinuses; while not consistent, the presence of pyriform sinuses did alter CPP for simulations. While changes to CPP were noticed among some simulations, there was no strong relationship between CPP and changes to epilaryngeal area.Type
textElectronic Thesis
Degree Name
M.S.Degree Level
mastersDegree Program
Graduate CollegeSpeech, Language, & Hearing Sciences