Development and Implementation of a Predictive Model of Documentation Status to Examine Factors Related to Health Outcomes for Undocumented Immigrants
Author
Rivers, Patrick SullivanIssue Date
2023Advisor
Marrero, David
Metadata
Show full item recordPublisher
The University of Arizona.Rights
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction, presentation (such as public display or performance) of protected items is prohibited except with permission of the author.Abstract
Background: The United States (US) is home to more than 8.6 million undocumented Hispanic immigrants. These individuals face significant barriers to accessing health services and are a subpopulation on whom relatively little research has been done. Undocumented Hispanics are faced with unique vulnerabilities, and research on them should be done thoughtfully and with care. One proposed method of investigating health outcomes for this subpopulation, and the factors that influence these outcomes, is utilizing predictive modeling for documentation status. Objectives: This dissertation is composed of three studies that aim to: 1) Synthesize the literature to assess health outcomes of undocumented immigrants with at least one chronic condition; 2) Create and internally validate a predictive model of documentation status utilizing two data sets by comparing demographic and health characteristics of individuals alongside their known documentation status; and 3) Utilize the predictive model on a prospective cohort of essential workers – in which documentation status is unknown – to examine healthcare utilization for the treatment of SARS-CoV-2 infections by predicted undocumented status. Methods: Aim one is explored through a scoping review of the literature related to chronic disease outcomes for undocumented Hispanic immigrants compared to documented immigrants or US-born citizens, following the Preferred Reporting Items for Systematic reviews and Meta-Analyses checklist for scoping reviews (PRISMA-ScR). A systematic search was conducted on October 10, 2019 in five databases (PubMed, MEDLINE, Web of Science, Embase, and CINAHL), with supplementary searches up to June 24, 2023. Studies were examined by two reviewers, and those found to be quantitative studies conducted in the US, comparing health outcomes for undocumented Hispanic immigrants aged 18+ compared to either or both documented Hispanic immigrants or US-born citizens were included. Aim two is examined by the development and internal validation of three multivariable prediction models of Hispanic immigrant documentation status utilizing data from two previous studies on health outcomes of individuals living near the Arizona-Mexico border. For the predictive models, multiple imputation by chained equations (MICE), random forest machine learning, and support vector machine learning algorithms were employed, with each model assessed for its accuracy, precision, recall, and Matthews correlation coefficient (MCC). Aim three is investigated through the implementation of the best-performing prediction model in a longitudinal cohort of essential workers, and utilizing multivariable logistic regression models to ascertain if there were differences between predicted undocumented immigrants and the rest of the cohort, and to identify factors associated with healthcare utilization for the treatment of SARS-CoV-2 infection. Results: Of 275 records identified through database search, 26 full-text articles were assessed, and eight articles were found to examine health outcomes for undocumented immigrants with at least one chronic condition. There was conflicting evidence as to whether undocumented immigrants have poorer health outcomes compared to documented immigrants or US-born citizens. For aim two, a combined sample of 473 individuals was used to build our prediction models. The model using MICE had the lowest rates of accuracy (64.3%), precision (0.81), recall (0.74), and MCC (-0.07). The random forests model was fared slightly better in all metrics with an accuracy of 73.1%, precision of 0.82, recall of 0.86, and MCC of 0.002, and the support vector machine was the strongest in all categories, with an accuracy of 90.7%, precision of 0.90, recall of 0.99, and MCC of 0.66. For aim three, predicted undocumented status of 1,074 Hispanic individuals from the longitudinal cohort was assigned using the support vector machine algorithm. Across 466 SARS-CoV-2 infections that occurred in this group, there was no statistically significant difference between healthcare utilization by predicted undocumented immigrants compared to the rest of the cohort, and the only factors associated with healthcare utilization were number of illness symptoms, number of days spent in bed, and illness duration. Conclusions: There are significant research gaps on undocumented immigrants, their health outcomes, and the factors that influence these health outcomes. One of the predictive models for documentation status developed in Aim 2 shows promise and warrants additional research. This work aims to make some incremental progress towards understanding this subpopulation better, the unique challenges they face, and their healthcare behaviors and needs.Type
Electronic Dissertationtext
Degree Name
Ph.D.Degree Level
doctoralDegree Program
Graduate CollegeHealth Behavior Health Promotion
