Using Machine Learning and Geospatial Data to Predict Groundwater Occurrence of Arsenic in the Colorado Plateau
| dc.contributor.advisor | Hoover, Joseph | |
| dc.contributor.author | Nuanez, Aaron Matthew | |
| dc.creator | Nuanez, Aaron Matthew | |
| dc.date.accessioned | 2025-06-30T21:23:01Z | |
| dc.date.available | 2025-06-30T21:23:01Z | |
| dc.date.issued | 2025 | |
| dc.identifier.citation | Nuanez, Aaron Matthew. (2025). Using Machine Learning and Geospatial Data to Predict Groundwater Occurrence of Arsenic in the Colorado Plateau (Master's thesis, University of Arizona, Tucson, USA). | |
| dc.identifier.uri | http://hdl.handle.net/10150/677713 | |
| dc.description.abstract | Geogenic arsenic is a naturally occurring groundwater contaminant that poses a public healthrisk and requires regulatory compliance for public water supply in the United States. Many efforts have been made to predict and map arsenic in groundwater using Geographic Information Systems (GIS) and machine learning methods. Previous research applied GIS, statistical, and ML methods to study the geographic distribution of arsenic in groundwater, yet these techniques have been rarely applied to rural and Tribal communities throughout the western United States generally, and Colorado Plateau specifically, in an effort to confront great uncertainty regarding local groundwater quality. The goal of this project was to predict the occurrence of arsenic in the groundwater of the Colorado Plateau using GIS to highlight communities at risk of elevated arsenic in their groundwater. Using Random Forest (RF) and eXtreme Gradient Boosting (XGBoost) we modeled the probability of groundwater arsenic exceeding either 5 or 10 µg/L. Final models demonstrated accuracy between 75 and 85% with sensitivity and specificity exceeding 0.5. Notable predictor variables included water pH and Fe, calcite (A soil horizon), and average annual precipitation. Of the 512 CWSs on the Colorado Plateau, 97 service areas overlapped with locations likely to exceed 5 µg/L As (serving ~344,000 people); and 57 systems (serving ~245,000 people) overlapped with locations likely to exceed 10 µg/L As. These models provided the first high resolution (<1 km) spatial model predicting As occurrence in the groundwater across the Colorado Plateau and highlight areas of potential groundwater As impacts for populations reliant on groundwater. | |
| dc.language.iso | en | |
| dc.publisher | The University of Arizona. | |
| dc.rights | Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction, presentation (such as public display or performance) of protected items is prohibited except with permission of the author. | |
| dc.rights.uri | http://rightsstatements.org/vocab/InC/1.0/ | |
| dc.subject | Arsenic | |
| dc.subject | Colorado Plateau | |
| dc.subject | GIS | |
| dc.subject | Machine Learning | |
| dc.subject | Prediction | |
| dc.subject | Spatial Modeling | |
| dc.title | Using Machine Learning and Geospatial Data to Predict Groundwater Occurrence of Arsenic in the Colorado Plateau | |
| dc.type | text | |
| dc.type | Electronic Thesis | |
| thesis.degree.grantor | University of Arizona | |
| thesis.degree.level | masters | |
| dc.contributor.committeemember | Root, Robert | |
| dc.contributor.committeemember | McIntosh, Jennifer | |
| dc.description.release | Release after 06/25/2026 | |
| thesis.degree.discipline | Graduate College | |
| thesis.degree.discipline | Environmental Science | |
| thesis.degree.name | M.S. |