Cross-Validation Indicates Predictive Models May Provide an Alternative to Indicator Organism Monitoring for Evaluating Pathogen Presence in Southwestern US Agricultural Water
AffiliationDepartment of Environmental Science, University of Arizona
MetadataShow full item record
PublisherFrontiers Media S.A.
CitationBelias, A., Brassill, N., Roof, S., Rock, C., Wiedmann, M., & Weller, D. (2021). Cross-Validation Indicates Predictive Models May Provide an Alternative to Indicator Organism Monitoring for Evaluating Pathogen Presence in Southwestern US Agricultural Water. Frontiers in Water.
JournalFrontiers in Water
RightsCopyright © 2021 Belias, Brassill, Roof, Rock, Wiedmann and Weller. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY).
Collection InformationThis item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at firstname.lastname@example.org.
AbstractPathogen contamination of agricultural water has been identified as a probable cause of recalls and outbreaks. However, variability in pathogen presence and concentration complicates the reliable identification of agricultural water at elevated risk of pathogen presence. In this study, we collected data on the presence of Salmonella and genetic markers for enterohemorrhagic E. coli (EHEC; PCR-based detection of stx and eaeA) in southwestern US canal water, which is used as agricultural water for produce. We developed and assessed the accuracy of models to predict the likelihood of pathogen contamination of southwestern US canal water. Based on 169 samples from 60 surface water canals (each sampled 1–3 times), 36% (60/169) and 21% (36/169) of samples were positive for Salmonella presence and EHEC markers, respectively. Water quality parameters (e.g., generic E. coli level, turbidity), surrounding land-use (e.g., natural cover, cropland cover), weather conditions (e.g., temperature), and sampling site characteristics (e.g., canal type) data were collected as predictor variables. Separate conditional forest models were trained for Salmonella isolation and EHEC marker detection, and cross-validated to assess predictive performance. For Salmonella, turbidity, day of year, generic E. coli level, and % natural cover in a 500–1,000 ft (~150–300 m) buffer around the sampling site were the top 4 predictors identified by the conditional forest model. For EHEC markers, generic E. coli level, day of year, % natural cover in a 250–500 ft (~75–150 m) buffer, and % natural cover in a 500–1,000 ft (~150–300 m) buffer were the top 4 predictors. Predictive performance measures (e.g., area under the curve [AUC]) indicated predictive modeling shows potential as an alternative method for assessing the likelihood of pathogen presence in agricultural water. Secondary conditional forest models with generic E. coli level excluded as a predictor showed < 0.01 difference in AUC as compared to the AUC values for the original models (i.e., with generic E. coli level included as a predictor) for both Salmonella (AUC = 0.84) and EHEC markers (AUC = 0.92). Our data suggests models that do not require the inclusion of microbiological data (e.g., indicator organism) show promise for real-time prediction of pathogen contamination of agricultural water (e.g., in surface water canals). Copyright © 2021 Belias, Brassill, Roof, Rock, Wiedmann and Weller.
NoteOpen access journal
VersionFinal published version
Except where otherwise noted, this item's license is described as Copyright © 2021 Belias, Brassill, Roof, Rock, Wiedmann and Weller. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY).