Machine-Learning-Based Forest Classification and Regression (FCR) for Spatial Prediction of Liver Fluke Opisthorchis viverrini (OV) Infection in Small Sub-Watersheds
Author
Pumhirunroj, B.Littidej, P.
Boonmars, T.
Bootyothee, K.
Artchayasawat, A.
Khamphilung, P.
Slack, D.
Affiliation
Department of Civil & Architectural Engineering & Mechanics, University of ArizonaIssue Date
2023-12-14Keywords
forest-based classification and regressionmachine learning
Opisthorchis viverrini
ordinary least square
Metadata
Show full item recordCitation
Pumhirunroj, B.; Littidej, P.; Boonmars, T.; Bootyothee, K.; Artchayasawat, A.; Khamphilung, P.; Slack, D. Machine-Learning-Based Forest Classification and Regression (FCR) for Spatial Prediction of Liver Fluke Opisthorchis viverrini (OV) Infection in Small Sub-Watersheds. ISPRS Int. J. Geo-Inf. 2023, 12, 503. https://doi.org/10.3390/ijgi12120503Rights
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).Collection Information
This item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at repository@u.library.arizona.edu.Abstract
Infection of liver flukes (Opisthorchis viverrini) is partly due to their suitability for habitats in sub-basin areas, which causes the intermediate host to remain in the watershed system in all seasons. The spatial monitoring of fluke at the small basin scale is important because this can enable analysis at the level of the factors involved that influence infections. A spatial mathematical model was weighted by the nine spatial factors X1 (index of land-use types), X2 (index of soil drainage properties), X3 (distance index from the road network, X4 (distance index from surface water resources), X5 (distance index from the flow accumulation lines), X6 (index of average surface temperature), X7 (average surface moisture index), X8 (average normalized difference vegetation index), and X9 (average soil-adjusted vegetation index) by dividing the analysis into two steps: (1) the sub-basin boundary level was analyzed with an ordinary least square (OLS) model used to select the spatial criteria of liver flukes aimed at analyzing the factors related to human liver fluke infection according to sub-watersheds, and (2) we used the infection risk positional analysis level through machine-learning-based forest classification and regression (FCR) to display the predictive results of infection risk locations along stream lines. The analysis results show four prototype models that import different independent variable factors. The results show that Model 1 and Model 2 gave the most AUC (0.964), and the variables that influenced infection risk the most were the distance to stream lines and the distance to water bodies; the NDMI and NDVI factors rarely affected the accuracy. This FCR machine-learning application approach can be applied to the analysis of infection risk areas at the sub-basin level, but independent variables must be screened with a preliminary mathematical model weighted to the spatial units in order to obtain the most accurate predictions. © 2023 by the authors.Note
Open access journalISSN
2220-9964Version
Final Published Versionae974a485f413a2113503eed53cd6c53
10.3390/ijgi12120503
Scopus Count
Collections
Except where otherwise noted, this item's license is described as © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).