Comparing Random Forest with Generalized Linear Regression: Predicting Conflict Events in Western Africa
AdvisorRyckman, Kirssa C.
MetadataShow full item record
PublisherThe University of Arizona.
RightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction, presentation (such as public display or performance) of protected items is prohibited except with permission of the author.
AbstractDespite the progress of conflict prediction models within the last two decades, most approaches lack a spatial approach to conflict prediction. This study explores the intersection of geography and conflict prediction, while discussing how spatial analysis and conflict prediction can integrate. The objective is to compare a spatial machine learning method, Classification Random Forest, to a more traditional statistical method, Logistic Generalized Linear Regression, to assess the sub-national predictive power of conflict occurrence in Western Africa. The two models are fitted to subnational data for Western Africa at the district level covering the years 2015 – 2017, generating an out of sample prediction for 2018. The overall accuracy is assessed using an F-1 score, which accounts for sensitivity and precision of the models, to discover areas of over-prediction and under-prediction. The Random Forest model produced an F-1 score of 0.58582, while the Generalized Linear Regression model had an F-1 score of 0.61017. A significant difference between the two models was not detected. The Generalized Linear Regression had a better overall accuracy, but the Random Forest model predicted more incidences of conflict correctly. Between the two models, five explanatory variables contributed to the predictive power of both models: Conflict Density, Road Density, Area of the District, Nighttime Lights, and Population Density. Future research should also explore the effect of conflict hot spots on conflict prediction models. Given the success of the Generalized Linear Regression model, the next logical step is to explore the local variation with a Geographic Weighted Regression model.
Degree ProgramGraduate College