On Lack of Robustness in Hydrological Model Development Due to Absence of Guidelines for Selecting Calibration and Evaluation Data: Demonstration for Data-Driven Models
Name:
Zheng_et_al-2018-Water_Resourc ...
Size:
1.847Mb
Format:
PDF
Description:
Final Published Version
Affiliation
Univ Arizona, Dept Hydrol & Atmospher SciIssue Date
2018-01-30
Metadata
Show full item recordPublisher
AMER GEOPHYSICAL UNIONCitation
Zheng, F., Maier, H. R., Wu, W., Dandy, G. C., Gupta, H. V., & Zhang, T. (2018). On lack of robustness in hydrological model development due to absence of guidelines for selecting calibration and evaluation data: Demonstration for data‐driven models. Water Resources Research, 54, 1013–1030. https://doi.org/10.1002/2017WR021470Journal
WATER RESOURCES RESEARCHRights
© 2018. American Geophysical Union. All Rights Reserved.Collection Information
This item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at repository@u.library.arizona.edu.Abstract
Hydrological models are used for a wide variety of engineering purposes, including streamflow forecasting and flood-risk estimation. To develop such models, it is common to allocate the available data to calibration and evaluation data subsets. Surprisingly, the issue of how this allocation can affect model evaluation performance has been largely ignored in the research literature. This paper discusses the evaluation performance bias that can arise from how available data are allocated to calibration and evaluation subsets. As a first step to assessing this issue in a statistically rigorous fashion, we present a comprehensive investigation of the influence of data allocation on the development of data-driven artificial neural network (ANN) models of streamflow. Four well-known formal data splitting methods are applied to 754 catchments from Australia and the U.S. to develop 902,483 ANN models. Results clearly show that the choice of the method used for data allocation has a significant impact on model performance, particularly for runoff data that are more highly skewed, highlighting the importance of considering the impact of data splitting when developing hydrological models. The statistical behavior of the data splitting methods investigated is discussed and guidance is offered on the selection of the most appropriate data splitting methods to achieve representative evaluation performance for streamflow data with different statistical properties. Although our results are obtained for data-driven models, they highlight the fact that this issue is likely to have a significant impact on all types of hydrological models, especially conceptual rainfall-runoff models.Note
6 month embargo; published online: 30 January 2018ISSN
0043-13971944-7973
Version
Final published versionSponsors
National Natural Science Foundation of China [51708491]; Australian Research Council through the Centre of Excellence for Climate System Science [CE110001028]Additional Links
https://agupubs.onlinelibrary.wiley.com/doi/abs/10.1002/2017WR021470ae974a485f413a2113503eed53cd6c53
10.1002/2017WR021470