Show simple item record

dc.contributor.authorMaier, H.R.
dc.contributor.authorZheng, F.
dc.contributor.authorGupta, H.
dc.contributor.authorChen, J.
dc.contributor.authorMai, J.
dc.contributor.authorSavic, D.
dc.contributor.authorLoritz, R.
dc.contributor.authorWu, W.
dc.contributor.authorGuo, D.
dc.contributor.authorBennett, A.
dc.contributor.authorJakeman, A.
dc.contributor.authorRazavi, S.
dc.contributor.authorZhao, J.
dc.date.accessioned2024-08-18T05:33:42Z
dc.date.available2024-08-18T05:33:42Z
dc.date.issued2023-09
dc.identifier.citationMaier, H. R., Zheng, F., Gupta, H., Chen, J., Mai, J., Savic, D., ... & Zhao, J. (2023). On how data are partitioned in model development and evaluation: Confronting the elephant in the room to enhance model generalization. Environmental Modelling & Software, 167, 105779.
dc.identifier.issn1364-8152
dc.identifier.doi10.1016/j.envsoft.2023.105779
dc.identifier.urihttp://hdl.handle.net/10150/674570
dc.description.abstractModels play a pivotal role in advancing our understanding of Earth's physical nature and environmental systems, aiding in their efficient planning and management. The accuracy and reliability of these models heavily rely on data, which are generally partitioned into subsets for model development and evaluation. Surprisingly, how this partitioning is done is often not justified, even though it determines what model we end up with, how we assess its performance and what decisions we make based on the resulting model outputs. In this study, we shed light on the paramount importance of meticulously considering data partitioning in the model development and evaluation process, and its significant impact on model generalization. We identify flaws in existing data-splitting approaches and propose a forward-looking strategy to effectively confront the “elephant in the room”, leading to improved model generalization capabilities. © 2023 The Authors
dc.language.isoen
dc.publisherElsevier Ltd
dc.rights© 2023 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (https://creativecommons.org/licenses/by/4.0/).
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectCalibration
dc.subjectData partitioning
dc.subjectData splitting
dc.subjectEarth systems
dc.subjectModel development
dc.subjectModel evaluation
dc.subjectUncertainty
dc.subjectValidation
dc.titleOn how data are partitioned in model development and evaluation: Confronting the elephant in the room to enhance model generalization
dc.typeArticle
dc.typetext
dc.contributor.departmentDepartment of Hydrology and Atmospheric Sciences, University of Arizona
dc.identifier.journalEnvironmental Modelling and Software
dc.description.noteOpen access article
dc.description.collectioninformationThis item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at repository@u.library.arizona.edu.
dc.eprint.versionFinal Published Version
dc.source.journaltitleEnvironmental Modelling and Software
refterms.dateFOA2024-08-18T05:33:42Z


Files in this item

Thumbnail
Name:
1-s2.0-S1364815223001652-main.pdf
Size:
1.949Mb
Format:
PDF
Description:
Final Published Version

This item appears in the following Collection(s)

Show simple item record

© 2023 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (https://creativecommons.org/licenses/by/4.0/).
Except where otherwise noted, this item's license is described as © 2023 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (https://creativecommons.org/licenses/by/4.0/).