Bridging the Gap Between the Physical-Conceptual Approach and Machine Learning for Modeling Hydrological Systems
PublisherThe University of Arizona.
RightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction, presentation (such as public display or performance) of protected items is prohibited except with permission of the author.
AbstractThere is a long history of developing computer-based physical-conceptual catchment-scale rainfall-runoff (RR) models in hydrology. To date, these models continue to be further refined to incorporate more physical processes and to be more spatially resolved, such as NOAA National Water Model (NWM) that facilitates spatio-temporal prediction. On the other hand, generic Machine Learning (ML) based Gated Recurrent Neural Networks are currently the most accurate and extrapolatable predictive models available for time-series prediction for Earth and environmental science problems. Certainly, there are pros and cons associated with the use of either of these modeling approaches, which leads to the research question of how to achieve a suitable tradeoff between model complexity and physical interpretability given a certain level of predictive accuracy.In the first part of this dissertation, I will summarize the experience gained through efforts made toward improving the NWM model architecture for streamflow prediction in semi-arid environments, and explore why an ML-based approach can be important support for the development of process-based approaches. The advantage of applying ML-based modeling technologies is demonstrated by using the Long Short-Term Memory Network (LSTM) architecture to model the dynamics of snowpack accumulation and melt. This approach is shown to significantly outperform the Physics-based SNOW-17 model in terms of predictive accuracy, computational efficiency, and spatial transferability. While ML-based approaches have achieved great success, the lack of physical explainability impedes widespread acceptance of such models by the hydrological science community. Hence, in the second part of my dissertation, I propose a specific type of system/network “node”, referred to as a Mass Conserving Perceptron (MCP), that can potentially form the basis for “interpretable” ML-based learning of RNN-type directed-graph type representations of mass/energy conserving physical systems directly from data. The MCP node incorporates several desirable features, including (i) recurrence, so that the dynamical state-variable evolution of system memory can be represented, (ii) the ability to impose conservation constraints at the nodal level, (iii) the ability to learn unobserved losses of mass/energy from each node, (iv) LSTM-like gating so that the state-variable time-constants can be dynamically adjusted based on current context, and (v) the ability to learn the forms of the flux equations governing the behaviors of the system. Finally, as a proof of concept, several MCP-based model architectures are tested using a 40-year daily rainfall-runoff dataset from the Leaf River catchment. The resulting performance is comparable to that obtainable using non-mass-conservative data-driven approaches including linear-time-series modeling, and NN-based modeling (time-delay ANNs, and LSTMs). Further, the use of the MCP-based approach facilitates relatively easy exploration of hypotheses regarding the nature of the data-generating process (the physical system), while seeking to balance model complexity with interpretability while maintaining a certain level of predictive accuracy.
Degree ProgramGraduate College