Procedures for the Characterization of Commercially-Available Volumetric Water Content Sensors to Augment Training Ensembles for Machine Learning Models with Synthetic Analog Datasets in Data-Sparse Regions
Publisher
The University of Arizona.Rights
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction, presentation (such as public display or performance) of protected items is prohibited except with permission of the author.Abstract
Machine Learning (ML) models are an increasingly popular method of producing predictions of real-world physical processes, such as forecasting stream stage heights after precipitation events, or the numerous meteorological models used in generating hurricane track forecasts. However, the development of ML models typically requires a significant amount of validated, real-world data inputs, which may be sparse or unavailable in specific regions due to a lack of instrumentation coverage. The limitation imposed by a lack of input data coverage or availability may be overcome by either augmenting, or replacing, input datasets with synthetic-analogs that are representative of the potential domain of actual in situ measurements. At a minimum, generating a synthetic-analog dataset requires an understanding of the real-world dynamics and properties governing the targeted physical process for the ML model, to create an adequate model ensemble from which representative data can be sampled. However, there is a greater need to measure and quantify any latent characteristics of instrumentation, such as sensor detection range, noise and bias, used in data collection intended to be used with the ML model post-development, to ensure that training data inputs match what may be seen in situ. This paper summarizes a method of characterizing commercially available soil volumetric water content (VWC) probes to use in the production of a large synthetic-analog datasets for the training and validation of deep-drainage flux and root-water uptake prediction ML models, derived from vadose (unsaturated) zone numerical model ensembles generated using Hydrus-1D.Type
textElectronic Thesis
Degree Name
M.S.Degree Level
mastersDegree Program
Graduate CollegeHydrology
