W4: Creative and Systematic Model Development and Evaluation

Organised by Tony Jakeman, Teemu Kokkonen, Neil Crout


Title: Systematic Simplification of Mechanistic Models

Authors: Neil Crout, Glen Cox, James Gibbons

Abstract: Models of environmental systems are often complex, reflecting the complexity of the systems they try and describe. This complexity is difficult to manage and, especially in the face of limited observed data, there is a risk that models become over-parameterised with the result that predictions are less reliable than they need be. An approach to investigating the influence of model complexity on prediction accuracy is to compare the performance of alternative (simpler) model formulations. However this is difficult to achieve in practice as the process of simplification can be time-consuming with the consequence that only a few alternative formulations can be investigated. Automatic, or perhaps semi-automatic, methods of model simplification are potentially useful in addressing this problem. This papers makes the case for such methods and discusses some of the issues arising from their use, with a practical example for a mechanistic model of plant uptake of radiocaesium

Title: On questions arising for any model simplification process

Authors: Perminov V.D.

Abstract: Models based on a direct statistical modeling (in ecology they are often called by individual-based models) allow the solution of more complex problems than the more usual mean field deterministic models. As a rule this achieved by the inclusion of a great number of object properties and processes which usually have been little studied, or not studied at all. It seems that methods to simplify such models are obvious: one only has to reduce the number of the properties and processes taken into account. However, under conditions of insufficient knowledge about these properties and processes, many questions arise, for example

These questions and possible answers to them will be a main subject of my workshop talk.

Title: Ten Interactive Steps in Model Development Applied to Process-Based Modelling of Estuarine Biogeochemistry

Authors: Barbara Robson

Abstract: There have been a number of recent calls for a more systematic approach to model development and verification, including paper in this conference (Jakeman et al) outlining ten steps that constitute best practise in model development. This paper demonstrates how these steps can be applied in the context of a process-based dynamic simulation model of estuarine nitrogen and phosphorus cycling and primary production, drawing on several examples, but chiefly on the example of a model developed for a macrotidal tropical system, Fitzroy Estuary and Keppel Bay, on the coast of Queensland, Australia. The model is able to reproduce spatially and temporally resolved nutrient dynamics on a seasonal timescale and has contributed to our understanding of the system and how it mediates between the catchment and the Great Barrier Reef Lagoon.

Title: Model Abstraction In Hydrologic Modeling

Authors: Yakov Pachepsky, Andrey Guber, Rien van Genuchten, Thomas Nicholson, Ralph Cady, Jirka Simunek, Timothy Gish, Diederik Jacques

Abstract: Model abstraction (MA) is a methodology for reducing the complexity of a simulation model while maintaining the validity of the simulation results with respect to the question that the simulation is being used to address. The MA explicitly deals with uncertainties in model structure and in model parameter sources. It has been researched in various knowledge fields that actively use modeling. We present (a) the taxonomy of model abstraction techniques being applied in subsurface hydrologic modeling, (b) the systematic and comprehensive procedure of the MA implementation including (1) defining the context of the modeling problem, (2) defining the need for the model abstraction, (3) selecting applicable MA techniques, (4) identifying MA directions that may give substantial gain, and (5) simplifying the base model in each direction. The need in MA may stem from (a) difficulties to obtain a reliable calibration of the base model, (b) the error propagation making the key outputs uncertain, (c) inexplicable results from the base model, (d) excessive resource requirements of the base model, (e) the intent to include the base model in a larger multimedia environmental model, (f) the need to make the modeling process more transparent and tractable, and (g) the need to justify the use of a simple model when a complex model is available. The example illustrates the MA application in field-scale simulations of water flow in variably saturated soils and sediments. The MA (a) can result in the improved reliability of modeling results, (c) make the data use more efficient, (c) enable risk assessments to be run and analyzed with much quicker turnaround, with the potential for allowing further analyses of problem sensitivity and uncertainty, and (d) enhance communication as simplifications may make the description of the problem more easily relayed to and understandable by others, including decision-makers and concerned public.

Title: Applying Bayesian Model Averaging to Mechanistic Models

Authors: James Gibbons, Glen Cox, Neil Crout, Andy Wood, Jim Craigon, Stephen Ramsden

Abstract: We investigate model averaging and Bayesian Model Averaging (BMA), as an alternative to model selection, in a mechanistic model context. Model averaging is relevant when there is a set of similarly performing models with differences in predictions. Predictions are combined, by weighting with factors related to model performance, resulting in ensemble predictions. BMA applies model averaging in a Bayesian framework where the model weights are Posterior Model Probabilities (PMPs). We describe several approximation methods for calculating PMPs and consider a full Bayesian approach implemented using a Markov Chain Monte Carlo (MCMC) method and a Metropolis-Hastings algorithm. We also describe a simplified BMA approach requiring only the maximum likelihood parameter estimates and Laplace approximation of the integrated likelihoods BMA is illustrated with the Absalom model, a mechanistic model which predicts the plant uptake of radiocaesium from contaminated soils. Using five model selection criteria (AIC, BIC, Residual Sum of Squares (RSS), MDL and ICOMP), ten models were selected for averaging. The set of models included the full Absalom model and nine further models in which model variables had been replaced by a constant. The model predictions and ensemble predictions were compared using a calibration data set and an independent data set. The PMPs estimated using the MCMC approach and the Laplace approximation strongly weighted models with fewer parameters. The BIC-based PMP estimates ranked the models in the same order as the Laplace approximation, but gave more weight to the models with more parameters. The AIC-based estimates of the PMPs differed considerably from the other methods. However, in terms of RSS all the methods produced similarly performing predictions. Individual predictions differed among models and the prediction ensembles captured this uncertainty. The simplified BMA approach performed as well as the full approach. We conclude that BMA is a valuable approach in mechanistic model development.

Title: Towards parsimony: generating alternative model formulations for assessment by model selection criteria.

Authors: Glen Cox, James Gibbons, Jim Craigon, Stephen Ramsden, Andy Wood, Neil Crout

Abstract: Mechanistic models which are to be used for prediction should be parsimonious, if overfitting and consequent poor performance are to be avoided. To assist the identification of parsimonious models several model selection criteria have been developed (e.g. AIC, BIC, MDL and ICOMP), which balance a models goodness-of-fit with some measure of the models complexity. These criteria require a set of alternative model formulations to compare; however, generating alternatives of large mechanistic models is often not straightforward, and can be very time consuming. We describe a procedure which automatically generates alternative model formulations, based upon an original model structure, by systematically replacing model variables or inputs with constant values. It should be noted that this approach is not intended to provide definitive answers regarding the best model formulation; rather, we envisage it being used to inform model development as part of an iterative process. To illustrate the approach, we present the results of its application to a radiocaesium plant-uptake model. In this case 1024 alternative model formulations were created, and the values of five different criteria (RSS, AIC, BIC, MDL and ICOMP) evaluated for each. The lowest values of RSS and AIC occurred for the same model where the pH, MCaMg and CEChumus variables were replaced. This model was a better predictor of the parameterisation dataset than the original model, suggesting that the pH input variable was introducing noise into the system. The lowest values of BIC, MDL and ICOMP all occurred for a different model in which two additional variables (Kdhumus and RIPclay) were replaced. Both of the reduced models selected by the selection criteria were better predictors of an independent dataset than the original model.

Title: Good Hydrological Modelling Practices in Data-poor Regions

Authors: Barry Croke

Abstract: Management of water resources in data poor regions requires integrated assessment of the impacts of land use and management on water resources. A key component of this is modelling the hydrological response of a catchment. In data-poor regions, not only is there limited data available for developing, calibrating and testing models, but often such data have significant errors that need to be accounted for. Analysis of datasets prior to application within a model is a vital component of such work as calibration of a model will attempt to correct for such errors as well as represent the response of the system. The objectives of a hydrological model in integrated assessment are different than those for developing hydrological understanding. In particular, relative changes in hydrologic response are often of more interest than estimation of the actual response. In addition, the spatial and temporal scales adopted need only resolve the response characteristics of interest. The performance of a hydrological model in such instances need only be comparable to the accuracy of the other components of the integrated model. Assessment of the model performance needs to take this into account, and should be designed to reflect the requirements of the hydrological model. While the Nash-Sutcliffe efficiency (NSE) is often used to assess model performance, this is not necessarily the best indicator for simulating the effect of management options on water resources as the duration and volume of low flows may be of more importance. Alternative indicators that may be adopted include a measure of the fit to the flow duration curve, or the NSE calculated using transformed flow values (e.g. logarithm of flow). The concepts are applied in a study of the hydrologic response of the Mae Chaem catchment in Northern Thailand, as part of a project investigating the impacts of agroforestry mosaics on catchment response.

Title: The development of a farming systems model (APSIM)  a disciplined approach.

Authors: Dean Holzworth

Abstract: The Agricultural Production Systems Simulator (APSIM) is a mature and stable modelling framework used widely in Australia in the domain of farming systems research and extension. It is capable of simulating a diverse range of farming systems including broad acre dryland and irrigated cropping, small holder farming, on-farm agroforestry systems including the interaction of trees and crops, and through collaboration with other groups, integrated stock and cropping enterprises. APSIM was developed primarily as a research tool to investigate on-farm management practices, natural resource issues including salinity and solute movement, climate risk studies looking at modifying farm practices and climate change scenarios to name but a few. In recent times commercialisation activities seek to bring the power of APSIM directly to consultants and farmers in a useful way. This paper details APSIMs construction over the past 15 years including its conception, specification, construction, performance and usage. APSIMs development is very much a collaborative effort between multiple organisations and involves numerous scientists, modellers and software developers. Even though APSIMs development continues, this paper chronicles the development effort to date and details some of the lessons learnt along the way.

Title: Simple Models for Supporting Shellfish Resource Area Management Decisions

Authors: Andrew Gronewold, Robert Wolpert, Kenneth Reckhow

Abstract: Shellfishing resource area management plans apply conservative criteria for opening and closing shellfish growing areas in order to protect human and environmental health. Closure criteria typically include recent rainfall event duration and intensity, while reopening decisions include a subjective evaluation of days since the last precipitation event, event intensity, and bacterial water quality monitoring results. These criteria are based on historic relationships between stormwater runoff and high pathogen concentrations in receiving waters, however the implicit causal relationships among precipitation intensity, lag between precipitation events, land use patterns, receiving water quality, and subsequent shellfish contamination are poorly understood. Because short-term protection of human health takes priority over long-term restoration of impaired shellfishing areas, effective implementation of a shellfishing resource area management plan does not necessitate explicit understanding of the runoff to shellfish contamination relationship. Long-term water quality restoration is, however, a stated goal of the Total Maximum Daily Load (TMDL) Program, and development of simple modeling tools which can effectively forecast water quality improvements based on simulated land use pattern changes are vital to the Program's success. This paper includes the two-stage development of a simple precipitation-driven pathogen water quality model. Precipitation events and antecedent periods of dryness serve as predictors, and model results are compared to historic shellfish resource area management decisions. The first stage model is intended to serve as a quantitative decision support tool for shellfishing resource area managers who must consider impacts of frequent resource area closures on the commercial shellfishing industry while protecting human health with limited water quality sampling and analysis resources. The second stage model is an extension of the first stage model leading to a pathogen TMDL support tool using a Bayesian network with expert elicitation to establish parameter prior distributions and subsequent model updating with water quality modeling results.

Title: Data-adaptive reduction of complex process-based models

Authors: Knut Bernhardt

Abstract: Many process-based mathematical models of natural systems are relatively complex due to the multitude of processes they incorporate, e.g. in the form of coupled differential equations. Although most of these processes may be of some relevance for the observed system dynamics their individual importance typically is hard to quantify because of nonlinear interactions between them. This complexity makes a model difficult to use or verify and shadows key processes of the system. A primary goal of model building therefore is to make models as simple as possible to match or predict measured datasets. The high level of abstraction needed to do this, however, is often difficult to accomplish and an alternative may be the simplification of already existing model systems. The approach to simplify models presented here is based on the observation that the relevant dynamics of many investigated systems is bound to a low-dimensional subspace and is also often limited to a small collection of dynamical modes. Thus, we formulate the problem of model reduction as the search for a new low-dimensional deterministic model structure that is able to reproduce the main features of the complex model's output. The approach combines the nonlinear projection of model-generated time series onto a new feature space with a genetic programming algorithm to build the structure of the new model, whose variables match the state variables spanning the low-dimensional space. Thus, the resulting reduced model is a process-based condensed form of the original model: its state variables are nonlinear transformations of the original ones and the main features of the dynamics are captured by processes based on these new variables. Results for some simple test models are promising and demonstrate the usefulness of the approach to generate interpretable reduced models that incorporate key processes of the underlying system.

Title: Construction of a degree-day snow model in the light of the ten iterative steps in model development

Authors: Teemu Kokkonen, Harri Koivusalo, Tony Jakeman, John Norton

Abstract: Jakeman et al. (2005) discuss minimum standards for model development and reporting and offer an outline of ten iterative steps to be used in model development. They present the main steps and give examples of what each step might include (especially what choices are to be made), without attempting the formidable task of compiling a comprehensive check list of the model-development process. This study reports construction of a simple degree-day snowmelt model in the light of the ten iterative steps. Such a modelling approach has been widely used in operational hydrology, where the motivation is to produce as reliable as possible snowmelt discharge predictions for streamflow forecasting. There were meteorological and snow cover data available from a research site in southern Finland. Measurements included daily precipitation and air temperature records for the period extending from Dec 1, 1996 to Apr 30, 2000, and in the same period snow water equivalent was observed at approximately a weekly interval. These data were used in the development, parameterisation and diagnostic checking of the model in the manner presented in the ten steps.

Title: Water Balance Modelling in Bowen, Queensland, and the Ten Iterative Steps In Model Development and Evaluation

Authors: Wendy Welsh

Abstract: The viability of irrigated horticulture in a coastal catchment near Bowen in Queensland, Australia, is dependent on groundwater. The summer-dominant rainfall is extremely variable, ranging from 255 to 2358 mm/year and there are no large surface water storages. A model was sought to improve understanding of the local hydrology and assist with management of the groundwater. The 220 km2 area is data-rich with 260 observation bores plus stream gauging, metering of irrigation bores and detailed land use mapping. A water balance model based on Darcys Law, which describes laminar water flow through soils, and using only the historical data was developed. The method is Geographic Information System (GIS)-based and provides both spatial and temporal results. The study proved cost and time effective and provided important insights to the groundwater dynamics of the area. The method is generally applicable to data-rich aquifers. The model development has been documented with reference to the ten interactive steps in model development and evaluation of Jakeman et al. (in prep), who suggest guidelines for good practice in model development, documentation and application to enhance the credibility and usefulness of the information and insights from modelling. References Jakeman, A.J., Letcher, R.A. and Norton, J.P. (in prep). Ten iterative steps in model development and evaluation.

Title: Ten interactive steps in model development: the construction, application and ongoing development of the CatchMODS water quality model

Authors: Lachlan Newham, Tony Jakeman

Abstract: The CatchMODS water quality model was constructed to assess the impacts of catchment-scale land and water management activities on streamflow, sediment and nutrient fluxes. The model integrates several sub-models to enable the development and evaluation of the impact of management scenarios aimed at reducing pollutant inputs. The innovation of the system is the integration of otherwise separate modelling approaches to enable biophysical and economic assessment of management options. Outputs from the model are used to improve the focus of on-ground remediation, targeted to specific stream reaches and subcatchment areas, as well as to encourage sustainable catchment management practices. The model was initially developed for application in the Ben Chifley Dam catchment in New South Wales, Australia. Its development was accompanied by close collaboration with a variety of local management organisations. This paper discusses the development of the model for the Ben Chifley Dam catchment in the context of the Ten Steps to model development described in Jakeman et al. (in press). The paper also describes progress in the ongoing development of the model viz. addition of pathogen and urban land use sub-models and a shift towards greater temporal resolution. In particular, the challenges that these changes present in ensuring rigorous model development are discussed.

Title: Four models for resource management: which mathematical method is more appropriate?

Authors: Sergei Schreider

Abstract: This presentation outlines four projects concerned with the application of different mathematical modelling methods in the field of natural resource management for sustainable development. One project is related to applications of partial equilibrium optimization in combined economics/water allocation modelling. The integrated model unifies a Linear Programming-based regional economic model with a water trading module and the water allocation model for the Goulburn-Murray System (GSM) based on a network LP. The project employed the REALM optimization software environment, especially designed for water allocation modelling. The paper discusses the methodology and results of this integration exercise for one of the most important regions of irrigated agriculture in Australia: the Goulburn Valley. The second project presented here is devoted to the application of computable general equilibrium to integration of hydrological and economical models. The core Operations Research technique used in this approach is non-linear convex optimization. The major advantage of the proposed methodology is that it allows researchers to consider a sole optimization framework instead of using separate optimization modules for economics and water allocation.

The first two cases employ the notion of resource competition. However, cooperation is also a very important tool for effective water resource management. The third project presented here is devoted to application of cooperative game theory (theory of games based on coalitions) for reducing the phosphorus pollution resulting from pasture fertilization in the Glenelg-Hopkins catchment located in western Victoria, Australia. Finally, the presentation discusses the application of a numerical PDE solution for growth modeling of invasive species in the coastal lagoons in NSW, Australia.

The presentation discusses which mathematical methods are more appropriate for modeling in each individual case and is focused on the first three steps of the checklist provided in the workshop’s background Position Paper. These are

  1. Specification of modeling objectives;
  2. Conceptualisation of the system and specification of data and other prior knowledge;
  3. Selection of the model features: nature, family, form of uncertainty specification.