Hydrologic Modeling For Runoff Forecasting

HOSHIN GUPTA

1 INTRODUCTION

The problem of forecasting streamflow levels given precipitation data has received the time and attention of a great many hydrologists. Models developed for this purpose have ranged from simple to extremely complex. The simplest ones are based on input-output regression-type relationships, while the most complex ones attempt to represent the detailed water and energy balance physics occurring in the watershed. The complex models are motivated largely by experimental evidence that the subwatershed-scale components of the rainfall-runoff process are strongly nonlinear, time variable, and spatially distributed. However, the processes of aggregation, attenuation, loss, and delay tend to result in an overall watershed response that is far less complex than the point-scale behavior. The effects of subwatershed-scale variability tend to be smoothed and poorly observable (to varying degrees) in the overall watershed-scale response. Thus, while remarkable progress has been made in understanding the physics of how precipitated water moves once it reaches the ground, the level of model complexity required to provide accurate runoff forecasts for any chosen watershed remains unclear. Even less clear is how this complexity varies with climatology, watershed size, and geologic and physiographic characteristics of the landscape.

2 MODELING AND COMPLEXITY

In the absence of such clarity, a wide variety of hydrologic models have found their way into the literature (Singh, 1995). The essential difference in these models is the

Handbook of Weather, Climate, and Water: Atmospheric Chemistry, Hydrology, and Societal Impacts, Edited by Thomas D. Potter and Bradley R. Colman. ISBN 0-471-21489-2 © 2003 John Wiley & Sons, Inc.

manner in which the underlying processes that transform precipitation into stream-flow are conceptualized. The more complex models have been motivated by the scientific pursuit of knowledge and are based on painstaking research into the physics of subwatershed-scale hydrologie processes. Such models attempt, in particular, to account for the spatially and temporally varying nature of watershed inputs (precipitation, solar radiation, etc.), losses (évapotranspiration), and characteristics (topography, permeability, vegetation, etc.). We shall refer to this modeling approach as "physics based." Perhaps the most well-known exponent of this approach is the Système Hydrologique Européen (SHE) model (Abbott, 1986). More recent developments are the soil-vegetation-atmosphere-transfer schemes (SVATS) used for climate studies, such as Biosphere Atmosphere Transfer Scheme (BATS) (Dickinson, 1993), Simple Biosphere Model 2 (SiB2) (Randall, 1996), and Variable Infill-tration Capacity 2-Layer Model (VIC-2L) (Liang, 1994).

At the other end of the spectrum, the simplest models have been motivated by engineering considerations based on a real need to provide quick and accurate forecasts of streamflow levels in the simplest possible way, particularly wherever human interests are at stake (such as flood-prone locations). Such models attempt to establish direct regression-like relationships between the input and output time series; generally, the streamflow value is regressed on values of precipitation and streamflow at previous times. We shall refer to this modeling approach as "systems theoretic." The most popular systems-theoretic methods have been the ARMAX (auto-regressive moving average with exogenous inputs) (Box, 1976; Salas, 1980) and the ANN (artificial neural network) (Hsu, 1995, 1997).

A third category of models, which are of intermediate complexity, are based on attempts to conceptualize the simplified (lumped) watershed-scale behavior resulting from the integrated effect of the subwatershed-scale hydrologie processes. Such models typically use simple linear and nonlinear tank components (reservoirs) to represent the primary soil moisture zones in the watershed and describe the manner in which moisture exchanges among these stores take place. We shall refer to such models as "conceptual." It is important to note that such models are based on speculative conjecture as to how best to partition the watershed into components and how to represent the integrated behavior of each component. This, and the fact that conceptual models are relatively simple to program into a computer, has encouraged a great deal of intellectual experimentation, resulting in a proliferation of conceptual models with widely differing structures. At the simple end, we have methods such as the API (antecedent precipitation index) and UHG (unit hydrograph) which, in a simple manner, partition the watershed response into precipitation excess and infiltration (based on an antecedent soil moisture index) and use linear equations to transform the precipitation excess into streamflow forecasts. Models at the intermediate level include the HEC-1 model (U.S. Army Corps of Engineers, 1973, 1985). At the complex level, we have methods such as the Stanford watershed model (SWM) (Crawford, 1966), the Institute of Hydrology Model (IHDM) (Beven, 1987), the Kineros model (Woolhiser, 1990), the Sacramento soil moisture accounting model (SAC-SMA) (Burnash, 1973), and TOPMODEL (Beven, 1979) that have numerous components. Within the United States, the most widely used of these may well be the API, UHG, and SAC-SMA models because they are extensively used by various regional offices of the U.S. National Weather Service for flood forecasting. Such models are currently being built into more general "modeling systems" such as the advanced hydrologic prediction system (AHPS) of the U.S. National Weather System and the modular modeling system (MMS) of the U.S. Geological Survey. These systems allow the user to build up a complete model by selecting the components from libraries containing several alternative conceptual representations.

Finally, the last half-decade has seen the emergence of a subclass of conceptual models that seek to strike a reasonable and parsimonious balance between the three issues of (a) scientific understanding (physics), (b) speculative conjecture about the nature of integrated watershed-scale processes (conceptualization), and (c) the level of model complexity that can actually be supported by the available watershed response data (i.e., the systems-theoretic issues of observability and identifiability). Examples of such models are the IHACRES [see e.g., Jakeman (1990, 1993)] model and the related HyMod under development at the University of Arizona (Boyle, 2000). For want of a better terminology, we follow Wheater (1993) in referring to these as "hybrid" models.

The three mechanisms of scientific understanding, conceptualization, and data-supportable complexity can be likened to the legs of a stool that must be of proper and complementary length so that the sitting surface is balanced and can perform its intended function (Fig. 1). The key issue in selecting an appropriate model is this intended function. A physics-based model such as SHE and some conceptual models such as TOPMODEL and Kineros may be clearly appropriate for detailed watershed modeling and for testing hypotheses about watershed behavior under perturbed conditions. On the other hand, Hsu et al. (1995) have shown that simple ANNtype systems-theoretic models can give one-step-ahead forecasts that are more accurate than those given by conceptual models, while requiring relatively minor computational resources and being quick and easy to build. However, if the intended

UNDERSTANDING COMPLEXITY

Figure 1 Issues influencing model development and selection.

UNDERSTANDING COMPLEXITY

Figure 1 Issues influencing model development and selection.

function is both accurate operational streamflow forecasting as well as insight into evolving watershed behavior, the emerging evidence suggests that hybrid models such as IHACRES and HyMod, which merge the strengths of the conceptual and systems-theoretic approaches, may prove to be the optimal choice.

3 MODEL PARAMETER ESTIMATION, CALIBRATION, AND EVALUATION

The model selected must be made specific to a watershed by estimating values for its parameters. In the case of physically based models and some conceptual models, approximate values (or ranges of values) for many of the parameters can sometimes be estimated from maps or field measurements. However, because all such models involve conceptualization (simplification and distortion from reality), the parameter estimates obtained in this manner can invariably be improved by calibration to historical input-output data. Certainly, in the case of systems-theoretic models, the only method for inferring structural complexity and parameter values is an automated computer-based identification procedure. Because each of the available systems-theoretic modeling approaches (such as ARMAX and ANN) is generally accompanied by clear procedures for model building and parameter estimation, they will not be described here. The discussion here will focus on parameter estimation for the other three categories of models via a procedure called model calibration.

The model calibration process involves five interrelated components: (a) data set, (b) constraints, (c) measures of closeness, (d) parameter adjustment procedure, and (e) evaluation procedure. Each of these components is discussed in turn.

Data Set

The input-output data set to be used for inferring model parameters must be carefully selected from the historical record to be representative of the behavior of the watershed. Two issues are important here—data quality and data quantity. Data quality has two subissues that must be considered. The first is simply that the data must be checked for accuracy and reliability (i.e., errors in measurement and/or recording). To take a trivial example, if the precipitation records indicate a large storm event but the flow records do not show a response (or vice versa), we might suspect the accuracy of the data. The second subissue is related to data informativeness; i.e., the data must be representative of the important characteristic modes of watershed behavior. For example, if the purpose of the model is flood forecasting, the data must certainly contain several significant storm events. These data will provide information about the parameters related to the partitioning of precipitation into flow components having different recession rates. However, the data must also contain several representative interstorm periods so that information regarding the parameters controlling streamflow recession as well as rates of evaporation loss can be deduced. There have been only a few studies investigating this issue. Gupta (1985) used a theoretical analysis to show that "threshold-type"

parameters are best identified when the data are selected to ensure that the model behavior tends to switch across the threshold numerous times; surprisingly, the amount of time spent in each mode of behavior is largely irrelevant. Yapo (1996) studied the reliability of parameter estimates of a conceptual rainfall-runoff model using 40 years of data and clearly demonstrated that the most reliable results are provided by using "wet" years for calibration.

The aforementioned studies also addressed the issue of data quantity (length). Gupta (1985) showed theoretically that a (daily) data set of approximately 3 years length is desirable for model calibration, and that additional amounts of data will provide only marginal gains unless containing significantly new information. Yapo (1996) found, however, that for the Leaf River in Mississippi, the SAC-SMA conceptual model requires at least 8 to 10 years of data for reliable calibration results to be obtained, suggesting that the variability of information in a hydrologic data set may extend over approximately a decade.

Having selected the calibration period data set, the next important decision is the selection of an appropriate length for the "buffer" period. A buffer period is a short data segment at the very beginning of the data set for which the measures of closeness (see below) are not computed. The intention is to minimize any potential bias in the calibration procedure caused by uncertain initialization of the model state variables. Because a watershed model tends to average and attenuate inputs, it will also attenuate the impact of initialization errors over time. A buffer period of 90 to 180 days beginning near the end of a long recession and approximately a week or two before the end of the dry season seems to be a good choice.

0 0

Post a comment