## Regression I

Regression is a method to estimate the trend in the climate equation (Eq. 1.1). Assume that outlier data do not exist or have already been removed by the assistance of an extreme value analysis (Chapter 6). Then the climate equation is a regression equation,

One choice is to write Xtrend (T) as a function with parameters to be estimated. A simple example is the linear function (Section 4.1), which has two parameters, intercept and slope. A second example is the nonlinear regression model (Section 4.2). The other choice is to estimate Xtrend(T) nonparametrically, without reference to a specific model. Nonparametric regression (Section 4.3) is also called smoothing.

Trend is a property of genuine interest in climatology, it describes the mean state. This chapter deals also with quantifying S(T), the variability around the trend, as second property of climate. Regression methods can be used to measure climate changes: their size and timing. For that aim, the ramp regression (Section 4.2.1) constitutes a useful parametric model of climate changes.

We compare the bootstrap with the classical approach to determine error bars and CIs for estimated regression parameters. The difficulties imposed by the data are non-Gaussian distributions, persistence and uneven spacing. We meet another difficulty, uncertain timescales. This leads to adaptions of the bootstrap (Section 4.1.7), where the resampling procedure is extended to include also the time values, t(i).

The present chapter studies regression as a tool for quantifying the time-dependence of Xtrend(T), the relation between trend and time in univariate time series. A later chapter (Regression II) uses regression to

M. Mudelsee, Climate Time Series Analysis, Atmospheric and 113

Oceanographic Sciences Library 42, DOI 10.1007/978-90-481-9482-7_4, © Springer Science+Business Media B.V. 2010

analyse the relation in bivariate time series, between one time-dependent climate variable, X(T), and another, Y(T).

4.1 Linear regression

The linear regression uses a straight-line model,

The climate equation without outlier component is then written in discrete time as a linear regression equation,

T is called the predictor or regressor variable, X the response variable, ft0 and ft1 the regression parameters.

4.1.1 Weighted least-squares and ordinary least-squares estimation

In a simple, theoretical setting, where the variability S(i) is known and Xnoise(i) has no serial dependence, the linear regression model can be fitted to data {t(i), x(i)}™=1 by minimizing the weighted sum of squares,

yielding the weighted least-squares (WLS) estimators fto =

where

In a practical setting, S(i) is often not known and has to be replaced by S(i). If prior knowledge indicates that S(i) is constant, then one may take as estimator the square root of the residual mean square MSE (Montgomery and Peck 1992),

0 0