## N K

SSQ ({A,, Bj}IK=l) = £ X(i) - £ [A, cos (2n£T(i)) (5.16)

This is in fact a regression and does not require even time spacing. However, the solution is simple if the spacing (d) is constant, n is even

and n

For other frequencies, these expressions are approximate to O (1/n).

If the frequencies and other parameters of the harmonic process are unknown, which is more realistic, then we may try to find those "hidden periodicities" with a search technique called periodogram analysis. Assume from now on even n and also even spacing, d(i) = d > 0. (Uneven spacing is treated in Section 5.2.4.) The one-sided periodogram, I(fj), is then given by

where a, and Bj are the least-squares estimators for a particular frequency, fj.

The periodogram is calculated at trial frequencies, fj = 1/(nd),..., 1/(2d). The idea is that where fj is close to a true (but unknown) frequency of the harmonic process, the periodogram has a peak.

The expectation of the periodogram of the harmonic process (Eq. 5.15), for all f > 0, is (Bartlett 1955)

E [I(f)] = 2dS2 + d(2n)-1 ^ (A2 + j j=i sin(nn(f + fj))2 (5.20)

The covariance of the periodogram of the harmonic process, for all f1 > 0 and f2 > 0, is

In periodogram analysis of a harmonic process with true frequency f', the expected peak of I (f) at around f, its width, its decay to a value of

2 dS2, and so forth, are determined by the terms within square brackets in Eq. (5.20), the sinusoids. Because the periodogram is evaluated only at discrete frequencies, fj = 1/(nd), 2/(nd),..., the peak at f = f ' may be missed. The advantage of having a larger sample size n is to search with a finer grid. However, having a larger sample size does not decrease the coefficient of variation of the periodogram, that is, the ratio of the standard deviation, and the expectation (Eq. 5.20). That means, the periodogram is not a consistent estimator. The point of selecting f1 or f2 from 0,1/(—d),..., 1/(2d) is that then COV [I(f1),1 (f2)] vanishes, which allows construction of (multiple) parametric statistical tests for the existence of peri-odogram peaks against a white-noise background (Fisher 1929; Siegel 1980). Those tests employ the fact that for Gaussian distributed X(i) (no periodic components, K = 0), I(f) is chi-squared distributed; the degrees of freedom are 1 for f = 0,1/(2d) and 2 elsewhere (Priestley 1981: Section 6.1.3 therein). See the background material for more pe-riodogram tests.

Also processes other than the harmonic (Eq. 5.15), for example, continuous-time processes, can be analysed using the periodogram (Priestley 1981: Section 6.2 therein). Despite the appealing property that asymptotically (for — ^ to) the periodogram is also here an unbiased estimator of the spectrum, h(f), it has several serious drawbacks.

1. The data size, —, for achieving acceptable levels of bias reduction for the periodogram may be extraordinarily high (Thomson (1982: p. 1058 therein) reports high bias values for — as large as 1.2 ■ 106).

2. The periodogram is not a consistent estimator of h(f) (its estimation standard error does not approach zero as — ^ to).

3. The periodogram has decreasing (with —) covariance between two neighbouring frequencies. This brings some erratic behaviour of the periodogram curve, which makes peak detection difficult.

In view of those points, the importance of the periodogram for climate time series analysis, for detecting peaks in the power spectral density function of Xnoise(T), is rather small. It can provide answers for

discrete-time, harmonic processes, which may be found in climate driving mechanisms (daily, annual and Milankovitch cycles). But, as said, the sampled record of such a forced climate is likely influenced also by other mechanisms, making peak detection more difficult. A further limitation is that the parametric periodogram tests often assume properties (white-noise background, Gaussian distribution) that are not realistic for climatic noise (Chapters 1 and 2). This could, however, in principle be overcome by tests based on resampling, see Section 5.2.3.1.

Spacing, d, and data size, n, determine the frequency search grid of the periodogram as /j = 1/(nd),2/(nd),..., 1/(2d). (This applies to even n. For odd n, the maximum frequency is fj = (n — 1)/[2(nd)].) In the case that the total record duration, t(n) —1(1), is pre-determined (e.g., when a stalagmite has been sampled over its entire length), we can, by doing additional measurements, increase n to a value n', decrease d to a value d', hold [(n — 1)d] = [(n' — 1)d'] constant and therefore study higher frequencies, up to (2d')-1. In the case we wish to have a finer search grid, we have to use a longer record because the minimum frequency resolution follows the relation A/j = (nd)-1 w [t(n) — t(1)]-1. This frequency value is also denoted as fundamental Fourier frequency.

The advantage of the periodogram, in comparison with other spectrum estimation methods (Sections 5.2.2, 5.2.3 and 5.2.4), lies in its high frequency resolution (small A/j). Its use has therefore been advocated in a series of papers (Muller and MacDonald 1995, 1997a,b,c, 2000) dealing with the so-called "problem of the 100-ka cycle." These authors aimed at showing that the dominant spectral peak in late Pleistocene ice-volume proxy records (¿18O) is at Tperiod = 95 ka instead of 100 ka. Although high frequency resolution is certainly desirable, we caution against over-interpreting periodograms of climate time series. We rather recommend to view the periodogram as one, extreme end of the smoothing technique (namely unsmoothed) in spectral analysis. The smoothing technique, described in the following (Sections 5.2.2, 5.2.3 and 5.2.4), trades resolution for standard error reduction and may lead to more reliable estimates of the power spectral density function.

5.2.2 Welch's Overlapped Segment Averaging

The periodogram is a "na-ive" (Percival and Walden 1993) estimator, h(/), of the power spectral density function. To overcome the unfavourable variance property of the periodogram when (mis-)applied to estimate continuous spectra, Welch (1967) advanced the idea of Bartlett (1950) to divide a time series {t(i), x(i)}™=1 into different segments, calculate the periodograms segment-wise and average them to obtain a reduced estimation variance. Welch (1967) allowed the segments to overlap

M/wvwwvw

0 200 400 600 800 1000

0 200 400 600 800 1000

50 00.01

0 200 400 600 800 1000

0.01

0.01

0.02

0.02

0.02

0.02

Figure 5.4. Welch's Overlapped Segment Averaging. a Time series of the obliquity of Earth's rotational axis over the past 1.024 Ma (n = 1024, d(i) = d =1 ka). The record was produced (Berger and Loutre 1991) by solving the astronomical, kinetic equations (many-body system). The record is segmented as follows: b segment I, points 1 to 512; c segment II, points 257 to 768; and d segment III, points 513 to 1024. The periodograms (Eq. 5.19) are calculated for segments I (f), II (g) and III (h). The average of the periodograms (e) has a maximum at Tperiod = 512/13 ka ~ 39.4 ka. Only a part of the frequency interval 0 to 1/2 ka-1 is shown in e—h; deg, degrees.

(for example, by 50%), and the method is called "Welch's Overlapped Segment Averaging" or WOSA procedure (Fig. 5.4). Overlapping has the positive effect that the number of segments, and therefore the number of averaged periodograms, is increased.

The negative effect of using WOSA (number of segments, —seg > 1) is that the frequency points, where the periodograms are calculated, are spaced wider than for —seg = 1. More precisely, the formula is Af = (—seg + 1)/(2—d) > 1/(—d) for —seg > 1. This is the smoothing problem in the spectral domain, the trade-off between spectral estimation variance and frequency resolution. As said in the previous section, a position that advocates undersmoothing with the extreme value —seg = 1 seems too extreme for estimating spectra of climatic processes.

Welch (1967) considered also tapering (weighting) the data points X(i) within segments. Tapering is treated in the following two sections. WOSA is not the only method to obtain better bias and variance properties of spectrum estimates. Section 5.3 gives some background and references.

### 5.2.3 Multitaper estimation

Spectral smoothing can be accomplished with a general data weighting technique called tapering (Algorithm 5.1 and Fig. 5.5). Consider the continuous-time process X(T) sampled as {T(i),X(i)}n=1. The taper is a real function wk(T); k indexes the tapers; the discrete-time version is given by {T(i),wk(i)}n=1 and has the property wk(i)2 = 1. The tapered process is then given by {T(i), wk(i) ■ X(i)}™=1. Consider further a modified version of the periodogram (Eq. 5.19),

A smoothed spectral estimate is then obtained by averaging a number of K modified periodograms, which are calculated from the tapered time series, {t(i),wk(i) ■ x(i)}™=1 with k = 0,... ,K — 1; see Algorithm 5.1.

 Step 1 Data {i(i),x(i)>? =i Step 2 Tapers, indexed by k = 0,... , K — 1 {wk (i)}n=1 Step 3 Tapered data, k = 0,... , K — 1 {t(i),wk (i) x(i)}n=i Step 4 Calculate modified periodograms from tapered data, k = 0,... , K — 1 Ik (fj ) Step 5 Average periodograms to obtain smoothed spectral estimate ) = K- 1 EfcL-1 Ik(fj)

Algorithm 5.1. Smoothed spectral estimation with tapering.

Algorithm 5.1. Smoothed spectral estimation with tapering.

The periodogram is an unsmoothed spectral estimate (K = 1, w0(i) = n-1/2). The suggestion of Bartlett (1950) was to use K > 1 and non-overlapping, uniform tapers (Fig. 5.5). The recommendation of Welch (1967) was to have overlap (for example, 50%) and to allow tapers that gradually approach zero such as the Welch taper (Fig. 5.5).

It was the breakthrough of Thomson (1982) to formulate taper construction as an optimization problem, in a local least-squares sense. The solution he obtained are denoted as kth order discrete prolate spheroidal

5.2 Spectral estimation Bartlett

WOSA

Multitaper

0 500 1000 0 500 1000 0 500 1000

500 1000 0 500 1000

0 500

Time, T

1000

0 500 1000 Time, T

Time, T

Figure 5.5. Tapers for spectral estimation. Shown are the functions wk(T), k = 0,... ,K — 1, for averaging K = 3 (modified) periodograms, respectively eigenspectra; here 0 < T < 1024 and n = 1024. The Bartlett type corresponds to non-overlapping and the WOSA type to (here) 50% overlapping segments. The WOSA type is shown with a uniform taper (dashed lines) and a normalized Welch taper (solid lines). The non-normalized Welch taper in continuous time is given by, for example, w'0(T) = 1 — [(T — 256)/256]2 for 0 < T < 512; the normalized Welch taper in discrete time by w0(i) = w'0(i)w'0(i)2. The dpss multitaper functions have as additional parameter a resolution bandwidth of 2W = 4/(nd); that is, wk(i) = (i); for convenience of presentation these discrete functions are shown as continuous plots.

sequences (dpss). The dpss had been previously described by Slepian (1978). Their calculation may be numerically difficult (Section 5.4). The dpss tapers, v!!'w(i), depend on k,n and a parameter termed resolution bandwidth. The dpss have more "wiggles" than Bartlett's or Welch's suggestions (Fig. 5.5). The intuitive reason is that this leads to a smaller dependence among the modified periodograms ("eigenspectra" in the terminology of Thomson (1982, 1990a)) and, hence, to a reduced variance of their average. The resolution bandwidth, 2W, is defined via W = jw/(nd) with jw (not necessarily an integer) in the range from 2 to 4 (Percival and Walden 1993: Section 7.1 therein) and higher (Thomson (1990a: p. 545 therein) considers values up to 20). The resolution bandwidth limits the maximum number of eigenspectra, K < 2ndW. A larger W value has therefore the positive effect that more eigenspectra can be averaged and the spectral estimation variance reduced. On the other hand, a smaller W value lets fine details in h(/) be seen better. To summarize, the combination of the multitaper parameters K and 2W determines estimation variance and spectral resolution.

Thomson (1982: Section 13 therein) developed a statistical test for the existence of a line component in the spectrum against a smooth background of arbitrary shape, which is considered to be better applicable to climate time series than periodogram tests (Section 5.2.1). The idea is to compare spectral power (variance) at a frequency with the average background variance around (±W) that frequency; if the variance ratio, F(/), is high, then the hypothesis of an existing line component is accepted. Under Gaussian background processes, X(i), the variance ratio follows an F distribution. Thomson's recipe is as follows.

The eigenvalue problem w f sin (nn(/ — /')) Ura,w/) / = \«.,w r7nw(/) (524)

has as solution Un'w(/) the discrete prolate spheroidal wave functions; the eigenvalues are An-' (Slepian 1978). The Fourier transform of the Un (/) are the dpss, v^!' (i). The eigenvalues are between 0 and 1. Let the "eigencoefficients" (Thomson 1982) of a sample be given by w

Yk(/) = (A!'w)-1 | U!'w(v) ■ Y(/ + v) dv, (5.25) -w where Y(/) is the discrete Fourier transform of the sample, {X(i)}!=1.

Line components in a spectrum relate to a nonzero mean function consisting of a number of sinusoids (after detection, the sinusoidal portion can be transferred to Xtrend(i)). For a single line component, the mean, ), can be estimated as

£(/) = £ Ufcn'W(0) ■ Yk(/) £ [ufcnW(0)] . (5.26)

This determines the numerator of the variance ratio. The denominator is determined by the eigencoefficients and the discrete prolate spheroidal wave functions at / = 0. The variance ratio, finally, is

0 0