## Npp

0.25 (94)

0.35 (99)

0.39 (99)

0.05 (63)

aNumbers in parentheses give percent confidence level estimates, which are based on correlations with 1000 synthetic time series having first-order autoregressive characteristics of the tree-ring indicators. Lag 0 or lag +1 correlations significant above the 90% level in both calibration periods are in bold.

^Sensitivity of indicator to local conditions as described by Villalba et al. (2000): CAT, coastal Alaskan temperature; NPT, northern Patagonian temperature; SWP, southwest U.S. Palmer Drought Severity Index (PDSI); and NPP, northern Patagonian precipitation.

aNumbers in parentheses give percent confidence level estimates, which are based on correlations with 1000 synthetic time series having first-order autoregressive characteristics of the tree-ring indicators. Lag 0 or lag +1 correlations significant above the 90% level in both calibration periods are in bold.

^Sensitivity of indicator to local conditions as described by Villalba et al. (2000): CAT, coastal Alaskan temperature; NPT, northern Patagonian temperature; SWP, southwest U.S. Palmer Drought Severity Index (PDSI); and NPP, northern Patagonian precipitation.

tically here have not shifted in space, and if the targeted single mode of SST continues to be the desired reconstruction target, then these results may suggest that an additional level of data screening is necessary. For example, Villalba et al. (2000) make this assumption to argue that the tree-ring data indicate different temporal variability before and after 1850. Ten of the fifteen tree-ring indicators support this assumption for the 1856-1990 period by calibrating in the same manner over both halves of the full period (Table 2). However, several indicators (1, 6, 8, 10, and 13) were significantly correlated during one calibration period, but not within the other. This result suggests that either a real change in the relationship between tree-ring indicator and SST or that the observed relationship is weak. We cannot eliminate the influence of real changes in either the proxy SST relationship or the proxy local climate relationship on long timescales. Further steps to test the stationarity assumption may include data-screening experiments with the current proxy data set, incorporation of additional tree-ring indicators from locations sensing the identified mode of climate variability, and quantitative estimation of the sensitivity of tree-ring in

FIGURE 5 Calibration statistics for reconstruction using the 1923-90 calibration interval. All statistics are calculated by using 1923 -90 data from Kaplan et al. (1998) (here termed KaSSTa) and the tree-ring-based SST reconstruction (TrSSTa) results. (a) Field correlation (correlation units). (b) Root-mean-squared (RMS) difference between TrSSTa and KaSSTa (oC). (c) RMS variance in TrSSTa (oC). (d) Theoretical error for TrSSTa (oC).

FIGURE 5 Calibration statistics for reconstruction using the 1923-90 calibration interval. All statistics are calculated by using 1923 -90 data from Kaplan et al. (1998) (here termed KaSSTa) and the tree-ring-based SST reconstruction (TrSSTa) results. (a) Field correlation (correlation units). (b) Root-mean-squared (RMS) difference between TrSSTa and KaSSTa (oC). (c) RMS variance in TrSSTa (oC). (d) Theoretical error for TrSSTa (oC).

dicators to local climate variables. Analysis of the physical differences between these sites and nearby sites with significant contributions to the reconstruction may indicate reasons for calibration failures and may suggest additional locations from which proxy data may be usefully incorporated or collected. These steps may all serve to reduce or better estimate the observational and mapping error in the reconstruction procedure through better understanding of the physical and biological processes underlying the statistical relationships observed here.

### 4.3.4. Analysis

We first examine results using 1923-90 as a calibration period, reserving data for 1856-1922 for verification exercises. Four statistics calculated for each point in the analysis grid are shown in Figs. 5 and 6 for calibration and verification periods, respectively. Correlation between reconstructed (TrSSTa) and verification (KaSSTa) fields shows where the reconstruction has skill regardless of signal amplitude (panel a). The root-mean-squared (RMS) difference between TrSSTa and

KaSSTa gives the actual error in the reconstructed fields by comparison of withheld observed data and reconstruction results (panel b). Panel (c) shows the RMS variance in the reconstructed field; this map may be compared to Fig. 2 to assess the extent to which the analysis resolves variance. Panel (d) gives the theoretical error in TrSSTa averaged over the verification period. This map may be compared to the RMS difference described previously to determine consistency of the reconstruction's theoretical error estimate.

4.3.5. Verification

4.3.5.1. Consistency of A Priori Assumptions and A Posteriori Results

Calibration results (Fig. 5) for the 1923-90 period show that correlation between TrSSTa and KaSSTa reaches 0.4-0.6 over much of the eastern tropical Pacific and in the centers of the Pacific subtropical gyres. This result indicates that ca. 20-25% of the variance in KaSSTa was calibrated in these regions, consistent with the map of reconstructed RMS variance in TrSSTa (Fig.

FIGURE 6 Verification statistics for reconstruction using the 1923-90 calibration interval. All statistics are calculated by using 1856-1922 data from Kaplan et al. (1998) (here termed KaSSTa) and the tree-ring-based SST reconstruction (TrSSTa) results. (a) Field correlation (correlation units). (b) Root-mean-squared (RMS) difference (oC). (c) RMS variance in TrSSTa (oC). (d) Theoretical error for TrSSTa (oC).

FIGURE 6 Verification statistics for reconstruction using the 1923-90 calibration interval. All statistics are calculated by using 1856-1922 data from Kaplan et al. (1998) (here termed KaSSTa) and the tree-ring-based SST reconstruction (TrSSTa) results. (a) Field correlation (correlation units). (b) Root-mean-squared (RMS) difference (oC). (c) RMS variance in TrSSTa (oC). (d) Theoretical error for TrSSTa (oC).

5c), which has roughly one-fourth to one-fifth of the amplitude of KaSSTa (Fig. 2a) over this period. Areas of minimum correlation correspond to regions in which the map H has little amplitude. Retrieval of small-amplitude TrSSTa is consistent with the large error in resolution of H shown in Fig. 4; the error plots (Figs. 5b and 5d) indicate that the mapped variance in the eastern equatorial Pacific and in the subtropical gyres is only partially resolved. Comparison of the actual RMS error and average theoretical error estimates (Figs. 5b and 5d) shows that the analysis procedure produces self-consistent errors.

4.3.5.2. Comparison with Withheld Historical SST Data

Results (Fig. 6) computed over the 1856-1922 verification period are similar to those shown in Fig. 5, suggesting that TrSSTa has captured limited but verifiable climatic information. Correlations have shrunk in amplitude by 0.1-0.2 units in the regions in which H has nonzero amplitude, but are similar in spatial pattern and strength to the results shown for the calibration period. This result suggests that the resolved pattern is ro bustly defined, at least for the interval 1856-1990, and serves as an initial check of analysis assumptions. An overlay of the correlation map shown in Fig. 4 with the verification correlation of TrSSTa and KaSSTa clearly shows that regions with the best reconstruction skill are also regions where the resolved map has a large amplitude; regions of minimal reconstruction skill correspond to regions of minimal map amplitude (Fig. 7 [see color insert]). Actual and theoretical error estimates are approximately equal (Figs. 6b and 6d), suggesting that we have not overcalibrated the reconstruction by retaining modes that cannot be verifiably reconstructed. However, note that the error in the North Pacific region has a slightly different structure than the actual error.

### 4.3.5.3. Sensitivity to the Calibration Period

We exchange the calibration and verification periods chosen for the previous experiment to estimate the dependence of the map from SST to tree-ring indicators on the calibration interval. The same four statistics are plotted for a 1856-1922 calibration period and a 192390 verification period in Figs. 8 and 9, respectively. Calibration results (Fig. 8) give slightly improved skill, but

120'E 150'E 180'W 150'W 120'W 90'W 120'E 150'E 180'W 150'W 120'W 90'W

FIGURE 8 Calibration statistics for reconstruction using the 1856-1922 calibration interval (as in Fig. 5).

120'E 150'E 180'W 150'W 120'W 90'W 120'E 150'E 180'W 150'W 120'W 90'W

FIGURE 8 Calibration statistics for reconstruction using the 1856-1922 calibration interval (as in Fig. 5).

FIGURE 9 Verification statistics for reconstruction using the 1856-1922 calibration interval (as in Fig. 6).

120°E 150'E 180'W 150'W 120'W 90'W 120'E 150'E 180'W 150'W 120'W 90'W

FIGURE 9 Verification statistics for reconstruction using the 1856-1922 calibration interval (as in Fig. 6).

resolve the same pattern of variability. The verification statistics are similar to those shown in Fig. 5. We may also compare the reconstructed time series [Eq. (10) with one mode of variability retained; see also Section 4.3.6] formed by using each of the chosen calibration intervals (1856-1922 and 1923-90) over their respective verification periods (1923-90 and 1856-1922). The reconstructions correlate with r = 0.76 (1923-90) and r = 0.84 (1856-1922); over the pre-observational interval 1001-1855, correlation between the two reconstructions is r = 0.76. For 11-year running averages, the two reconstructions correlate with r ~ 0.9 over the interval 1856-1990 and the full-time interval 1001-1990. However, there appears to be a tradeoff in skill between the North Pacific and eastern equatorial Pacific regions depending on the calibration period (compare Figs. 5 and 6 to Figs. 8 and 9). This finding suggests some sensitivity of H to the calibration interval chosen, which can affect the skill level by about 0.1 correlation unit in regions where TrSSTa has verifiable skill.

4.3.5.4. Comparison with Benchmark and Noise Reconstructions

In addition to checking the sensitivity of the results to map stability, we can also compare TrSSTa to an instrumental data-based reconstruction and a re construction based on calibrated red noise. Both experiments employ synthetic proxy data. In the first experiment, we reconstruct the SST field using SST, precipitation, and Palmer Drought Severity Index (PDSI) data from locations nearby the tree-ring indicator sampling sites (benchmark). In the second experiment, we perform the reconstruction based on randomly generated time series with normal distribution, unit variance, and lag -1 autocorrelation statistics of the tree-ring indicators (noise). The verification results obtained for these two experiments, using 1923-90 as a calibration period and reconstructing one pattern, are shown in Figs. 10 and 11, respectively. These results may be compared to those already shown in Fig. 6.

The calibrated skill in these experiments (results not shown) has spatial structure similar to that of TrSSTa (Fig. 5). Not surprisingly, the benchmark experiment has higher calibrated skill than TrSSTa, especially along the coasts of the Americas, and there is very little loss of skill (<0.1 correlation unit) between calibration and verification periods (results not shown). More surprisingly, our procedure was content to calibrate the noise proxy data at skill levels very similar to those of TrSS-Ta (results not shown). The true skill of the experiments, however, becomes apparent in the comparisons with withheld observed SST (Figs. 10a and 11a). While TrSSTa seems to capture the larger scale, open ocean

120"E 150"E 180'W 150'W 120"W 90"W 120*E 150°E 180°W 150'W 120'W 90'W

FIGURE 10 Verification statistics for benchmark reconstruction using calibration for 1923-90 (as in Fig. 6).

120"E 150"E 180'W 150'W 120"W 90"W 120*E 150°E 180°W 150'W 120'W 90'W

FIGURE 10 Verification statistics for benchmark reconstruction using calibration for 1923-90 (as in Fig. 6).

signal and has less skill near the proxy sampling sites (Figs. 5-8), the benchmark experiment is more skillful in resolving smaller scale, coastal phenomena, with less skill in the subpolar North Pacific. We speculate that this may be due to temporal signal integration or other smoothing effects introduced by statistical development of the tree-ring chronologies, which may highlight the large-scale oceanographic signal. The noise experiment provides no verifiable skill, as expected (Fig. 11a). Due to the artificial calibrated skill, the reconstruction error is underestimated (Figs. 11b and 11d). However, since the noise reconstruction error (Fig. 11b) is only slightly larger than that of TrSSTa (Figs. 6b and 9b), the magnitudes of the respective reconstructed variances are similar (Figs. 6c, 9c, and 11c) and small relative to those of the benchmark experiment (Fig. 10c). These results suggest that TrSSTa has verifiable skill beyond that expected from red noise. However, the observational and mapping errors are large and must be correctly specified [Eq. (9)] to obtain reconstructions with amplitudes that are consistent with such errors. Verification exercises are required to clearly distinguish artificial skill from climatic information.

We have shown results calibrating only the leading pattern of covariance between the SST and the tree-ring indicators. This choice for the number of climatic patterns recoverable from this set of proxy data was motivated by PC and SVD analyses (Section 4.3.3, Fig. 3). Here, we examine the change in the results with increasing rank of H to see if additional patterns may be verifiably reconstructed. The calibration and verification skills for retention of two and three patterns are shown in Fig. 12. Over the calibration period 1923-90 (Figs. 12a and 12b), employing two or three patterns improves results marginally in the eastern equatorial Pacific and in the North Pacific. Verification results for the period 1856-1922 (Figs. 12c and 12d) show that skill is not increased by the additional patterns, although some redistribution of skill between the tropics and ex-tratropics occurs. The changes in skill are similar in amplitude to those observed with reversed calibration and verification periods (Figs. 6 and 9). These observations are consistent with the choice of a rank 1 map for the CFR. Similarly, benchmark experiments retaining two and three patterns (results not shown) do not significantly improve on the verification statistics shown in Fig. 10. This result suggests that the number of patterns resolved in the present reconstruction is limited by the scarcity of proxy data employed as well as by the observational error. However, note also that while there is no increase in reconstruction skill with the rank of H, there is little or no deterioration either, since additional modes contribute little variance to the reconstruction

FIGURE 12 Calibration and verification correlation maps for reconstructions using 1923 -90 calibration and two and three retained patterns (correlation units). (a) Calibration correlation (1923-90), two patterns retained. (b) Calibration correlation (1923-90), three patterns retained. (c) Verification correlation (1856-1922), two patterns retained. (d) Verification correlation (1856-1922), three patterns retained.

FIGURE 12 Calibration and verification correlation maps for reconstructions using 1923 -90 calibration and two and three retained patterns (correlation units). (a) Calibration correlation (1923-90), two patterns retained. (b) Calibration correlation (1923-90), three patterns retained. (c) Verification correlation (1856-1922), two patterns retained. (d) Verification correlation (1856-1922), three patterns retained.

[Eq. (6); second term of Eq. (9)]. Hence, minimization of S makes the choice of the rank of H not as subjective as it might initially appear.

The preceding results suggest that while TrSSTa has skill over certain parts of the Pacific basin and in regions far removed from the tree-ring indicators, it does not have skill in all parts of the Pacific basin. In addition, as noted in Section 4.2 and shown here, variance resolved is a function of the quality of the map from climate record to proxy, as well as the proxy observational error. The first component is determined in the calibration stage of the procedure; the second component depends on the number of proxy indicators available for analysis as a function of time. In the most recent centuries of the reconstruction, all tree-ring indicators are available, and H is a full matrix. The analysis reconstructs anomalies with variance similar to that of the calibration period (e.g., Figs. 5c and 6c). However, as we proceed further back in time, fewer chronologies are available for analysis, and H becomes increasingly sparse. The observational error variance R grows (Section 4.3.4; Eq. (12)], and the analysis approaches zero or climatology, with the error almost the magnitude of the true or calibration period variance. However, since in this example most of the error in the reconstruction is due to error in H, and since we resolve only a single pattern of climate variability, the reconstruction error remains large even into the relatively well-observed modern period. Additional proxy indicators sensitive to this leading mode of SST variability will be required to reduce the mapping error in the calibration stage, as well as to produce more accurate and precise reconstructions back through time.

4.3.6. Study of Climate Dynamics

4.3.6.1. Pacific Decadal SST Variability Inferred from Tree-Ring Indicators

Based on verification results (Figs. 5-8), we form an index from one of the most skillfully resolved regions, NINO3.4 (170°W-120°W, 5°S-5°N). Figure 13a shows the TrNINO3.4 SST anomaly; gray bars indicate 1s error bars of this index. Since this reconstruction is composed of the time variation of a single mode of spatial variability, all such indices will be simple linear scal-ings of this time variation. We expect this index to be our best estimate of basin-scale SST variability, since the ratio of signal strength to analysis error is greatest in this region (Figs. 2 and 6). Correlation over the veri-

FIGURE 13 (a) NINO3.4 sea surface temperature (SST) anomaly (170°-120°W, 5°N-5°S) from tree-ring-based SST reconstruction (TrSSTa), A.D. 1001-1990. Units are degrees Celsius (°C). Shading indicates 1<r error estimates. (b) TrNINO3.4 index passed with a 31-year Gaussian filter. Overlain gray bars indicate the sign of the filtered index. Units are °C. (c) Number of tree-ring indicators available for analysis over this period.

## Post a comment