## Quality Control Decision Making Algorithms

The quality control outcomes of the various external data checks described previously are combined in a decision-making algorithm. The outcome of the decision-making algorithm is the overall indication of observation quality, which is used to select data for the assimilation. The decision-making algorithm is applied to each observed reporting level and, in the case of profile (and glider) observations, to the entire profile in the shape comparison test. Thus, for profile observations there are two indicators of data quality: one indicator for the overall profile shape, and a second indicator for each profile level. It is important to take into consideration results from all of the external data checks before the final quality decision is made. For example, an observation could fail the climate background check while at the same time pass the forecast background check. The observation would be rejected if the climate test was applied first in a serial fashion. Quality control decision-making algorithms, therefore, are necessarily complex and must combine outcomes from the different external test results appropriately. It helps if the external test outcomes are of the same form, such as probabilities or standard normal deviates.

A quality control decision-making algorithm in use at the U.S. Navy oceano-graphic centers is described here. The quality control outcomes from the various external data checks are in the form of probabilities of error. The majority of these probabilities are calculated according to Eq. (4.1), assuming a normal probability density function, but probabilities are also calculated using chi-square distribution functions (i.e., aerosol contamination test). Given a set of error probabilities the decision-making algorithm is summarized as follows:

where pb is the composite background error probability, pd is the composite data-derived error probability, pg and pr are the global and regional forecast background error probabilities, pc and px are the climate and cross validation error probabilities, rf is the forecast error threshold probability, and po is the overall probability the observation contains a random error. The forecast error probability threshold for the system is typically set to 0.99 (3 standard deviations).

The algorithm first determines if the observation is consistent with the model background fields by taking the minimum error probability of the global and re gional forecasts. If the minimum background error probability is less than the prescribed forecast error tolerance limit, then the algorithm returns it as the overall probability of error for the observation. However, if the minimum model background error probability exceeds the forecast error threshold, then it is compared against the data-derived error defined as the minimum of the cross validation and climatology error probabilities. The overall observation error probability is returned as the minimum of the composite background and composite data-derived errors. In this way, cross validation and climate backgrounds determine data quality only if the observation is not consistent with the forecast. Experience has shown that requiring observations to always be consistent with climate backgrounds results in spurious rejection of valid observations during extreme events.

Once the overall probability of error for an observation has been determined, output from the various specific observing system quality control tests are simply added to the error probability using unique integer-valued flags. The quality control flags have three levels of severity: (1) information-only (<100); (2) cautionary (>100); and (3) fatal (>1,000). Observations with fatal errors are not used in the analysis. Information-only flagged observations are routinely used in the analysis, but the use of cautionary flagged observations is under user control via analysis namelist options. The ultimate decision to accept an observation into the analysis, however, is always based on the underlying error probability value obtained from the decision-making algorithm. If quality control flags have been appended, the underlying probability of error can always be recovered from the summation using some simple modular arithmetic.

### 4.5.1 Quality Control System Performance

Output from the U.S. Navy's fully automated real-time ocean data control system is summarized for satellite SST retrievals, sea ice concentration retrievals, altimeter sea surface height and significant wave height retrievals, and in situ observations at the surface and at depth from various sources. Quality control output for the satellite data is given for two monthly time periods during 2009 (June and December) to allow for examination of possible effects of seasonality, while output from quality control of the in situ data is shown for the entire 2009 year. The overall quality of the observations is summarized using an error probability frequency of occurrence in per cent. The error probabilities are the outcomes of the quality control decisionmaking algorithm for single level observations and the overall probability of error for profile observations. Assuming a normal probability distribution function, the frequency of occurrence bins correspond to one standard deviation (p < 0.67), two standard deviation (p < 0.95), and three standard deviation (p < 0.99) departures from a zero mean. Probability frequencies indicated as p < 1.0 include probabilities greater than 0.99 plus observations flagged as being suspect from one or more of the specific external data checks described previously. Observations with error probabilities less than 0.99 are typically accepted into the analysis.

In general, QC outcomes of the satellite SST retrievals indicate that the data are of good quality (Table 4.1). The frequencies of error probabilities within one standard deviation of the background field consistently include 90% or more of the data for all satellite systems. Allowing for two background error standard deviations results in more than ~99% of the observations being included. There is some evidence

 Satellite Month 2009 Type Count x 106 Diurnal Aerosol1 p < 0.67 p < 0.95 p < 0.99 p < AMSR-E2 Jun - 87.82 - - 96.2 3.7 0.1 0.1 Dec Day 47.68 23,427 - 94.5 5.3 0.2 0.1 Night 55.59 - - 95.5 4.3 0.1 0.0 AATSR3 Jun Day 220.35 364,910 30,656 93.0 6.3 0.5 0.3 Night 330.58 - 195,971 91.2 8.4 0.4 0.1 Dec Day 230.32 161,863 8,391 95.0 4.7 0.2 0.1 Night 317.16 - 42,313 91.9 7.6 0.4 0.0 GOES-11 Jun Day 26.93 258 12 89.8 10.1 0.1 0.0 Night 70.84 - 4 95.2 4.7 0.1 0.0 Dec Day 37.67 97.6 2.3 0.0 0.0 Night 88.80 95.8 4.1 0.1 0.0 GOES-12 Jun Day 19.06 1,043 7,083 96.7 3.2 0.1 0.0 Night 53.33 - 435,078 93.3 5.7 0.2 0.8 Dec Day 27.44 1,014 49 95.4 4.6 0.0 0.0 Night 66.30 - 12,519 93.1 6.7 0.2 0.0 METOP Jun Day 5.46 938 2,541 97.6 2.3 0.1 0.1 GAC Night 5.63 - 5,462 94.7 5.0 0.2 0.1 Dec Day 6.09 862 35 97.5 2.4 0.1 0.0 Night 5.89 - 144 95.4 4.4 0.2 0.0 METOP Jun Day 106.52 28,165 86,935 96.2 3.5 0.1 0.1 LAC4 Night 119.47 - 44,456 95.5 4.4 0.2 0.0 Dec Day 216.67 20,350 3,312 97.4 2.5 0.1 0.0 Night 234.74 - 9,060 94.5 5.3 0.2 0.0 MSG5 Jun Day 14.47 2,995 10,202 94.8 4.5 0.4 0.3 Night 73.28 - 13,343 94.8 4.8 0.2 0.2 Dec Day 12.23 25,999 759 95.3 4.2 0.2 0.3 Night 11.55 - 3,082 94.9 4.9 0.2 0.0 NOAA-18 Jun Day 4.71 148 14 90.7 8.6 0.6 0.1 Night 5.24 - 5,072 95.3 4.4 0.2 0.1 NOAA-19 Dec Day 5.08 11,919 36 88.9 10.2 0.6 0.3 Night 4.99 - 298 95.4 4.4 0.3 0.0

1 Aerosol contamination calculated for Saharan Dust events in an area bounded by 10°S-30°N, 25°E-55°W

2 AMSR-E not partitioned into day/night retrievals in June. AMSR-E data missing 16-17 June 06Z, 18 June 00-12Z, 20 June, 23 June, 25-26 June, 28-30 June, 29 Dec 12-24Z

3 AATSR data missing 16 June 00-06Z, 20 June 00-06Z, 26 June 00-18Z, 28 June 00-12Z, 29 June 12-18Z. 8 Dec 00-06Z, 24 Dec 06-12Z

4 METOP LAC data missing 19 Dec 00-12Z; 27 Dec 18-24Z

5 MSG data missing 6 June 12-18Z, 13 June 00-06Z, 15 June 00-06Z, 16-18 June, 20-21 June 00Z, 22 June 06-12Z, 23-24 June 00Z, 25-30 June, 15 Dec 06-12Z

Table 4.2 Real-time QC outcomes for satellite sea ice retrievals

Satellite1 Month 2009 Count x 106 Weather p < 0.67 p < 0.95 p < 0.99 p < 1.0

Filter2

Table 4.2 Real-time QC outcomes for satellite sea ice retrievals

Satellite1 Month 2009 Count x 106 Weather p < 0.67 p < 0.95 p < 0.99 p < 1.0

Filter2

 F133 Jun 5.23 570 97.1 2.1 0.5 0.3 Dec - - - - - - F15 Jun 10.65 2,777 96.2 2.6 0.7 0.5 Dec 11.63 1,048 94.5 3.6 1.1 0.8 F16 Jun 16.78 17,070 96.5 2.4 0.6 0.5 Dec 18.32 3,478 95.3 3.3 0.9 0.6 F17 Jun 16.64 13,687 97.1 2.1 0.4 0.3 Dec 18.87 3,652 95.5 3.2 0.8 0.5 ShelfIce Jun 0.65 - 77.6 10.4 5.8 6.1 Dec 0.44 - 74.7 16.7 5.7 2.9

1 F13 and F15 are SSM/I satellites; F16 and F17 are SSMI/S satellites

2 Weather filter based on collocated analyzed SST values (see text for details)

3 F13 data use discontinued in December

1 F13 and F15 are SSM/I satellites; F16 and F17 are SSMI/S satellites

2 Weather filter based on collocated analyzed SST values (see text for details)

3 F13 data use discontinued in December of seasonality in the number of retrievals detected as coming from diurnal warming and aerosol contamination events for AATSR, GOES, METOP and MSG data. Sea ice concentration retrievals from the SSMI and SSMI/S satellites are also of good quality: ~99% of the data fall within two standard deviations of the background field (Table 4.2). The number of sea ice retrievals rejected by the weather filter based on collocated SST shows a clear seasonality with many more weather filter rejections in June than in December. Altimeter sea surface height (SSH) observations are also of good quality with ~99% of the data within two standard deviations (Table 4.3). Altimeter significant wave height (SWH) observations appear to be of lower quality, but SWH rejections are mostly over land or ice covered seas (defined here as 33% sea ice concentration). Quality control of altimeter SWH retrievals is model based in the Navy system. A 6-hour forecast from a data assimilative run of the wave model is used to check newly received altimeter and buoy SWH observations for consistency, ensuring that the valid time of the forecast corresponds closely to the observed times of the data.

Table 4.4 gives QC outcomes for in situ SST observations from ships and buoys. Ship data are of lower quality than buoy data, with about 8% of the ship data being rejected across the different ship data types. Drifting buoy data are of higher quality than fixed buoys, with fixed buoy data showing increased variability as indicated by the large percentage of data in the probability range of 0.67-0.95. Profile data QC is summarized in Table 4.5. Recall that profile levels with density inversions or vertical gradient information-only flags do not affect use of those data in the assimilation. The large number of TESAC data is a result of fixed buoys reporting both temperature and salinity using the WMO TESAC code form. These data report only a single or very few vertical levels and are of low quality, with less than 75% of the data occurring within two standard deviations of the background field. XBT observations have large occurrences of vertical gradient and instrumentation

 Satellite1 Type Month 2009 Count xlO6 Ice Covered Shallow Water Land Area Zero Value2 p<0.67 p<0.95 p<0.99 p<1.0 ENJ'ISAT SSH Jun 1.32 - - - - 95.7 4.1 0.2 0.0 Dec 1.39 - - - - 95.9 3.9 0.1 0.0 SWH Jun 0.87 106,945 1,483 12,631 - 70.1 3.3 0.1 26.5 Dec 1.36 48,265 1,589 25,094 8,931 68.7 3.0 0.1 28.2 Jason 1 SSH Jun 1.49 - - - - 86.7 12.5 0.8 0.1 Dec 1.63 - - - - 87.9 11.3 0.7 0.1 SWH Jun 1.48 73,119 68 21,664 - 80.5 10.8 1.2 7.5 Dec 2.02 12,971 4 38,751 27,754 84.5 6.6 0.5 8.4 Jason 2 SSH Jun 1.55 - - - - 88.3 11.1 0.6 0.0 Dec 1.66 - - - - 89.1 10.3 0.6 0.0 SWH Jun 1.48 95,920 1,960 21,286 - 65.3 3.2 0.1 31.4 Dec 2.30 4,735 1,816 34,629 27,767 67.6 2.9 0.2 29.3

1 SWH observations not available 1-10 June

2 Zero values are SWH retrievals reported as exactly zero

1 SWH observations not available 1-10 June

2 Zero values are SWH retrievals reported as exactly zero

 Table 4.4 Real-time QC outcomes for in .situ surface temperature observations in 2009 Type Count x 103 p < 0.67 p < 0.95 p < 0.99 p < 1.0 Ship ERI 210.3 55.5 27.0 9.0 8.5 Ship Bucket 32.1 47.2 31.4 12.6 8.8 Ship Hull Contact 309.2 53.6 28.5 10.2 7.7 CMAN Station 23.6 72.1 20.6 5.0 2.2 FixedBuoy 2,657.3 83.5 13.3 2.6 0.7 Drifting Buoy 10,624.1 92.3 5.8 0.9 1.0 Table 4.5 Real-time QC outcomes for profile observations in 2009 Type Count1 x Density Vertical Inst. Depth Missing p <0.67 p <0.95 p <0.99 p < 1.0 103 Inv.2 Grad.2 Error2,3 Error2 Value24 XBT 18.9 12,722 52,301 674 26 75.9 16.2 1.6 6.3 Fixed 502.5 19,000 3,922 - - 1,163 81.3 16.3 1.5 0.9 Buoy Drifting 31.7 207 5,743 6,374 - 84.3 8.1 1.9 5.7 Buoy TESAC 1,332.4 1,382 2,165 1,706 551 222 44.0 29.3 10.0 16.7 Argo 148.2 9,028 8,801 6,669 4,628 7,158 77.9 18.3 1.7 2.1

1 Counts are number of profiles

2 Counts are number of profile levels affected

3 Instrumentaton error includes wire stretch, wire breaking, invalid upper ocean temperature response, profile spikes

4 Counts refer to missing temperature levels only

1 Counts are number of profiles

2 Counts are number of profile levels affected

3 Instrumentaton error includes wire stretch, wire breaking, invalid upper ocean temperature response, profile spikes

4 Counts refer to missing temperature levels only errors, which are probably due to inflexion point decimation of the profiles done prior to posting the data on the GTS. Argo is of high quality with more than 96% of the profiles accepted into the analysis. However, Argo profiles show a relatively high occurrence of depth errors (duplicate depths or depths not strictly increasing) and missing value errors (defined here in terms of temperature) that need to be investigated.

### 4.6 Internal Data Checks

Internal checks are those quality control procedures performed by the analysis system itself. These data consistency checks are best done within the assimilation algorithm since it requires detailed knowledge of the background and observation error covariances, which are available only when the assimilation is being performed. The internal data checks are the last defense of the assimilation algorithm against bad observations. Data that contain gross and random errors have hopefully been removed prior to the assimilation in the sensibility and external data checks. The purpose of the internal data checks is to decide whether any marginal observations remaining in the assimilation data set are acceptable or unacceptable.

The need for quality control at this stage of the analysis/forecast system cannot be over emphasized. Any assimilation system based on the assumption of normality, no matter how sophisticated, is vulnerable to bad observations that do not fit a normal distribution. Further, since many GODAE forecasting systems use a sequential analysis-forecast cycle, it is difficult to remove the propagation of error through the forecast period that occurs when erroneous data have been assimilated. Once this happens the only option is to blacklist the bad observations and back-up and rerun the analysis-forecast cycle. This remedy will cause a delay in the production of the forecast, which can be a serious problem in operations since the forecast products are time critical.

The internal consistency checks are quite different from the cross validation procedure described in Sect. 4. In particular, each observation is compared with the entire set of observations used in the assimilation, not just nearby observations. A metric is devised to test whether observation innovations are likely or unlikely with respect to other observations and the specified background and observation error statistics. Once the decision to reject an observation is made in the internal data check it is necessary to intervene in the assimilation process to ensure that the rejected observation has no effect on the analysis. Typically, internal data checks are performed in variational analysis schemes, where the solution is obtained using iterative methods that can be interrupted and started up again. The internal data checks described below were developed for the Navy Atmospheric Variational Data Assimilation System (NAVDAS), described in Daley and Barker (2001). These checks have also been implemented in the Navy Coupled Ocean Data Assimilation (NCODA) system (Cummings 2005), which has recently been updated to a 3D variational analysis based on NAVDAS. The discussion below is adapted from Daley and Barker (2001, Chap. 9.3).

In an observation based analysis system the analyzed increments (or correction vector) are computed according to,

where xa is the analysis and xb is the forecast model background. In the right hand side of Eq. (4.8), B is the background error covariance, H is the forward operator, R is the observation error covariance, y is the observation vector, and T indicates matrix transpose. The observation vector contains all of the synoptic temperature, salinity and velocity observations that are within the geographic and time domains of the forecast model grid and update cycle. When the analysis variable and the model prognostic variable are the same type, the forward operator H is simply spatial interpolation of the forecast model grid to the observation location performed in three dimensions. Thus, HBHT is approximated directly by the background error correlation between observation locations, and BHT directly by the error correlation between observation and grid locations. The quantity [y - H(xb)] is referred to as the innovation vector (model-data misfits at the observation locations).

The first part of the internal data check uses a tolerance limit. Denote A=HBHT+R as the observation symmetric positive definite matrix of Eq. (4.8). Define A"=diag

(A). Then, define the observation vector d"=A" -1/2[y - H(xb)]. The elements of d" are the normalized innovations and should be distributed (over many realizations) in a normal distribution with a standard deviation equal to 1.0 if the background and observation error covariances have been specified correctly. Assuming this to be the case, tolerance limits (TL) are defined. Since B and R are never perfectly known, it is best to use a relatively high tolerance limit (say, TL=4.0) in operations. The test statistic is designed to identify a marginally acceptable observation if its element of d" is larger than the specified tolerance limit.

The second part of the internal data check is a consistency check. It compares marginally acceptable observations with every other observation. The procedure is a logical extension of the tolerance limit check described above. Define the vector d*=A-1/2[y - #(xb)]. The elements of d* are like those of d", dimensionless quantities normally distributed. However, because d* involves the full covariance matrix A, it includes correlations between all of the observations. By comparing the vectors d" and d* it can be shown which marginally acceptable observations are inconsistent with other observations and can therefore be rejected. The d* metric should increase (decrease) with respect to d" when that observation is inconsistent (consistent) with other observations, as specified by the background and observation error statistics.

The internal data check is illustrated using the example given in Table 4.6 for 3 hypothetical observations considered marginally acceptable on the basis of a prescribed tolerance limit (d") check value of 3.0 (Daley and Barker 2001). The d* metric for the first observation is reduced when additional, correlated (p=0.8) observations more accurate than the background (so=0.1) are considered. In this case, the suspect observation, rejected individually on the basis of the tolerance limit check, is now determined to be consistent and is retained in the analysis (d* = 1.9). However, if the additional data are uncorrelated (p=-0.4) while also being accurate (eo=0.1), then the results indicate the suspect observation is much more unlikely than the tolerance limit check and should be rejected (d* = 5.8). Inaccurate observations relative to the background (eo=2.0) show less sensitivity to correlations among observations but still give the same direction of change (d* vs. d") as the accurate observations.

There are difficulties applying the consistency data check in practice since it requires calculating the entire A-1/2 matrix, which is prohibitive for very large problems. Fortunately, there are some good approximations to this calculation that can be used (Daley and Barker 2001). However, other implementation issues remain. To

 d1A=d2A=d3A=3.0 14*1 p=-0.4 p=0.8 s0 = 0.1 5.8 1.9 s0=2.° 3.5 2.4

dA, d* defined in text p correlation between observations s observation error normalized by the background error dA, d* defined in text p correlation between observations s observation error normalized by the background error reject an observation a large constant is added to the appropriate diagonal element of the HBHT+R matrix. This modifies the matrix in such a way as to effectively prevent the rejected observation from affecting the analysis. However, if this operation is done during the descent iteration then the modified matrix is no longer consistent with the other vectors that have been evolving as part of the conjugate gradient solution. The descent can be restarted (very expensive) or the conjugate gradient solution vectors can be suitably altered to allow the descent to continue. In either case the tolerance limit and internal consistency checks can be applied multiple times during the descent as the solution resolves more and more of the observation innovations.

As discussed in Daley and Barker (2001), modifications to this procedure can be made for extreme events when the specified background error statistics are likely to be incorrect. Typically, error statistics in the assimilation are produced by averaging time series of innovations and forecast differences and reflect average, rather than extreme, conditions over the model domain. When changes are occurring in the ocean (such as an eddy shedding or frontal meander event) the background error statistics are likely to be larger than normal. In this case, a tolerance limit specified too low could reject good (and very important) data. One option for dealing with this is make one pass through the tolerance limit check and compute the mode of the d" values over some limited subareas of the analysis domain. The mode is a better statistic here because it is less susceptible to outliers than the mean. If the subarea mode is much greater than one, then it can be concluded that there are serious discrepancies between the observations and the background in that area. In such a case, to avoid spuriously rejecting good data, the subarea tolerance limit should be increased beyond the prescribed value.

If the assimilation of an observation has made the forecast issued from the analyzed state more accurate than a forecast valid at the same time but issued from a prior state, then the observation is considered to have a beneficial, positive impact. All assimilated observations are expected to have beneficial impacts on correcting the initial conditions and thereby improving the forecast issued from the analysis. However, if consistent non-beneficial impacts are found for a particular data type or observing system, then that may indicate data quality control issues, such as subtle instrument drift or calibration problems that otherwise are difficult to assess when considering the data in isolation. Thus, the adjoint-based data impact procedure is an effective tool to provide quantitative diagnostics of ocean data quality. The use of adjoint sensitivities in ocean data assimilation and ocean data quality control is still an active area of research and development.

### 4.8 Summary and Conclusions

Effective ocean data quality control is a difficult problem. Observations are imperfect and prone to error. Data with errors that are not described by the assimilation system through the error covariance matrices need to be eliminated prior to the analysis. Effective quality control, therefore, requires a set of pre-established, standardized test procedures, with results of the procedures clearly associated with the data values. Effectiveness in turn depends on the reliability of the standard(s) and on the choices made for measuring goodness of fit.

The need for observation quality control depends on the use being made of the observations. Users of quality controlled data sets have a wide range of views on the most appropriate standards and on the appropriate "tightness of fit" demanded by the quality control procedures (too tight increases the chance of erroneously rejecting anomalous features; too loose increases the chance of accepting bad data). Indicators of data quality must be useful for determining if the quality controlled observations are appropriate for a particular purpose. In this paper, observation quality control is performed as a prelude to assimilation of the observations in an ocean forecast system. Using this definition, the best ocean data quality scheme is that which leads to the best ocean forecast.

It is surprisingly difficult to demonstrate consistent impact from the quality control of individual observations in an analysis/forecast system. Quality control, however, is very important in data monitoring: collection of statistics on the perfor mance of observing systems; detection of observing systems that are not performing as expected; and feedback to the data providers so that deficiencies are corrected. An integrated, end-to-end quality control system, therefore, must ensure that results of the quality control procedures are recorded for independent analysis and later use. If the quality control is carried out well, then it can reduce the duplication of effort among the users of ocean data—value added is not lost or misinterpreted. At a minimum, a comprehensive database of raw and processed observed values, independent estimates of the same quantities, and quality control outcomes is needed. The database would be used to look for "unexpected" behavior in observing systems, and allow users and operators of quality control systems to identify systematic problems in order to get errors in the data collection or data transmission corrected. At present, there are few agreed-upon standards for real-time ocean data quality control and very few cases where the procedures and results from the oceanographic centers have been compared. As the GODAE operational oceanographic community continues to develop a range of complex ocean analysis and prediction systems, it is important that procedures be developed for routinely assessing the effectiveness of ocean data quality control and for routinely exchanging statistics from the quality control processes at the operational centers. A start on this process has begun with the GODAE QC intercomparison project (Smith 2003; Cummings et al. 2009), which initially is focusing on profile data types.

The fully automated ocean data quality control procedures described in this paper are limited to observation data types that are routinely assimilated in ocean forecast models. New ocean observing systems continue to be deployed and new failure modes of existing observing systems continue to be identified. Examples of new observing systems include HF coastal radars and microwave measurements of sea surface salinity from space. Examples of new instrument failure modes are the pressure and salinity sensor issues associated with the long-term, autonomous, deployments of the Argo profiling floats. New observation error models need to be developed for the automated quality control of new data types, and existing error models need to be updated to detect, and correct, new instrument failure modes. The validity of existing and new automated quality control procedures must be continually confirmed by formal statistical tests and by examining differences between automated and delayed-mode quality control outcomes on the same observation. The automated quality control system can be considered to have performed well if decisions made on observations in real-time are consistent with decisions made to modify or reject the same observations in delayed mode, where more rigorous scientific and expert manual intervention quality control methods are possible. Delayed mode quality control outcomes of the Argo profiling float array are readily available and can be used in this evaluation. This activity is an integral component of the GODAE QC intercomparison project, which includes participation from the following operational centers: Bureau of Meteorology in Australia, Coriolis Data Center in France, the Integrated Science Data Management Branch in Canada, Fleet Numerical Meteorology and Oceanography Center in the U.S.A, and the Met Office in the U.K.

Acknowledgements This work was funded by the National Ocean Partnership Program (NOPP) project, US GODAE: Global-Ocean Prediction with the Hybrid Coordinate Ocean Model, and by the Naval Research Laboratory 6.2 project, Observation Impact Using a Variational Adjoint System. The Program Executive Office for C4I and Space PMW-180 provided additional funding as part of the 6.4 project Ocean Data Assimilation for the Coupled Ocean Atmosphere Mesoscale Prediction System. I acknowledge Mark Ignaszewski from the Fleet Numerical Meteorology and Oceanography Center in Monterey, CA, and Krzysztof Sarnowski from the Naval Oceanographic Office in Stennis Space Center, MS, for their continuing assistance and support in the transition and maintenance of the Navy Coupled Ocean Data Assimilation Quality Control (NCODA_QC) system at the U.S. Navy operational centers.

References

Bailey R, Gronell A, Phillips H, Tanner E, Meyers G (1994) Quality control cookbook for XBT Data, CSIRO marine laboratories report 221. http://www.medssdmm.dfo-mpo.gc.ca/meds/ Prog_Int/GTSPP/QC_e.htm Baker NL, Daley R (2000) Observation and background adjoint sensitivity in the adaptive observation targeting problem. Q J Roy Meteor Soc 126:1431-1454 Boyer T, Levitus S (1994) Quality control and processing of historical oceanographic temperature, salinity, and oxygen data. NOAA Technical Report NESDIS 81. p 65 Corlett GK, Barton IJ, Donlon CJ, Edwards MC, Good SA, Horrocks LA, Llewellyn-Jones DT, Merchant CJ, Minnett PJ, Nightingale TJ, Noyes EJ, O'Carroll AG, Remedios JJ, Robinson IS, Saunders RW, Watts JG (2006) The accuracy of SST retrievals from AATSR: an initial assessment through geophysical validation against in .situ radiometers, buoys and other SST data sets. Adv Space Res 37(4):764-769 Cummings JA (2005) Operational multivariate ocean data assimilation. Q J Royal Met Soc 131:3583-3604

Cummings JA, Brassington G, Keeley R, Martin M, Carval T (2009) GODAE ocean data quality control intercomparison project. Proceedings, Ocean Obs '09, Venice, Italy. p 5 Daley R, Barker E (2001) The NAVDAS sourcebook 2001. Naval Research Laboratory NRL/

PU/7530-01-441, Monterey, p 160 Donlon C, Minnett P, Gentemann C, Nightingale TJ, Barton I, Ward B, Murray M (2002) Toward improved validation of satellite sea surface skin temperature measurements for climate research. J Clim 15:353-369 Donlon CJ, Robinson I, Casey KS, Vazquez-Cuervo J, Armstrong E, Arino O, Gentemann C, May D, LeBorgne P, Piolle, Barton1 I, Beggs H, Poulter DJS, Merchant CJ, Bingham A, Heinz S, Harris A, Wick G, Emery B, Minnett P, Evans R, Llewellyn-Jones D, Mutlow C, Reynolds R, Kawamura1 H, Rayner N (2007) The global ocean data assimilation experiment (GODAE) high resolution sea surface temperature pilot project (GHRSST-PP). Bull Am Meteorol Soc 88(8):1197-1213

Langland RH, Baker NL (2004) Estimation of observation impact using the NRL atmospheric variational data assimilation adjoint system. Tellus 56A:189-201 May D, Osterman WO (1998) Satellite-derived sea surface temperatures: Evaluation of GOES-8

and GOES-9 multispectral imager retrieval accuracy. J Atmos Oceanic Technol 15:788-834 May D, Parmeter MM, Olszewski DS McKenzie BD (1998). Operational processing of satellite sea surface temperature retrievals at the Naval Oceanographic Office. Bull Am Meteor Soc 79:397-407

Merchant CJ, Embury O, Le Borgne P, Bellec B (2006) Saharan dust in nighttime thermal imagery: detection and reduction of related biases in retrieved sea surface temperature. Rem Sens Env 104(1):15-30

Merchant CJ, Le Borgne P, Marsouin A, Roquet H (2008) Optimal estimation of sea surface temperature from split-window observations. Rem Sens Env 112(5):2469-2484 Merchant CJ, Le Borgne P, Roquet H, Marsouin A (2009) Sea surface temperature from a geostationary satellite by optimal estimation. Rem Sens Env 113(2):445-457 Smith N (2003) Sixth session of the global ocean observing system steering committee (GSC-VI): GODAE report. IOC-WMO-UNEP/I-GOOS-VI/17