Andrew M. Moore

Abstract The use of adjoint methods in data assimilation is reviewed, and illustrative examples are presented.

### 14.1 Introduction

Adjoint operators are central to many operational data assimilation systems used for numerical weather prediction, and are gaining popularity in oceanography also. In this chapter we shall review the use of adjoint methods for data assimilation. We begin in Sect. 14.2 with an exploration of the concept of the adjoint of a linear operator, and the important properties that make it an indispensible tool for data assimilation. Familiar illustrative examples are used throughout to highlight the important ideas. The fundamental concepts underpinning 4-dimensional variational data assimilation (4D-Var) are reviewed in Sect. 14.3, and in Sect. 14.4 example 4D-Var calculations for the California Current System are presented using the Regional Ocean Modeling System (ROMS).

14.2 What Is an Adjoint Operator?

Adjoints exist only for linear operators. The concept of an adjoint operator is best illustrated by considering first the discrete form of linear operators and functions, namely matrices and vectors. The following is a brief exposé on adjoint operators, but an excellent in-depth description can be found in the classic text by Lanczos (1961).

Ocean Sciences Department, University of California, Santa Cruz, CA 95064, USA e-mail: [email protected]

A. Schiller, G. B. Brassington (eds.), Operational Oceanography in the 21st Century, 351

### 14.2.1 Spaces

Any continuous linear operator in function space has a discrete analog in the form of a matrix. Similarly, any continuous function has a discrete analog in the form of a vector. With this is mind, consider the N x M rectangular matrix A. The matrix A operates on the set of vectors u of length M and yields a set of vectors w of length N, so that w=Au. So we say that A maps from a space of dimension M ("M-space") to a space of dimension N ("N-space"). The adjoint of the operator A can be identified with the matrix transpose, namely AT. The formal connection between the matrix transpose and an adjoint operator will be made in Sect. 14.2.2, but for now identifying the adjoint as the matrix transpose will suffice. The adjoint AT is a M x N matrix and operates on the set of vectors v of length N to yield the set of vectors z of length M, namely z=ATv. So we say that the adjoint maps from N-space to M-space.

Suppose that we wish to solve the system of linear equations y=Ax given A and y. This represents a system of N equations for the M unknown elements of x. If N < M, the system is said to be underdetermined since there are fewer equations than unknowns. In this case we might ask whether a unique or meaningful solution exists for x? The answer is probably yes, and is given by the so-called "natural solution" for which the adjoint operator plays a critical role. Suppose we search for a solution of the form x=ATs. While the vector x resides in M-space, the vector s resides in N-space, so we are effectively restricting our search for solutions to N-space, the space in which the known vector y resides. We have now reduced the problem to solving y=AATs which is well posed since AAT is a N x N matrix that maps s to the N known elements of y. The solution x=ATs is called the natural solution, and s is referred to as the generating function. As we shall see, generating functions are important players in some approaches to data assimilation.

Let us consider a familiar geophysical example in which the natural solution plays a critical role. The vertical component of relative vorticity of a fluid element is given by g = dv/dx - d«/dy where (u,v) are the x andy components of velocity. Suppose we are given a field of values of g at discrete points (x,y) such as on the grid of a numerical model. This is the discrete analog of a function which we will denote by the vector g where each element of g represents a grid point value of g. Given the field g in N-space, how do we find the corresponding velocity components (u,v) in M-space, where M=2N in this example, and u and v are the vectors of grid point values of u and v respectively? This is an underdetermined linear system for which a natural solution exists of the form:

where A = ( - d/dy d/dx), and g = AATs = -(92/dy2 d2/dx2)s. If we identify s = —f we recover the familiar equation g = V2f relating vorticity g to stream function ^ arising from Helmholtz theorem for a horizontally non-divergent flow. This example reveals that stream function is the generating function for vorticity for a horizontally non-divergent flow.

Recall that the N x M matrix A has an operator equivalent A in function space, so for the case N < M the function A acts only on part of the function space. We say that only some of the dimensions of the function space are activated by A. In the discrete case at most only N of the possible M dimensions are activated by the matrix A. More generally, the activated dimensions of A correspond to the p < N eigenvectors of AAT with non-zero eigenvalues X.. The remaining non-activated dimensions with Xi=0, p < i < N are referred to as the "null space." The M x N adjoint operator AT identifies the activated part of the M-space and ignores the null space. The natural solution y = AATs therefore represents the solution that exists only in the activated dimensions of A. In the parlance of linear algebra, AAT is the projection onto the subspace spanned by the range of A. In the theory of linear differential equations, the natural solution is also called the particular integral. Solutions that reside in the null space satisfy the equation Ax = 0 and in the theory of linear differential equations are referred to as the complementary function. The general solution of any linear differential equation or equivalent discrete linear system is the sum of the particular integral (natural solution) and the complementary function. In both function space and discrete space the adjoint operator identifies the space in which these two parts of the general solution reside. As we shall see later, we can use this important property of adjoint operators for data assimilation.

Before concluding this discussion of vector spaces, it is instructive to consider the N x M matrix A where now N > M. The corresponding linear system y=Ax may be over determined or constrained since there are more equations than there are unknown elements of x. Recall that the adjoint AT maps vectors from N-space to M-space where the solution for x resides, so it is tempting to solve the system ATy=ATAx. In this case ATA is the projection on the subspace spanned by AT. It is easy to show that the solution of this system minimizes (Ax - y)T(Ax - y) and is the familiar least squares solution for an over determined system. Thus we see that the adjoint operator plays a critical role in identifying the least squares solution of over determined systems as well.

To complete the connection between operator adjoints and matrices, it is instructive to return to function space. In function space, we will denote a linear operator as A and the adjoint of the operator as A+. Consider the functions u and w so that w = Au which is the continuous analog of the discrete case considered in Sect. 14.2.1. For any two functions u and v, there will in general exist an inner-product and an associated norm which we will denote by {v,w}. An adjoint operator is always associated with a particular inner-product and is defined by {v,Au} = {A+v,u}, often referred to as the Green's identity. The adjoint operators of different inner-products are in fact linearly related. To illustrate, suppose we let {v,w} represent the inner-product of the Euclidean norm, and define a different inner-product as

(v,w) = {v,Mw} where for now M is a linear, self-adjoint (i.e. M+=M), invertible operator. The adjoint of A with respect to the new inner-product will be denoted At and is defined by the Green's identity (v,Au) = (Atv,u) = {M~l4+Mv,Mu}, which shows that At = M-A+M.

In the discrete case, the Green's identity is referred to as the bilinear identity, and the inner-product of function space is replaced by a dot-product, so that for the Euclidean norm we have {v,w} = vTw. For w=Au, the bilinear identity for the adjoint becomes vTAu = (ATv)Tu=uTATv, showing that for the Euclidean norm the discrete equivalent of the adjoint operator A+ is the matrix transpose AT. If A is a N x M matrix, v and u reside in N-space and M-space respectively. While the dot-product vTw=vT Au is evaluated in N-space it is unique since z=ATv and in M-space uTz=uTATv=vTAu = vTw.

Exercise 1: If AT is the adjoint of the operator represented by the square matrix A with respect to the Euclidean norm vTw, derive an expression for the adjoint operator A with respect to the norm vTMw where M is a symmetric, invertible matrix, and show that A = M-1ATM.

### 14.2.3 An Illustrative Example

The ideas of Sects. 14.2.1 and 14.2.2 are best illustrated using a simple yet familiar geophysical example. Consider a rectangular, homogeneous, flat bottomed ocean of undisturbed depth H, in the form of a rotating channel that spans the Cartesian domain 0 < x < 1, 0 < y < 1, that is periodic in x, and subject to zero normal flow boundary conditions on the circulation at y=0 and y = 1.

14.2.3.1 The Linear Shallow Water Equations

We will consider first the case of linear waves in an ocean in which the circulation is described by the linear shallow water equations:

where (u,v) are the components of velocity in the x and y directions, h is the sea surface displacement, f=fy) is the Coriolis parameter, H is a constant undisturbed depth, and g is the acceleration due to gravity. The zero normal flow boundary conditions correspond to v=0, at y = 0 and y = 1, while periodicity in x requires that u(0,y) = u(1,y), v(0,y)=v(1,y), and h(0,y) = h(1,y). Recall that the adjoint of d U dt — fv = —gdh/dx d v/dt + fu = -gdh/dy Bh/Bt + B (Hu)J d x + B (H v)J By = 0

Eqs. (14.1)-(14.3) depends on the choice of an inner-product. A natural inner-product for the shallow water equations is that which yields the energy norm: 