Data Assimilation in the Geosciences

The theoretical background to confront head-on the problem of data assimilation together with several examples of practical applications is provided.

Researchers in all disciplines of geosciences (including the ionosphere, atmosphere, ocean, land, the ecosystems and down to the core of the Earth) have developed numerical models and have access to a regular flow of observations. They are therefore legitimately tempted to use data-assimilation methods to turn their models into powerful prediction and reanalysis machines.

Figure 1. (Fig.9 in the review paper) Required data assimilation method versus model resolution and prediction time horizon; examples of corresponding natural phenomena are also shown for illustrative purposes. The degree of sophistication of the data assimilation grows commensurately with the increase in prediction time horizon and the decrease of the model grid size (i.e. increased resolution).

So, which data-assimilation method to use? Both theoretical and practical issues are likely to hamper their progress and apply to all sub-disciplines of geosciences. We need to practically deal with the high dimensionality and strong nonlinearity of the of the processes at play in earth-sciences models, and we must handle the uncertainties in the models. Figure 1 illustrates the extent and origins of these quests.  The constant increase of the numerical model resolution implies resolving more and more small-scale processes (e.g., convection or turbulence) that are often inherently nonlinear and non-Gaussian.

The transition toward high-resolution models is only possible if accompanied by a corresponding data-assimilation method where the Gaussian and linear assumptions are relaxed. Along with the increase in model resolution, there is also a growing interest in making long-term forecasts. Long-term predictability arises from the interactions between the atmosphere and the more slowly varying components of the climate system, like the ocean, land surface and cryosphere, so that predictions are issued using fully coupled models. The productive use of data assimilation with coupled models necessitates the development of adequate coupled data assimilation methods that allow for a consistent and balanced propagation of the informational content of the observations across all model components. Data assimilation is also a formidable big-data problem: we need to integrate massive datasets efficiently and accurately with numerical models of high dimension.

This review paper aims at providing the theoretical background to confront head-on the problem of data assimilation together with several examples of practical applications as illustrations. It addresses the data-assimilation challenges the geosciences discipline is facing in the current era of high-power computing and vast data availability. This review can thus serve as an up-to-date guide for geoscientists who are facing the issue of combining data with numerical models.


Kindly contributed by Alberto Carrassi1, Marc Bocquet2, Laurent Bertino1 and Geir Evensen1,3.

  1. Nansen Environmental and Remote Sensing Center – NERSC, Bergen, Norway.
  2. CEREA joint laboratory École des Ponts ParisTech and EdF R&D, Université Paris-Est, Champs-sur-Marne,
  3. IRIS, Bergen, Norway
To Top