Low-memory filtering for large-scale data assimilation
Bibov, Alexander (2017-05-22)
Väitöskirja
Bibov, Alexander
22.05.2017
Lappeenranta University of Technology
Acta Universitatis Lappeenrantaensis
Julkaisun pysyvä osoite on
https://urn.fi/URN:ISBN:978-952-335-077-9
https://urn.fi/URN:ISBN:978-952-335-077-9
Tiivistelmä
Data assimilation is process of combining information acquired from mathematical model with observed data in attempt to increase the accuracy of both. The real world phenomena are hard to model in exact way. In addition, even when certain processes allow very accurate mathematical description the model often cannot be represented in a closed form.
These facts lead to necessity to deal with modelling errors when it comes to numerical
simulations of real-life phenomena. One of the usual ways to improve the quality of sim-
ulated data is to use information from (possibly indirect) observations. The observations
in turn are prone to the measurement errors that should also be taken into consideration.
Therefore, the commonly assumed task, which is the subject for data assimilation could
be roughly formulated as follows: given prediction computed by certain numerical simu-
lation and the corresponding (possibly indirect) observation provide an optimal estimate
for the state of the system in question. Here the optimality can have different meanings,
but the commonly assumed case is that the estimate must be unbiased to gain correct
grasp of reality and that it should have the minimal variance, which corresponds to noise
reduction. The algorithms that solve the data assimilation task are called data assimilation
methods. From the aforementioned descriptions it is visible that data assimilation is in the
essence similar to fitting model to the data. The special term of “data assimilation” was
borrowed from meteorological community and is mostly used when the system leveraged
to compute predictions is having high dimension and is chaotic, i.e. sensitive to the initial
state. This dissertation mainly focuses on such models, which is the reason to use this
special term here.
In this dissertation we consider data assimilation methods that deal with the case where
dimension of the state space of the system being simulated is too “large-scale” for the
classical algorithms to be practicable. By “large-scale” here we presume that if n is
dimension of the state space, then n is considered “large” if an n-by-n matrix could not
fit into available computer memory using desired storage format (e.g. single- or double-
precision). Common example of a large-scale model is an operational weather prediction
simulation system. By the moment of writing this text for such systems n 109, which
means that a 109-by-109 matrix stored in double-precision format (this is often required
in scientific simulations) would occupy approximately 7275958TB of memory, which is
far beyond the capabilities of all modern supercomputers (when this text was written the
fastest supercomputer was Sunway TaihuLight located in national supercomputer center
in Wuxi, China and it was having only 1310TB or RAM).
The motivation for having special emphasis for the large-scale models is that the classical
optimal data assimilation approaches employ covariance matrices of the state vectors (the
vectors containing all parameters that fully represent a state of the model at given time
instance). This implies necessity to store n-by-n matrices, which makes implementation
of all such methods inefficient. In this dissertation we attempt to address this problem
by considering low-memory approximations of the classical approaches. The approxima-
tions are by nature sub-optimal, but the way they are formulated allows to alleviate the
memory issues that arise in the classical algorithms.
In the present work we concentrate on low-memory approximations of the Extended
Kalman filter based on limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS)
unconstrained optimization scheme and present family of stabilizing corrections that allow to circumvent certain stability issues that are present in some previously known approaches based on this scheme. We also demonstrate that our stabilizing corrections imply
better convergence properties and re-use this fact to formulate and solve the parallel filtering task, which is essentially a low-memory approximation of the fixed-lag Kalmansmoother.
We study performance of our methods using a synthetic model, the two-layer Quasi-Geostrophic model, which describes conservative wind motion over cylindrical surface vertically divided into two layers. The model is a well-known test case and has been
extensively used in ongoing research conducted in e.g., European Centre For Medium-
Ranged Weather Forecasts, Reading, UK. We analyse performance of our methods by
comparing them against a set of competing low-memory data assimilation techniques
such as Variational Kalman Filter, BFGS Low-Memory Kalman Filter, Weak-Constraint
4D-VAR, and a selection of ensemble-based algorithms.
Finally, we analyse applicability of our approaches by considering the problem of esti-
mating intensity of blooming in the coastal regions of the Finnish gulf during the Spring-
Summer months. For this case we use high-resolution satellite images that provide con-
centrations of the chlorophyll in the gulf. However, the data taken at certain time instances
is not complete due to cloudiness and therefore, the task of estimating the concentrations
of chlorophyll turns out to be a perfect candidate for data assimilation. In addition, the
problem is naturally large-scale due to the resolution of the original data.
These facts lead to necessity to deal with modelling errors when it comes to numerical
simulations of real-life phenomena. One of the usual ways to improve the quality of sim-
ulated data is to use information from (possibly indirect) observations. The observations
in turn are prone to the measurement errors that should also be taken into consideration.
Therefore, the commonly assumed task, which is the subject for data assimilation could
be roughly formulated as follows: given prediction computed by certain numerical simu-
lation and the corresponding (possibly indirect) observation provide an optimal estimate
for the state of the system in question. Here the optimality can have different meanings,
but the commonly assumed case is that the estimate must be unbiased to gain correct
grasp of reality and that it should have the minimal variance, which corresponds to noise
reduction. The algorithms that solve the data assimilation task are called data assimilation
methods. From the aforementioned descriptions it is visible that data assimilation is in the
essence similar to fitting model to the data. The special term of “data assimilation” was
borrowed from meteorological community and is mostly used when the system leveraged
to compute predictions is having high dimension and is chaotic, i.e. sensitive to the initial
state. This dissertation mainly focuses on such models, which is the reason to use this
special term here.
In this dissertation we consider data assimilation methods that deal with the case where
dimension of the state space of the system being simulated is too “large-scale” for the
classical algorithms to be practicable. By “large-scale” here we presume that if n is
dimension of the state space, then n is considered “large” if an n-by-n matrix could not
fit into available computer memory using desired storage format (e.g. single- or double-
precision). Common example of a large-scale model is an operational weather prediction
simulation system. By the moment of writing this text for such systems n 109, which
means that a 109-by-109 matrix stored in double-precision format (this is often required
in scientific simulations) would occupy approximately 7275958TB of memory, which is
far beyond the capabilities of all modern supercomputers (when this text was written the
fastest supercomputer was Sunway TaihuLight located in national supercomputer center
in Wuxi, China and it was having only 1310TB or RAM).
The motivation for having special emphasis for the large-scale models is that the classical
optimal data assimilation approaches employ covariance matrices of the state vectors (the
vectors containing all parameters that fully represent a state of the model at given time
instance). This implies necessity to store n-by-n matrices, which makes implementation
of all such methods inefficient. In this dissertation we attempt to address this problem
by considering low-memory approximations of the classical approaches. The approxima-
tions are by nature sub-optimal, but the way they are formulated allows to alleviate the
memory issues that arise in the classical algorithms.
In the present work we concentrate on low-memory approximations of the Extended
Kalman filter based on limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS)
unconstrained optimization scheme and present family of stabilizing corrections that allow to circumvent certain stability issues that are present in some previously known approaches based on this scheme. We also demonstrate that our stabilizing corrections imply
better convergence properties and re-use this fact to formulate and solve the parallel filtering task, which is essentially a low-memory approximation of the fixed-lag Kalmansmoother.
We study performance of our methods using a synthetic model, the two-layer Quasi-Geostrophic model, which describes conservative wind motion over cylindrical surface vertically divided into two layers. The model is a well-known test case and has been
extensively used in ongoing research conducted in e.g., European Centre For Medium-
Ranged Weather Forecasts, Reading, UK. We analyse performance of our methods by
comparing them against a set of competing low-memory data assimilation techniques
such as Variational Kalman Filter, BFGS Low-Memory Kalman Filter, Weak-Constraint
4D-VAR, and a selection of ensemble-based algorithms.
Finally, we analyse applicability of our approaches by considering the problem of esti-
mating intensity of blooming in the coastal regions of the Finnish gulf during the Spring-
Summer months. For this case we use high-resolution satellite images that provide con-
centrations of the chlorophyll in the gulf. However, the data taken at certain time instances
is not complete due to cloudiness and therefore, the task of estimating the concentrations
of chlorophyll turns out to be a perfect candidate for data assimilation. In addition, the
problem is naturally large-scale due to the resolution of the original data.
Kokoelmat
- Väitöskirjat [1036]