To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In this chapter we explore the properties of Bayesian inversion from the perspective of an optimization problem which corresponds to maximizing the posterior probability; that is, to finding a maximum a posteriori (MAP) estimator, or mode of the posterior distribution. We demonstrate the properties of the point estimator resulting from this optimization problem, showing its positive and negative attributes, the latter motivating our work in the following three chapters. We also introduce, and study, basic gradient-based optimization algorithms.
In this chapter we introduce Monte Carlo sampling and importance sampling. These are two general techniques for estimating expectations with respect to a given pdf π. Monte Carlo generates independent samples from π and combines them with equal weights, whilst importance sampling uses independent samples, weighted appropriately, from a different distribution. In quantifying the error in Monte Carlo and importance sampling, we will use a distance on random probability measures that reduces to total variation in the case of deterministic probability measures; and we will introduce the χ2 divergence.
In this chapter we introduce the Bayesian approach to inverse problems in which the unknown parameter and the observed data are viewed as random variables. In this probabilistic formulation, the solution of the inverse problem is the posterior distribution on the parameter given the data. We will show that the Bayesian formulation leads to a form of well-posedness: small perturbations of the forward model or the observed data translate into small perturbations of the posterior distribution. Well-posedness requires a notion of distance between probability measures. We introduce the total variation and Hellinger distances, giving characterizations of them, and bounds relating them, that will be used throughout these notes. We prove well-posedness in the Hellinger distance.
The aim of these notes is to provide a clear and concise mathematical introduction to the subjects of Inverse Problems and Data Assimilation, and their interrelations, together with bibliographic pointers to literature in this area that goes into greater depth. The target audiences are advanced undergraduates and beginning graduate students in the mathematical sciences, together with researchers in the sciences and engineering who are interested in the systematic underpinnings of methodologies widely used in their disciplines.
In this chapter we introduce the Bayesian approach to inverse problems in which the unknown parameter and the observed data are viewed as random variables. In this probabilistic formulation, the solution of the inverse problem is the posterior distribution on the parameter given the data. We will show that the Bayesian formulation leads to a form of well-posedness: small perturbations of the forward model or the observed data translate into small perturbations of the posterior distribution. Well-posedness requires a notion of distance between probability measures. We introduce the total variation and Hellinger distances, giving characterizations of them, and bounds relating them, that will be used throughout these notes. We prove well-posedness in the Hellinger distance.
This chapter demonstrates the use of optimization, namely the 3DVAR and 4DVAR methodologies, to obtain information from the filtering and smoothing distributions. We emphasize that the methods we present in this chapter do not provide approximations of the filtering and smoothing distributions; they simply provide estimates of the signal, given data, in the filtering (on-line) and smoothing (off-line) data scenarios.
This chapter is devoted to the particle filter, a method that approximates the filtering distribution by a sum of Dirac masses. Particle filters provably converge to the filtering distribution as the number of particles, and hence the number of Dirac masses, approaches infinity. We focus on the bootstrap particle filter (BPF), also known as sequential importance resampling; it is linked to the material on Monte Carlo and importance sampling described in Chapter 5.
In this chapter we study Markov chain Monte Carlo (MCMC), a methodology that delivers approximate samples from a given target distribution π. The methodology applies to settings in which π is the posterior distribution in (1.2), but it is also widely used in numerous applications beyond Bayesian inference. As with Monte Carlo and importance sampling, MCMC may be viewed as approximating the target distribution by a sum of Dirac masses, thus allowing the approximation of expectations with respect to the target. Implementation of Monte Carlo presupposes that independent samples from the target can be obtained. Importance sampling and MCMC bypass this restrictive assumption: importance sampling by appropriately weighting independent samples from a proposal distribution, and MCMC by drawing correlated samples from a Markov kernel that has the target as invariant distribution.
In this chapter we again adopt an optimization approach to the problem of Bayesian inference, but instead seek a Gaussian distribution 𝑝 = N(μ, Σ) that minimizes some distance-like measure from the posterior 𝜋𝑦 (u). However, rather than using a metric to define the distance, we use the Kullback–Leibler divergence introduced in Section 4.1.
This chapter demonstrates the use of optimization, namely the 3DVAR and 4DVAR methodologies, to obtain information from the filtering and smoothing distributions. We emphasize that the methods we present in this chapter do not provide approximations of the filtering and smoothing distributions; they simply provide estimates of the signal, given data, in the filtering (on-line) and smoothing (off-line) data scenarios.
This chapter brings together the material in the first two parts of these notes, demonstrating how the principles and ideas underpinning the derivation of extended and ensemble Kalman filters for data assimilation can be used to design ensemble Kalman methods for inverse problems.
This concise introduction provides an entry point to the world of inverse problems and data assimilation for advanced undergraduates and beginning graduate students in the mathematical sciences. It will also appeal to researchers in science and engineering who are interested in the systematic underpinnings of methodologies widely used in their disciplines. The authors examine inverse problems and data assimilation in turn, before exploring the use of data assimilation methods to solve generic inverse problems by introducing an artificial algorithmic time. Topics covered include maximum a posteriori estimation, (stochastic) gradient descent, variational Bayes, Monte Carlo, importance sampling and Markov chain Monte Carlo for inverse problems; and 3DVAR, 4DVAR, extended and ensemble Kalman filters, and particle filters for data assimilation. The book contains a wealth of examples and exercises, and can be used to accompany courses as well as for self-study.