Structure-Exploiting scalable methods for large-scale Bayesian inverse problems in high dimensional parameter spaces
We review our recent efforts in developing scalable approaches to large-scale Bayesian statistical inverse problems. At the heart of our approach is the exploitation of Hessian information, which captures the local curvature and correlations of the posterior pdf, in turn greatly improving efficiency and scalability to high dimensions. First, we present a Gaussian approximation of the posterior in which we exploit the low rank nature of the misfit Hessian to devise a scalable strategy to approximate infinite dimensional Bayesian inverse problems. Second, we propose a complementary scaled stochastic Newton method that eliminates the dependence on variance in number of iterations required by the chain.. Finally, we present our current work on scalable Maximum Randomized Likelihood method for MCMC that is capable of generating approximate independent samples for Bayesian posterior in high dimensional parameter spaces. We demonstrate our methods on large-scale full wave form seismic inversion, inverse thermal fin design, and inverse Helmholtz equation.
This is a joint work with Carsten Burstedde, Omar Ghattas, James Martin, Georg Stadler, and Lucas Wilcox.
Parallel Tempering Applied to Reservoir History Matching
A key question in the management of a petroleum reservoir is whether we need additional wells and if so where they should be placed. If through the solution of an inverse problem we can determine approximately where the unproduced hydrocarbons are to be found and in what quantities then we can assess the profitability of a new well. However if there are multiple distinct interpretations as to where the unproduced hydrocarbons are then we need to identify all of these interpretations and assign them probabilities so that we can carry out a probabilistic risk assessment for the well profitability. In this work we assess the capability of a practical implementation of parallel tempering to identify multiple interpretations. In a practical implementation it is important that the user does not have to set tuning parameters.
Iterative ensemble smoothers and applications to reservoir history matching
The use of the ensemble smoother instead of the ensemble Kalman filter increases the nonlinearity of the update step during data assimilation and the need for iterative assimilation methods. We investigated the performance of several iterative ensemble smoothers with particular focus on reservoir history matching problems. The iterative ensemble smoothers based on the standard Gauss-Newton and Levenberg-Marquardt formulation require explicit estimation of the sensitivity from the ensemble and were often not effective in reducing the objective function for large scale history matching problems. A iterative ensemble smoother based on a modified Levenberg-Marquardt method (LM-EnRML), in which the ensemble approximation of the Hessian is altered to simplify computation and increase stability often shows much greater rate of convergence and lower objective function at the final iteration for large scale problems.
The performance of the iterative ensemble smoothers was evaluated using several examples including a field application to a North sea reservoir. The model parameters that are updated in the field example include permeability, porosity and net-to-gross ratio at each gridblock, vertical transmissibility at each gridblock for six layers, transmissibility multipliers of 53 faults, end-point water and gas relative permeability of four different reservoir zones, depth of water-oil contacts and transmissibility multipliers between a few main fault blocks. The total number of model parameters is about 150,000. The LM-EnRML was able to achieve improved data match compared to the manually history matched model after three iterations. Updates from LM-EnRML do not introduce artifacts in the property fields as in the manually history matched model. The automated workflow is also much less labor intensive than manual history matching.
A Variational Smoothing Filter for Sequential Inverse Problems
Once a system model, a model of any measuring apparatus that relates states to observations, and a prior assessment of uncertainty have been specified, the probability density of subsequent system states, conditioned upon the history of the observations is a logical consequence of the assumptions.
When observations are made at discrete times, it is know that the evolving probability density is a solution of the Bayesian filtering equations. In the very special case of linear models and Gaussian noise these equations can be solved using a Kalman Filter. The real problem, as we all know, is to perform a similar miracle in the general case. Following the progress made by other researchers, this work explores the consequences that follow when we seek to approximate the evolving probability density using an ensemble of centres in a sum of Gaussian densities. In general this leads inexorably to a sequence of optimisation problems and related high-dimensional integrals. There are other problems too, related to the necessity of using a small number of densities in the approximation, the requirement to maintain sparsity of any matrices and the need to compute first and – somewhat problematically second – derivatives. Adjoint methods, Taylor expansions, Gaussian random fields and Newton’s method can be combined to, possibly, provide a solution.
Some progress has now been made which will be explained, and the remaining severe difficulties described. The generalised Lorenz-96 equations, where in addition to the variables, many unknown parameters are also present, will be used, once more, by way of illustration. Results will be presented that indicate that when the equations are correct and the uncertainty is quantified with sufficient accuracy, one can retrieve the entire system state (including the unobserved parameters) with very few observations of the variables. One is led to conjecture that the optimal solution of an inverse problem, when performed in a Bayesian setting, contains much valuable information that is usually missed.
It is normal not to be Gaussian
Some algorithms for the inverse stochastic modeling of hydraulic conductivity spatial heterogeneity will be discussed. The aim of the algorithms will be in the reproduction of spatial patterns that are difficult, if not impossible, to capture using a multiGaussian random function model, such as curvilinear channels. Most of the algorithms fall in the framework of the ensemble Kalman filter (EnKF); two such algorithms are the normal-score EnKF (NS-EnKF), and the ensemble pattern matching (EnPAT); the first one still relies in two-point statistics, while the second one uses multiple-point statistics to reproduce curvilinear structures in the generated realizations. Advantages and pitfalls of the different approaches will be commented.
Design in Inverse Problems - Reducing the risk and uncertainty
We present an algorithmic framework for linear and nonlinear experimental design obtained by the minimization of an empirical estimate of the risk. The formulation leads to a bilevel stochastic optimization problem that is challenging to solve. We discuss the solution of the problem and show that for linear
inverse problems this can be done efficiently. We also discuss techniques for the nonlinear case that are based on stochastic programming techniques.
Finally, we demonstrate that design in inverse problems can yield significant advantages and reduce the risk as well as the cost of experiments.
Model reduced variational data assimilation: An ensemble approach to model calibration
Data assimilation methods are used to combine the results of a large scale numerical model with the measurement information available in order to obtain an optimal reconstruction of the dynamic behavior of the model state. Variational data assimilation or "the adjoint method" has been used very often for data assimilation. This approach is especially attractive for model calibration problems in reservoir models. Using the available data, the uncertain permeability parameters in the model are identified by minimizing a certain cost function that measures the difference between the model results and the data. In order to obtain a computational efficient procedure, the minimization is performed with a gradient-based algorithm where the gradient is determined by solving the adjoint problem.
Variational data assimilation requires the implementation of the adjoint model. Even with the use of the adjoint compilers that have become available recently this is a tremendous programming effort, that hampers new applications of the method. Therefore we propose another approach to variational data assimilation using model reduction that does not require the implementation of the adjoint of (the tangent linear approximation of) the original model. Model reduced variational data assimilation is based upon a POD (Proper Orthogonal Decomposition) approach to determine a reduced model for the tangent linear approximation of the original nonlinear forward model. Linearisation is carried out using a numerical finite difference procedure. This is computational feasible because it can be done in reduced space. Once the reduced model is available, its adjoint can be implemented very easily and the minimization process can be solved completely in reduced space with negligible computational costs. If necessary, the procedure can be repeated a few times by generating new ensembles more close to the most recent estimate of the parameters.
In many reservoir models the adjoint states can be computed relatively easy and only the Jacobians with respect to parameters are hard to obtain. This makes a balanced POD-based method very attractive.
In the presentation we will introduce the model reduced variational data assimilation approach. The characteristics and performance of the method will be illustrated with a number of real life data assimilation applications to reservoir models.
Accelerated Markov Chain - Monte Carlo Methods for Bayesian Inversion
The Bayesian approach to inverse problems, in which the posterior probability distribution on an unknown field is sampled for the purposes of computing posterior expectations of quantities of interest, is starting to become computationally feasible for partial differential equation (PDE) inverse problems.
We study Bayesian inversion for a model elliptic PDE with an unknown diffusion coefficient. We first present the “plain” MCMC method based on combining MCMC sampling with linear complexity multilevel solvers for elliptic PDE. Our work versus accuracy bounds show that the complexity of this approach can be quite prohibitive. Two strategies for reducing the computational complexity are then proposed: first, a sparse, parametric and deterministic generalized polynomial chaos (gpc) “surrogate” representation of the forward response map of the PDE over the entire parameter space, and, second, a novel Multi-Level Markov Chain Monte Carlo strategy which utilizes sampling from a multilevel discretization of the posterior and of the forward PDE.
For both of these strategies we provide asymptotic bounds on work versus accuracy, and hence asymptotic bounds on the computational complexity of the algorithms. We show that these methods can achieve an equal level of accuracy with substantial reduction in complexity in comparison to the plain MCMC method.
This is a joint work with Christoph Schwab (ETH) and Andrew Stuart (Warwick).
Real-time Ensemble Control with Reduced-Order Modeling
The control of spatially distributed systems is often complicated by significant uncertainty about system inputs, both time-varying exogenous inputs and time-invariant parameters. Spatial variations of uncertain parameters can be particularly problematic in geoscience applications, making it difficult to forecast the impact of proposed controls. One of the most effective ways to deal with uncertainties in control problems is to incorporate periodic measurements of the system's states into the control process. Stochastic control provides a convenient way to do this, by integrating uncertainty, monitoring, forecasting, and control in a consistent analytical framework. This paper describes an ensemble-based approach to closed-loop stochastic control that relies on a computationally efficient reduced-order model. The use of ensembles of uncertain parameters and states makes it possible to consider a range of probabilistic performance objectives and to derive real-time controls that explicitly account for uncertainty. The process divides naturally into measurement updating, control, and forecasting steps carried out recursively and initialized with a prior ensemble that describes parameter uncertainty. The performance of the ensemble controller is investigated here with a numerical experiment based on a solute transport control problem. This experiment evaluates the performance of open and closed-loop controllers with full and reduced-order models as well as the performance obtained with a controller based on perfect knowledge of the system and the nominal performance obtained with no control. The experimental results show that a closed-loop controller that relies on measurements consistently performs better than an open-loop controller that does not. They also show that a reduced-order forecasting model based on extensive off-line simulations gives nearly the same performance as a significantly more computationally expensive full-order model. Finally, the experiment indicates that a moderate penalty on the variance of control cost yields a robust control strategy that reduces uncertainty about system performance with little or no increase in average cost. Taken together, these results confirm that reduced-order ensemble closed-loop control is a flexible and efficient control option for uncertain spatially distributed systems.
Minimization for generating an ensemble of conditional realizations from an ensemble of unconditional realizations
Ensemble-based methods of data assimilation such as the ensemble Kalman filter (EnKF) and the ensemble smoother (ES) rely on an ensemble of samples to provide a Monte Carlo representation of the probability density. Each ensemble member is ideally an independent sample from the appropriate probability density. The objective of the analysis step in data assimilation is to update each ensemble member such that the resulting ensemble provides a Monte Carlo representation of the posterior pdf, i.e. after updating each ensemble member is an independent sample from the posterior. Although Bayes' rule specifies how to update the pdf when data are assimilated, it does not specify how individual ensemble members should be updated, so while it is often relatively easy to sample from the prior, it is much more difficult to sample from the posterior. Here we look at the problem of transforming samples from the prior to samples from the posterior. Although there are exact methods for sampling from the posterior (e.g. Markov chain Monte Carlo or acceptance/rejection), these methods tend to be computationally demanding when evaluation of the likelihood function is expensive, as it is for most geoscience applications.
As an alternative, in this talk we seek a deterministic mapping of variables distributed according to the prior to variables distributed according to the posterior. The mapping can then be used for ensemble-based data assimilation to transform samples from the prior distribution to samples from the posterior distribution. Although any deterministic mapping might be equally useful, we will focus on mappings with the particular property of minimizing the expected value of the distance between the variables.
Assimilation of data by a Multi-scale Ensemble Kalman Filter
The EnKF provides an approximate Monte Carlo solution to forecasts in hidden Markov chain models. Under Gauss-linear assumptions the EnKF is asymptotically correct as the number of ensemble members tends towards infinity. For finite ensemble filters many complications are reported – even under Gauss-linear model assumptions. Some finite sample results that shed light on these complications will be presented – with focus on uncertainty quantification.
The number of ensemble members is critical for the performance of the filter. For large-scale problems with computer demanding forward models, computer efficient simplified surrogate models are often used in order to allow larger ensemble sizes. The introduction of these simplified models does not come without bias and precision costs, however. The multi-scale EnKF is designed to quantify these costs, and to provide more representative forecasts and forecast intervals. The methodological basis for this multi-scale EnKF will be presented and discussed. The effect of using the filter will be demonstrated on a reservoir production conditioning problem by using a synthetic petroleum reservoir model.
Speeding up convergence to equilibrium for diffusion processes
In many applications it is necessary to sample from a probability distribution that is known up to a constant. A standard technique for doing this is by simulating a stochastic differential equation whose invariant measure is the probability measure from which we want to sample. There are (infinitely) many different diffusion processes that have the same invariant distribution. It is natural to choose a diffusion process that converges as quickly as possible to equilibrium, since fast convergence to equilibrium reduces the computational cost. In this talk we will present some recent results on optimizing the rate of convergence to equilibrium by adding an appropriate non-reversible perturbation to the standard reversible overdamped Langevin dynamics. This is joint work with T. Leliever and F. Nier.
Multilevel Markov Chain Monte Carlo Methods
One of the key tasks in many areas of subsurface flow, most notably in radioactive waste disposal and oil recovery, is an efficient treatment of data uncertainties and the quantification of how these uncertainties propagate through the system. Similar questions arise also in other areas of science and engineering such as weather and climate prediction or manufacturing. The mathematical challenge associated with these questions is the solution of high-dimensional quadrature problems with integrands that involve the solution of PDEs with random coefficients.
Due to the heterogeneity of the subsurface and the complexity of the flow, each realisation of the integrand is very costly and so it is paramount to make existing uncertainty quantification tools more efficient. Recent advances in Monte Carlo type methods, based on deterministic, Quasi-Monte Carlo sampling rules and multilevel approaches, provide unprecedented opportunities for accurate uncertainty analyses in realistic three-dimensional subsurface flow applications.
In this talk we show how the multilevel framework can also be extended to Bayesian inverse problems. In particular, we present an abstract multilevel Metropolis-Hastings algorithm and carry out a complete analysis for a typical model problem in subsurface flow. As in the case of the forward problem, the analysis reduces to classical questions in regularity and finite element approximation error analysis. We can show significant gains over the standard Metropolis-Hastings estimator.
Numerical experiments confirm the analysis and demonstrate the effectiveness of the method with consistent reductions in computational time (for the same accuracy) of a factor of O(10-100).
Sparse quadrature for Bayesian inverse problems
We present a novel, quadrature based deterministic computational approach to Bayesian estimation for a broad class of inverse problems with uncertain, distributed parameters in ordinary or partial differential equations.
Based on Bayes’ theorem in function spaces from , we present new sparsity results for the Bayesian posterior density and an adaptive Smolyak algorithm for the efficient, deterministic approximation of the infinite-dimensional integrals with respect to the posterior measure arising in Bayesian estimation.
Convergence rates for the adaptive Smolyak quadrature approximation are shown, computationally, to coincide with the best N term approximation rates of the Bayesian posterior density which, in turn, are proved [7, 6] to depend only on the sparsity class of the uncertain distributed parameter, but which are bounded independently of the number of uncertain parameters activated by the adaptive algorithm.
Computational savings afforded by adaptivity in the forward solver are analyzed. Applications are studies to high-dimensional, parametric initial value problems from biological systems sciences, to diffusion problems with lognormal, Gaussian coefficients  and to problems with multiple, spatial length-scales . Both, intrusive  as well as nonintrusive methods  are considered. The vanishing variance in observation noise is analysed.
Work supported by the ERC under AdG 247277.
 M. Eigel, C.J. Gittelson, Ch. Schwab and E. Zander, Adaptive stochastic Galerkin FEM, Report 2013-01, Seminar for Applied Mathematics, ETH Zürich, 2013.
 A. Chkifa, A. Cohen, R. DeVore and Ch. Schwab, Sparse adaptive Taylor approximation algorithms for parametric and stochastic elliptic PDEs, ESAIM: Mathematical Modelling and Numerical Analysis, 47 (2013) 253–280.
 M. Hansen, Cl. Schillings and Ch. Schwab, Sparse Approximation Algorithms for High Dimensional Parametric Initial Value Problems, Report 2013-10, Seminar for Applied Mathematics, ETH Zürich (2013) (to appear in Proc. HPSC5, Hanoi, 2013 in Springer LNCSE).
 Cl. Schillings and Ch. Schwab, Sparse, adaptive Smolyak algorithms for Bayesian inverse problems, Report 2012-37, SAM, ETH Zürich. Inverse Problems (2013), (in press).
 Cl. Schillings, A Note on Sparse, Adaptive Smolyak Quadratures for Bayesian Inverse Problems. Report 2013-06, SAM, ETH Zürich. Oberwolfach Reports (2013) (to appear).
 Cl. Schillings and Ch. Schwab, Sparsity in Bayesian Inverse Problems, (in preparation)(2013).
 Ch. Schwab and A. M. Stuart, Sparse deterministic approximation of Bayesian inverse problems, Inverse Problems 28(4) http://dx:doi:org//10:1088/0266-5611/28/4/045003, (2012).
 A.M. Stuart, Inverse problems: A Bayesian Perspective, Acta Numerica (2010).