@@ -184,8 +184,8 @@ A MAP estimate, which is only representative for a tiny fraction of the paramete
This causes MAP signal estimates to be more prone to overfitting the noise as well as to perception thresholds than methods that take volume effects into account.
Variational Inference
---------------------
Metric Gaussian Variational Inference
-------------------------------------
One method that takes volume effects into account is Variational Inference (VI).
In VI, the posterior :math:`\mathcal{P}(\xi|d)` is approximated by a simpler, parametrized distribution, often a Gaussian :math:`\mathcal{Q}(\xi)=\mathcal{G}(\xi-m,D)`.
...
...
@@ -224,11 +224,33 @@ Thus, only the gradient of the KL is needed with respect to this, which can be e
We stochastically estimate the KL-divergence and gradients with a set of samples drawn from the approximate posterior distribution.
The particular structure of the covariance allows us to draw independent samples solving a certain system of equations.
This KL-divergence for MGVI is implemented in the class :class:`~nifty.7minimization.metric_gaussian_kl.MetricGaussianKL` within NIFTy7.
This KL-divergence for MGVI is implemented by
:func:`~nifty7.minimization.kl_energies.MetricGaussianKL` within NIFTy7.
It should be noted that MGVI can typically only provide a lower bound on the variance.
The demo `getting_started_3.py` for example not only infers a field this way, but also the power spectrum of the process that has generated the field.
The cross-correlation of field and power spectrum is taken care of in this process.
Posterior samples can be obtained to study this cross-correlation.
Geometric Variational Inference
-------------------------------
It should be noted that MGVI, as any VI method, can typically only provide a lower bound on the variance.
For non-linear posterior distributions :math:`\mathcal{P}(\xi|d)` an approximation with a Gaussian :math:`\mathcal{Q}(\xi)` in the coordinates :math:`\xi` is sub-optimal, as higher order interactions are ignored.
A better approximation can be achieved by constructing a coordinate system :math:`y = g\left(\xi\right)` in which the posterior is close to a Gaussian, and perform VI with a Gaussian :math:`\mathcal{Q}(y)` in these coordinates.
One useful coordinate system is obtained in case the metric :math:`M` of the posterior can be expressed as the pullback of the Euclidean metric using an invertible transformation :math:`g`:
.. math::
M = \left(\frac{\partial g}{\partial \xi}\right)^T \frac{\partial g}{\partial \xi} \ .
In general, such a transformation can only be constructed locally, i.E. in a neighbourhood of some expansion point :math:`\bar{\xi}`, denoted as :math:`g_{\bar{\xi}}\left(\xi\right)`. Using :math:`g_{\bar{\xi}}`, the Geometric Variational Inference [6]_ (GeoVI) scheme uses a zero mean, unit Gaussian :math:`\mathcal{Q}(y) = \mathcal{G}(y, 1)` approximation, which can be expressed in :math:`\xi` coordinates via the pushforward using the inverse transformation :math:`\xi = g_{\bar{\xi}}^{-1}(y)`:
The remaining task in geoVI is to obtain the optimal expansion point :math:`\bar{\xi}` suth that :math:`\mathcal{Q}_{\bar{\xi}}` matches the posterior as good as possible. Analogous to the MGVI algorithm, :math:`\bar{\xi}` is obtained by minimization of the KL-divergenge between :math:`\mathcal{P}` and :math:`\mathcal{Q}_{\bar{\xi}}` w.r.t. :math:`\bar{\xi}`. Furthermore the KL is represented as a stochastic estimate using a set of samples drawn from :math:`\mathcal{Q}_{\bar{\xi}}` which is implemented in NIFTy7 via :func:`~nifty7.minimization.kl_energies.GeoMetricKL`.
A visual comparison of the MGVI and GeoVI algorithm can be found in the demo script `mgvi_visualized.py`.
.. [6] P. Frank, R. Leike, and T.A. Enßlin (2021), "Geometric Variational Inference"; `[arXiv:2105.10470] <https://arxiv.org/abs/2105.10470>`_