Commit cc077788 authored by Philipp Frank's avatar Philipp Frank Committed by Philipp Arras
Browse files

documentation

parent 10e967de
Pipeline #102600 passed with stages
in 13 minutes and 54 seconds
......@@ -369,17 +369,17 @@ tackling new IFT problems. An example of concrete energy classes delivered with
NIFTy7 is :class:`~minimization.quadratic_energy.QuadraticEnergy` (with
position-independent metric, mainly used with conjugate gradient minimization).
For MGVI, NIFTy provides the :class:`~energy.Energy` subclass
:class:`~minimization.metric_gaussian_kl.MetricGaussianKL`,
which computes the sampled estimated of the KL divergence, its gradient and the
Fisher metric. The constructor of
:class:`~minimization.metric_gaussian_kl.MetricGaussianKL` requires an instance
For MGVI and GeoVI, NIFTy provides the constructors
:func:`~minimization.kl_energies.MetricGaussianKL` and
:func:`~minimization.kl_energies.GeoMetricKL`, respectively, which instantiate an
object containing the sampled estimate of the KL divergence, its gradient and the
Fisher metric. Thes constructors require an instance
of :class:`~operators.energy_operators.StandardHamiltonian`, an operator to
compute the negative log-likelihood of the problem in standardized coordinates
at a given position in parameter space.
Finally, the :class:`~operators.energy_operators.StandardHamiltonian`
can be constructed from the likelihood, represented by an
:class:`~operators.energy_operators.EnergyOperator` instance.
can be constructed from the likelihood, represented by a
:class:`~operators.energy_operators.LikelihoodOperator` instance.
Several commonly used forms of the likelihoods are already provided in
NIFTy, such as :class:`~operators.energy_operators.GaussianEnergy`,
:class:`~operators.energy_operators.PoissonianEnergy`,
......
......@@ -184,8 +184,8 @@ A MAP estimate, which is only representative for a tiny fraction of the paramete
This causes MAP signal estimates to be more prone to overfitting the noise as well as to perception thresholds than methods that take volume effects into account.
Variational Inference
---------------------
Metric Gaussian Variational Inference
-------------------------------------
One method that takes volume effects into account is Variational Inference (VI).
In VI, the posterior :math:`\mathcal{P}(\xi|d)` is approximated by a simpler, parametrized distribution, often a Gaussian :math:`\mathcal{Q}(\xi)=\mathcal{G}(\xi-m,D)`.
......@@ -224,11 +224,33 @@ Thus, only the gradient of the KL is needed with respect to this, which can be e
We stochastically estimate the KL-divergence and gradients with a set of samples drawn from the approximate posterior distribution.
The particular structure of the covariance allows us to draw independent samples solving a certain system of equations.
This KL-divergence for MGVI is implemented in the class :class:`~nifty.7minimization.metric_gaussian_kl.MetricGaussianKL` within NIFTy7.
This KL-divergence for MGVI is implemented by
:func:`~nifty7.minimization.kl_energies.MetricGaussianKL` within NIFTy7.
It should be noted that MGVI can typically only provide a lower bound on the variance.
The demo `getting_started_3.py` for example not only infers a field this way, but also the power spectrum of the process that has generated the field.
The cross-correlation of field and power spectrum is taken care of in this process.
Posterior samples can be obtained to study this cross-correlation.
Geometric Variational Inference
-------------------------------
It should be noted that MGVI, as any VI method, can typically only provide a lower bound on the variance.
For non-linear posterior distributions :math:`\mathcal{P}(\xi|d)` an approximation with a Gaussian :math:`\mathcal{Q}(\xi)` in the coordinates :math:`\xi` is sub-optimal, as higher order interactions are ignored.
A better approximation can be achieved by constructing a coordinate system :math:`y = g\left(\xi\right)` in which the posterior is close to a Gaussian, and perform VI with a Gaussian :math:`\mathcal{Q}(y)` in these coordinates.
One useful coordinate system is obtained in case the metric :math:`M` of the posterior can be expressed as the pullback of the Euclidean metric using an invertible transformation :math:`g`:
.. math::
M = \left(\frac{\partial g}{\partial \xi}\right)^T \frac{\partial g}{\partial \xi} \ .
In general, such a transformation can only be constructed locally, i.E. in a neighbourhood of some expansion point :math:`\bar{\xi}`, denoted as :math:`g_{\bar{\xi}}\left(\xi\right)`. Using :math:`g_{\bar{\xi}}`, the Geometric Variational Inference [6]_ (GeoVI) scheme uses a zero mean, unit Gaussian :math:`\mathcal{Q}(y) = \mathcal{G}(y, 1)` approximation, which can be expressed in :math:`\xi` coordinates via the pushforward using the inverse transformation :math:`\xi = g_{\bar{\xi}}^{-1}(y)`:
.. math::
\mathcal{Q}_{\bar{\xi}}(\xi) = \left(g_{\bar{\xi}}^{-1} * \mathcal{Q}\right)(\xi) = \int \delta\left(\xi - g_{\bar{\xi}}^{-1}(y)\right) \ \mathcal{G}(y, 1) \ \mathcal{D}y \ ,
where :math:`\delta` denotes the kronecker-delta.
The remaining task in geoVI is to obtain the optimal expansion point :math:`\bar{\xi}` suth that :math:`\mathcal{Q}_{\bar{\xi}}` matches the posterior as good as possible. Analogous to the MGVI algorithm, :math:`\bar{\xi}` is obtained by minimization of the KL-divergenge between :math:`\mathcal{P}` and :math:`\mathcal{Q}_{\bar{\xi}}` w.r.t. :math:`\bar{\xi}`. Furthermore the KL is represented as a stochastic estimate using a set of samples drawn from :math:`\mathcal{Q}_{\bar{\xi}}` which is implemented in NIFTy7 via :func:`~nifty7.minimization.kl_energies.GeoMetricKL`.
A visual comparison of the MGVI and GeoVI algorithm can be found in the demo script `mgvi_visualized.py`.
.. [6] P. Frank, R. Leike, and T.A. Enßlin (2021), "Geometric Variational Inference"; `[arXiv:2105.10470] <https://arxiv.org/abs/2105.10470>`_
......@@ -90,7 +90,7 @@ class LikelihoodOperator(EnergyOperator):
def get_metric_at(self, x):
"""Computes the Fisher information metric for a `LikelihoodOperator`
at `x` using the Jacobian of the coordinate transformation given by
`get_transformation`.
:func:`~nifty7.operators.operator.Operator.get_transformation`.
"""
dtp, f = self.get_transformation()
ch = ScalingOperator(f.target, 1.)
......
......@@ -110,17 +110,15 @@ class Operator(metaclass=NiftyMeta):
def get_transformation(self):
"""The coordinate transformation that maps into a coordinate system
where the metric of a likelihood is the Euclidean metric.
This is `None`, except when the object is considered a likelihood i.E.
for an instance of `EnergyOperator` with its metric being a proper
Fisher information metric, or a sum or nested sum thereof.
This is `None`, except when the object an instance of
:class:`~nifty7.operators.energy_operators.LikelihoodOperator` or a
(nested) sum thereof.
Retruns
Returns
-------
np.dtype, or dict of np.dtype : The dtype(s) of the target space of the
transformation.
np.dtype, or dict of np.dtype : The dtype(s) of the target space of the transformation.
Operator : The transformation that maps from `domain` into the
Euclidean target space.
Operator : The transformation that maps from `domain` into the Euclidean target space.
"""
return None
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment