Commit cc077788 authored by Philipp Frank's avatar Philipp Frank Committed by Philipp Arras
Browse files

documentation

parent 10e967de
...@@ -369,17 +369,17 @@ tackling new IFT problems. An example of concrete energy classes delivered with ...@@ -369,17 +369,17 @@ tackling new IFT problems. An example of concrete energy classes delivered with
NIFTy7 is :class:`~minimization.quadratic_energy.QuadraticEnergy` (with NIFTy7 is :class:`~minimization.quadratic_energy.QuadraticEnergy` (with
position-independent metric, mainly used with conjugate gradient minimization). position-independent metric, mainly used with conjugate gradient minimization).
For MGVI, NIFTy provides the :class:`~energy.Energy` subclass For MGVI and GeoVI, NIFTy provides the constructors
:class:`~minimization.metric_gaussian_kl.MetricGaussianKL`, :func:`~minimization.kl_energies.MetricGaussianKL` and
which computes the sampled estimated of the KL divergence, its gradient and the :func:`~minimization.kl_energies.GeoMetricKL`, respectively, which instantiate an
Fisher metric. The constructor of object containing the sampled estimate of the KL divergence, its gradient and the
:class:`~minimization.metric_gaussian_kl.MetricGaussianKL` requires an instance Fisher metric. Thes constructors require an instance
of :class:`~operators.energy_operators.StandardHamiltonian`, an operator to of :class:`~operators.energy_operators.StandardHamiltonian`, an operator to
compute the negative log-likelihood of the problem in standardized coordinates compute the negative log-likelihood of the problem in standardized coordinates
at a given position in parameter space. at a given position in parameter space.
Finally, the :class:`~operators.energy_operators.StandardHamiltonian` Finally, the :class:`~operators.energy_operators.StandardHamiltonian`
can be constructed from the likelihood, represented by an can be constructed from the likelihood, represented by a
:class:`~operators.energy_operators.EnergyOperator` instance. :class:`~operators.energy_operators.LikelihoodOperator` instance.
Several commonly used forms of the likelihoods are already provided in Several commonly used forms of the likelihoods are already provided in
NIFTy, such as :class:`~operators.energy_operators.GaussianEnergy`, NIFTy, such as :class:`~operators.energy_operators.GaussianEnergy`,
:class:`~operators.energy_operators.PoissonianEnergy`, :class:`~operators.energy_operators.PoissonianEnergy`,
......
...@@ -184,8 +184,8 @@ A MAP estimate, which is only representative for a tiny fraction of the paramete ...@@ -184,8 +184,8 @@ A MAP estimate, which is only representative for a tiny fraction of the paramete
This causes MAP signal estimates to be more prone to overfitting the noise as well as to perception thresholds than methods that take volume effects into account. This causes MAP signal estimates to be more prone to overfitting the noise as well as to perception thresholds than methods that take volume effects into account.
Variational Inference Metric Gaussian Variational Inference
--------------------- -------------------------------------
One method that takes volume effects into account is Variational Inference (VI). One method that takes volume effects into account is Variational Inference (VI).
In VI, the posterior :math:`\mathcal{P}(\xi|d)` is approximated by a simpler, parametrized distribution, often a Gaussian :math:`\mathcal{Q}(\xi)=\mathcal{G}(\xi-m,D)`. In VI, the posterior :math:`\mathcal{P}(\xi|d)` is approximated by a simpler, parametrized distribution, often a Gaussian :math:`\mathcal{Q}(\xi)=\mathcal{G}(\xi-m,D)`.
...@@ -224,11 +224,33 @@ Thus, only the gradient of the KL is needed with respect to this, which can be e ...@@ -224,11 +224,33 @@ Thus, only the gradient of the KL is needed with respect to this, which can be e
We stochastically estimate the KL-divergence and gradients with a set of samples drawn from the approximate posterior distribution. We stochastically estimate the KL-divergence and gradients with a set of samples drawn from the approximate posterior distribution.
The particular structure of the covariance allows us to draw independent samples solving a certain system of equations. The particular structure of the covariance allows us to draw independent samples solving a certain system of equations.
This KL-divergence for MGVI is implemented in the class :class:`~nifty.7minimization.metric_gaussian_kl.MetricGaussianKL` within NIFTy7. This KL-divergence for MGVI is implemented by
:func:`~nifty7.minimization.kl_energies.MetricGaussianKL` within NIFTy7.
It should be noted that MGVI can typically only provide a lower bound on the variance.
The demo `getting_started_3.py` for example not only infers a field this way, but also the power spectrum of the process that has generated the field. Geometric Variational Inference
The cross-correlation of field and power spectrum is taken care of in this process. -------------------------------
Posterior samples can be obtained to study this cross-correlation.
It should be noted that MGVI, as any VI method, can typically only provide a lower bound on the variance. For non-linear posterior distributions :math:`\mathcal{P}(\xi|d)` an approximation with a Gaussian :math:`\mathcal{Q}(\xi)` in the coordinates :math:`\xi` is sub-optimal, as higher order interactions are ignored.
A better approximation can be achieved by constructing a coordinate system :math:`y = g\left(\xi\right)` in which the posterior is close to a Gaussian, and perform VI with a Gaussian :math:`\mathcal{Q}(y)` in these coordinates.
One useful coordinate system is obtained in case the metric :math:`M` of the posterior can be expressed as the pullback of the Euclidean metric using an invertible transformation :math:`g`:
.. math::
M = \left(\frac{\partial g}{\partial \xi}\right)^T \frac{\partial g}{\partial \xi} \ .
In general, such a transformation can only be constructed locally, i.E. in a neighbourhood of some expansion point :math:`\bar{\xi}`, denoted as :math:`g_{\bar{\xi}}\left(\xi\right)`. Using :math:`g_{\bar{\xi}}`, the Geometric Variational Inference [6]_ (GeoVI) scheme uses a zero mean, unit Gaussian :math:`\mathcal{Q}(y) = \mathcal{G}(y, 1)` approximation, which can be expressed in :math:`\xi` coordinates via the pushforward using the inverse transformation :math:`\xi = g_{\bar{\xi}}^{-1}(y)`:
.. math::
\mathcal{Q}_{\bar{\xi}}(\xi) = \left(g_{\bar{\xi}}^{-1} * \mathcal{Q}\right)(\xi) = \int \delta\left(\xi - g_{\bar{\xi}}^{-1}(y)\right) \ \mathcal{G}(y, 1) \ \mathcal{D}y \ ,
where :math:`\delta` denotes the kronecker-delta.
The remaining task in geoVI is to obtain the optimal expansion point :math:`\bar{\xi}` suth that :math:`\mathcal{Q}_{\bar{\xi}}` matches the posterior as good as possible. Analogous to the MGVI algorithm, :math:`\bar{\xi}` is obtained by minimization of the KL-divergenge between :math:`\mathcal{P}` and :math:`\mathcal{Q}_{\bar{\xi}}` w.r.t. :math:`\bar{\xi}`. Furthermore the KL is represented as a stochastic estimate using a set of samples drawn from :math:`\mathcal{Q}_{\bar{\xi}}` which is implemented in NIFTy7 via :func:`~nifty7.minimization.kl_energies.GeoMetricKL`.
A visual comparison of the MGVI and GeoVI algorithm can be found in the demo script `mgvi_visualized.py`.
.. [6] P. Frank, R. Leike, and T.A. Enßlin (2021), "Geometric Variational Inference"; `[arXiv:2105.10470] <https://arxiv.org/abs/2105.10470>`_
...@@ -90,7 +90,7 @@ class LikelihoodOperator(EnergyOperator): ...@@ -90,7 +90,7 @@ class LikelihoodOperator(EnergyOperator):
def get_metric_at(self, x): def get_metric_at(self, x):
"""Computes the Fisher information metric for a `LikelihoodOperator` """Computes the Fisher information metric for a `LikelihoodOperator`
at `x` using the Jacobian of the coordinate transformation given by at `x` using the Jacobian of the coordinate transformation given by
`get_transformation`. :func:`~nifty7.operators.operator.Operator.get_transformation`.
""" """
dtp, f = self.get_transformation() dtp, f = self.get_transformation()
ch = ScalingOperator(f.target, 1.) ch = ScalingOperator(f.target, 1.)
......
...@@ -110,17 +110,15 @@ class Operator(metaclass=NiftyMeta): ...@@ -110,17 +110,15 @@ class Operator(metaclass=NiftyMeta):
def get_transformation(self): def get_transformation(self):
"""The coordinate transformation that maps into a coordinate system """The coordinate transformation that maps into a coordinate system
where the metric of a likelihood is the Euclidean metric. where the metric of a likelihood is the Euclidean metric.
This is `None`, except when the object is considered a likelihood i.E. This is `None`, except when the object an instance of
for an instance of `EnergyOperator` with its metric being a proper :class:`~nifty7.operators.energy_operators.LikelihoodOperator` or a
Fisher information metric, or a sum or nested sum thereof. (nested) sum thereof.
Retruns Returns
------- -------
np.dtype, or dict of np.dtype : The dtype(s) of the target space of the np.dtype, or dict of np.dtype : The dtype(s) of the target space of the transformation.
transformation.
Operator : The transformation that maps from `domain` into the Operator : The transformation that maps from `domain` into the Euclidean target space.
Euclidean target space.
""" """
return None return None
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment