**NIFTy** [1]_ [2]_ [3]_, "\ **N**\umerical **I**\nformation **F**\ield **T**\heor\ **y**\ ", is a versatile library designed to enable the development of signal inference algorithms that are independent of the underlying grids (spatial, spectral, temporal, …) and their resolutions.

Its object-oriented framework is written in Python, although it accesses libraries written in C++ and C for efficiency.

NIFTy offers a toolkit that abstracts discretized representations of continuous spaces, fields in these spaces, and operators acting on these fields into classes.

This allows for an abstract formulation and programming of inference algorithms, including those derived within information field theory.

NIFTy's interface is designed to resemble IFT formulae in the sense that the user implements algorithms in NIFTy independent of the topology of the underlying spaces and the discretization scheme.

Thus, the user can develop algorithms on subsets of problems and on spaces where the detailed performance of the algorithm can be properly evaluated and then easily generalize them to other, more complex spaces and the full problem, respectively.

The set of spaces on which NIFTy operates comprises point sets, *n*-dimensional regular grids, spherical spaces, their harmonic counterparts, and product spaces constructed as combinations of those.

NIFTy takes care of numerical subtleties like the normalization of operations on fields and the numerical representation of model components, allowing the user to focus on formulating the abstract inference procedures and process-specific model properties.

References

----------

.. [1] Selig et al., "NIFTY - Numerical Information Field Theory. A versatile PYTHON library for signal inference ", 2013, Astronmy and Astrophysics 554, 26; `[DOI] <https://ui.adsabs.harvard.edu/link_gateway/2013A&A...554A..26S/doi:10.1051/0004-6361/201321236>`_, `[arXiv:1301.4499] <https://arxiv.org/abs/1301.4499>`_

.. [2] Steininger et al., "NIFTy 3 - Numerical Information Field Theory - A Python framework for multicomponent signal inference on HPC clusters", 2017, accepted by Annalen der Physik; `[arXiv:1708.01073] <https://arxiv.org/abs/1708.01073>`_

.. [3] Arras et al., "NIFTy5: Numerical Information Field Theory v5", 2019, Astrophysics Source Code Library; `[ascl:1903.008] <http://ascl.net/1903.008>`_

In Variational Inference (VI), the posterior :math:`\mathcal{P}(\xi|d)` is approximated by a simpler, parametrized distribution, often a Gaussian :math:`\mathcal{Q}(\xi)=\mathcal{G}(\xi-m,D)`.

The parameters of :math:`\mathcal{Q}`, the mean :math:`m` and its covariance :math:`D` are obtained by minimization of an appropriate information distance measure between :math:`\mathcal{Q}` and :math:`\mathcal{P}`.

As a compromise between being optimal and being computationally affordable, the variational Kullback-Leibler (KL) divergence is used:

NIFTy features two main alternatives for variational inference: Metric Gaussian Variational Inference (MGVI) and geometric Variational Inference (geoVI).

A visual comparison of the MGVI and GeoVI algorithm can be found in `variational_inference_visualized.py <https://gitlab.mpcdf.mpg.de/ift/nifty/-/blob/NIFTy_8/demos/variational_inference_visualized.py>`_.

Metric Gaussian Variational Inference (MGVI)

--------------------------------------------

Minimizing the KL divergence with respect to all entries of the covariance :math:`D` is unfeasible for fields.

Therefore, Metric Gaussian Variational Inference (MGVI, [1]_) approximates the posterior precision matrix :math:`D^{-1}` at the location of the current mean :math:`m` by the Bayesian Fisher information metric,

In practice the average is performed over :math:`\mathcal{P}(d,\xi)\approx \mathcal{P}(d|\xi)\,\delta(\xi-m)` by evaluating the expression at the current mean :math:`m`.

This results in a Fisher information metric of the likelihood evaluated at the mean plus the prior information metric.

Therefore we will only have to infer the mean of the approximate distribution.

The only term within the KL-divergence that explicitly depends on it is the Hamiltonian of the true problem averaged over the approximation:

We stochastically estimate the KL-divergence and gradients with a set of samples drawn from the approximate posterior distribution.

The particular structure of the covariance allows us to draw independent samples solving a certain system of equations.

This KL-divergence for MGVI is implemented by

:func:`~nifty8.minimization.kl_energies.MetricGaussianKL` within NIFTy8.

Note that MGVI typically provides only a lower bound on the variance.

Geometric Variational Inference (geoVI)

---------------------------------------

For non-linear posterior distributions :math:`\mathcal{P}(\xi|d)` an approximation with a Gaussian :math:`\mathcal{Q}(\xi)` in the coordinates :math:`\xi` is sub-optimal, as higher order interactions are ignored.

A better approximation can be achieved by constructing a coordinate system :math:`y = g\left(\xi\right)` in which the posterior is close to a Gaussian, and perform VI with a Gaussian :math:`\mathcal{Q}(y)` in these coordinates.

This approach is called Geometric Variational Inference (geoVI).

It is discussed in detail in [2]_.

One useful coordinate system is obtained in case the metric :math:`M` of the posterior can be expressed as the pullback of the Euclidean metric by :math:`g`:

.. math::

M = \left(\frac{\partial g}{\partial \xi}\right)^T \frac{\partial g}{\partial \xi} \ .

In general, such a transformation exists only locally, i.e. in a neighbourhood of some expansion point :math:`\bar{\xi}`, denoted as :math:`g_{\bar{\xi}}\left(\xi\right)`.

Using :math:`g_{\bar{\xi}}`, the GeoVI scheme uses a zero mean, unit Gaussian :math:`\mathcal{Q}(y) = \mathcal{G}(y, 1)` approximation.

It can be expressed in :math:`\xi` coordinates via the pushforward by the inverse transformation :math:`\xi = g_{\bar{\xi}}^{-1}(y)`:

GeoVI obtains the optimal expansion point :math:`\bar{\xi}` such that :math:`\mathcal{Q}_{\bar{\xi}}` matches the posterior as good as possible.

Analogous to the MGVI algorithm, :math:`\bar{\xi}` is obtained by minimization of the KL-divergence between :math:`\mathcal{P}` and :math:`\mathcal{Q}_{\bar{\xi}}` w.r.t. :math:`\bar{\xi}`.

Furthermore the KL is represented as a stochastic estimate using a set of samples drawn from :math:`\mathcal{Q}_{\bar{\xi}}` which is implemented in NIFTy8 via :func:`~nifty8.minimization.kl_energies.GeoMetricKL`.

Publications

------------

If you use MGVI or geoVI, the authors of the respective papers [1]_ [2]_ would greatly appreciate a citation.

title={NIFTy5: Numerical Information Field Theory v5},

author={Arras, Philipp and Baltac, Mihai and Ensslin, Torsten A and Frank, Philipp and Hutschenreuter, Sebastian and Knollmueller, Jakob and Leike, Reimar and Newrzella, Max-Niklas and Platz, Lukas and Reinecke, Martin and others},

...

...

@@ -9,6 +9,7 @@ NIFTy-related publications

year={2019}

}

.. parsed-literal::

@software{nifty,

author = {{Martin Reinecke, Theo Steininger, Marco Selig}},

title = {NIFTy -- Numerical Information Field TheorY},

...

...

@@ -17,7 +18,8 @@ NIFTy-related publications

date = {2018-04-05},

}

@article{2013A&A...554A..26S,

.. parsed-literal::

@article{nifty1,

author = {{Selig}, M. and {Bell}, M.~R. and {Junklewitz}, H. and {Oppermann}, N. and {Reinecke}, M. and {Greiner}, M. and {Pachajoa}, C. and {En{\ss}lin}, T.~A.},

title = "{NIFTY - Numerical Information Field Theory. A versatile PYTHON library for signal inference}",

journal = {\aap},

...

...

@@ -35,7 +37,8 @@ NIFTy-related publications

adsnote = {Provided by the SAO/NASA Astrophysics Data System}

}

@article{2017arXiv170801073S,

.. parsed-literal::

@article{nifty3,

author = {{Steininger}, T. and {Dixit}, J. and {Frank}, P. and {Greiner}, M. and {Hutschenreuter}, S. and {Knollm{\"u}ller}, J. and {Leike}, R. and {Porqueres}, N. and {Pumpe}, D. and {Reinecke}, M. and {{\v S}raml}, M. and {Varady}, C. and {En{\ss}lin}, T.},

title = "{NIFTy 3 - Numerical Information Field Theory - A Python framework for multicomponent signal inference on HPC clusters}",

@@ -181,81 +181,3 @@ In the high dimensional setting of field inference these volume factors can diff

A MAP estimate, which is only representative for a tiny fraction of the parameter space, might be a poorer choice (with respect to an error norm) compared to a slightly worse location with slightly lower posterior probability, which, however, is associated with a much larger volume (of nearby locations with similar probability).

This causes MAP signal estimates to be more prone to overfitting the noise as well as to perception thresholds than methods that take volume effects into account.

Metric Gaussian Variational Inference

-------------------------------------

One method that takes volume effects into account is Variational Inference (VI).

In VI, the posterior :math:`\mathcal{P}(\xi|d)` is approximated by a simpler, parametrized distribution, often a Gaussian :math:`\mathcal{Q}(\xi)=\mathcal{G}(\xi-m,D)`.

The parameters of :math:`\mathcal{Q}`, the mean :math:`m` and its covariance :math:`D` are obtained by minimization of an appropriate information distance measure between :math:`\mathcal{Q}` and :math:`\mathcal{P}`.

As a compromise between being optimal and being computationally affordable, the variational Kullback-Leibler (KL) divergence is used:

Minimizing this with respect to all entries of the covariance :math:`D` is unfeasible for fields.

Therefore, Metric Gaussian Variational Inference (MGVI) approximates the posterior precision matrix :math:`D^{-1}` at the location of the current mean :math:`m` by the Bayesian Fisher information metric,

In practice the average is performed over :math:`\mathcal{P}(d,\xi)\approx \mathcal{P}(d|\xi)\,\delta(\xi-m)` by evaluating the expression at the current mean :math:`m`.

This results in a Fisher information metric of the likelihood evaluated at the mean plus the prior information metric.

Therefore we will only have to infer the mean of the approximate distribution.

The only term within the KL-divergence that explicitly depends on it is the Hamiltonian of the true problem averaged over the approximation:

We stochastically estimate the KL-divergence and gradients with a set of samples drawn from the approximate posterior distribution.

The particular structure of the covariance allows us to draw independent samples solving a certain system of equations.

This KL-divergence for MGVI is implemented by

:func:`~nifty8.minimization.kl_energies.MetricGaussianKL` within NIFTy8.

Note that MGVI typically provides only a lower bound on the variance.

Geometric Variational Inference

-------------------------------

For non-linear posterior distributions :math:`\mathcal{P}(\xi|d)` an approximation with a Gaussian :math:`\mathcal{Q}(\xi)` in the coordinates :math:`\xi` is sub-optimal, as higher order interactions are ignored.

A better approximation can be achieved by constructing a coordinate system :math:`y = g\left(\xi\right)` in which the posterior is close to a Gaussian, and perform VI with a Gaussian :math:`\mathcal{Q}(y)` in these coordinates.

This approach is called Geometric Variation Inference (geoVI).

It is discussed in detail in [6]_.

One useful coordinate system is obtained in case the metric :math:`M` of the posterior can be expressed as the pullback of the Euclidean metric by :math:`g`:

.. math::

M = \left(\frac{\partial g}{\partial \xi}\right)^T \frac{\partial g}{\partial \xi} \ .

In general, such a transformation exists only locally, i.e. in a neighbourhood of some expansion point :math:`\bar{\xi}`, denoted as :math:`g_{\bar{\xi}}\left(\xi\right)`.

Using :math:`g_{\bar{\xi}}`, the GeoVI scheme uses a zero mean, unit Gaussian :math:`\mathcal{Q}(y) = \mathcal{G}(y, 1)` approximation.

It can be expressed in :math:`\xi` coordinates via the pushforward by the inverse transformation :math:`\xi = g_{\bar{\xi}}^{-1}(y)`:

GeoVI obtains the optimal expansion point :math:`\bar{\xi}` such that :math:`\mathcal{Q}_{\bar{\xi}}` matches the posterior as good as possible.

Analogous to the MGVI algorithm, :math:`\bar{\xi}` is obtained by minimization of the KL-divergence between :math:`\mathcal{P}` and :math:`\mathcal{Q}_{\bar{\xi}}` w.r.t. :math:`\bar{\xi}`.

Furthermore the KL is represented as a stochastic estimate using a set of samples drawn from :math:`\mathcal{Q}_{\bar{\xi}}` which is implemented in NIFTy8 via :func:`~nifty8.minimization.kl_energies.GeoMetricKL`.

A visual comparison of the MGVI and GeoVI algorithm can be found in `variational_inference_visualized.py <https://gitlab.mpcdf.mpg.de/ift/nifty/-/blob/NIFTy_8/demos/variational_inference_visualized.py>`_.

.. [6] P. Frank, R. Leike, and T.A. Enßlin (2021), "Geometric Variational Inference"; `[arXiv:2105.10470] <https://arxiv.org/abs/2105.10470>`_

.. note:: Some of this discussion is rather technical and may be skipped in a first read-through.

...

...

@@ -12,7 +12,7 @@ Fields are defined to be scalar functions on the manifold, living in the functio

Unless we find ourselves in the lucky situation that we can solve for the posterior statistics of interest analytically, we need to apply numerical methods.

This is where NIFTy comes into play.

.. figure:: images/inference.png

.. figure:: ../images/inference.png

:width: 80%

:align: center

...

...

@@ -138,7 +138,7 @@ NIFTy is implemented such that in order to change resolution, only the line of c

It automatically takes care of dependent structures like volume factors, discretised operators and responses.

A visualisation of this can be seen in figure 2, which displays the MAP inference of a signal at various resolutions.

**NIFTy** [1]_ [2]_ [3]_, "\ **N**\umerical **I**\nformation **F**\ield **T**\heor\ **y**\ ", is a versatile library designed to enable the development of signal inference algorithms that are independent of the underlying grids (spatial, spectral, temporal, …) and their resolutions.

Its object-oriented framework is written in Python, although it accesses libraries written in C++ and C for efficiency.

NIFTy offers a toolkit that abstracts discretized representations of continuous spaces, fields in these spaces, and operators acting on these fields into classes.

This allows for an abstract formulation and programming of inference algorithms, including those derived within information field theory.

NIFTy's interface is designed to resemble IFT formulae in the sense that the user implements algorithms in NIFTy independent of the topology of the underlying spaces and the discretization scheme.

Thus, the user can develop algorithms on subsets of problems and on spaces where the detailed performance of the algorithm can be properly evaluated and then easily generalize them to other, more complex spaces and the full problem, respectively.

The set of spaces on which NIFTy operates comprises point sets, *n*-dimensional regular grids, spherical spaces, their harmonic counterparts, and product spaces constructed as combinations of those.

NIFTy takes care of numerical subtleties like the normalization of operations on fields and the numerical representation of model components, allowing the user to focus on formulating the abstract inference procedures and process-specific model properties.

Examples of nifty applications can be found in the `nifty gallery (external link) <https://wwwmpa.mpa-garching.mpg.de/~ensslin/nifty-gallery/index.html>`_.

References

----------

.. [1] Selig et al., "NIFTY - Numerical Information Field Theory. A versatile PYTHON library for signal inference ", 2013, Astronmy and Astrophysics 554, 26; `[DOI] <https://ui.adsabs.harvard.edu/link_gateway/2013A&A...554A..26S/doi:10.1051/0004-6361/201321236>`_, `[arXiv:1301.4499] <https://arxiv.org/abs/1301.4499>`_

.. [2] Steininger et al., "NIFTy 3 - Numerical Information Field Theory - A Python framework for multicomponent signal inference on HPC clusters", 2017, accepted by Annalen der Physik; `[arXiv:1708.01073] <https://arxiv.org/abs/1708.01073>`_

.. [3] Arras et al., "NIFTy5: Numerical Information Field Theory v5", 2019, Astrophysics Source Code Library; `[ascl:1903.008] <http://ascl.net/1903.008>`_