Typically, $s$ is a continuous field, $d$ a discrete data vector. Particularly, $R$ is not invertible.
IFT aims at **inverting** the above uninvertible problem in the **best possible way** using Bayesian statistics.
## NIFTy
NIFTy (Numerical Information Field Theory) is a Python framework in which IFT problems can be tackled easily.
Main Interfaces:
-**Spaces**: Cartesian, 2-Spheres (Healpix, Gauss-Legendre) and their respective harmonic spaces.
-**Fields**: Defined on spaces.
-**Operators**: Acting on fields.
%% Cell type:markdown id: tags:
## Wiener Filter: Formulae
### Assumptions
- $d=Rs+n$, $R$ linear operator.
- $\mathcal P (s) = \mathcal G (s,S)$, $\mathcal P (n) = \mathcal G (n,N)$ where $S, N$ are positive definite matrices.
### Posterior
The Posterior is given by:
$$\mathcal P (s|d) \propto P(s,d) = \mathcal G(d-Rs,N) \,\mathcal G(s,S) \propto \mathcal G (s-m,D) $$
where
$$\begin{align}
m &= Dj \\
D^{-1}&= (S^{-1} +R^\dagger N^{-1} R )\\
j &= R^\dagger N^{-1} d
\end{align}$$
Let us implement this in NIFTy!
%% Cell type:markdown id: tags:
## Wiener Filter: Example
- We assume statistical homogeneity and isotropy. Therefore the signal covariance $S$ is diagonal in harmonic space, and is described by a one-dimensional power spectrum, assumed here as $$P(k) = P_0\,\left(1+\left(\frac{k}{k_0}\right)^2\right)^{-\gamma /2},$$
where $\lambda_{\text{max}}$ and $\lambda_{\text{min}}$ are the largest and smallest eigenvalue of $A$, respectively.
- The larger $\kappa$ the slower Conjugate Gradient.
- By default, conjugate gradient solves: $D^{-1} m = j$ for $m$, where $D^{-1}$ can be badly conditioned. If one knows a non-singular matrix $T$ for which $TD^{-1}$ is better conditioned, one can solve the equivalent problem:
$$\tilde A m = \tilde j,$$
where $\tilde A = T D^{-1}$ and $\tilde j = Tj$.
- In our case $S^{-1}$ is responsible for the bad conditioning of $D$ depending on the chosen power spectrum. Thus, we choose
$$T = \mathcal F^\dagger S_h^{-1} \mathcal F.$$
-->
%% Cell type:markdown id: tags:
### Generate Mock data
- Generate a field $s$ and $n$ with given covariances.
title={NIFTy5: Numerical Information Field Theory v5},
@article{asclnifty6,
title={NIFTy6: Numerical Information Field Theory v5},
author={Arras, Philipp and Baltac, Mihai and Ensslin, Torsten A and Frank, Philipp and Hutschenreuter, Sebastian and Knollmueller, Jakob and Leike, Reimar and Newrzella, Max-Niklas and Platz, Lukas and Reinecke, Martin and others},
journal={Astrophysics Source Code Library},
year={2019}
...
...
@@ -13,7 +13,7 @@ NIFTy-related publications
author = {{Martin Reinecke, Theo Steininger, Marco Selig}},
title = {NIFTy -- Numerical Information Field TheorY},
NIFTy5 automatically calculates the necessary gradient from a generative model of the signal and the data and uses this to minimize the Hamiltonian.
NIFTy6 automatically calculates the necessary gradient from a generative model of the signal and the data and uses this to minimize the Hamiltonian.
However, MAP often provides unsatisfactory results in cases of deep hirachical Bayesian networks.
The reason for this is that MAP ignores the volume factors in parameter space, which are not to be neglected in deciding whether a solution is reasonable or not.
...
...
@@ -224,7 +224,7 @@ Thus, only the gradient of the KL is needed with respect to this, which can be e
We stochastically estimate the KL-divergence and gradients with a set of samples drawn from the approximate posterior distribution.
The particular structure of the covariance allows us to draw independent samples solving a certain system of equations.
This KL-divergence for MGVI is implemented in the class :class:`~minimization.metric_gaussian_kl.MetricGaussianKL` within NIFTy5.
This KL-divergence for MGVI is implemented in the class :class:`~minimization.metric_gaussian_kl.MetricGaussianKL` within NIFTy6.
The demo `getting_started_3.py` for example not only infers a field this way, but also the power spectrum of the process that has generated the field.