@@ -61,12 +61,12 @@ $d$ is the data vector and $s$ is the physical field about which one wants to le

Bayesian reconstruction algorithms may be viewed in terms of three basic building blocks.

\begin{enumerate}

\item The \emph{likelihood}$\mathcal P(d|s)$ describes the measurement process.

\item The \emph{likelihood}$\mathcal P(d|s)$ describes the measurement process.

It incorporates all available information about the experiment.

It contains the data, the statistics of the data including the error bars and a description of the measurement device itself.

\item The \emph{prior}$\mathcal P(s)$ describes the knowledge the scientist has \emph{before} executing the experiment.

\item The \emph{prior}$\mathcal P(s)$ describes the knowledge the scientist has \emph{before} executing the experiment.

In order to define a prior one needs to make one's knowledge about the physical process which was observed by the measurement device explicit.

\item Finally, one needs an algorithmic and computational framework which is able to actually do the Bayesian inference, compute the \emph{posterior}$\mathcal P(s/d)$, given the above information.

\item Finally, one needs an algorithmic and computational framework which is able to actually do the Bayesian inference, compute the \emph{posterior}$\mathcal P(s/d)$, given the above information.

\end{enumerate}

It becomes clear that these three parts are separate from each other.

...

...

@@ -84,9 +84,9 @@ Since Torsten's group cannot help you a lot in implementing the likelihood, we h

\section*{Disclaimer}

\begin{itemize}

\item IFT is good for reconstructing fields, i.e.\ physical quantities (like temperature, flux, etc.) which are defined on a continuous space (like time, space, frequency/energy of electromagnetic radiation).

\item IFT is good for reconstructing fields, i.e.\ physical quantities (like temperature, flux, etc.) which are defined on a continuous space (like time, space, frequency/energy of electromagnetic radiation).

If you've got data, which tells you something about such a field, keep on reading. If not, you may use IFT and NIFTy as well, but there may be other methods available which fit your needs better.

\item The explanations below shall be pretty general in order to be applicable to a variety of problems.

\item The explanations below shall be pretty general in order to be applicable to a variety of problems.

Therefore, the text might be difficult to understand in its full generality.

However, it is not unlikely that your \lstinline{R} (which will be defined below) is just a linear function.

In this case everything becomes very simple, since the derivative of a linear function is just the function itself.

...

...

@@ -94,14 +94,14 @@ Since Torsten's group cannot help you a lot in implementing the likelihood, we h

You do not need to understand what I'm saying already at this point.

Just feel lucky and relax.

You may relax anyways.

\item Read the full guide before you start implementing something.

\item Read the full guide before you start implementing something.

\end{itemize}

\section*{Mathematical description of what NIFTy needs}

The abstract definition of what you'll need to implement is very short:

\begin{itemize}

\item The log-likelihood $\mathcal H (d|s) :=-\log\mathcal P (d|s)$ as a function of $s$.

\item Its derivative $\mathcal H' =\frac{\partial}{\partial s}\mathcal H (d|s)$.

\item The log-likelihood $\mathcal H (d|s) :=-\log\mathcal P (d|s)$ as a function of $s$.

\item Its derivative $\mathcal H' =\frac{\partial}{\partial s}\mathcal H (d|s)$.

\end{itemize}

That's it.

The rest of this guide explains what these formulae mean and how to compute them.

\item$R' =\left.\frac{d R}{d s}\right|_{s=\text{postition}}$ (needed for second-order minimization only).

\end{itemize}

...

...

@@ -166,11 +166,11 @@ Since $R$ is a function:

R: \mathbb R^{s.shape}\to\mathbb R^{d.shape},

\end{align*}

its gradient at the position \lstinline{s=position} is a linear map of the following shape:\footnote{There are various ways to think about derivatives and gradients of multi-dimensional functions.

A different view on gradients would be that at a given point $s=\text{position}$ the gradient is a matrix with \lstinline{s.size} columns and \lstinline{d.size} rows.

Obviously, it is not feasible to store such a matrix on a computer due to its size.

Therefore we think of this matrix in terms of a linear map which maps an array of shape \lstinline{s.shape} to an array of shape \lstinline{d.shape}.

This linear map shall be implemented on the computer in terms of a function.

Think of this map as a linear approximation to \lstinline{R} based at \lstinline{position}.}

A different view on gradients would be that at a given point $s=\text{position}$ the gradient is a matrix with \lstinline{s.size} columns and \lstinline{d.size} rows.

Obviously, it is not feasible to store such a matrix on a computer due to its size.

Therefore we think of this matrix in terms of a linear map which maps an array of shape \lstinline{s.shape} to an array of shape \lstinline{d.shape}.

This linear map shall be implemented on the computer in terms of a function.

Think of this map as a linear approximation to \lstinline{R} based at \lstinline{position}.}

@@ -214,14 +214,14 @@ And this is where NIFTy comes in. NIFTy has the notion of \enquote{spaces}.

In a nutshell, a NIFTy space comes with the following properties:

\begin{itemize}

\item Topology of the space supported by NIFTY:

\item Topology of the space supported by NIFTY:

\begin{itemize}

\item regular grid spaces of arbitrary dimensions (RGSpace)

\item two 2-sphere pixelations, namely HEALPix (HPSpace) and Gauss-Legendre (GLSpace)

\item unstructured spaces (UnstructuredDomain) (which do not have a specific topology and do not have pixel sizes).

\end{itemize}

\item Size of a pixel. This is called \enquote{distances} in NIFTy.

\item Size of a pixel. This is called \enquote{distances} in NIFTy.

\end{itemize}

What needs to be done now, is to make the three functions which were implemented above (\lstinline{R(s)}, \lstinline{R_prime(position, s)} and \lstinline{R_prime_adjoint(position, d)}) aware of the space on which \lstinline{position}, \lstinline{s} (these are the same) and \lstinline{d} are defined.

...

...

@@ -233,10 +233,10 @@ The method \lstinline{apply()} takes an instance of \lstinline{Field} (which is

Some tips exclusively for you:

\begin{itemize}

\item Please have a look at the method \lstinline{weight()} of \lstinline{Field}.

\item Please have a look at the method \lstinline{weight()} of \lstinline{Field}.

With it, you can easily multiply the field values with or divide by the volume of each pixel.

\item The methods \lstinline{from_raw()} and \lstinline{val()} convert a Field to a numpy array and vice versa.

\item Be aware of the fact that NIFTy fields are immutable.

\item The methods \lstinline{from_raw()} and \lstinline{val()} convert a Field to a numpy array and vice versa.

\item Be aware of the fact that NIFTy fields are immutable.

As soon as you pass a numpy array into a NIFTy field with \lstinline{from_raw()}, the \enquote{lock} flag will be set and the numpy array is immutable afterwards.

\lstinline{val()} returns an object reference to this numpy array which is locked and cannot be modified.

If you want to modify it, you may want to obtain an unlocked copy via \lstinline{val_rw()}.

...

...

@@ -260,14 +260,14 @@ All pixel volumes should be taken care of by you.

\section*{Example: $\gamma$-ray imaging}

The information a $\gamma$-ray astronomer would provide to the algorithm (in the simplest case):

\begin{itemize}

\item Data has Poissonian statistics.

\item Two functions: \lstinline{R(s)} and \lstinline{R_adjoint(d)} where $R$ applies a convolution with a Gaussian kernel of given size to \lstinline{s} and picks out the positions to which there exists a data point in \lstinline{d}. \lstinline{R_adjoint(d)} implements $R^\dagger$.

\item Data has Poissonian statistics.

\item Two functions: \lstinline{R(s)} and \lstinline{R_adjoint(d)} where $R$ applies a convolution with a Gaussian kernel of given size to \lstinline{s} and picks out the positions to which there exists a data point in \lstinline{d}. \lstinline{R_adjoint(d)} implements $R^\dagger$.

\end{itemize}

Why is this already sufficient?

\begin{itemize}

\item The Hamiltonian $\mathcal H$ is given by: $\mathcal H (d|s)=-\log(R(s))^\dagger d+\sum_i R(s)_i$.

\item The Hamiltonian $\mathcal H$ is given by: $\mathcal H (d|s)=-\log(R(s))^\dagger d+\sum_i R(s)_i$.

Implementing $R$ and stating that the data is Poissonian determines this form.

\item Since $R$ is a composition of a convolution and a sampling, both of which are linear operations, $R$ itself is a linear operator.\footnote{i.e.\ $R(\alpha s_1+s_2)=\alpha R(s_1)+ R(s_2)$.} Thus, $R' = R$ and $R'^\dagger= R^\dagger$.

\item Since $R$ is a composition of a convolution and a sampling, both of which are linear operations, $R$ itself is a linear operator.\footnote{i.e.\ $R(\alpha s_1+s_2)=\alpha R(s_1)+ R(s_2)$.} Thus, $R' = R$ and $R'^\dagger= R^\dagger$.

All in all, we need an implementation for $R$ and $R^\dagger$.