Commit 8103fbbd authored by Philipp Arras's avatar Philipp Arras
Browse files

Cosmetics

parent cc077788
...@@ -84,7 +84,7 @@ LikelihoodOperator ...@@ -84,7 +84,7 @@ LikelihoodOperator
A new subclass of `EnergyOperator` was introduced and all `EnergyOperator`s A new subclass of `EnergyOperator` was introduced and all `EnergyOperator`s
that are likelihoods are now `LikelihoodOperator`s. A `LikelihoodOperator` that are likelihoods are now `LikelihoodOperator`s. A `LikelihoodOperator`
has to implement the function `get_transformation`, which returns a has to implement the function `get_transformation`, which returns a
coordinate transformation in which the Fisher metric of the likelihood becomes coordinate transformation in which the Fisher metric of the likelihood becomes
the identity matrix. the identity matrix.
......
...@@ -11,7 +11,7 @@ ...@@ -11,7 +11,7 @@
# You should have received a copy of the GNU General Public License # You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>. # along with this program. If not, see <http://www.gnu.org/licenses/>.
# #
# Copyright(C) 2013-2020 Max-Planck-Society # Copyright(C) 2013-2021 Max-Planck-Society
# #
# NIFTy is being developed at the Max-Planck-Institut fuer Astrophysik. # NIFTy is being developed at the Max-Planck-Institut fuer Astrophysik.
...@@ -97,13 +97,13 @@ def main(): ...@@ -97,13 +97,13 @@ def main():
data = signal_response(mock_position) + N.draw_sample_with_dtype(dtype=np.float64) data = signal_response(mock_position) + N.draw_sample_with_dtype(dtype=np.float64)
# Minimization parameters # Minimization parameters
ic_sampling = ift.AbsDeltaEnergyController( ic_sampling = ift.AbsDeltaEnergyController(name="Sampling (linear)",
deltaE=0.05, iteration_limit=100) deltaE=0.05, iteration_limit=100)
ic_newton = ift.AbsDeltaEnergyController( ic_newton = ift.AbsDeltaEnergyController(name='Newton', deltaE=0.5,
name='Newton', deltaE=0.5, iteration_limit=35) convergence_level=2, iteration_limit=35)
minimizer = ift.NewtonCG(ic_newton) minimizer = ift.NewtonCG(ic_newton)
ic_sampling_nl = ift.AbsDeltaEnergyController( ic_sampling_nl = ift.AbsDeltaEnergyController(name='Sampling (nonlin)',
name='Sampling', deltaE=0.5, iteration_limit=15, convergence_level=2) deltaE=0.5, iteration_limit=15, convergence_level=2)
minimizer_sampling = ift.NewtonCG(ic_sampling_nl) minimizer_sampling = ift.NewtonCG(ic_sampling_nl)
# Set up likelihood and information Hamiltonian # Set up likelihood and information Hamiltonian
......
...@@ -224,7 +224,7 @@ Thus, only the gradient of the KL is needed with respect to this, which can be e ...@@ -224,7 +224,7 @@ Thus, only the gradient of the KL is needed with respect to this, which can be e
We stochastically estimate the KL-divergence and gradients with a set of samples drawn from the approximate posterior distribution. We stochastically estimate the KL-divergence and gradients with a set of samples drawn from the approximate posterior distribution.
The particular structure of the covariance allows us to draw independent samples solving a certain system of equations. The particular structure of the covariance allows us to draw independent samples solving a certain system of equations.
This KL-divergence for MGVI is implemented by This KL-divergence for MGVI is implemented by
:func:`~nifty7.minimization.kl_energies.MetricGaussianKL` within NIFTy7. :func:`~nifty7.minimization.kl_energies.MetricGaussianKL` within NIFTy7.
It should be noted that MGVI can typically only provide a lower bound on the variance. It should be noted that MGVI can typically only provide a lower bound on the variance.
...@@ -246,7 +246,7 @@ In general, such a transformation can only be constructed locally, i.E. in a nei ...@@ -246,7 +246,7 @@ In general, such a transformation can only be constructed locally, i.E. in a nei
.. math:: .. math::
\mathcal{Q}_{\bar{\xi}}(\xi) = \left(g_{\bar{\xi}}^{-1} * \mathcal{Q}\right)(\xi) = \int \delta\left(\xi - g_{\bar{\xi}}^{-1}(y)\right) \ \mathcal{G}(y, 1) \ \mathcal{D}y \ , \mathcal{Q}_{\bar{\xi}}(\xi) = \left(g_{\bar{\xi}}^{-1} * \mathcal{Q}\right)(\xi) = \int \delta\left(\xi - g_{\bar{\xi}}^{-1}(y)\right) \ \mathcal{G}(y, 1) \ \mathcal{D}y \ ,
where :math:`\delta` denotes the kronecker-delta. where :math:`\delta` denotes the kronecker-delta.
The remaining task in geoVI is to obtain the optimal expansion point :math:`\bar{\xi}` suth that :math:`\mathcal{Q}_{\bar{\xi}}` matches the posterior as good as possible. Analogous to the MGVI algorithm, :math:`\bar{\xi}` is obtained by minimization of the KL-divergenge between :math:`\mathcal{P}` and :math:`\mathcal{Q}_{\bar{\xi}}` w.r.t. :math:`\bar{\xi}`. Furthermore the KL is represented as a stochastic estimate using a set of samples drawn from :math:`\mathcal{Q}_{\bar{\xi}}` which is implemented in NIFTy7 via :func:`~nifty7.minimization.kl_energies.GeoMetricKL`. The remaining task in geoVI is to obtain the optimal expansion point :math:`\bar{\xi}` suth that :math:`\mathcal{Q}_{\bar{\xi}}` matches the posterior as good as possible. Analogous to the MGVI algorithm, :math:`\bar{\xi}` is obtained by minimization of the KL-divergenge between :math:`\mathcal{P}` and :math:`\mathcal{Q}_{\bar{\xi}}` w.r.t. :math:`\bar{\xi}`. Furthermore the KL is represented as a stochastic estimate using a set of samples drawn from :math:`\mathcal{Q}_{\bar{\xi}}` which is implemented in NIFTy7 via :func:`~nifty7.minimization.kl_energies.GeoMetricKL`.
......
...@@ -11,7 +11,7 @@ ...@@ -11,7 +11,7 @@
# You should have received a copy of the GNU General Public License # You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>. # along with this program. If not, see <http://www.gnu.org/licenses/>.
# #
# Copyright(C) 2013-2020 Max-Planck-Society # Copyright(C) 2013-2021 Max-Planck-Society
# Authors: Philipp Frank # Authors: Philipp Frank
# #
# NIFTy is being developed at the Max-Planck-Institut fuer Astrophysik. # NIFTy is being developed at the Max-Planck-Institut fuer Astrophysik.
...@@ -21,6 +21,7 @@ from functools import reduce ...@@ -21,6 +21,7 @@ from functools import reduce
from .. import random from .. import random
from .. import utilities from .. import utilities
from ..domain_tuple import DomainTuple
from ..linearization import Linearization from ..linearization import Linearization
from ..multi_field import MultiField from ..multi_field import MultiField
from ..operators.inversion_enabler import InversionEnabler from ..operators.inversion_enabler import InversionEnabler
...@@ -38,17 +39,6 @@ from ..utilities import myassert ...@@ -38,17 +39,6 @@ from ..utilities import myassert
from .energy import Energy from .energy import Energy
from .descent_minimizers import DescentMinimizer, ConjugateGradient from .descent_minimizers import DescentMinimizer, ConjugateGradient
def _is_prior_dtype_float(H):
real = True
dts = H._prior._met._dtype
if isinstance(dts, dict):
for k in dts.keys():
if not np.issubdtype(dts[k], np.float):
real = False
else:
real = np.issubdtype(dts, np.float)
return real
def _get_lo_hi(comm, n_samples): def _get_lo_hi(comm, n_samples):
ntask, rank, _ = utilities.get_MPI_params_from_comm(comm) ntask, rank, _ = utilities.get_MPI_params_from_comm(comm)
...@@ -109,7 +99,7 @@ class _SampledKLEnergy(Energy): ...@@ -109,7 +99,7 @@ class _SampledKLEnergy(Energy):
self._comm = comm self._comm = comm
self._local_samples = local_samples self._local_samples = local_samples
self._nanisinf = bool(nanisinf) self._nanisinf = bool(nanisinf)
lin = Linearization.make_var(mean) lin = Linearization.make_var(mean)
v, g = [], [] v, g = [], []
for s in self._local_samples: for s in self._local_samples:
...@@ -183,12 +173,21 @@ class _SampledKLEnergy(Energy): ...@@ -183,12 +173,21 @@ class _SampledKLEnergy(Energy):
class _GeoMetricSampler: class _GeoMetricSampler:
def __init__(self, position, H, minimizer, start_from_lin, def __init__(self, position, H, minimizer, start_from_lin,
n_samples, mirror_samples, napprox=0, want_error=False): n_samples, mirror_samples, napprox=0, want_error=False):
if not isinstance(H, StandardHamiltonian): if not isinstance(H, StandardHamiltonian):
raise NotImplementedError raise NotImplementedError
if not _is_prior_dtype_float(H):
# Check domain dtype
dts = H._prior._met._dtype
if isinstance(H.domain, DomainTuple):
real = np.issubdtype(dts, np.float)
else:
real = all([np.issubdtype(dts[kk], np.float) for kk in dts.keys()])
if not real:
raise ValueError("_GeoMetricSampler only supports real valued latent DOFs.") raise ValueError("_GeoMetricSampler only supports real valued latent DOFs.")
# /Check domain dtype
if isinstance(position, MultiField): if isinstance(position, MultiField):
self._position = position.extract(H.domain) self._position = position.extract(H.domain)
else: else:
...@@ -199,19 +198,18 @@ class _GeoMetricSampler: ...@@ -199,19 +198,18 @@ class _GeoMetricSampler:
dtype, f_lh = tr dtype, f_lh = tr
scale = ScalingOperator(f_lh.target, 1.) scale = ScalingOperator(f_lh.target, 1.)
if isinstance(dtype, dict): if isinstance(dtype, dict):
sampling = reduce((lambda a,b: a*b), sampling = reduce((lambda a,b: a*b),
[dtype[k] is not None for k in dtype.keys()]) [dtype[k] is not None for k in dtype.keys()])
else: else:
sampling = dtype is not None sampling = dtype is not None
scale = SamplingDtypeSetter(scale, dtype) if sampling else scale scale = SamplingDtypeSetter(scale, dtype) if sampling else scale
fl = f_lh(Linearization.make_var(self._position)) fl = f_lh(Linearization.make_var(self._position))
self._g = (Adder(-self._position) + self._g = (Adder(-self._position) + fl.jac.adjoint@Adder(-fl.val)@f_lh)
fl.jac.adjoint@Adder(-fl.val)@f_lh)
self._likelihood = SandwichOperator.make(fl.jac, scale) self._likelihood = SandwichOperator.make(fl.jac, scale)
self._prior = SamplingDtypeSetter(ScalingOperator(fl.domain,1.), np.float64) self._prior = SamplingDtypeSetter(ScalingOperator(fl.domain,1.), np.float64)
self._met = self._likelihood + self._prior self._met = self._likelihood + self._prior
if napprox >=1: if napprox >= 1:
self._approximation = makeOp(approximation2endo(self._met, napprox)).inverse self._approximation = makeOp(approximation2endo(self._met, napprox)).inverse
else: else:
self._approximation = None self._approximation = None
...@@ -378,7 +376,7 @@ def MetricGaussianKL(mean, hamiltonian, n_samples, mirror_samples, constants=[], ...@@ -378,7 +376,7 @@ def MetricGaussianKL(mean, hamiltonian, n_samples, mirror_samples, constants=[],
local_samples, nanisinf) local_samples, nanisinf)
def GeoMetricKL(mean, hamiltonian, n_samples, minimizer_samp, mirror_samples, def GeoMetricKL(mean, hamiltonian, n_samples, minimizer_samp, mirror_samples,
start_from_lin = True, constants=[], point_estimates=[], start_from_lin = True, constants=[], point_estimates=[],
napprox=0, comm=None, nanisinf=True): napprox=0, comm=None, nanisinf=True):
"""Provides the sampled Kullback-Leibler used in geometric Variational """Provides the sampled Kullback-Leibler used in geometric Variational
...@@ -386,7 +384,7 @@ def GeoMetricKL(mean, hamiltonian, n_samples, minimizer_samp, mirror_samples, ...@@ -386,7 +384,7 @@ def GeoMetricKL(mean, hamiltonian, n_samples, minimizer_samp, mirror_samples,
In geoVI a probability distribution is approximated with a standard normal In geoVI a probability distribution is approximated with a standard normal
distribution in the canonical coordinate system of the Riemannian manifold distribution in the canonical coordinate system of the Riemannian manifold
associated with the metric of the other distribution. The coordinate associated with the metric of the other distribution. The coordinate
transformation is approximated by expanding around a point. In order to transformation is approximated by expanding around a point. In order to
infer the expansion point, a stochastic estimate of the Kullback-Leibler infer the expansion point, a stochastic estimate of the Kullback-Leibler
divergence is minimized. This estimate is obtained by sampling from the divergence is minimized. This estimate is obtained by sampling from the
...@@ -414,7 +412,7 @@ def GeoMetricKL(mean, hamiltonian, n_samples, minimizer_samp, mirror_samples, ...@@ -414,7 +412,7 @@ def GeoMetricKL(mean, hamiltonian, n_samples, minimizer_samp, mirror_samples,
sample variation is counterbalanced. sample variation is counterbalanced.
start_from_lin : boolean start_from_lin : boolean
Whether the non-linear sampling should start using the inverse Whether the non-linear sampling should start using the inverse
linearized transformation (i.E. the corresponding MGVI sample). linearized transformation (i.E. the corresponding MGVI sample).
If False, the minimization starts from the prior sample. If False, the minimization starts from the prior sample.
Default is True. Default is True.
constants : list constants : list
...@@ -425,7 +423,7 @@ def GeoMetricKL(mean, hamiltonian, n_samples, minimizer_samp, mirror_samples, ...@@ -425,7 +423,7 @@ def GeoMetricKL(mean, hamiltonian, n_samples, minimizer_samp, mirror_samples,
(possibly) optimized for, corresponding to point estimates of these. (possibly) optimized for, corresponding to point estimates of these.
Default is to draw samples for the complete domain. Default is to draw samples for the complete domain.
napprox : int napprox : int
Number of samples for computing preconditioner for linear sampling. Number of samples for computing preconditioner for linear sampling.
No preconditioning is done by default. No preconditioning is done by default.
comm : MPI communicator or None comm : MPI communicator or None
If not None, samples will be distributed as evenly as possible If not None, samples will be distributed as evenly as possible
...@@ -444,7 +442,7 @@ def GeoMetricKL(mean, hamiltonian, n_samples, minimizer_samp, mirror_samples, ...@@ -444,7 +442,7 @@ def GeoMetricKL(mean, hamiltonian, n_samples, minimizer_samp, mirror_samples,
during minimization and vice versa. during minimization and vice versa.
DomainTuples should never be created using the constructor, but rather DomainTuples should never be created using the constructor, but rather
via the factory function :attr:`make`! via the factory function :attr:`make`!
Note on MPI and mirror_samples: Note on MPI and mirror_samples:
As in MGVI, mirroreing samples can help to stabilize the latent mean as it As in MGVI, mirroreing samples can help to stabilize the latent mean as it
reduces sampling noise. But unlike MGVI a mirrored sample involves an reduces sampling noise. But unlike MGVI a mirrored sample involves an
......
...@@ -93,10 +93,11 @@ class LikelihoodOperator(EnergyOperator): ...@@ -93,10 +93,11 @@ class LikelihoodOperator(EnergyOperator):
:func:`~nifty7.operators.operator.Operator.get_transformation`. :func:`~nifty7.operators.operator.Operator.get_transformation`.
""" """
dtp, f = self.get_transformation() dtp, f = self.get_transformation()
ch = ScalingOperator(f.target, 1.) ch = None
if dtp is not None: if dtp is not None:
ch = SamplingDtypeSetter(ch, dtp) ch = SamplingDtypeSetter(ScalingOperator(f.target, 1.), dtp)
return SandwichOperator.make(f(Linearization.make_var(x)).jac, ch) bun = f(Linearization.make_var(x)).jac
return SandwichOperator.make(bun, ch)
class Squared2NormOperator(EnergyOperator): class Squared2NormOperator(EnergyOperator):
...@@ -176,13 +177,13 @@ class VariableCovarianceGaussianEnergy(LikelihoodOperator): ...@@ -176,13 +177,13 @@ class VariableCovarianceGaussianEnergy(LikelihoodOperator):
use_full_fisher: boolean use_full_fisher: boolean
Whether or not the proper Fisher information metric should be used as Whether or not the proper Fisher information metric should be used as
a `metric`. If False the same approximation used in a `metric`. If False the same approximation used in
`get_transformation` is used instead. `get_transformation` is used instead.
Default is True Default is True
""" """
def __init__(self, domain, residual_key, inverse_covariance_key, sampling_dtype, def __init__(self, domain, residual_key, inverse_covariance_key,
use_full_fisher = True): sampling_dtype, use_full_fisher=True):
self._kr = str(residual_key) self._kr = str(residual_key)
self._ki = str(inverse_covariance_key) self._ki = str(inverse_covariance_key)
dom = DomainTuple.make(domain) dom = DomainTuple.make(domain)
...@@ -190,7 +191,7 @@ class VariableCovarianceGaussianEnergy(LikelihoodOperator): ...@@ -190,7 +191,7 @@ class VariableCovarianceGaussianEnergy(LikelihoodOperator):
self._dt = {self._kr: sampling_dtype, self._ki: np.float64} self._dt = {self._kr: sampling_dtype, self._ki: np.float64}
_check_sampling_dtype(self._domain, self._dt) _check_sampling_dtype(self._domain, self._dt)
self._cplx = _iscomplex(sampling_dtype) self._cplx = _iscomplex(sampling_dtype)
self._use_fisher = use_full_fisher self._use_full_fisher = use_full_fisher
def apply(self, x): def apply(self, x):
self._check_input(x) self._check_input(x)
...@@ -201,7 +202,7 @@ class VariableCovarianceGaussianEnergy(LikelihoodOperator): ...@@ -201,7 +202,7 @@ class VariableCovarianceGaussianEnergy(LikelihoodOperator):
res = 0.5*(r.vdot(r*i) - i.ptw("log").sum()) res = 0.5*(r.vdot(r*i) - i.ptw("log").sum())
if not x.want_metric: if not x.want_metric:
return res return res
if self._use_fisher: if self._use_full_fisher:
met = 1. if self._cplx else 0.5 met = 1. if self._cplx else 0.5
met = MultiField.from_dict({self._kr: i.val, self._ki: met*i.val**(-2)}, met = MultiField.from_dict({self._kr: i.val, self._ki: met*i.val**(-2)},
domain=self._domain) domain=self._domain)
...@@ -231,7 +232,7 @@ class VariableCovarianceGaussianEnergy(LikelihoodOperator): ...@@ -231,7 +232,7 @@ class VariableCovarianceGaussianEnergy(LikelihoodOperator):
return None, res return None, res
def get_transformation(self): def get_transformation(self):
"""Note that for the metric of a `VariableCovarianceGaussianEnergy` no """Note that for the metric of a `VariableCovarianceGaussianEnergy` no
global transformation to Euclidean space exists. A local approximation global transformation to Euclidean space exists. A local approximation
ivoking the resudual is used instead. ivoking the resudual is used instead.
""" """
...@@ -616,6 +617,3 @@ class AveragedEnergy(EnergyOperator): ...@@ -616,6 +617,3 @@ class AveragedEnergy(EnergyOperator):
dtp, trafo = self._h.get_transformation() dtp, trafo = self._h.get_transformation()
mymap = map(lambda v: trafo@Adder(v), self._res_samples) mymap = map(lambda v: trafo@Adder(v), self._res_samples)
return dtp, utilities.my_sum(mymap)/np.sqrt(len(self._res_samples)) return dtp, utilities.my_sum(mymap)/np.sqrt(len(self._res_samples))
...@@ -110,14 +110,14 @@ class Operator(metaclass=NiftyMeta): ...@@ -110,14 +110,14 @@ class Operator(metaclass=NiftyMeta):
def get_transformation(self): def get_transformation(self):
"""The coordinate transformation that maps into a coordinate system """The coordinate transformation that maps into a coordinate system
where the metric of a likelihood is the Euclidean metric. where the metric of a likelihood is the Euclidean metric.
This is `None`, except when the object an instance of This is `None`, except when the object an instance of
:class:`~nifty7.operators.energy_operators.LikelihoodOperator` or a :class:`~nifty7.operators.energy_operators.LikelihoodOperator` or a
(nested) sum thereof. (nested) sum thereof.
Returns Returns
------- -------
np.dtype, or dict of np.dtype : The dtype(s) of the target space of the transformation. np.dtype, or dict of np.dtype : The dtype(s) of the target space of the transformation.
Operator : The transformation that maps from `domain` into the Euclidean target space. Operator : The transformation that maps from `domain` into the Euclidean target space.
""" """
return None return None
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment