ift issueshttps://gitlab.mpcdf.mpg.de/groups/ift/-/issues2017-07-06T05:50:22Zhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/17Add axis keyword to obj.vdot2017-07-06T05:50:22ZTheo SteiningerAdd axis keyword to obj.vdotTheo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/16Add out-array parameter to numerical d2o operations.2017-07-06T05:50:22ZTheo SteiningerAdd out-array parameter to numerical d2o operations.Numpy supports to specify an out array in order to avoid memory reallocation.
a = np.array([1,2,3,4])
b = np.array([5,6,7,8])
# slow:
a = a + b
# fast:
np.add(a,b,out=a)
Numpy supports to specify an out array in order to avoid memory reallocation.
a = np.array([1,2,3,4])
b = np.array([5,6,7,8])
# slow:
a = a + b
# fast:
np.add(a,b,out=a)
Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/15d2o: Contraction functions rely on non-degeneracy of distribution strategy2017-07-06T05:50:22ZTheo Steiningerd2o: Contraction functions rely on non-degeneracy of distribution strategySeveral methods of the distributed_data_object rely on the fact, that the distribution strategy behaves as if the local data was non-degenerate. Currently the non-distributor fixes this by returning trivial (local) results in the _allgather and the _Allreduce_sum method.
Affected d2o methods are at least: _contraction_helper, mean
Fix: Move the functionality for sum, prod, etc... into the distributor.Several methods of the distributed_data_object rely on the fact, that the distribution strategy behaves as if the local data was non-degenerate. Currently the non-distributor fixes this by returning trivial (local) results in the _allgather and the _Allreduce_sum method.
Affected d2o methods are at least: _contraction_helper, mean
Fix: Move the functionality for sum, prod, etc... into the distributor.Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/14d2o: _contraction_helper does not work when using numpy keyword arguments2017-07-06T05:50:22ZTheo Steiningerd2o: _contraction_helper does not work when using numpy keyword argumentsThe `_contraction_helper` passes keyword arguments to the underlying numpy functions (axis=, keepdims=). The result of the _contraction_helper's local computation is then an array and not a scalar. Therefore the dtype check fails.
Fix: After solving theos/NIFTy#2, adopt to the case that the local run's result object is an array and make a further distinction of cases, i.e for something like axis=0 for the slicing_distributor. The `_contraction_helper` passes keyword arguments to the underlying numpy functions (axis=, keepdims=). The result of the _contraction_helper's local computation is then an array and not a scalar. Therefore the dtype check fails.
Fix: After solving theos/NIFTy#2, adopt to the case that the local run's result object is an array and make a further distinction of cases, i.e for something like axis=0 for the slicing_distributor. Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/13Add `axis` keyword functionality to unary methods.2017-07-06T05:50:23ZTheo SteiningerAdd `axis` keyword functionality to unary methods.Many numpy functions support the `axis` keyword in order to perform an operation only along certain directions of the array. The current implementation of d2o does not support this, e.g. for `all`, `any`, `sum`, etc...
Related to: theos/NIFTy#3Many numpy functions support the `axis` keyword in order to perform an operation only along certain directions of the array. The current implementation of d2o does not support this, e.g. for `all`, `any`, `sum`, etc...
Related to: theos/NIFTy#3https://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/12Add `source_rank` parameter to d2o.set_full_data()2017-07-06T05:50:23ZTheo SteiningerAdd `source_rank` parameter to d2o.set_full_data()A source_rank parameter should be added to the distributed_data_object in order to specify on which node the source data-array resides on. A source_rank parameter should be added to the distributed_data_object in order to specify on which node the source data-array resides on. Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/11Move `flatten` method into the distributor.2017-07-06T05:50:23ZTheo SteiningerMove `flatten` method into the distributor.At the moment `flatten` is performed by the distributed_data_object itself. Thereby it assumes, that flattening the local arrays produces the right result. In general with arbitrary distribution strategies this is wrong. At the moment `flatten` is performed by the distributed_data_object itself. Thereby it assumes, that flattening the local arrays produces the right result. In general with arbitrary distribution strategies this is wrong. Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/10Profile the d2o.bincount method2017-07-06T05:50:23ZTheo SteiningerProfile the d2o.bincount methodThe d2o.bincount method scales well with MPI parallelization but compared to single-core np.bincount has a rather big overhead. The d2o.bincount method scales well with MPI parallelization but compared to single-core np.bincount has a rather big overhead. Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/9Semi-advanced indexing is not recognized2017-07-06T05:50:23ZTheo SteiningerSemi-advanced indexing is not recognized a = np.arange(24).reshape((3, 4,2))
obj = distributed_data_object(a)
Semi-advanced indexing
a[(2,1,1),1]
yields
array([[18, 19],
[10, 11],
[10, 11]])
The ``indexinglist'' scheme in d2o expects either scalars or numpy arrays as tuple elements and therefore:
obj[(2,1,1),1] -> AttributeError
However,
obj[np.array((2,1,1)), 1]
works.
Solution: Parse the elements and in doubt cast them to numpy arrays.
a = np.arange(24).reshape((3, 4,2))
obj = distributed_data_object(a)
Semi-advanced indexing
a[(2,1,1),1]
yields
array([[18, 19],
[10, 11],
[10, 11]])
The ``indexinglist'' scheme in d2o expects either scalars or numpy arrays as tuple elements and therefore:
obj[(2,1,1),1] -> AttributeError
However,
obj[np.array((2,1,1)), 1]
works.
Solution: Parse the elements and in doubt cast them to numpy arrays.
Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/8Add support for `from array` indexing2017-07-06T05:50:23ZTheo SteiningerAdd support for `from array` indexingWhen building the kdict from pindex and kindex something of the following form must be done (a==kindex, b==pindex):
a = np.arange(16)*2
b = np.array([[3,2],[1,0]])
In [1]: a[b]
Out[1]:
array([[6, 4],
[2, 0]])
Currently, this is solved using a hack:
p.apply_scalar_function(lambda z: obj[z])
This functionality could easily be added to the get_data interface.
When building the kdict from pindex and kindex something of the following form must be done (a==kindex, b==pindex):
a = np.arange(16)*2
b = np.array([[3,2],[1,0]])
In [1]: a[b]
Out[1]:
array([[6, 4],
[2, 0]])
Currently, this is solved using a hack:
p.apply_scalar_function(lambda z: obj[z])
This functionality could easily be added to the get_data interface.
Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/7The d2o_librarian will fail when mixing different MPI comms2017-07-06T05:50:23ZTheo SteiningerThe d2o_librarian will fail when mixing different MPI commsEvery local librarian instance on a node of a MPI cluster just increments its internal counter by one when a new d2o is registered. This gets out of sync, when only a part of the full cluster is covered by a special comm.
?Possible solution: The individual librarians store the id of 'their' d2o and communicate a common id for their dictionary.
Con: Involves MPI communication.Every local librarian instance on a node of a MPI cluster just increments its internal counter by one when a new d2o is registered. This gets out of sync, when only a part of the full cluster is covered by a special comm.
?Possible solution: The individual librarians store the id of 'their' d2o and communicate a common id for their dictionary.
Con: Involves MPI communication.Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/6d2o cumsum and flatten rely on certain features of distribution strategy2017-07-06T05:50:23ZTheo Steiningerd2o cumsum and flatten rely on certain features of distribution strategycumsum and flatten assume: if the shape of the d2o changes through flattening, the distribution strategy was "slicing". cumsum and flatten assume: if the shape of the d2o changes through flattening, the distribution strategy was "slicing". Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/5Add function d2o.arange2017-07-06T05:50:23ZTheo SteiningerAdd function d2o.arangeTheo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/4Add unit tests for `copy=True/False` functionality2017-07-06T05:50:23ZTheo SteiningerAdd unit tests for `copy=True/False` functionalityTheo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/3Move obj.bincount and obj.unique into distributor and make them more efficient.2016-05-26T11:09:09ZTheo SteiningerMove obj.bincount and obj.unique into distributor and make them more efficient.For efficiency, use Allreduce instead of allgather in bincount.
Use `fast-summation` in obj.unique <http://materials.jeremybejarano.com/MPIwithPython/collectiveCom.html#fastsum>For efficiency, use Allreduce instead of allgather in bincount.
Use `fast-summation` in obj.unique <http://materials.jeremybejarano.com/MPIwithPython/collectiveCom.html#fastsum>Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/2Add tensor-/outer-dot to d2o2016-05-26T11:09:02ZTheo SteiningerAdd tensor-/outer-dot to d2oTheo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/nifty/-/issues/290Data types for the Grand Unification2020-04-08T18:20:48ZMartin ReineckeData types for the Grand Unification```
Space:
structured or unstructured set of points (current NIFTy's "Domain")
SpaceTuple:
external product of zero or more Spaces (current NIFTy's "DomainTuple")
SpaceTuple = (Space, Space, ...)
SpaceTupleDict:
dictionary (with string keys) of one or more SpaceTuples
SpaceTupleDict = {name1: SpaceTuple1, name2: SpaceTuple2, ...}
(current NIFTy's "MultiDomain")
Domain:
This is an abstract concept. Can currently be represented by SpaceTuple or SpaceTupleDict
Special domains:
ScalarDomain: empty SpaceTuple
Operator:
- Operators take an Operand defined on an input domain, transform it in some way
and return a new Operand defined on a (possibly different) target domain.
- Operators can be concatenated, as long as the domains at the interface are
identical. The result is another Operator.
Operand:
- Operand objects represent fields and potentially their Jacobians and metrics.
- an Operand object can be asked for its "value" and (if configured accordingly)
its Jacobian. If the target domain of an Operand object is scalar, a metric
may also be available.
- Applying an Operator to an Operand object will always return another Operand object.
class Operator(object):
@property
def domain(self):
# return input domain
@property
def target(self):
# return output domain
def __call__(self, other):
# if isinstance(other, Operand) return an Operand object
# else return an Operator object
def __matmul__(self, other):
return self(other)
class LinearOperator(Operator):
# more or less analogous to the current LinearOperator
class Operand(object):
# this unifies current NIFTy's Field, MultiField, and Linearization classes
@property
def domain(self):
# if no Jacobian is present, return None, else the Jacobian's domain.
@property
def target(self):
# return the domain on which the value of the Operand is defined. This is
# also the Jacobian's target (if a Jacobian is defined)
@property
def val(self):
# return a low level data structure holding the actual values (currently numpy.ndarray
# or dictionary of numpy.ndarrays. Read-only.
def val_rw(self):
# return a writeable copy of `val`
def fld(self):
# return am Operand that only contains the value content of this object. Its Jacobian and
# potential higher derivatives will be `None`.
@property
def jac(self):
return a Jacobian LinearOperator if possible, else None
@property
def want_metric(self):
return True or False
@property
def metric(self):
if self.jacobian is None, raise an exception
if self.target is not ScalarDomain, raise an exception
if not self.want_metric, raise an exception
if metric cannot be computed, raise an exception
return metric
```
```
Space:
structured or unstructured set of points (current NIFTy's "Domain")
SpaceTuple:
external product of zero or more Spaces (current NIFTy's "DomainTuple")
SpaceTuple = (Space, Space, ...)
SpaceTupleDict:
dictionary (with string keys) of one or more SpaceTuples
SpaceTupleDict = {name1: SpaceTuple1, name2: SpaceTuple2, ...}
(current NIFTy's "MultiDomain")
Domain:
This is an abstract concept. Can currently be represented by SpaceTuple or SpaceTupleDict
Special domains:
ScalarDomain: empty SpaceTuple
Operator:
- Operators take an Operand defined on an input domain, transform it in some way
and return a new Operand defined on a (possibly different) target domain.
- Operators can be concatenated, as long as the domains at the interface are
identical. The result is another Operator.
Operand:
- Operand objects represent fields and potentially their Jacobians and metrics.
- an Operand object can be asked for its "value" and (if configured accordingly)
its Jacobian. If the target domain of an Operand object is scalar, a metric
may also be available.
- Applying an Operator to an Operand object will always return another Operand object.
class Operator(object):
@property
def domain(self):
# return input domain
@property
def target(self):
# return output domain
def __call__(self, other):
# if isinstance(other, Operand) return an Operand object
# else return an Operator object
def __matmul__(self, other):
return self(other)
class LinearOperator(Operator):
# more or less analogous to the current LinearOperator
class Operand(object):
# this unifies current NIFTy's Field, MultiField, and Linearization classes
@property
def domain(self):
# if no Jacobian is present, return None, else the Jacobian's domain.
@property
def target(self):
# return the domain on which the value of the Operand is defined. This is
# also the Jacobian's target (if a Jacobian is defined)
@property
def val(self):
# return a low level data structure holding the actual values (currently numpy.ndarray
# or dictionary of numpy.ndarrays. Read-only.
def val_rw(self):
# return a writeable copy of `val`
def fld(self):
# return am Operand that only contains the value content of this object. Its Jacobian and
# potential higher derivatives will be `None`.
@property
def jac(self):
return a Jacobian LinearOperator if possible, else None
@property
def want_metric(self):
return True or False
@property
def metric(self):
if self.jacobian is None, raise an exception
if self.target is not ScalarDomain, raise an exception
if not self.want_metric, raise an exception
if metric cannot be computed, raise an exception
return metric
```
https://gitlab.mpcdf.mpg.de/ift/nifty/-/issues/250[post Nifty6] new design for the `Field` class?2019-12-02T20:58:52ZMartin Reinecke[post Nifty6] new design for the `Field` class?I'm wondering whether we can solve most of our current problems regarding fields and multifields by introducing a single new `Field` class that
- unifies the current `Field` and `MultiField` classes (a classic field would have a single dictionary entry with an empty string as name)
- is immutable
- can contain scalars or array-like objects to describe field content (maybe even functions?)
If we have such a class we can again demand that all operations on fields require identical domains, and we can still avoid writing huge zero-filled arrays by writing scalar zeros instead.
@reimar, @kjako, @parras please let me know what you think.I'm wondering whether we can solve most of our current problems regarding fields and multifields by introducing a single new `Field` class that
- unifies the current `Field` and `MultiField` classes (a classic field would have a single dictionary entry with an empty string as name)
- is immutable
- can contain scalars or array-like objects to describe field content (maybe even functions?)
If we have such a class we can again demand that all operations on fields require identical domains, and we can still avoid writing huge zero-filled arrays by writing scalar zeros instead.
@reimar, @kjako, @parras please let me know what you think.