ift issueshttps://gitlab.mpcdf.mpg.de/groups/ift/-/issues2017-07-06T05:50:22Zhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/18Improve obj.searchsorted such that it supports scalar, numpy arrays and distr...2017-07-06T05:50:22ZTheo SteiningerImprove obj.searchsorted such that it supports scalar, numpy arrays and distributed_data_objects as input......and return a distributed_data_object if the input was distributed.
...and return a distributed_data_object if the input was distributed.
Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/17Add axis keyword to obj.vdot2017-07-06T05:50:22ZTheo SteiningerAdd axis keyword to obj.vdotTheo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/2Add tensor-/outer-dot to d2o2016-05-26T11:09:02ZTheo SteiningerAdd tensor-/outer-dot to d2oTheo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/3Move obj.bincount and obj.unique into distributor and make them more efficient.2016-05-26T11:09:09ZTheo SteiningerMove obj.bincount and obj.unique into distributor and make them more efficient.For efficiency, use Allreduce instead of allgather in bincount.
Use `fast-summation` in obj.unique <http://materials.jeremybejarano.com/MPIwithPython/collectiveCom.html#fastsum>For efficiency, use Allreduce instead of allgather in bincount.
Use `fast-summation` in obj.unique <http://materials.jeremybejarano.com/MPIwithPython/collectiveCom.html#fastsum>Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/4Add unit tests for `copy=True/False` functionality2017-07-06T05:50:23ZTheo SteiningerAdd unit tests for `copy=True/False` functionalityTheo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/5Add function d2o.arange2017-07-06T05:50:23ZTheo SteiningerAdd function d2o.arangeTheo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/6d2o cumsum and flatten rely on certain features of distribution strategy2017-07-06T05:50:23ZTheo Steiningerd2o cumsum and flatten rely on certain features of distribution strategycumsum and flatten assume: if the shape of the d2o changes through flattening, the distribution strategy was "slicing". cumsum and flatten assume: if the shape of the d2o changes through flattening, the distribution strategy was "slicing". Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/7The d2o_librarian will fail when mixing different MPI comms2017-07-06T05:50:23ZTheo SteiningerThe d2o_librarian will fail when mixing different MPI commsEvery local librarian instance on a node of a MPI cluster just increments its internal counter by one when a new d2o is registered. This gets out of sync, when only a part of the full cluster is covered by a special comm.
?Possible solution: The individual librarians store the id of 'their' d2o and communicate a common id for their dictionary.
Con: Involves MPI communication.Every local librarian instance on a node of a MPI cluster just increments its internal counter by one when a new d2o is registered. This gets out of sync, when only a part of the full cluster is covered by a special comm.
?Possible solution: The individual librarians store the id of 'their' d2o and communicate a common id for their dictionary.
Con: Involves MPI communication.Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/8Add support for `from array` indexing2017-07-06T05:50:23ZTheo SteiningerAdd support for `from array` indexingWhen building the kdict from pindex and kindex something of the following form must be done (a==kindex, b==pindex):
a = np.arange(16)*2
b = np.array([[3,2],[1,0]])
In [1]: a[b]
Out[1]:
array([[6, 4],
[2, 0]])
Currently, this is solved using a hack:
p.apply_scalar_function(lambda z: obj[z])
This functionality could easily be added to the get_data interface.
When building the kdict from pindex and kindex something of the following form must be done (a==kindex, b==pindex):
a = np.arange(16)*2
b = np.array([[3,2],[1,0]])
In [1]: a[b]
Out[1]:
array([[6, 4],
[2, 0]])
Currently, this is solved using a hack:
p.apply_scalar_function(lambda z: obj[z])
This functionality could easily be added to the get_data interface.
Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/9Semi-advanced indexing is not recognized2017-07-06T05:50:23ZTheo SteiningerSemi-advanced indexing is not recognized a = np.arange(24).reshape((3, 4,2))
obj = distributed_data_object(a)
Semi-advanced indexing
a[(2,1,1),1]
yields
array([[18, 19],
[10, 11],
[10, 11]])
The ``indexinglist'' scheme in d2o expects either scalars or numpy arrays as tuple elements and therefore:
obj[(2,1,1),1] -> AttributeError
However,
obj[np.array((2,1,1)), 1]
works.
Solution: Parse the elements and in doubt cast them to numpy arrays.
a = np.arange(24).reshape((3, 4,2))
obj = distributed_data_object(a)
Semi-advanced indexing
a[(2,1,1),1]
yields
array([[18, 19],
[10, 11],
[10, 11]])
The ``indexinglist'' scheme in d2o expects either scalars or numpy arrays as tuple elements and therefore:
obj[(2,1,1),1] -> AttributeError
However,
obj[np.array((2,1,1)), 1]
works.
Solution: Parse the elements and in doubt cast them to numpy arrays.
Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/10Profile the d2o.bincount method2017-07-06T05:50:23ZTheo SteiningerProfile the d2o.bincount methodThe d2o.bincount method scales well with MPI parallelization but compared to single-core np.bincount has a rather big overhead. The d2o.bincount method scales well with MPI parallelization but compared to single-core np.bincount has a rather big overhead. Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/11Move `flatten` method into the distributor.2017-07-06T05:50:23ZTheo SteiningerMove `flatten` method into the distributor.At the moment `flatten` is performed by the distributed_data_object itself. Thereby it assumes, that flattening the local arrays produces the right result. In general with arbitrary distribution strategies this is wrong. At the moment `flatten` is performed by the distributed_data_object itself. Thereby it assumes, that flattening the local arrays produces the right result. In general with arbitrary distribution strategies this is wrong. Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/12Add `source_rank` parameter to d2o.set_full_data()2017-07-06T05:50:23ZTheo SteiningerAdd `source_rank` parameter to d2o.set_full_data()A source_rank parameter should be added to the distributed_data_object in order to specify on which node the source data-array resides on. A source_rank parameter should be added to the distributed_data_object in order to specify on which node the source data-array resides on. Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/13Add `axis` keyword functionality to unary methods.2017-07-06T05:50:23ZTheo SteiningerAdd `axis` keyword functionality to unary methods.Many numpy functions support the `axis` keyword in order to perform an operation only along certain directions of the array. The current implementation of d2o does not support this, e.g. for `all`, `any`, `sum`, etc...
Related to: theos/NIFTy#3Many numpy functions support the `axis` keyword in order to perform an operation only along certain directions of the array. The current implementation of d2o does not support this, e.g. for `all`, `any`, `sum`, etc...
Related to: theos/NIFTy#3https://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/14d2o: _contraction_helper does not work when using numpy keyword arguments2017-07-06T05:50:22ZTheo Steiningerd2o: _contraction_helper does not work when using numpy keyword argumentsThe `_contraction_helper` passes keyword arguments to the underlying numpy functions (axis=, keepdims=). The result of the _contraction_helper's local computation is then an array and not a scalar. Therefore the dtype check fails.
Fix: After solving theos/NIFTy#2, adopt to the case that the local run's result object is an array and make a further distinction of cases, i.e for something like axis=0 for the slicing_distributor. The `_contraction_helper` passes keyword arguments to the underlying numpy functions (axis=, keepdims=). The result of the _contraction_helper's local computation is then an array and not a scalar. Therefore the dtype check fails.
Fix: After solving theos/NIFTy#2, adopt to the case that the local run's result object is an array and make a further distinction of cases, i.e for something like axis=0 for the slicing_distributor. Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/15d2o: Contraction functions rely on non-degeneracy of distribution strategy2017-07-06T05:50:22ZTheo Steiningerd2o: Contraction functions rely on non-degeneracy of distribution strategySeveral methods of the distributed_data_object rely on the fact, that the distribution strategy behaves as if the local data was non-degenerate. Currently the non-distributor fixes this by returning trivial (local) results in the _allgather and the _Allreduce_sum method.
Affected d2o methods are at least: _contraction_helper, mean
Fix: Move the functionality for sum, prod, etc... into the distributor.Several methods of the distributed_data_object rely on the fact, that the distribution strategy behaves as if the local data was non-degenerate. Currently the non-distributor fixes this by returning trivial (local) results in the _allgather and the _Allreduce_sum method.
Affected d2o methods are at least: _contraction_helper, mean
Fix: Move the functionality for sum, prod, etc... into the distributor.Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/16Add out-array parameter to numerical d2o operations.2017-07-06T05:50:22ZTheo SteiningerAdd out-array parameter to numerical d2o operations.Numpy supports to specify an out array in order to avoid memory reallocation.
a = np.array([1,2,3,4])
b = np.array([5,6,7,8])
# slow:
a = a + b
# fast:
np.add(a,b,out=a)
Numpy supports to specify an out array in order to avoid memory reallocation.
a = np.array([1,2,3,4])
b = np.array([5,6,7,8])
# slow:
a = a + b
# fast:
np.add(a,b,out=a)
Theo SteiningerTheo Steininger