D2O issues

D2O issues https://gitlab.mpcdf.mpg.de/ift/D2O/-/issues 2018-03-20T14:57:13Z https://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/25 Meta-issue: how to deal with open D2O issues? 2018-03-20T14:57:13Z Martin Reinecke

Meta-issue: how to deal with open D2O issues?

@theos, @ensslint: There are currently 22 open issues in D2O, and I don't expect that Theo will have the time to work on them. Unfortunately, no one else has the necessary knowledge. Any suggestions how to proceed here? @theos, @ensslint: There are currently 22 open issues in D2O, and I don't expect that Theo will have the time to work on them. Unfortunately, no one else has the necessary knowledge. Any suggestions how to proceed here? https://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/24 Missing file in last commit? 2018-03-18T10:55:45Z Martin Reinecke

Missing file in last commit?

Is it possible that you forgot to add a file `random.py` in your last commit? As things are, D2O now imports Python's default `random` module, but this probably doesn't address the problems with MPI-parallel seeding you mentioned. Is it possible that you forgot to add a file `random.py` in your last commit? As things are, D2O now imports Python's default `random` module, but this probably doesn't address the problems with MPI-parallel seeding you mentioned. https://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/23 "float128"-error on Windows 10 64bit 2017-06-22T13:16:50Z Christoph Lienhard

"float128"-error on Windows 10 64bit

I get an error when using D20 related functions in NIFTy on my Windows machine: site-packages\d2o-1.1.0-py2.7.egg\d2o\dtype_converter.py", line 54, in __init__ [np.dtype('float128'), MPI.LONG_DOUBLE], TypeError: data type "float... I get an error when using D20 related functions in NIFTy on my Windows machine: site-packages\d2o-1.1.0-py2.7.egg\d2o\dtype_converter.py", line 54, in __init__ [np.dtype('float128'), MPI.LONG_DOUBLE], TypeError: data type "float128" not understood As far as I understand numpy.float128 does not exist on every system (for some reason). Edit: same Problem with "complex256": \site-packages\d2o-1.1.0-py2.7.egg\d2o\distributed_data_object.py", line 1898, in _to_hdf5 if self.dtype is np.dtype(np.complex256): AttributeError: 'module' object has no attribute 'complex256' https://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/22 `imag` and `real` break memory view 2017-07-06T05:50:22Z Theo Steininger

`imag` and `real` break memory view

from d2o import * a = np.array([1,2,3,4], dtype=np.complex) obj = distributed_data_object(a) obj.imag[0] = 1234 obj from d2o import * a = np.array([1,2,3,4], dtype=np.complex) obj = distributed_data_object(a) obj.imag[0] = 1234 obj Theo Steininger Theo Steininger https://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/21 Create dedicated object for 'distribution_strategy' 2017-07-06T05:50:22Z Theo Steininger

Create dedicated object for 'distribution_strategy'

Only global-type distribution strategies are comparable by their name directly. In order to compare local-type distribution strategies as well -> implement an object which represents the distribution strategy. For 'freeform' this include... Only global-type distribution strategies are comparable by their name directly. In order to compare local-type distribution strategies as well -> implement an object which represents the distribution strategy. For 'freeform' this includes the individual slices lengths. Theo Steininger Theo Steininger https://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/20 Initialization: prefer global_shape/local_shape over shape of data 2017-07-06T05:50:22Z Theo Steininger

Initialization: prefer global_shape/local_shape over shape of data

Right now, durring initialization if some init-data is provided, the shape of this data is prefered over an explicitly given shape. -> Change the behavior in Distributor-Factory. -> Make a disperse_data instead of a distribute_data a... Right now, durring initialization if some init-data is provided, the shape of this data is prefered over an explicitly given shape. -> Change the behavior in Distributor-Factory. -> Make a disperse_data instead of a distribute_data at the end of init. Theo Steininger Theo Steininger https://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/19 reshaping -> enfold/defold 2017-07-06T05:50:22Z Theo Steininger

reshaping -> enfold/defold

enfold in the slicing distributor relies on the fact, that the global axes correspond to the local array axes, as it operates on local data from `get_local_data()`. In general this is not true for generic distribution strategies. enfold in the slicing distributor relies on the fact, that the global axes correspond to the local array axes, as it operates on local data from `get_local_data()`. In general this is not true for generic distribution strategies. Theo Steininger Theo Steininger https://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/18 Improve obj.searchsorted such that it supports scalar, numpy arrays and distr... 2017-07-06T05:50:22Z Theo Steininger

Improve obj.searchsorted such that it supports scalar, numpy arrays and distributed_data_objects as input...

...and return a distributed_data_object if the input was distributed. ...and return a distributed_data_object if the input was distributed. Theo Steininger Theo Steininger https://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/17 Add axis keyword to obj.vdot 2017-07-06T05:50:22Z Theo Steininger

Add axis keyword to obj.vdot

Theo Steininger Theo Steininger https://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/1 tests: only make h5py test if h5py is avaiable. 2017-05-15T21:27:26Z Theo Steininger

tests: only make h5py test if h5py is avaiable.

Theo Steininger Theo Steininger https://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/2 Add tensor-/outer-dot to d2o 2016-05-26T11:09:02Z Theo Steininger

Add tensor-/outer-dot to d2o

Theo Steininger Theo Steininger https://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/3 Move obj.bincount and obj.unique into distributor and make them more efficient. 2016-05-26T11:09:09Z Theo Steininger

Move obj.bincount and obj.unique into distributor and make them more efficient.

For efficiency, use Allreduce instead of allgather in bincount. Use `fast-summation` in obj.unique <http://materials.jeremybejarano.com/MPIwithPython/collectiveCom.html#fastsum> For efficiency, use Allreduce instead of allgather in bincount. Use `fast-summation` in obj.unique <http://materials.jeremybejarano.com/MPIwithPython/collectiveCom.html#fastsum> Theo Steininger Theo Steininger https://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/4 Add unit tests for `copy=True/False` functionality 2017-07-06T05:50:23Z Theo Steininger

Add unit tests for `copy=True/False` functionality

Theo Steininger Theo Steininger https://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/5 Add function d2o.arange 2017-07-06T05:50:23Z Theo Steininger

Add function d2o.arange

Theo Steininger Theo Steininger https://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/6 d2o cumsum and flatten rely on certain features of distribution strategy 2017-07-06T05:50:23Z Theo Steininger

d2o cumsum and flatten rely on certain features of distribution strategy

cumsum and flatten assume: if the shape of the d2o changes through flattening, the distribution strategy was "slicing". cumsum and flatten assume: if the shape of the d2o changes through flattening, the distribution strategy was "slicing". Theo Steininger Theo Steininger https://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/7 The d2o_librarian will fail when mixing different MPI comms 2017-07-06T05:50:23Z Theo Steininger

The d2o_librarian will fail when mixing different MPI comms

Every local librarian instance on a node of a MPI cluster just increments its internal counter by one when a new d2o is registered. This gets out of sync, when only a part of the full cluster is covered by a special comm. ?Possible s... Every local librarian instance on a node of a MPI cluster just increments its internal counter by one when a new d2o is registered. This gets out of sync, when only a part of the full cluster is covered by a special comm. ?Possible solution: The individual librarians store the id of 'their' d2o and communicate a common id for their dictionary. Con: Involves MPI communication. Theo Steininger Theo Steininger https://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/8 Add support for `from array` indexing 2017-07-06T05:50:23Z Theo Steininger

Add support for `from array` indexing

When building the kdict from pindex and kindex something of the following form must be done (a==kindex, b==pindex): a = np.arange(16)*2 b = np.array([[3,2],[1,0]]) In [1]: a[b] Out[1]: array([[6, 4], ... When building the kdict from pindex and kindex something of the following form must be done (a==kindex, b==pindex): a = np.arange(16)*2 b = np.array([[3,2],[1,0]]) In [1]: a[b] Out[1]: array([[6, 4], [2, 0]]) Currently, this is solved using a hack: p.apply_scalar_function(lambda z: obj[z]) This functionality could easily be added to the get_data interface. Theo Steininger Theo Steininger https://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/9 Semi-advanced indexing is not recognized 2017-07-06T05:50:23Z Theo Steininger

Semi-advanced indexing is not recognized

a = np.arange(24).reshape((3, 4,2)) obj = distributed_data_object(a) Semi-advanced indexing a[(2,1,1),1] yields array([[18, 19], [10, 11], [10, 11]]) The ``indexinglist'' scheme in d2o expects... a = np.arange(24).reshape((3, 4,2)) obj = distributed_data_object(a) Semi-advanced indexing a[(2,1,1),1] yields array([[18, 19], [10, 11], [10, 11]]) The ``indexinglist'' scheme in d2o expects either scalars or numpy arrays as tuple elements and therefore: obj[(2,1,1),1] -> AttributeError However, obj[np.array((2,1,1)), 1] works. Solution: Parse the elements and in doubt cast them to numpy arrays. Theo Steininger Theo Steininger https://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/10 Profile the d2o.bincount method 2017-07-06T05:50:23Z Theo Steininger

Profile the d2o.bincount method

The d2o.bincount method scales well with MPI parallelization but compared to single-core np.bincount has a rather big overhead. The d2o.bincount method scales well with MPI parallelization but compared to single-core np.bincount has a rather big overhead. Theo Steininger Theo Steininger https://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/11 Move `flatten` method into the distributor. 2017-07-06T05:50:23Z Theo Steininger

Move `flatten` method into the distributor.

At the moment `flatten` is performed by the distributed_data_object itself. Thereby it assumes, that flattening the local arrays produces the right result. In general with arbitrary distribution strategies this is wrong. At the moment `flatten` is performed by the distributed_data_object itself. Thereby it assumes, that flattening the local arrays produces the right result. In general with arbitrary distribution strategies this is wrong. Theo Steininger Theo Steininger