D2O issueshttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues2018-03-20T14:57:13Zhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/25Meta-issue: how to deal with open D2O issues?2018-03-20T14:57:13ZMartin ReineckeMeta-issue: how to deal with open D2O issues?@theos, @ensslint:
There are currently 22 open issues in D2O, and I don't expect that Theo will have the time to work on them. Unfortunately, no one else has the necessary knowledge.
Any suggestions how to proceed here?@theos, @ensslint:
There are currently 22 open issues in D2O, and I don't expect that Theo will have the time to work on them. Unfortunately, no one else has the necessary knowledge.
Any suggestions how to proceed here?https://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/24Missing file in last commit?2018-03-18T10:55:45ZMartin ReineckeMissing file in last commit?Is it possible that you forgot to add a file `random.py` in your last commit?
As things are, D2O now imports Python's default `random` module, but this probably doesn't address the problems with MPI-parallel seeding you mentioned.Is it possible that you forgot to add a file `random.py` in your last commit?
As things are, D2O now imports Python's default `random` module, but this probably doesn't address the problems with MPI-parallel seeding you mentioned.https://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/23"float128"-error on Windows 10 64bit2017-06-22T13:16:50ZChristoph Lienhard"float128"-error on Windows 10 64bitI get an error when using D20 related functions in NIFTy on my Windows machine:
site-packages\d2o-1.1.0-py2.7.egg\d2o\dtype_converter.py", line 54, in __init__
[np.dtype('float128'), MPI.LONG_DOUBLE],
TypeError: data type "float...I get an error when using D20 related functions in NIFTy on my Windows machine:
site-packages\d2o-1.1.0-py2.7.egg\d2o\dtype_converter.py", line 54, in __init__
[np.dtype('float128'), MPI.LONG_DOUBLE],
TypeError: data type "float128" not understood
As far as I understand numpy.float128 does not exist on every system (for some reason).
Edit:
same Problem with "complex256":
\site-packages\d2o-1.1.0-py2.7.egg\d2o\distributed_data_object.py", line 1898, in _to_hdf5
if self.dtype is np.dtype(np.complex256):
AttributeError: 'module' object has no attribute 'complex256'https://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/22`imag` and `real` break memory view2017-07-06T05:50:22ZTheo Steininger`imag` and `real` break memory viewfrom d2o import *
a = np.array([1,2,3,4], dtype=np.complex)
obj = distributed_data_object(a)
obj.imag[0] = 1234
objfrom d2o import *
a = np.array([1,2,3,4], dtype=np.complex)
obj = distributed_data_object(a)
obj.imag[0] = 1234
objTheo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/21Create dedicated object for 'distribution_strategy'2017-07-06T05:50:22ZTheo SteiningerCreate dedicated object for 'distribution_strategy'Only global-type distribution strategies are comparable by their name directly. In order to compare local-type distribution strategies as well -> implement an object which represents the distribution strategy. For 'freeform' this include...Only global-type distribution strategies are comparable by their name directly. In order to compare local-type distribution strategies as well -> implement an object which represents the distribution strategy. For 'freeform' this includes the individual slices lengths. Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/20Initialization: prefer global_shape/local_shape over shape of data2017-07-06T05:50:22ZTheo SteiningerInitialization: prefer global_shape/local_shape over shape of dataRight now, durring initialization if some init-data is provided, the shape of this data is prefered over an explicitly given shape.
-> Change the behavior in Distributor-Factory.
-> Make a disperse_data instead of a distribute_data a...Right now, durring initialization if some init-data is provided, the shape of this data is prefered over an explicitly given shape.
-> Change the behavior in Distributor-Factory.
-> Make a disperse_data instead of a distribute_data at the end of init.
Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/19reshaping -> enfold/defold2017-07-06T05:50:22ZTheo Steiningerreshaping -> enfold/defoldenfold in the slicing distributor relies on the fact, that the global axes correspond to the local array axes, as it operates on local data from `get_local_data()`.
In general this is not true for generic distribution strategies. enfold in the slicing distributor relies on the fact, that the global axes correspond to the local array axes, as it operates on local data from `get_local_data()`.
In general this is not true for generic distribution strategies. Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/18Improve obj.searchsorted such that it supports scalar, numpy arrays and distr...2017-07-06T05:50:22ZTheo SteiningerImprove obj.searchsorted such that it supports scalar, numpy arrays and distributed_data_objects as input......and return a distributed_data_object if the input was distributed.
...and return a distributed_data_object if the input was distributed.
Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/17Add axis keyword to obj.vdot2017-07-06T05:50:22ZTheo SteiningerAdd axis keyword to obj.vdotTheo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/1tests: only make h5py test if h5py is avaiable.2017-05-15T21:27:26ZTheo Steiningertests: only make h5py test if h5py is avaiable.Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/2Add tensor-/outer-dot to d2o2016-05-26T11:09:02ZTheo SteiningerAdd tensor-/outer-dot to d2oTheo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/3Move obj.bincount and obj.unique into distributor and make them more efficient.2016-05-26T11:09:09ZTheo SteiningerMove obj.bincount and obj.unique into distributor and make them more efficient.For efficiency, use Allreduce instead of allgather in bincount.
Use `fast-summation` in obj.unique <http://materials.jeremybejarano.com/MPIwithPython/collectiveCom.html#fastsum>For efficiency, use Allreduce instead of allgather in bincount.
Use `fast-summation` in obj.unique <http://materials.jeremybejarano.com/MPIwithPython/collectiveCom.html#fastsum>Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/4Add unit tests for `copy=True/False` functionality2017-07-06T05:50:23ZTheo SteiningerAdd unit tests for `copy=True/False` functionalityTheo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/5Add function d2o.arange2017-07-06T05:50:23ZTheo SteiningerAdd function d2o.arangeTheo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/6d2o cumsum and flatten rely on certain features of distribution strategy2017-07-06T05:50:23ZTheo Steiningerd2o cumsum and flatten rely on certain features of distribution strategycumsum and flatten assume: if the shape of the d2o changes through flattening, the distribution strategy was "slicing". cumsum and flatten assume: if the shape of the d2o changes through flattening, the distribution strategy was "slicing". Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/7The d2o_librarian will fail when mixing different MPI comms2017-07-06T05:50:23ZTheo SteiningerThe d2o_librarian will fail when mixing different MPI commsEvery local librarian instance on a node of a MPI cluster just increments its internal counter by one when a new d2o is registered. This gets out of sync, when only a part of the full cluster is covered by a special comm.
?Possible s...Every local librarian instance on a node of a MPI cluster just increments its internal counter by one when a new d2o is registered. This gets out of sync, when only a part of the full cluster is covered by a special comm.
?Possible solution: The individual librarians store the id of 'their' d2o and communicate a common id for their dictionary.
Con: Involves MPI communication.Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/8Add support for `from array` indexing2017-07-06T05:50:23ZTheo SteiningerAdd support for `from array` indexingWhen building the kdict from pindex and kindex something of the following form must be done (a==kindex, b==pindex):
a = np.arange(16)*2
b = np.array([[3,2],[1,0]])
In [1]: a[b]
Out[1]:
array([[6, 4],
...When building the kdict from pindex and kindex something of the following form must be done (a==kindex, b==pindex):
a = np.arange(16)*2
b = np.array([[3,2],[1,0]])
In [1]: a[b]
Out[1]:
array([[6, 4],
[2, 0]])
Currently, this is solved using a hack:
p.apply_scalar_function(lambda z: obj[z])
This functionality could easily be added to the get_data interface.
Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/9Semi-advanced indexing is not recognized2017-07-06T05:50:23ZTheo SteiningerSemi-advanced indexing is not recognized a = np.arange(24).reshape((3, 4,2))
obj = distributed_data_object(a)
Semi-advanced indexing
a[(2,1,1),1]
yields
array([[18, 19],
[10, 11],
[10, 11]])
The ``indexinglist'' scheme in d2o expects... a = np.arange(24).reshape((3, 4,2))
obj = distributed_data_object(a)
Semi-advanced indexing
a[(2,1,1),1]
yields
array([[18, 19],
[10, 11],
[10, 11]])
The ``indexinglist'' scheme in d2o expects either scalars or numpy arrays as tuple elements and therefore:
obj[(2,1,1),1] -> AttributeError
However,
obj[np.array((2,1,1)), 1]
works.
Solution: Parse the elements and in doubt cast them to numpy arrays.
Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/10Profile the d2o.bincount method2017-07-06T05:50:23ZTheo SteiningerProfile the d2o.bincount methodThe d2o.bincount method scales well with MPI parallelization but compared to single-core np.bincount has a rather big overhead. The d2o.bincount method scales well with MPI parallelization but compared to single-core np.bincount has a rather big overhead. Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/D2O/-/issues/11Move `flatten` method into the distributor.2017-07-06T05:50:23ZTheo SteiningerMove `flatten` method into the distributor.At the moment `flatten` is performed by the distributed_data_object itself. Thereby it assumes, that flattening the local arrays produces the right result. In general with arbitrary distribution strategies this is wrong. At the moment `flatten` is performed by the distributed_data_object itself. Thereby it assumes, that flattening the local arrays produces the right result. In general with arbitrary distribution strategies this is wrong. Theo SteiningerTheo Steininger