Markus Scheidgen (f4fce04c) at 22 Mar 16:45
Hotfixed that h5grove works on sub directories again.
Markus Scheidgen (19c408bf) at 22 Mar 16:42
Markus Scheidgen (99cf267c) at 22 Mar 16:42
Merge branch 'download-docs' into 'develop'
... and 1 more commit
Changelog: Added
Markus Scheidgen (19c408bf) at 22 Mar 16:08
Added documentation for downloading files and data with curl.
I removed it everywhere.
I think it is "one
at a time", but still i need to correct it.
Markus Scheidgen (91896cd3) at 22 Mar 10:20
Markus Scheidgen (91896cd3) at 22 Mar 09:46
Fixed an issue with url incode in h5grove app.
Closes #1952
Markus Scheidgen (8611cba8) at 22 Mar 09:44
This started as an issue on discord: https://discord.com/channels/1201445470485106719/1218105117975380028/1218105117975380028
Our h5grove app does not seem to work with file names that have a "+" character in them, e.g. data_2024-03-15T08-51-17_698104+01-00.nxs
.
@thchang this is what i tried Yesterday before we had the discussion on !1706 today. The thoughts in #1947 are a bit newer. I think, I correctly could extract an NumpyArray
type, but i got stuck with usages if it not working with the MTypes
lists that only contained the np.float64
, etc. style types. I tried to make the NumpyArray
the same as np.float64
, etc. with overwriting __eq__
and __hash__
. But this only seems to work for normal classes not for types like np.float64
. Anyhow, I think MTypes
should be implemented differently (see my explanation in #1947).
Feel free to discard my implementation. I just though, I made it, so I show it.
Markus Scheidgen (7d565b82) at 20 Mar 16:01
Extracted numpy values into metainfo DataType. [skip ci]
Closes #1947
Markus Scheidgen (b80797bc) at 20 Mar 15:59
Move all types into their own DataTypes
and remove the "if then else" logic from set, get, and (de)serialize operations. E.g. NPArray(dtype)
and PythonType(type)
should deal with np and python types respectively. This should make it much easier to extend NPArray(dtype)
with something like HDF5Dataset(dtype)
in the future. It might also make it easier to add more specialised types. Also the m_to_dict
, __set_normalized
, etc. looks horrible in its current state.
Potentially those new types need parameters like dtype
or type
. Note that types can have parameters. Reference
and SectionReference
for example, already take the referenced section definition as a type.
But, the unit and the shape should stay in the quantity definition. In all DataType
operations, you will have access to the quantity def anyways. Maybe DataType
needs a field supports_shapes
or something. Maybe of the set, get, and (de)serialize will need to know, if they should manage the list vs scalar or if the data type is actually doing this. For something like Quantity(type=Datatime, shape=['*'])
m_to_dict
would create a list and would ask Datetime to serialize the elements. For Quantity(type=np.float64, shape=[1,2])
m_to_dict
would just call serialize
on NPArray
and would expect it to serialze the whole array and not just an element. For data types that do not support shapes in itself, only scalars and list would work, for higher shapes we would throw an error.
The QuantityType
(the type for types) could duck-type and help with backwards compatibility. E.g. everytime you use type=np.float64
, QuantityType
replaces it it it#s set_normalized
function with NPArray(np.float64)
. Keep in mind that we also need backwards compatibility in how QuantityType
serializes types. For example an NPArray(np.float64)
should still (de)serialize to {type_kind: 'numpy', type_data: 'float64'}
, etc.
When we need to map the metainfo to other systems like Pydantic, Optimade, Mongo, etc. we often make use of MTypes
to figure out if a quantity value is compatible with a respective foreign Pydantic, Optimade, Mongo, etc. type. Here MTypes
provides list of types that are called number
, numpy
, etc. Maybe DataType
can define functions to implement this more explicitly:
class DataType:
def compatible_with(target_type: Type) -> bool:
"""
Returns true if the given type is compatible. All compatible types can be used in `convert`.
Also values in all compatible types can be assigned to quantities with self type.
"""
return target_type == self
def convert(target_type: Type[T], value) -> T:
"""
Converts the given value into a value of the given compatible type.
This will not assert if the given type is actually compatible.
Use `compatible_with` to check.
"""
return value
class NPArray(DataType):
def __init__(self, dtype):
self.dtype = dtype
def compatible_with(target_type):
if (self.dtype.type in [np.float64, npfloat32]):
return target_type == float
if (self.dtype.type in [np.int64, np.uint64]):
return target_type == int
return target_type == self.dtype.type
def convert(target_type, value):
if target_type == self.dtype.type:
return value
target_type(value)
Pydantic(pydantic_model)
, Dataframe(...)
for table data.nomad.datamodel.metainfo
. Ideally, the nomad.metainfo
could be reduced to pure Python (no numpy, no pandas, no nomad.config). Ways to possibly inject dependencies are specialisations of DataType
and Context
.label
propertybenchmarks
, legacy
, generate
Category
Environments
completelyMSection
, Definition
, Datatype
, Annotation
, Context
is cyclic by nature.nomad.datamodel.metainfo
, where they are only imported if actually needed and they might depend on more than basic python packages.nexus
should definitely move. First into nomad.datamodel.metainfo
, but eventually in its own plugin.Not everything has to be in one MR.