metainfo.py 42.6 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
# Copyright 2018 Markus Scheidgen
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#   http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an"AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
14

15
16
17
18
"""
The NOMAD meta-info allows to define physics data quantities. These definitions are
necessary for all computer representations of respective data (e.g. in Python,
search engines, data-bases, and files).
19

20
This modules provides various Python interfaces for
21

22
23
24
- defining meta-info data
- to create and manipulate data that follows these definitions
- to (de-)serialize meta-info data in JSON (i.e. represent data in JSON formatted files)
25

26
27
28
29
Here is a simple example that demonstrates the definition of System related quantities:

.. code-block:: python

Markus Scheidgen's avatar
Markus Scheidgen committed
30
    class System(MSection):
31
32
33
34
35
        \"\"\"
        A system section includes all quantities that describe a single a simulated
        system (a.k.a. geometry).
        \"\"\"

36
37
38
39
        n_atoms = Quantity(
            type=int, description='''
            A Defines the number of atoms in the system.
            ''')
40

41
42
43
44
        atom_labels = Quantity(type=Enum(ase.data.chemical_symbols), shape['n_atoms'])
        atom_positions = Quantity(type=float, shape=['n_atoms', 3], unit=Units.m)
        simulation_cell = Quantity(type=float, shape=[3, 3], unit=Units.m)
        pbc = Quantity(type=bool, shape=[3])
45

46
47
48
    class Run(MSection):
        systems = SubSection(sub_section=System, repeats=True)

49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
Here, we define a `section` called ``System``. The section mechanism allows to organize
related data into, well, sections. Sections form containment hierarchies. Here
containment is a parent-child (whole-part) relationship. In this example many ``Systems``,
are part of one ``Run``. Each ``System`` can contain values for the defined quantities:
``n_atoms``, ``atom_labels``, ``atom_positions``, ``simulation_cell``, and ``pbc``.
Quantities allow to state type, shape, and physics unit to specify possible quantity
values.

Here is an example, were we use the above definition to create, read, and manipulate
data that follows these definitions:

.. code-bock:: python

    run = Run()
    system = run.m_create(System)
    system.n_atoms = 3
    system.atom_labels = ['H', 'H', 'O']

    print(system.atom_labels)
    print(run.m_to_json(ident=2))

This last statement, will produce the following JSON:

.. code-block:: JSON

    {
75
        "m_def" = "Run",
76
77
        "System": [
            {
78
                "m_def" = "System",
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
                "m_parent_index" = 0,
                "n_atoms" = 3,
                "atom_labels" = [
                    "H",
                    "H",
                    "O"
                ]
            }
        ]
    }

This is the JSON representation, a serialized version of the Python representation in
the example above.

Sections can be extended with new quantities outside the original section definition.
This provides the key mechanism to extend commonly defined parts with (code) specific
quantities:

.. code-block:: Python

    class Method(nomad.metainfo.common.Method):
        x_vasp_incar_ALGO=Quantity(
            type=Enum(['Normal', 'VeryFast', ...]),
            links=['https://cms.mpi.univie.ac.at/wiki/index.php/ALGO'])
        \"\"\"
        A convenient option to specify the electronic minimisation algorithm (as of VASP.4.5)
        and/or to select the type of GW calculations.
        \"\"\"


All meta-info definitions and classes for meta-info data objects (i.e. section instances)
110
inherit from :class:` MSection`. This base-class provides common functions and properties
111
112
113
114
115
116
for all meta-info data objects. Names of these common parts are prefixed with ``m_``
to distinguish them from user defined quantities. This also constitute's the `reflection`
interface (in addition to Python's build in ``getattr``, ``setattr``) that allows to
create and manipulate meta-info data, without prior program time knowledge of the underlying
definitions.

Markus Scheidgen's avatar
Markus Scheidgen committed
117
.. autoclass:: MSection
118
119
120

The following classes can be used to define and structure meta-info data:

Markus Scheidgen's avatar
Markus Scheidgen committed
121
- sections are defined by sub-classes :class:`MSection` and using :class:`Section` to
122
  populate the classattribute `m_def`
123
124
125
126
127
128
129
130
131
132
133
134
- quantities are defined by assigning classattributes of a section with :class:`Quantity`
  instances
- references (from one section to another) can be defined with quantities that use
  section definitions as type
- dimensions can use defined by simply using quantity names in shapes
- categories (former `abstract type definitions`) can be given in quantity definitions
  to assign quantities to additional specialization-generalization hierarchies

See the reference of classes :class:`Section` and :class:`Quantities` for details.

.. autoclass:: Section
.. autoclass:: Quantity
135
136
"""

137
138
# TODO validation

139
from typing import Type, TypeVar, Union, Tuple, Iterable, List, Any, Dict, Set, cast
140
import sys
141
import inspect
142
import re
143
import json
144
import itertools
145

146
import numpy as np
147
148
from pint.unit import _Unit
from pint import UnitRegistry
149

Markus Scheidgen's avatar
Markus Scheidgen committed
150
151
is_bootstrapping = True
MSectionBound = TypeVar('MSectionBound', bound='MSection')
152
T = TypeVar('T')
153

154

155
# Reflection
156

157
class Enum(list):
158
    """ Allows to define str types with values limited to a pre-set list of possible values. """
159
160
161
    pass


162
163
164
165
166
167
168
169
class DataType:
    """
    Allows to define custom data types that can be used in the meta-info.

    The metainfo supports most types out of the box. These includes the python build-in
    primitive types (int, bool, str, float, ...), references to sections, and enums.
    However, in some occasions you need to add custom data types.
    """
Markus Scheidgen's avatar
Markus Scheidgen committed
170
    def type_check(self, section, value):
171
        """ Checks the given value before it is set to the given section. Can modify the value. """
172
173
        return value

Markus Scheidgen's avatar
Markus Scheidgen committed
174
    def to_json_serializable(self, section, value):
175
176
        return value

Markus Scheidgen's avatar
Markus Scheidgen committed
177
    def from_json_serializable(self, section, value):
178
179
180
181
        return value


class Dimension(DataType):
182
    def type_check(self, section, value):
183
        if isinstance(value, int):
Markus Scheidgen's avatar
Markus Scheidgen committed
184
            return value
185
186
187

        if isinstance(value, str):
            if value.isidentifier():
Markus Scheidgen's avatar
Markus Scheidgen committed
188
                return value
189
            if re.match(r'(\d)\.\.(\d|\*)', value):
Markus Scheidgen's avatar
Markus Scheidgen committed
190
                return value
191
192

        if isinstance(value, Section):
Markus Scheidgen's avatar
Markus Scheidgen committed
193
            return value
194

195
        if isinstance(value, type) and hasattr(value, 'm_def'):
Markus Scheidgen's avatar
Markus Scheidgen committed
196
            return value
197
198
199
200

        raise TypeError('%s is not a valid dimension' % str(value))


Markus Scheidgen's avatar
Markus Scheidgen committed
201
202
203
204
205
206
207
208
209
210
211
212
class Reference(DataType):
    """ A datatype class that can be used to define reference types based on section definitions.

    A quantity can be used to define possible references between sections. Instantiate
    this class to create a reference type that specified that a quantity with this type
    is actually a reference (or references, depending on shape) to a section of the
    given definition.
    """
    def __init__(self, section: 'Section'):
        self.section = section


213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
class Unit(DataType):
    def type_check(self, section, value):
        if isinstance(value, str):
            value = units.parse_units(value)

        elif not isinstance(value, _Unit):
            raise TypeError('Units must be given as str or pint Unit instances.')

        return value

    def to_json_serializable(self, section, value):
        return value.__str__()

    def from_json_serializable(self, section, value):
        return units.parse_units(value)

229
230
231
# TODO class Datetime(DataType)


232
class MObjectMeta(type):
233

234
235
    def __new__(self, cls_name, bases, dct):
        cls = super().__new__(self, cls_name, bases, dct)
Markus Scheidgen's avatar
Markus Scheidgen committed
236

Markus Scheidgen's avatar
Markus Scheidgen committed
237
238
        init = getattr(cls, '__init_cls__')
        if init is not None and not is_bootstrapping:
239
240
            init()
        return cls
241
242


Markus Scheidgen's avatar
Markus Scheidgen committed
243
Content = Tuple[MSectionBound, Union[List[MSectionBound], MSectionBound], str, MSectionBound]
244
245
246
247
248
249
250
251
252
253
254

SectionDef = Union[str, 'Section', 'SubSection', Type[MSectionBound]]
""" Type for section definition references.

This can either be :

- the name of the section
- the section definition itself
- the definition of a sub section
- or the section definition Python class
"""
255
256


Markus Scheidgen's avatar
Markus Scheidgen committed
257
258
class MSection(metaclass=MObjectMeta):
    """Base class for all section instances on all meta-info levels.
259

Markus Scheidgen's avatar
Markus Scheidgen committed
260
261
262
    All metainfo objects instantiate classes that inherit from ``MSection``. Each
    section or quantity definition is an ``MSection``, each actual (meta-)data carrying
    section is an ``MSection``. This class consitutes the reflection interface of the
263
264
265
266
267
268
269
270
271
272
273
274
275
    meta-info, since it allows to manipulate sections (and therefore all meta-info data)
    without having to know the specific sub-class.

    It also carries all the data for each section. All sub-classes only define specific
    sections in terms of possible sub-sections and quantities. The data is managed here.

    The reflection insterface for reading and manipulating quantity values consists of
    Pythons build in ``getattr``, ``setattr``, and ``del``, as well as member functions
    :func:`m_add_value`, and :func:`m_add_values`.

    Sub-sections and parent sections can be read and manipulated with :data:`m_parent`,
    :func:`m_sub_section`, :func:`m_create`.

276
277
278
279
280
    .. code-block:: python

        system = run.m_create(System)
        assert system.m_parent == run
        assert run.m_sub_section(System, system.m_parent_index) == system
281
282

    Attributes:
283
        m_def: The section definition that defines this sections, its possible
284
285
286
287
288
289
290
291
292
293
294
            sub-sections and quantities.
        m_parent: The parent section instance that this section is a sub-section of.
        m_parent_index: For repeatable sections, parent keep a list of sub-sections for
            each section definition. This is the index of this section in the respective
            parent sub-section list.
        m_data: The dictionary that holds all data of this section. It keeps the quantity
            values and sub-section. It should only be read directly (and never manipulated)
            if you are know what you are doing. You should always use the reflection interface
            if possible.
    """

295
    m_def: 'Section' = None
296

Markus Scheidgen's avatar
Markus Scheidgen committed
297
    def __init__(self, m_def: 'Section' = None, m_parent: 'MSection' = None, **kwargs):
298
        self.m_def: 'Section' = m_def
Markus Scheidgen's avatar
Markus Scheidgen committed
299
        self.m_parent: 'MSection' = m_parent
300
        self.m_parent_index = -1
301

302
        cls = self.__class__
303
304
        if self.m_def is None:
            self.m_def = cls.m_def
305

306
307
        if cls.m_def is not None:
            assert self.m_def == cls.m_def, \
308
309
                'Section class and section definition must match'

310
        self.m_annotations: Dict[str, Any] = {}
Markus Scheidgen's avatar
Markus Scheidgen committed
311
        rest = {}
312
313
314
315
        for key, value in kwargs.items():
            if key.startswith('a_'):
                self.m_annotations[key[2:]] = value
            else:
Markus Scheidgen's avatar
Markus Scheidgen committed
316
317
318
319
320
                rest[key] = value

        if is_bootstrapping:
            self.m_data: Dict[str, Any] = {}
            for key, value in rest.items():
321
322
                self.m_data[key] = value

Markus Scheidgen's avatar
Markus Scheidgen committed
323
324
        else:
            self.m_data = {}
325
326
327
328
            self.m_update(**rest)
            # self.m_data = {}
            # for key, value in rest.items():
            #     self.m_data[key] = value
329

330
    @classmethod
Markus Scheidgen's avatar
Markus Scheidgen committed
331
    def __init_cls__(cls):
332
333
        # ensure that the m_def is defined
        m_def = cls.m_def
Markus Scheidgen's avatar
Markus Scheidgen committed
334
        if m_def is None:
335
336
            m_def = Section()
            setattr(cls, 'm_def', m_def)
337

338
339
        # transfer name and description to m_def
        m_def.name = cls.__name__
340
        if cls.__doc__ is not None:
341
            m_def.description = inspect.cleandoc(cls.__doc__).strip()
342
        m_def.section_cls = cls
343

344
        for name, attr in cls.__dict__.items():
345
346
            # transfer names and descriptions for properties
            if isinstance(attr, Property):
347
                attr.name = name
348
                if attr.description is not None:
349
                    attr.description = inspect.cleandoc(attr.description).strip()
350
                    attr.__doc__ = attr.description
351

Markus Scheidgen's avatar
Markus Scheidgen committed
352
                # manual manipulation of m_data due to bootstrapping
353
354
355
356
357
358
359
360
361
                if isinstance(attr, Quantity):
                    properties = m_def.m_data.setdefault('quantities', [])
                elif isinstance(attr, SubSection):
                    properties = m_def.m_data.setdefault('sub_sections', [])
                else:
                    raise NotImplementedError('Unknown property kind.')
                properties.append(attr)
                attr.m_parent = m_def
                attr.m_parent_index = len(properties) - 1
362

Markus Scheidgen's avatar
Markus Scheidgen committed
363
364
365
366
367
368
369
370
371
372
373
374
        # add base sections
        for base_cls in cls.__bases__:
            if base_cls != MSection:
                section = getattr(base_cls, 'm_def')
                if section is None:
                    raise TypeError(
                        'Section defining classes must have MSection or a decendant as '
                        'base classes.')

                # manual manipulation of m_data due to bootstrapping
                m_def.m_data.setdefault('base_sections', []).append(section)

375
376
377
        # add section cls' section to the module's package
        module_name = cls.__module__
        pkg = Package.from_module(module_name)
378
        pkg.m_add_sub_section(cls.m_def)
379

Markus Scheidgen's avatar
Markus Scheidgen committed
380
381
    def m_type_check(self, definition: 'Quantity', value: Any, check_item: bool = False):
        """ Checks and normalized the given value according to the quantity type. """
382
383
384

        if value is None and not check_item and definition.default is None:
            # Allow the default None value even if it would violate the type
385
            return value
386
387
388
389

        def check_value(value):
            if isinstance(definition.type, Enum):
                if value not in definition.type:
390
                    raise TypeError('Not one of the enum values.')
391
392
393

            elif isinstance(definition.type, type):
                if not isinstance(value, definition.type):
394
395
396
                    raise TypeError(
                        'Value %s is not of type %s, required by quantity %s.' %
                        (value, definition.type, definition))
397
398

            elif isinstance(definition.type, Section):
399
400
401
402
                if not isinstance(value, MSection) or not value.m_follows(definition.type):
                    raise TypeError(
                        'The section %s is not of section definition %s, required by quantity %s.' %
                        (value, definition.type, definition))
403

Markus Scheidgen's avatar
Markus Scheidgen committed
404
405
406
            elif isinstance(definition.type, DataType):
                value = definition.type.type_check(self, value)

407
            else:
408
409
410
                # TODO
                # raise Exception('Invalid quantity type: %s' % str(definition.type))
                pass
411

Markus Scheidgen's avatar
Markus Scheidgen committed
412
413
            return value

414
415
416
417
418
419
420
        shape = None
        try:
            shape = definition.shape
        except KeyError:
            pass

        if shape is None or len(shape) == 0 or check_item:
Markus Scheidgen's avatar
Markus Scheidgen committed
421
            value = check_value(value)
422

423
424
425
426
427
428
429
430
        else:
            if type(definition.type) == np.dtype:
                if len(shape) != len(value.shape):
                    raise TypeError('Wrong shape')
            else:
                if len(shape) == 1:
                    if not isinstance(value, list):
                        raise TypeError('Wrong shape')
431

Markus Scheidgen's avatar
Markus Scheidgen committed
432
                    value = [check_value(item) for item in value]
433

434
                else:
Markus Scheidgen's avatar
Markus Scheidgen committed
435
                    raise NotImplementedError('Checking types is not available for complex shapes.')
436
437
438

        # TODO check dimension

Markus Scheidgen's avatar
Markus Scheidgen committed
439
440
        return value

441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
    def _resolve_sub_section(self, definition: SectionDef) -> 'SubSection':
        """ Resolves and checks the given section definition. """

        if isinstance(definition, type):
            definition = getattr(definition, 'm_def', None)
            if definition is None:
                raise TypeError(
                    'The type/class %s is not definining a section, i.e. not derived from '
                    'MSection.' % str(definition))

        if isinstance(definition, Section):
            sub_section = self.m_def.all_sub_sections_by_section.get(definition, None)
            if sub_section is None:
                raise KeyError(
                    'The section %s is not a sub section of %s.' %
                    (definition.name, self.m_def.name))

        elif isinstance(definition, str):
            sub_section = self.m_def.all_sub_sections[definition]

        elif isinstance(definition, SubSection):
            sub_section = definition
463
464

        else:
465
466
467
468
            raise TypeError(
                '%s does not refer to a section definition. Either use the section '
                'definition, sub section definition, section class, or name.' %
                str(definition))
469

470
471
472
473
474
475
476
477
478
479
480
        if sub_section is None:
            raise KeyError(
                'The section %s is not a sub section of %s.' %
                (cast(Definition, definition).name, self.m_def.name))

        if sub_section.m_parent is not self.m_def:
            raise KeyError(
                'The section %s is not a sub section of %s.' %
                (cast(Definition, definition).name, self.m_def.name))

        return sub_section
481

Markus Scheidgen's avatar
Markus Scheidgen committed
482
    def m_sub_sections(self, definition: SectionDef) -> List[MSectionBound]:
483
484
485
486
487
488
489
490
        """Returns all sub sections for the given section definition

        Args:
            definition: The definition of the section.

        Raises:
            KeyError: If the definition is not for a sub section
        """
491
492
        sub_section = self._resolve_sub_section(definition)
        return getattr(self, sub_section.name)
493

Markus Scheidgen's avatar
Markus Scheidgen committed
494
    def m_sub_section(self, definition: SectionDef, parent_index: int = -1) -> MSectionBound:
495
496
497
498
499
500
501
502
503
504
505
506
507
508
        """Returns the sub section for the given section definition and possible
           parent_index (for repeatable sections).

        Args:
            definition: The definition of the section.
            parent_index: The index of the desired section. This can be omitted for non
                repeatable sections. If omitted for repeatable sections a exception
                will be raised, if more then one sub-section exists. Likewise, if the given
                index is out of range.
        Raises:
            KeyError: If the definition is not for a sub section
            IndexError: If the given index is wrong, or if an index is given for a non
                repeatable section
        """
509
        sub_section = self._resolve_sub_section(definition)
510

511
        m_data_value = getattr(self, sub_section.name)
512
513

        if m_data_value is None:
514
            if sub_section.repeats:
515
516
517
                m_data_value = []
            else:
                m_data_value = None
518
519
520
521
522
523
524
525
526
527
528
529
530

        if isinstance(m_data_value, list):
            m_data_values = m_data_value
            if parent_index == -1:
                if len(m_data_values) == 1:
                    return m_data_values[0]
                else:
                    raise IndexError()
            else:
                return m_data_values[parent_index]
        else:
            if parent_index != -1:
                raise IndexError('Not a repeatable sub section.')
531
532

            return m_data_value
533

Markus Scheidgen's avatar
Markus Scheidgen committed
534
    def m_add_sub_section(self, sub_section: MSectionBound) -> MSectionBound:
535
536
        """Adds the given section instance as a sub section to this section."""

537
538
539
540
541
542
        sub_section_def = self._resolve_sub_section(sub_section.m_def.section_cls)
        sub_section.m_parent = self
        if sub_section_def.repeats:
            values = getattr(self, sub_section_def.name)
            sub_section.m_parent_index = len(values)
            values.append(sub_section)
543
544

        else:
545
546
            self.m_data[sub_section_def.name] = sub_section
            sub_section.m_parent_index = -1
547
548
549

        return sub_section

550
    def m_create(self, definition: Type[MSectionBound], **kwargs) -> MSectionBound:
551
        """Creates a subsection and adds it this this section
552

553
554
555
556
        Args:
            section: The section definition of the subsection. It is either the
                definition itself, or the python class representing the section definition.
            **kwargs: Are used to initialize the subsection.
557

558
559
        Returns:
            The created subsection
560

561
        Raises:
562
            KeyError: If the given section is not a subsection of this section.
563
        """
564
        sub_section: 'SubSection' = self._resolve_sub_section(definition)
565

566
567
        section_cls = sub_section.sub_section.section_cls
        section_instance = section_cls(m_def=section_cls.m_def, m_parent=self, **kwargs)
568

569
        return cast(MSectionBound, self.m_add_sub_section(section_instance))
570

571
572
573
    def __resolve_quantity(self, definition: Union[str, 'Quantity']) -> 'Quantity':
        """Resolves and checks the given quantity definition. """
        if isinstance(definition, str):
574
            quantity = self.m_def.all_quantities[definition]
575

576
        else:
577
            if definition.m_parent != self.m_def:
578
579
580
581
582
583
584
                raise KeyError('Quantity is not a quantity of this section.')
            quantity = definition

        return quantity

    def m_add(self, definition: Union[str, 'Quantity'], value: Any):
        """Adds the given value to the given quantity."""
585

586
587
        quantity = self.__resolve_quantity(definition)

Markus Scheidgen's avatar
Markus Scheidgen committed
588
        value = self.m_type_check(quantity, value, check_item=True)
589
590
591
592
593
594
595
596
597

        m_data_values = self.m_data.setdefault(quantity.name, [])
        m_data_values.append(value)

    def m_add_values(self, definition: Union[str, 'Quantity'], values: Iterable[Any]):
        """Adds the given values to the given quantity."""

        quantity = self.__resolve_quantity(definition)

Markus Scheidgen's avatar
Markus Scheidgen committed
598
        values = [self.m_type_check(quantity, value, check_item=True) for value in values]
599
600
601
602
603

        m_data_values = self.m_data.setdefault(quantity.name, [])
        for value in values:
            m_data_values.append(value)

604
605
606
    def m_update(self, **kwargs):
        """ Updates all quantities and sub-sections with the given arguments. """
        for name, value in kwargs.items():
607
608
            prop = self.m_def.all_properties.get(name, None)
            if prop is None:
609
                raise KeyError('%s is not an attribute of this section %s' % (name, self))
610

611
612
            if isinstance(prop, SubSection):
                if prop.repeats:
613
614
615
616
                    if isinstance(value, List):
                        for item in value:
                            self.m_add_sub_section(item)
                    else:
617
                        raise TypeError('Sub section %s repeats, but no list was given' % prop.name)
618
619
620
621
622
623
                else:
                    self.m_add_sub_section(item)

            else:
                setattr(self, name, value)

624
    def m_follows(self, definition: 'Section') -> bool:
625
        """ Determines if this section's definition is or is derived from the given definition. """
626
627
        return self.m_def == definition or self.m_def in definition.all_base_sections

628
629
    def m_to_dict(self) -> Dict[str, Any]:
        """Returns the data of this section as a json serializeable dictionary. """
630
631

        def items() -> Iterable[Tuple[str, Any]]:
632
            yield 'm_def', self.m_def.name
633
            if self.m_parent_index != -1:
634
                yield 'm_parent_index', self.m_parent_index
635

636
            for name, sub_section in self.m_def.all_sub_sections.items():
637
638
639
640
641
642
643
644
                if name not in self.m_data:
                    continue

                if sub_section.repeats:
                    yield name, [item.m_to_dict() for item in self.m_data[name]]
                else:
                    yield name, self.m_data[name].m_to_dict()

645
            for name, quantity in self.m_def.all_quantities.items():
646
                if name in self.m_data:
647
648
649
650
651
652
653
654
655
656
657
                    to_json_serializable = str
                    if isinstance(quantity.type, DataType):
                        to_json_serializable = lambda v: quantity.type.to_json_serializable(self, v)

                    elif isinstance(quantity.type, Section):
                        # TODO
                        to_json_serializable = str
                    else:
                        # TODO
                        pass

658
                    value = getattr(self, name)
659

660
                    if hasattr(value, 'tolist'):
661
662
663
664
665
666
667
668
669
670
671
                        serializable_value = value.tolist()

                    else:
                        if len(quantity.shape) == 0:
                            serializable_value = to_json_serializable(value)
                        elif len(quantity.shape) == 1:
                            serializable_value = [to_json_serializable(i) for i in value]
                        else:
                            raise NotImplementedError('Higher shapes (%s) not supported: %s' % (quantity.shape, quantity))

                    yield name, serializable_value
672
673

        return {key: value for key, value in items()}
674

675
    @classmethod
Markus Scheidgen's avatar
Markus Scheidgen committed
676
    def m_from_dict(cls: Type[MSectionBound], dct: Dict[str, Any]) -> MSectionBound:
677
678
679
680
681
682
683
        """ Creates a section from the given data dictionary.

        This is the 'oposite' of :func:`m_to_dict`. It takes a deserialized dict, e.g
        loaded from JSON, and turns it into a proper section, i.e. instance of the given
        section class.
        """

684
        section_def = cls.m_def
685

686
687
        # remove m_def and m_parent_index, they set themselves automatically
        assert section_def.name == dct.pop('m_def', None)
688
689
690
        dct.pop('m_parent_index', -1)

        def items():
691
            for name, sub_section_def in section_def.all_sub_sections.items():
692
693
694
695
                if name in dct:
                    sub_section_value = dct.pop(name)
                    if sub_section_def.repeats:
                        yield name, [
696
                            sub_section_def.sub_section.section_cls.m_from_dict(sub_section_dct)
697
698
                            for sub_section_dct in sub_section_value]
                    else:
699
                        yield name, sub_section_def.sub_section.section_cls.m_from_dict(sub_section_value)
700
701
702
703
704

            for key, value in dct.items():
                yield key, value

        dct = {key: value for key, value in items()}
Markus Scheidgen's avatar
Markus Scheidgen committed
705
        section_instance = cast(MSectionBound, section_def.section_cls())
706
707
708
        section_instance.m_update(**dct)
        return section_instance

709
    def m_to_json(self, **kwargs):
710
        """Returns the data of this section as a json string. """
711
        return json.dumps(self.m_to_dict(), **kwargs)
712

713
    def m_all_contents(self) -> Iterable[Content]:
714
        """Returns an iterable over all sub and sub subs sections. """
715
716
717
        for content in self.m_contents():
            for sub_content in content[0].m_all_contents():
                yield sub_content
718

719
            yield content
720

721
    def m_contents(self) -> Iterable[Content]:
722
        """Returns an iterable over all direct subs sections. """
723
724
725
        for name, attr in self.m_data.items():
            if isinstance(attr, list):
                for value in attr:
Markus Scheidgen's avatar
Markus Scheidgen committed
726
                    if isinstance(value, MSection):
727
                        yield value, attr, name, self
728

Markus Scheidgen's avatar
Markus Scheidgen committed
729
            elif isinstance(attr, MSection):
730
                yield value, value, name, self
731

732
    def __repr__(self):
733
        m_section_name = self.m_def.name
734
735
736
737
738
        name = ''
        if 'name' in self.m_data:
            name = self.m_data['name']

        return '%s:%s' % (name, m_section_name)
739
740


Markus Scheidgen's avatar
Markus Scheidgen committed
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
class MCategory(metaclass=MObjectMeta):

    m_def: 'Category' = None

    @classmethod
    def __init_cls__(cls):
        # ensure that the m_def is defined
        m_def = cls.m_def
        if m_def is None:
            m_def = Category()
            setattr(cls, 'm_def', m_def)

        # transfer name and description to m_def
        m_def.name = cls.__name__
        if cls.__doc__ is not None:
756
            m_def.description = inspect.cleandoc(cls.__doc__).strip()
Markus Scheidgen's avatar
Markus Scheidgen committed
757
758
759
760
761
762
763

        # add section cls' section to the module's package
        module_name = cls.__module__
        pkg = Package.from_module(module_name)
        pkg.m_add_sub_section(cls.m_def)


764
765
766
767
768
769
770
# M3, the definitions that are used to write definitions. These are the section definitions
# for sections Section and Quantity.They define themselves; i.e. the section definition
# for Section is the same section definition.
# Due to this circular nature (hen-egg-problem), the classes for sections Section and
# Quantity do only contain placeholder for their own section and quantity definitions.
# These placeholder are replaced, once the necessary classes are defined. This process
# is referred to as 'bootstrapping'.
771

772
773
774
_definition_change_counter = 0


775
776
class cached_property:
    """ A property that allows to cache the property value.
777
778
779
780
781

    The cache will be invalidated whenever a new definition is added. Once all definitions
    are loaded, the cache becomes stable and complex derived results become available
    instantaneous.
    """
782
783
784
785
786
    def __init__(self, f):
        self.__doc__ = getattr(f, "__doc__")
        self.f = f
        self.change = -1
        self.values: Dict[type(self), Any] = {}
787

788
789
790
791
792
793
794
    def __get__(self, obj, cls):
        if obj is None:
            return self

        global _definition_change_counter
        if self.change != _definition_change_counter:
            self.values = {}
795

796
797
798
799
        value = self.values.get(obj, None)
        if value is None:
            value = self.f(obj)
            self.values[obj] = value
800
801
802
803

        return value


Markus Scheidgen's avatar
Markus Scheidgen committed
804
class Definition(MSection):
805

Markus Scheidgen's avatar
Markus Scheidgen committed
806
    __all_definitions: Dict[Type[MSection], List[MSection]] = {}
807

808
809
810
    name: 'Quantity' = None
    description: 'Quantity' = None
    links: 'Quantity' = None
811
    categories: 'Quantity' = None
812

813
814
815
816
817
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        global _definition_change_counter
        _definition_change_counter += 1

818
819
820
821
822
        for cls in self.__class__.mro() + [self.__class__]:
            definitions = Definition.__all_definitions.setdefault(cls, [])
            definitions.append(self)

    @classmethod
Markus Scheidgen's avatar
Markus Scheidgen committed
823
    def all_definitions(cls: Type[MSectionBound]) -> Iterable[MSectionBound]:
824
        """ Returns all definitions of this definition class. """
Markus Scheidgen's avatar
Markus Scheidgen committed
825
        return cast(Iterable[MSectionBound], Definition.__all_definitions.get(cls, []))
826

827
828
829
830
    @cached_property
    def all_categories(self):
        """ All categories of this definition and its categories. """
        all_categories = list(self.categories)
Markus Scheidgen's avatar
Markus Scheidgen committed
831
        for category in self.categories:  # pylint: disable=not-an-iterable
832
833
834
835
836
            for super_category in category.all_categories:
                all_categories.append(super_category)

        return all_categories

837

838
839
840
841
842
class Property(Definition):
    pass


class Quantity(Property):
843
844
845
846
847
848
849
850
851
852
853
    """Used to define quantities that store a certain piece of (meta-)data.

    Quantities are the basic building block with meta-info data. The Quantity class is
    used to define quantities within sections. A quantity definition
    is a (physics) quantity with name, type, shape, and potentially a unit.

    In Python terms, quantities are descriptors. Descriptors define how to get, set, and
    delete values for a object attribute. Meta-info descriptors ensure that
    type and shape fit the set values.
    """

854
855
    type: 'Quantity' = None
    shape: 'Quantity' = None
856
857
    unit: 'Quantity' = None
    default: 'Quantity' = None
Markus Scheidgen's avatar
Markus Scheidgen committed
858
    synonym_for: 'Quantity' = None
859
860
861
862
863

    # TODO derived_from = Quantity(type=Quantity, shape=['0..*'])
    # TODO categories = Quantity(type=Category, shape=['0..*'])
    # TODO converter = Quantity(type=Converter), a class with set of functions for
    #      normalizing, (de-)serializing values.
864
865
866
867
868

    # Some quantities of Quantity cannot be read as normal quantities due to bootstraping.
    # Those can be accessed internally through the following replacement properties that
    # read directly from m_data.
    __name = property(lambda self: self.m_data['name'])
Markus Scheidgen's avatar
Markus Scheidgen committed
869
    __synonym_for = property(lambda self: self.m_data.get('synonym_for', None))
870
    __default = property(lambda self: self.m_data.get('default', None))
871

872
    def __get__(self, obj, cls):
873
874
875
876
877
        if obj is None:
            # class (def) attribute case
            return self

        # object (instance) attribute case
Markus Scheidgen's avatar
Markus Scheidgen committed
878
        if self.__synonym_for is not None:
879
            return getattr(obj, self.__synonym_for)
Markus Scheidgen's avatar
Markus Scheidgen committed
880

881
882
883
884
        try:
            return obj.m_data[self.__name]
        except KeyError:
            return self.__default
885

886
    def __set__(self, obj, value):
887
888
889
890
891
        if obj is None:
            # class (def) case
            raise KeyError('Cannot overwrite quantity definition. Only values can be set.')

        # object (instance) case
Markus Scheidgen's avatar
Markus Scheidgen committed
892
        if self.__synonym_for is not None:
893
            return setattr(obj, self.__synonym_for, value)
Markus Scheidgen's avatar
Markus Scheidgen committed
894

895
896
897
898
899
900
901
902
903
        if type(self.type) == np.dtype:
            if type(value) != np.ndarray:
                value = np.array(value, dtype=self.type)
            elif self.type != value.dtype:
                value = np.array(value, dtype=self.type)

        elif type(value) == np.ndarray:
            value = value.tolist()

Markus Scheidgen's avatar
Markus Scheidgen committed
904
        value = obj.m_type_check(self, value)
905

906
        obj.m_data[self.__name] = value
907

908
    def __delete__(self, obj):
909
910
911
912
913
        if obj is None:
            # class (def) case
            raise KeyError('Cannot delete quantity definition. Only values can be deleted.')

        # object (instance) case
Markus Scheidgen's avatar
Markus Scheidgen committed
914
915
916
        if self.__synonym_for is not None:
            return self.__synonym_for.__delete__(obj)

917
        del obj.m_data[self.__name]
918
919


920
921
922
923
924
925
class SubSection(Property):
    """ Allows to assign a section class as a sub-section to another section class. """

    sub_section: 'Quantity' = None
    repeats: 'Quantity' = None

926
    def __get__(self, obj, type=None):
927
928
929
930
931
932
933
934
935
936
937
938
939
940
        if obj is None:
            # the class attribute case
            return self

        else:
            # the object attribute case
            m_data_value = obj.m_data.get(self.name, None)
            if m_data_value is None:
                if self.repeats:
                    m_data_value = []
                    obj.m_data[self.name] = m_data_value

            return m_data_value

941
    def __set__(self, obj, value):
942
943
944
945
946
947
        raise NotImplementedError('Sub sections cannot be set directly. Use m_create.')

    def __delete__(self, obj):
        raise NotImplementedError('Sub sections cannot be deleted directly.')


948
class Section(Definition):
949
950
951
    """Used to define section that organize meta-info data into containment hierarchies.

    Section definitions determine what quantities and sub-sections can appear in a section
952
    instance.
953

954
    In Python terms, sections are classes. Sub-sections and quantities are attributes of
955
956
957
958
959
960
    respective instantiating objects. For each section class there is a corresponding
    :class:`Section` instance that describes this class as a section. This instance
    is referred to as 'section definition' in contrast to the Python class that we call
    'section class'.
    """

Markus Scheidgen's avatar
Markus Scheidgen committed
961
    section_cls: Type[MSection] = None
962
963
    """ The section class that corresponse to this section definition. """

964
965
    quantities: 'SubSection' = None
    sub_sections: 'SubSection' = None
966

Markus Scheidgen's avatar
Markus Scheidgen committed
967
    base_sections: 'Quantity' = None
968
969
    # TODO extends = Quantity(type=bool), denotes this section as a container for
    #      new quantities that belong to the base-class section definitions
970
971
972
973
974
975
976
977
978
    @cached_property
    def all_base_sections(self) -> Set['Section']:
        all_base_sections: Set['Section'] = set()
        for base_section in self.base_sections:  # pylint: disable=not-an-iterable
            for base_base_section in base_section.all_base_sections:
                all_base_sections.add(base_base_section)

            all_base_sections.add(base_section)
        return all_base_sections
979

980
    @cached_property
981
    def all_properties(self) -> Dict[str, Union['SubSection', Quantity]]:
982
        """ All attribute (sub section and quantity) definitions. """
983

984
985
986
        properties: Dict[str, Union[SubSection, Quantity]] = dict(**self.all_quantities)
        properties.update(**self.all_sub_sections)
        return properties
987

988
    @cached_property
989
    def all_quantities(self) -> Dict[str, Quantity]:
990
        """ All quantity definition in the given section definition. """
991

Markus Scheidgen's avatar
Markus Scheidgen committed
992
        all_quantities: Dict[str, Quantity] = {}
993
        for section in itertools.chain(self.all_base_sections, [self]):
Markus Scheidgen's avatar
Markus Scheidgen committed
994
995
996
997
            for quantity in section.m_data.get('quantities', []):
                all_quantities[quantity.name] = quantity

        return all_quantities
998

999
    @cached_property
1000
1001
    def all_sub_sections(self) -> Dict[str, 'SubSection']:
        """ All sub section definitions for this section definition by name. """
1002

1003
1004
        return {
            sub_section.name: sub_section
1005
            for sub_section in self.m_data.get('sub_sections', [])}
1006

1007
1008
1009
1010
1011
1012
    @cached_property
    def all_sub_sections_by_section(self) -> Dict['Section', 'SubSection']:
        """ All sub section definitions for this section definition by their section definition. """
        return {
            sub_section.sub_section: sub_section
            for sub_section in self.m_data.get('sub_sections', [])}
1013

1014

1015
class Package(Definition):
1016

1017
1018
1019
    section_definitions: 'SubSection'
    category_definitions: 'SubSection'

1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
    @staticmethod
    def from_module(module_name: str):
        module = sys.modules[module_name]

        pkg: 'Package' = getattr(module, 'm_package', None)
        if pkg is None:
            pkg = Package()
            setattr(module, 'm_package', pkg)

        pkg.name = module_name
        if pkg.description is None and module.__doc__ is not None:
1031
            pkg.description = inspect.cleandoc(module.__doc__).strip()
1032
1033

        return pkg
1034
1035


1036
1037
1038
1039
1040
class Category(Definition):
    """Can be used to define categories for definitions.

    Each definition, including categories themselves, can belong to a set of categories.
    Categories therefore form a hierarchy of concepts that definitions can belong to, i.e.
1041
    they form a `is a` relationship.
1042

1043
1044
    In the old meta-info this was known as `abstract types`.
    """
1045
1046
1047
1048
1049
1050

    @cached_property
    def definitions(self) -> Iterable[Definition]:
        """ All definitions that are directly or indirectly in this category. """
        return list([
            definition for definition in Definition.all_definitions()
1051
            if self in definition.all_categories])
1052
1053


Markus Scheidgen's avatar
Markus Scheidgen committed
1054
Section.m_def = Section(name='Section')
1055
1056
Section.m_def.m_def = Section.m_def
Section.m_def.section_cls = Section
1057

Markus Scheidgen's avatar
Markus Scheidgen committed
1058
1059
1060
1061
Definition.m_def = Section(name='Definition')
Property.m_def = Section(name='Property')
Quantity.m_def = Section(name='Quantity')
SubSection.m_def = Section(name='SubSection')
1062
1063
Category.m_def = Section(name='Category')
Package.m_def = Section(name='Package')
1064
1065

Definition.name = Quantity(
Markus Scheidgen's avatar
Markus Scheidgen committed
1066
    type=str, name='name', description='''
1067
1068
1069
    The name of the quantity. Must be unique within a section.
    ''')
Definition.description = Quantity(
Markus Scheidgen's avatar
Markus Scheidgen committed
1070
    type=str, name='description', description='''
1071
1072
1073
    An optional human readable description.
    ''')
Definition.links = Quantity(
Markus Scheidgen's avatar
Markus Scheidgen committed
1074
    type=str, shape=['0..*'], name='links', description='''
1075
1076
    A list of URLs to external resource that describe this definition.
    ''')
1077
Definition.categories = Quantity(