Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
P
public-wiki
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 1
    • Issues 1
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge Requests 0
    • Merge Requests 0
  • Operations
    • Operations
    • Incidents
  • Analytics
    • Analytics
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
  • nomad-lab
  • public-wiki
  • Wiki
  • nomad meta info

Last edited by Fawzi Mohamed Mar 21, 2016
Page history
This is an old version of this page. You can view the most recent version or browse the history.

nomad meta info

NomadMetaInfo are the main method used to describe the information stored in NomadLab. An interactive visualization of the NomadMetaInfo is available online.

The thinking behind meta data

The NOMAD Archive contains the result of the parsing and normalisation of the data contained in the Repository. Ideally, the Archive contains all the (parsable) data contained in the Repository, but in an easy-to-access format, so that the Database for the Encyclopedia can be constructed starting from the data structure of the Archive. The first step is the definition of a metadata structure for the un-ambiguous archiving of the data.

When classifying data, identifying the type of a value – and describing it – is of crucial importance. Metadata is information on the data. Its meaning depends on what one considers as data. Here, with data we mean the properties, values, etc. contained in the files stored in the NOMAD Repository. So, metadata is the information that is used to identify, describe and classify these values. For example, the name of the program used to perform a given calculation (e.g., “VASP”) is considered data, while the related metadata is the string “program used to perform the calculation”. If one thinks of storing data as ‘key’-‘value’ pairs (as in a dictionary), the ‘key’ is the metadata. To avoid conflicts (doubly defined metadata with different meaning) all names would need to be registered centrally. Since one would typically like to avoid very long strings as metadata names, the “good short“ names easily and quickly run out.

These problems can be solved by introducing hierarchically structured metadata: Adding a longer description to the short name, makes the meaning clearer. One should note that the name of the metadata is the label used to refer to it, directly, when writing the parser, so it is inconvenient to have long strings in the code. On the other hand, the description of the metadata can be accessed via the metadata name, when needed. The metadata itself can be described introducing multiple inheritance.

In this context, inheritance among metadata means that one metadata, called child, has the same features of another metadata, called parent. The child may have more features than the parent, but inherits all parent’s features. Multiple refers to the fact that one child may have several parents. Parents in turn can have (super)parents and children children of their own. This creates a hierarchical structure.

In our case, for instance, the metadata energy_total_scf_converged (containing final, converged, total – electronic + ionic – energy) inherits from both energy_total_potential (an abstract-type metadata – see below for metadata types in detail – that contains final – i.e., scf converged or result of a perturbative method – energy quantities) and section_single_configuration_calculation (a section-type metadata that groups all metadata related to a calculation performed on a given configuration and with a given method, see below for more details on sections).

To allow the reuse of short and descriptive names, a string called unique identifier (gid) is assigned to each metadata. The gid depends not just on the name, but also on the description, and the identifier of all its dependencies. An identifier clash is therefore very unlikely, as it would mean that two metadata with same name, description, and parents mean two different things.

Conceptual model for calculations

Most data values do not make sense taken isolated from their context, as they are connected to each other. For example an energy_total_scf_converged value is not independent of the system (atomic configuration, etc.) it refers to. Thus we have to define which values are grouped together. This is done by using metadata objects of type section. The value associated with a section metadata is a list of groups of keys and values that are connected. For example, a typical calculation has the following sections:

  • section_run: represents a single "run" of the program,
  • section_method: contains the information defining the theory level, and convergence parameters,
  • section_system_description: the content of each of this section corresponds to a different and specific system configuration
  • section_single_configuration_calculation: contains the results for a system as defined in a single section_method and a single section_system_description.
  • section_scf_iteration: each entry is a single self-consistency iteration. A metadata x_index (of type integer) and a metadata x_identifier (of type string) are implicitly defined for each section x. Sections can be nested meaning that each inner one can contain one or more outer sections. By using sections, it is possible to put less information in the single metadata, for example the energy_total value could be identified also as energy_total in a section_single_configuration_calculation, and the actual xc functional and computation parameters can be found in its associated section_method.

Practical implementation

NomadMetaInfo, i.e., the practical implementation of the NOMAD metadata, uses a dictionary in json format to describe a metadata:

{
  "name": "energy_total_scf_converged",
  "description": "A total (final, converged) energy calculated with XC_method_scf",
  "kindStr": "type_document_content",
  "dtypeStr": "f",
  "repeats": false,
  "shape": [],
  "superNames": [
    "energy_total_potential"
  ],
  "units": "J"
}

and stores a list of such dictionaries in the metadata key of a dictionary within files ending with ".nomadmetainfo.json".

There is a git repository nomad-meta-info contains the current version on the metadata, along with several tools to help handling NOMAD metadata, verifying correctness, versions,...

There is a git repository nomad-meta-info contains the current version on the metadata, along with several tools to help handling NOMAD metadata, verifying correctness, versions, ... The goal of the NOMAD metadata is not just to describe the data of a calculation and the various properties calculated in a run, but also derived quantities not necessarily parsed like basis-set-superposition-error (BSSE) corrected energies. As said, it gives a unique way to identify a given property, and allow one to easily treat similar properties in the same way.

The metadata type is declared in kindStr and can be:

  • type_document_content has a value associated, but cannot be further inherited (the default type)
  • type_section describes a section that groups related quantities
  • type_abstract_document_content are types that are used only to classify other types.

Web Interface

A Web REST interface offers both json values of a complete version and single values:

https://nomad-dev.rz-berlin.mpg.de/nmi/v/common/n/<matadata name>/info.html,

e.g.: energy_total_scf_converged,

as long as json values of a complete version and single values:

https://nomad-dev.rz-berlin.mpg.de/nmi/v/common/n/<metadata name>/info.json

e.g., energy_total_scf_converged

Extensibility

NomadMetaInfo are defined in a way that everybody (in principle, in practice, at the moment only the NOMAD Database developers) can extend them and introduce new types without needing to consult a central authority, and clashes are basically impossible. This happens because, while it is well possible to use the same name in a different ways, internally NomadMetaInfo are always identified by gid, which is a checksum that depends on the whole definition of the metadeata, and the gid of all its dependencies. Thus if a person defines a different "energy_total_scf_converged" metadata, it will have another, different (be it in the description string, inheritance, or some other property), gid. A single document or piece of information has to use a unique definition for each name, but different documents might use different ones without problems. This is very useful for new or experimental properties, that can be stored and used before being standardized.

Standard NomadMetaInfo

Still, one should strive to register the type used and use a "standard" version of the metadata, so that one can search across all documents for, e.g., “energy_total_scf_converged” using a unique key that has a clear meaning. The goal of the repository at https://gitlab.rzg.mpg.de/nomad-lab/nomad-meta-info is exactly to define this "standard" version of the NomadMetaInfo. They are stored in the "meta_info/nomad_meta_info" directory. The python-common repository contains also scripts in common/python/nomadscripts to help ensure that the definition does not contain errors and generate overrides.

Concrete MetaInfo

There is linked visualization of the last version a more interactive version, and a (large!) svg plot of the inheritance of the common meta infos.

Special info on some concrete meta info is in the following pages:

  • atomic-multipole-kind
  • basis-set-atom-centered-short-name
  • basis-set-atom-centered-unique-name
  • basis-set-cell-associated-kind
  • basis-set-cell-associated-name
  • basis-set-kind
  • basis-set-name
  • calculation-method-current
  • calculation-to-calculation-external-url
  • calculation-to-calculation-kind
  • constraint-kind
  • eigenvalues-kind
  • energy-comparable
  • energy-current
  • ensemble-type
  • frame-sequence-external-url
  • interaction-kind
  • m-kind
  • method-to-method-external-url
  • method-to-method-kind
  • relativity-method
  • sampling-method
  • self-interaction-correction-method
  • smearing-kind
  • topology-force-field-name
  • van-der-Waals-method
  • XC-functional

Overrides

The standard version of the metadata can change, and a list of how to map old versions to new ones (if the new metadata is for all purposes equivalent to the old one) can be specified with override files.

These files describe the new version of a NomadMetaInfo, by listing old gid and new gid, and can thus introduce versioning for NomadMetaInfo. The name or other keys can be given, but are only informative and can be omitted.

File Naming convention

Normally overrides are given between two tagged versions or between the last checked-in version, and the current state. So, override files are by default given as

<oldVersion>_<newVersion>_YYYY-MM-DD.nomadmetainfo_overrides.json

where ''oldVersion'' can be the first 10 characters of the git sha, a tag name, or even empty; just like ''newVersion''. ''YYYY-MM-DD'' is the current date, and if required an "_n" with a suitable number ''n'' that does not clash with existing files can be used.

The extension .nomadmetainfo_overrides.json is mandatory.

Automatic Generation

Normally you can generate these files automatically with scripts/nomadscripts/calculate_meta_info_overrides.py The script works if the name of the KindInfo is the same but have different gid.

Complex Cases

In cases in which you have renamed a NomadMetaInfo or there is a NomadMetaInfo outside the standard that you want to replace with the standard one you have to create (or complete) the override file by hand. In these cases the --verbose flag can be useful.

It is also possible to use

scripts/nomadscripts/normalize\_local\_kinds.py --add-gid

to update each KindInfo with its gid, which then you can use to create manual override files.

Please, do not check in the repository the generated .nomadmetainfo.json files with gid.

Clone repository
  • D2.1 Documentation
  • Dev culture
  • Encyclopedia
  • GettingAccess
  • LASSO_L0
  • NOMAD software and source code release
  • ParserAssignment
  • ParsersOverview
  • analytics
    • Home
    • LASSO_L0
    • RSvsZB_LASSO_L0
  • Home
  • nomad meta info
  • plots and pictures for the tutorials
  • remoteviz
    • MPCDF RVS
View All Pages