Commit 86a2cd73 authored by Lauri Himanen's avatar Lauri Himanen
Browse files

First merge attempt with v0.8.0. Encyclopedia tests pass.

parents 1302ea85 47b112c3
Pipeline #72428 canceled with stages
in 3 minutes and 18 seconds
Subproject commit d60013e1597493972237210a36549bfcf0a2706f
Subproject commit bc0e599c58252f791c3cb809b3c8cfbf652d3b73
Subproject commit 0feea3bb5aeb2847cde41c4d642c6b5af8e38cd5
Subproject commit 9965ca2f9c2d7815be3b1f9d8ad3e97862d48ef4
Subproject commit c0e936246972c0fb98b9b9c96a67167f1725156f
Subproject commit 3054fcd44013c78b55a784328b99c16ffd32710c
Subproject commit f2b7f39ca62438d25a21cdbaf267269fbc4f62ac
Subproject commit 901bd3fcf50ce54e9b947c11476ee1a8371b3bae
Subproject commit 5f07d80f9d1838b3f6b95e39266221002061e0d1
Subproject commit cc782ec0acec9cbbc38c0c717e2439ed2aaae92d
Subproject commit 628e4b333004ffa7636b70a99b13efea606119e1
Subproject commit 16f7a7f6909dbe16908d1be2e1fa03d3bddd17b5
Subproject commit b3159851a99a4ff6a5ee13dbe204cc0a72aff81e
Subproject commit 75a663a7e1ba8ff13c49bcdc62bca8bdb2f2d108
# Archive API tutorial
This contains the tutorials to use the new archive query functionality.
It uses the new metainfo definition for the archive data. In addition, the archive data
can now be filtered through the new api. The archive are now also stored using a new binary
format msgpack which in principle makes querying faster.
## Archive API
First, we look at how to use the new archive query api. Here we use the python requests
import requests
data = {
'atoms': 'Fe', 'scroll': True, 'per_page': 10,
'results': [{"section_run": {"section_single_configuration_calculation[-1]": {"energy_total": None}}}]}
response ='', data=data)
data = response.json
results = data.get('results')
To query the archive, we use the post method where we provide the usual query parameters
in a dictionary. In addition, we provide a schema for the archive data ala graphQL, i.e.
a heirarchical dictionary with null values for each of the property we would like to query.
In the example, we would like to return only the total energy for the last image. It is
important to point out that this schema uses the key 'results' and is a list since
this will be filled with a list of archive data with this schema.
## Archive and the new metainfo
A wrapper for the archive query api is implemented in ArchiveQuery.
from nomad.archive_library.filedb import ArchiveQuery
q = ArchiveQuery(
atoms=Fe, scroll=True, per_page=10, archive_data={
"section_run": {"section_single_configuration_calculation[-1]": {"energy_total": None}}})
metainfo = q.query()
for calc in metainfo:
Similarly, we provide query parameters and also the schema which in this case is 'archive_data'.
When we invoke query, a recursive api request is made until all the data matching our
parameters are downloaded. The results are then expressed in the new metainfo scheme
which offers auto-completion feature, among others.
## Msgpack container
The archive data are now stored in a binary format called msgpack. To create a msgpack database
from the archive data and query it, one uses ArchiveFileDB.
from nomad.archive_library.filedb import ArchiveFileDB
db = ArchiveFileDB('archive.msg', mode='w', entry_toc_depth=2)
db.add_data({'calc1':{'secA': {'subsecA': {'propA': 1.0}}, 'secB': {'propB': 'X'}}})
db.add_data({'calc2':{'secA': {'subsecA': {'propA': 2.0}}, 'secB': {'propB': 'Y'}}})
db = ArchiveFileDB('archive.msg')
db.query({'calc1':{'secA': None}})
In the example, we first create a database in 'archive.msg', and data which are added
will be fragmented down to subsections. We reload it for reading and query all entries
under 'secA' of 'calc1'.
.. mdinclude:: ../ops/docker-compose/nomad/
.. mdinclude:: ../ops/docker-compose/infrastructure/
......@@ -14,5 +14,6 @@ and infrastructure with a simplyfied architecture and consolidated code base.
......@@ -216,9 +216,6 @@ There are three catergories of metadata:
Those sets of metadata along with the actual raw and archive data are often transformed,
passed, stored, etc. by the various nomad modules.
.. figure:: datamodel_metadataflow.png
:alt: nomad's metadata flow
### Implementation
The different entities have often multiple implementations for different storage systems.
For example, aspects of calculations are stored in files (raw files, calc metadata, archive data),
......@@ -124,39 +124,31 @@ must run to make use of nomad. We use *docker* and *docker-compose* to create a
unified environment that is easy to build and to run.
You can use *docker* to run all necessary 3rd-party components and run all nomad
services manually from your python environment. Or you can run everything within
docker containers. The former is often preferred during development, since it allows
services manually from your python environment. You can also run nomad in docker,
but using Python is often preferred during development, since it allows
you change things, debug, and re-run things quickly. The later one brings you
closer to the environment that will be used to run nomad in production.
closer to the environment that will be used to run nomad in production. For
development we recommend to skip the next step.
### Docker images for nomad
There are currently two different images and respectively two different docker files:
`Dockerfile`, and `gui/Dockerfile`.
Nomad comprises currently two services,
the *worker* (does the actual processing), and the *app*. Those services can be
run from one image that have the nomad python code and all dependencies installed. This
is covered by the `Dockerfile`.
The gui is served via containers based on the `gui/Dockerfile` which contains the
react-js frontend code. Before this image can be build, make sure to execute
is covered by the `Dockerfile` in the root directory
of the nomad sources. The gui is served also served from the *app* which entails the react-js frontend code.
Before building the image, make sure to execute
cd gui
cd ..
This allows to gui to present some information about the current git revision without
This allows the app to present some information about the current git revision without
having to copy the git itself to the docker build context.
The images are build via *docker-compose* and don't have to be created manually.
### Run necessary 3-rd party services with docker-compose
You can run all containers with:
cd ops/docker-compose/nomad
cd ops/docker-compose/infrastructure
docker-compose -f docker-compose.yml -f docker-compose.override.yml up -d mongo elastic rabbitmq
This diff is collapsed.
This diff is collapsed.
from nomad import datamodel
print(datamodel.EntryMetadata(domain='DFT', calc_id='test').__class__.__name__)
print(datamodel.EntryMetadata(domain='EMS', calc_id='test').__class__.__name__)
echo log, ref, version, commit = \"$(git log -1 --oneline)\", \"$(git describe --all)\", \"$(git describe --tags)\", \"$(git rev-parse --verify --short HEAD)\" > nomad/
\ No newline at end of file
# python/backend
echo log, ref, version, commit = \"$(git log -1 --oneline)\", \"$(git describe --all)\", \"$(git describe --tags)\", \"$(git rev-parse --verify --short HEAD)\" > nomad/
# gui
commit=`git rev-parse --short --verify HEAD`
sed -i -e "s/nomad-gui-commit-placeholder/$commit/g" gui/package.json
rm -f gui/package.json-e
\ No newline at end of file
\ No newline at end of file
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment