Commit d548c92e authored by Markus Scheidgen's avatar Markus Scheidgen
Browse files

updated developer documentation

parent 495c0d33
Pipeline #88466 canceled with stages
in 10 minutes and 22 seconds
...@@ -38,7 +38,7 @@ Let's say you want to see the repository metadata (i.e. the information that you ...@@ -38,7 +38,7 @@ Let's say you want to see the repository metadata (i.e. the information that you
our gui) for entries that fit search criteria, like compounds having atoms *Si* and *O* in our gui) for entries that fit search criteria, like compounds having atoms *Si* and *O* in
it: it:
``` ```sh
curl -X GET "http://nomad-lab.eu/prod/rae/api/repo/?atoms=Si&atoms=O" curl -X GET "http://nomad-lab.eu/prod/rae/api/repo/?atoms=Si&atoms=O"
``` ```
...@@ -46,7 +46,7 @@ Here we used curl to send an HTTP GET request to return the resource located by ...@@ -46,7 +46,7 @@ Here we used curl to send an HTTP GET request to return the resource located by
In practice you can omit the `-X GET` (which is the default) and you might want to format In practice you can omit the `-X GET` (which is the default) and you might want to format
the output: the output:
``` ```sh
curl "http://nomad-lab.eu/prod/rae/api/repo/?atoms=Si&atoms=O" | python -m json.tool curl "http://nomad-lab.eu/prod/rae/api/repo/?atoms=Si&atoms=O" | python -m json.tool
``` ```
...@@ -68,21 +68,21 @@ Similar functionality is offered to download archive or raw data. Let's say you ...@@ -68,21 +68,21 @@ Similar functionality is offered to download archive or raw data. Let's say you
identified an entry (given via a `upload_id`/`calc_id`, see the query output), and identified an entry (given via a `upload_id`/`calc_id`, see the query output), and
you want to download it: you want to download it:
``` ```sh
curl "http://nomad-lab.eu/prod/rae/api/raw/calc/JvdvikbhQp673R4ucwQgiA/k-ckeQ73sflE6GDA80L132VCWp1z/*" -o download.zip curl "http://nomad-lab.eu/prod/rae/api/raw/calc/JvdvikbhQp673R4ucwQgiA/k-ckeQ73sflE6GDA80L132VCWp1z/*" -o download.zip
``` ```
With `*` you basically requests all the files under an entry or path.. With `*` you basically requests all the files under an entry or path..
If you need a specific file (that you already know) of that calculation: If you need a specific file (that you already know) of that calculation:
``` ```sh
curl "http://nomad-lab.eu/prod/rae/api/raw/calc/JvdvikbhQp673R4ucwQgiA/k-ckeQ73sflE6GDA80L132VCWp1z/INFO.OUT" curl "http://nomad-lab.eu/prod/rae/api/raw/calc/JvdvikbhQp673R4ucwQgiA/k-ckeQ73sflE6GDA80L132VCWp1z/INFO.OUT"
``` ```
You can also download a specific file from the upload (given a `upload_id`), if you know You can also download a specific file from the upload (given a `upload_id`), if you know
the path of that file: the path of that file:
``` ```sh
curl "http://nomad-lab.eu/prod/rae/api/raw/JvdvikbhQp673R4ucwQgiA/exciting_basis_set_error_study/monomers_expanded_k8_rgkmax_080_PBE/72_Hf/INFO.OUT" curl "http://nomad-lab.eu/prod/rae/api/raw/JvdvikbhQp673R4ucwQgiA/exciting_basis_set_error_study/monomers_expanded_k8_rgkmax_080_PBE/72_Hf/INFO.OUT"
``` ```
...@@ -90,27 +90,27 @@ If you have a query ...@@ -90,27 +90,27 @@ If you have a query
that is more selective, you can also download all results. Here all compounds that only that is more selective, you can also download all results. Here all compounds that only
consist of Si, O, bulk material simulations of cubic systems (currently ~100 entries): consist of Si, O, bulk material simulations of cubic systems (currently ~100 entries):
``` ```sh
curl "http://nomad-lab.eu/prod/rae/api/raw/query?only_atoms=Si&only_atoms=O&system=bulk&crystal_system=cubic" -o download.zip curl "http://nomad-lab.eu/prod/rae/api/raw/query?only_atoms=Si&only_atoms=O&system=bulk&crystal_system=cubic" -o download.zip
``` ```
Here are a few more examples for downloading the raw data of based on DOI or dataset. Here are a few more examples for downloading the raw data of based on DOI or dataset.
You will have to encode non URL safe characters in potential dataset names (e.g. with a service like [www.urlencoder.org](https://www.urlencoder.org/)): You will have to encode non URL safe characters in potential dataset names (e.g. with a service like [www.urlencoder.org](https://www.urlencoder.org/)):
``` ```sh
curl "http://nomad-lab.eu/prod/rae/api/raw/query?datasets.doi=10.17172/NOMAD/2020.03.18-1" -o download.zip curl "http://nomad-lab.eu/prod/rae/api/raw/query?datasets.doi=10.17172/NOMAD/2020.03.18-1" -o download.zip
curl "http://nomad-lab.eu/prod/rae/api/raw/query?dataset=Full%20ahnarmonic%20stAViC%20approach%3A%20Silicon%20and%20SrTiO3" -o download.zip curl "http://nomad-lab.eu/prod/rae/api/raw/query?dataset=Full%20ahnarmonic%20stAViC%20approach%3A%20Silicon%20and%20SrTiO3" -o download.zip
``` ```
In a similar way you can see the archive of an entry: In a similar way you can see the archive of an entry:
``` ```sh
curl "http://nomad-lab.eu/prod/rae/api/archive/f0KQE2aiSz2KRE47QtoZtw/6xe9fZ9xoxBYZOq5lTt8JMgPa3gX" | python -m json.tool curl "http://nomad-lab.eu/prod/rae/api/archive/f0KQE2aiSz2KRE47QtoZtw/6xe9fZ9xoxBYZOq5lTt8JMgPa3gX" | python -m json.tool
``` ```
Or query and display the first page of 10 archives: Or query and display the first page of 10 archives:
``` ```sh
curl "http://nomad-lab.eu/prod/rae/api/archive/query?only_atoms=Si&only_atoms=O" | python -m json.tool curl "http://nomad-lab.eu/prod/rae/api/archive/query?only_atoms=Si&only_atoms=O" | python -m json.tool
``` ```
...@@ -164,7 +164,7 @@ Optionally, if you need to access your private data, the package *python-keycloa ...@@ -164,7 +164,7 @@ Optionally, if you need to access your private data, the package *python-keycloa
required to conveniently acquire the necessary tokens to authenticate your self towards required to conveniently acquire the necessary tokens to authenticate your self towards
NOMAD. NOMAD.
``` ```sh
pip install bravado pip install bravado
pip install python-keycloak pip install python-keycloak
``` ```
...@@ -386,20 +386,20 @@ The shell tool *curl* can be used to call most API endpoints. Most endpoints for ...@@ -386,20 +386,20 @@ The shell tool *curl* can be used to call most API endpoints. Most endpoints for
or downloading data are only **GET** operations controlled by URL parameters. For example: or downloading data are only **GET** operations controlled by URL parameters. For example:
Downloading data: Downloading data:
``` ```sh
curl http://nomad-lab.eu/prod/rae/api/raw/query?upload_id=<your_upload_id> -o download.zip curl http://nomad-lab.eu/prod/rae/api/raw/query?upload_id=<your_upload_id> -o download.zip
``` ```
It is a litle bit trickier, if you need to authenticate yourself, e.g. to download It is a litle bit trickier, if you need to authenticate yourself, e.g. to download
not yet published or embargoed data. All endpoints support and most require the use of not yet published or embargoed data. All endpoints support and most require the use of
an access token. To acquire an access token from our usermanagement system with curl: an access token. To acquire an access token from our usermanagement system with curl:
``` ```sh
curl --data 'grant_type=password&client_id=nomad_public&username=<your_username>&password=<your password>' \ curl --data 'grant_type=password&client_id=nomad_public&username=<your_username>&password=<your password>' \
https://nomad-lab.eu/fairdi/keycloak/auth/realms/fairdi_nomad_prod/protocol/openid-connect/token https://nomad-lab.eu/fairdi/keycloak/auth/realms/fairdi_nomad_prod/protocol/openid-connect/token
``` ```
You can use the access-token with: You can use the access-token with:
``` ```sh
curl -H 'Authorization: Bearer <you_access_token>' \ curl -H 'Authorization: Bearer <you_access_token>' \
http://nomad-lab.eu/prod/rae/api/raw/query?upload_id=<your_upload_id> -o download.zip http://nomad-lab.eu/prod/rae/api/raw/query?upload_id=<your_upload_id> -o download.zip
``` ```
......
...@@ -11,7 +11,7 @@ are up and running and both have access to the underlying file storage, part of ...@@ -11,7 +11,7 @@ are up and running and both have access to the underlying file storage, part of
which is mounted inside each container under :code:`.volumes/fs`. which is mounted inside each container under :code:`.volumes/fs`.
With both the source and target deployment running, you can use the With both the source and target deployment running, you can use the
:code::ref:`cli_ref:mirror` command to transfer the data from source to target. The :ref:`cli_ref:mirror` command to transfer the data from source to target. The
mirror will copy everything: i.e. the raw data, archive data and associated mirror will copy everything: i.e. the raw data, archive data and associated
metadata in the database. metadata in the database.
......
.. _install-client:
Install the NOMAD client library Install the NOMAD client library
================================ ================================
......
# Developing NOMAD # Developing NOMAD
## Introduction
The nomad infrastructure consists of a series of nomad and 3rd party services:
- nomad worker (python): task worker that will do the processing
- nomad app (python): the nomad app and it's REST APIs
- nomad gui: a small server serving the web-based react gui
- proxy: an nginx server that reverse proxyies all services under one port
- elastic search: nomad's search and analytics engine
- mongodb: used to store processing state
- rabbitmq: a task queue used to distribute work in a cluster
All 3rd party services should be run via *docker-compose* (see below). The
nomad python services can be run with python to develop them.
The gui can be run with a development server via yarn.
Below you will find information on how to install all python dependencies and code
manually. How to use *docker*/*docker-compose*. How run 3rd-party services with *docker-compose*.
Keep in mind the *docker-compose* configures all services in a way that mirror
the configuration of the python code in `nomad/config.py` and the gui config in
`gui/.env.development`.
To learn about how to run everything in docker, e.g. to operate a NOMAD OASIS in
production, go (here)(/app/docs/ops.html).
## Getting started ## Getting started
### Cloning and development tools ### Clone the sources
If not already done, you should clone nomad and create a python virtual environment. If not already done, you should clone nomad. To clone the main NOMAD repository:
To clone the repository:
``` ```
git clone git@gitlab.mpcdf.mpg.de:nomad-lab/nomad-FAIR.git git clone git@gitlab.mpcdf.mpg.de:nomad-lab/nomad-FAIR.git nomad
cd nomad-FAIR cd nomad
``` ```
### C libs ### Prepare your Python environment
Even though the NOMAD infrastructure is written in python, there is a C library
required by one of our python dependencies.
#### libmagic
Libmagic allows to determine the MIME type of files. It should be installed on most
unix/linux systems. It can be installed on MacOS with homebrew:
```
brew install libmagic
```
### Virtual environment You work in a Python virtual environment.
#### pyenv #### pyenv
The nomad code currently targets python 3.7. If you host machine has an older version installed, The nomad code currently targets python 3.7. If you host machine has an older version installed,
...@@ -67,43 +29,50 @@ virtualenv -p `which python3` .pyenv ...@@ -67,43 +29,50 @@ virtualenv -p `which python3` .pyenv
source .pyenv/bin/activate source .pyenv/bin/activate
``` ```
#### Conda #### conda
If you are a conda user, there is an equivalent, but you have to install pip and the If you are a conda user, there is an equivalent, but you have to install pip and the
right python version while creating the environment. right python version while creating the environment.
``` ```sh
conda create --name nomad_env pip python=3.7 conda create --name nomad_env pip python=3.7
conda activate nomad_env conda activate nomad_env
``` ```
To install libmagick for conda, you can use (other channels might also work): To install libmagick for conda, you can use (other channels might also work):
``` ```sh
conda install -c conda-forge --name nomad_env libmagic conda install -c conda-forge --name nomad_env libmagic
``` ```
#### pip #### pip
Make sure you have the most recent version of pip: Make sure you have the most recent version of pip:
``` ```sh
pip install --upgrade pip pip install --upgrade pip
``` ```
The next steps can be done using the `setup.sh` script. If you prefer to understand all #### Missing system libraries (e.g. on MacOS)
the steps and run them manually, read on:
Even though the NOMAD infrastructure is written in python, there is a C library
required by one of our python dependencies. Libmagic is missing on some systems.
Libmagic allows to determine the MIME type of files. It should be installed on most
unix/linux systems. It can be installed on MacOS with homebrew:
```sh
brew install libmagic
```
### Install NOMAD-coe dependencies. ### Install sub-modules.
Nomad is based on python modules from the NOMAD-coe project. Nomad is based on python modules from the NOMAD-coe project.
This includes parsers, python-common and the meta-info. These modules are maintained as This includes parsers, python-common and the meta-info. These modules are maintained as
their own GITLab/git repositories. To clone and initialize them run: their own GITLab/git repositories. To clone and initialize them run:
``` ```sh
git submodule update --init git submodule update --init
``` ```
All requirements for these submodules need to be installed and they need to be installed All requirements for these submodules need to be installed and they need to be installed
themselves as python modules. Run the `dependencies.sh` script that will install themselves as python modules. Run the `dependencies.sh` script that will install
everything into your virtual environment: everything into your virtual environment:
``` ```sh
./dependencies.sh -e ./dependencies.sh -e
``` ```
...@@ -112,19 +81,19 @@ to change the downloaded dependency code without having to reinstall after. ...@@ -112,19 +81,19 @@ to change the downloaded dependency code without having to reinstall after.
### Install nomad ### Install nomad
Finally, you can add nomad to the environment itself (including all extras) Finally, you can add nomad to the environment itself (including all extras)
``` ```sh
pip install -e .[all] pip install -e .[all]
``` ```
If pip tries to use and compile sources and this creates errors, it can be told to prefer binary version: If pip tries to use and compile sources and this creates errors, it can be told to prefer binary version:
``` ```sh
pip install -e .[all] --prefer-binary pip install -e .[all] --prefer-binary
``` ```
### Generate GUI artifacts ### Generate GUI artifacts
The NOMAD GUI requires static artifacts that are generated from the NOMAD Python codes. The NOMAD GUI requires static artifacts that are generated from the NOMAD Python codes.
``` ```sh
nomad dev metainfo > gui/src/metainfo.json nomad dev metainfo > gui/src/metainfo.json
nomad dev searchQuantities > gui/src/searchQuantities.json nomad dev searchQuantities > gui/src/searchQuantities.json
nomad dev units > gui/src/units.js nomad dev units > gui/src/units.js
...@@ -136,92 +105,53 @@ the tests. See below. ...@@ -136,92 +105,53 @@ the tests. See below.
## Running the infrastructure ## Running the infrastructure
### Docker and nomad To run NOMAD, some 3-rd party services are neeed
Nomad depends on a set of databases, search engines, and other services. Those - elastic search: nomad's search and analytics engine
must run to make use of nomad. We use *docker* and *docker-compose* to create a - mongodb: used to store processing state
unified environment that is easy to build and to run. - rabbitmq: a task queue used to distribute work in a cluster
You can use *docker* to run all necessary 3rd-party components and run all nomad
services manually from your python environment. You can also run nomad in docker,
but using Python is often preferred during development, since it allows
you change things, debug, and re-run things quickly. The later one brings you
closer to the environment that will be used to run nomad in production. For
development we recommend to skip the next step.
### Docker images for nomad
Nomad comprises currently two services,
the *worker* (does the actual processing), and the *app*. Those services can be
run from one image that have the nomad python code and all dependencies installed. This
is covered by the `Dockerfile` in the root directory
of the nomad sources. The gui is served also served from the *app* which entails the react-js frontend code.
Before building the image, make sure to execute
```
./gitinfo.sh
```
This allows the app to present some information about the current git revision without
having to copy the git itself to the docker build context.
### Run necessary 3-rd party services with docker-compose All 3rd party services should be run via *docker-compose* (see below).
Keep in mind the *docker-compose* configures all services in a way that mirror
the configuration of the python code in `nomad/config.py` and the gui config in
`gui/.env.development`.
You can run all containers with: You can run all services with:
``` ```sh
cd ops/docker-compose/infrastructure cd ops/docker-compose/infrastructure
docker-compose -f docker-compose.yml -f docker-compose.override.yml up -d mongo elastic rabbitmq docker-compose up -d mongo elastic rabbitmq
``` ```
To shut down everything, just `ctrl-c` the running output. If you started everything To shut down everything, just `ctrl-c` the running output. If you started everything
in *deamon* mode (`-d`) use: in *deamon* mode (`-d`) use:
``` ```sh
docker-compose down docker-compose down
``` ```
Usually these services only used by the nomad containers, but sometimes you also Usually these services only used by NOMAD, but sometimes you also
need to check something or do some manual steps. need to check something or do some manual steps. You can access mongodb and elastic search
via your preferred tools. Just make sure to use the right ports.
The *docker-compose* can be overriden with additional seetings. See documentation section on
operating NOMAD for more details. The override `docker-compose.override.yml` will
expose all database ports to the hostmachine and should be used in development. To use
it run docker-compose with `-f docker-compose.yml -f docker-compose.override.yml`.
### ELK (elastic stack)
If you run the ELK stack (and enable logstash in nomad/config.py),
you can reach the Kibana with [localhost:5601](http://localhost:5601).
The index prefix for logs is `logstash-`. The ELK is only available with the
`docker-compose.dev-elk.yml` override.
### mongodb and elastic search
You can access mongodb and elastic search via your preferred tools. Just make sure
to use the right ports (see above).
## Running NOMAD ## Running NOMAD
### API and worker NOMAD consist of the NOMAD app/api, a worker, and the GUI. You can run app and worker with
the NOMAD cli:
To simply run a worker with the installed nomad cli, do (from the root) ```sh
``` nomad admin run app
nomad admin run worker nomad admin run worker
nomad admin run appworker
``` ```
To run it directly with celery, do (from the root) The app will run at port 8000 by default.
```
celery -A nomad.processing worker -l info
```
You can also run worker and app together: To run the worker directly with celery, do (from the root)
``` ```sh
nomad admin run appworker celery -A nomad.processing worker -l info
``` ```
### GUI
When you run the gui on its own (e.g. with react dev server below), you have to have When you run the gui on its own (e.g. with react dev server below), you have to have
the API running manually also. This *inside docker* API is configured for ngingx paths the app manually also.
and proxies, which are run by the gui container. But you can run the *production* gui ```sh
in docker and the dev server gui in parallel with an API in docker.
Either with docker, or:
```
cd gui cd gui
yarn yarn
yarn start yarn start
...@@ -229,13 +159,12 @@ yarn start ...@@ -229,13 +159,12 @@ yarn start
## Running tests ## Running tests
### additional settings and artifacts
To run the tests some additional settings and files are necessary that are not part To run the tests some additional settings and files are necessary that are not part
of the code base. of the code base.
First you need to create a `nomad.yaml` with the admin password for the user management First you need to create a `nomad.yaml` with the admin password for the user management
system: system:
``` ```yaml
keycloak: keycloak:
password: <the-password> password: <the-password>
``` ```
...@@ -245,7 +174,7 @@ be copied from `/nomad/fairdi/db/data/springer.msg` on our servers and should ...@@ -245,7 +174,7 @@ be copied from `/nomad/fairdi/db/data/springer.msg` on our servers and should
be placed at `nomad/normalizing/data/springer.msg`. be placed at `nomad/normalizing/data/springer.msg`.
Thirdly, you have to provide static files to serve the docs and NOMAD distribution: Thirdly, you have to provide static files to serve the docs and NOMAD distribution:
``` ```sh
cd docs cd docs
make html make html
cd .. cd ..
...@@ -254,24 +183,23 @@ python setup.py sdist ...@@ -254,24 +183,23 @@ python setup.py sdist
cp dist/nomad-lab-*.tar.gz dist/nomad-lab.tar.gz cp dist/nomad-lab-*.tar.gz dist/nomad-lab.tar.gz
``` ```
### run the necessary infrastructure
You need to have the infrastructure partially running: elastic, rabbitmq. You need to have the infrastructure partially running: elastic, rabbitmq.
The rest should be mocked or provided by the tests. Make sure that you do no run any The rest should be mocked or provided by the tests. Make sure that you do no run any
worker, as they will fight for tasks in the queue. worker, as they will fight for tasks in the queue.
``` ```sh
cd ops/docker-compose cd ops/docker-compose/infrastructure
docker-compose up -d elastic rabbitmq docker-compose up -d elastic rabbitmq
cd ../.. cd ../..
pytest -svx tests pytest -svx tests
``` ```
We use pylint, pycodestyle, and mypy to ensure code quality. To run those: We use pylint, pycodestyle, and mypy to ensure code quality. To run those:
``` ```sh
nomad dev qa --skip-test nomad dev qa --skip-test
``` ```
To run all tests and code qa: To run all tests and code qa:
``` ```sh
nomad dev qa nomad dev qa
``` ```
...@@ -608,7 +536,7 @@ The lifecycle of a *feature* branch should look like this: ...@@ -608,7 +536,7 @@ The lifecycle of a *feature* branch should look like this:
While working on a feature, there are certain practices that will help us to create While working on a feature, there are certain practices that will help us to create
a clean history with coherent commits, where each commit stands on its own. a clean history with coherent commits, where each commit stands on its own.
``` ```sh
git commit --amend git commit --amend
``` ```
...@@ -619,7 +547,7 @@ you are basically adding changes to the last commit, i.e. editing the last commi ...@@ -619,7 +547,7 @@ you are basically adding changes to the last commit, i.e. editing the last commi
you push, you need to force it `git push origin feature-branch --force-with-lease`. So be careful, and you push, you need to force it `git push origin feature-branch --force-with-lease`. So be careful, and
only use this on your own branches. only use this on your own branches.
``` ```sh
git rebase <version-branch> git rebase <version-branch>
``` ```
...@@ -633,7 +561,7 @@ more consistent history. You can also rebase before create a merge request, bas ...@@ -633,7 +561,7 @@ more consistent history. You can also rebase before create a merge request, bas
allowing for no-op merges. Ideally the only real merges that we ever have, are between allowing for no-op merges. Ideally the only real merges that we ever have, are between
version branches. version branches.
``` ```sh
git merge --squash <other-branch> git merge --squash <other-branch>
``` ```
...@@ -659,7 +587,7 @@ you have to make sure that the modules are updated to not accidentally commit ol ...@@ -659,7 +587,7 @@ you have to make sure that the modules are updated to not accidentally commit ol
submodule commits again. Usually you do the following to check if you really have a submodule commits again. Usually you do the following to check if you really have a
clean working directory. clean working directory.
``` ```sh
git checkout something-with-changes git checkout something-with-changes
git submodule update git submodule update
git status git status
......
GUI React Components
====================
These is the API reference for NOMAD's GUI React components.
.. contents:: Table of Contents
.. reactdocgen:: react-docgen.out