diff --git a/docs/api_tutorial.md b/docs/api.md
similarity index 99%
rename from docs/api_tutorial.md
rename to docs/api.md
index 0a8371b478ea4d5adc6ff37cbedf83a0ccadd2ef..29b8b83b108a719eca439f49d51a5613526cfdea 100644
--- a/docs/api_tutorial.md
+++ b/docs/api.md
@@ -1,4 +1,4 @@
-# API Tutorials
+# How to use the APIs
 
 The NOMAD Repository and Archive offers all its functionality through an application
 programming interface (API). More specifically a [RESTful HTTP API](https://en.wikipedia.org/wiki/Representational_state_transfer) that allows you
diff --git a/docs/api.rst b/docs/api_reference.rst
similarity index 100%
rename from docs/api.rst
rename to docs/api_reference.rst
diff --git a/docs/archive.rst b/docs/archive.rst
index 59c4e7f0b16825cbf110a2dbbef84200afdcc26e..2e70bda34593f4dab2e4010a07762e83b2842956 100644
--- a/docs/archive.rst
+++ b/docs/archive.rst
@@ -1,7 +1,7 @@
 .. _access-the-archive-label:
 
-Using the NOMAD Archive
-=======================
+Data Access (Archive)
+=====================
 
 Of course, you can access the NOMAD Archive directly via the NOMAD API (see the `API tutorial <api_tutorial.html>`_
 and `API reference <api.html>`_). But, it is more effective and convenient to use NOMAD's Python client
diff --git a/docs/client/client.rst b/docs/client/client.rst
index 26d689f0a290815984f468ce674a6e1684ac4322..81be26bc5bfa307bfd9348dd3a88b78da8a8cde4 100644
--- a/docs/client/client.rst
+++ b/docs/client/client.rst
@@ -1,5 +1,5 @@
-NOMAD Python package and CLI
------------------------------------------------------
+NOMAD's Python library
+----------------------
 The :code:`nomad` python package comes with a command line interface (CLI) that
 can be accessed after installation by simply running the :code:`nomad` command
 in your terminal. The CLI provides a hiearchy of commands by using the `click
diff --git a/docs/dev/dev_guidelines.rst b/docs/dev/dev_guidelines.rst
deleted file mode 100644
index a57e96864c51f07e81c37e534742d416cd565eef..0000000000000000000000000000000000000000
--- a/docs/dev/dev_guidelines.rst
+++ /dev/null
@@ -1,331 +0,0 @@
-Development guidelines
-======================
-
-Design principles
------------------
-
-- simple first, complicated only when necessary
-- adopting generic established 3rd party solutions before implementing specific solutions
-- only uni directional dependencies between components/modules, no circles
-- only one language: Python (except, GUI of course)
-
-
-Source code & Git repository
-----------------------------
-
-Code Rules
-^^^^^^^^^^
-
-The are some *rules* or better strong *guidelines* for writing code. The following
-applies to all python code (and were applicable, also to JS and other code):
-
-- Use an IDE (e.g. `vscode <https://code.visualstudio.com/>`_ or otherwise automatically
-  enforce code (`formatting and linting <https://code.visualstudio.com/docs/python/linting>`_).
-  Use ``nomad qa`` before committing. This will run all tests, static type checks, linting, etc.
-
-- There is a style guide to python. Write `pep-8 <https://www.python.org/dev/peps/pep-0008/>`_
-  compliant python code. An exception is the line cap at 79, which can be broken but keep it 90-ish.
-
-- Test the public API of each sub-module (i.e. python file)
-
-- Be `pythonic <https://docs.python-guide.org/writing/style/>`_ and watch
-  `this <https://www.youtube.com/watch?v=wf-BqAjZb8M>`_.
-
-- Document any *public* API of each sub-module (e.g. python file). Public meaning API that
-  is exposed to other sub-modules (i.e. other python files).
-
-- Use google `docstrings <http://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html>`_.
-
-- Add your doc-strings to the sphinx documentation in ``docs``. Use .md, follow the example.
-  Markdown in sphix is supported via `recommonmark
-  <https://recommonmark.readthedocs.io/en/latest/index.html#autostructify>`_
-  and `AutoStructify <http://recommonmark.readthedocs.io/en/latest/auto_structify.html>`_
-
-- The project structure is according to `this guid <https://docs.python-guide.org/writing/structure/>`_.
-  Keep it!
-
-
-CI/CD
-^^^^^
-
-These *guidelines* are partially enforced by CI/CD. As part of CI all tests are run on all
-branches; further we run a *linter*, *pep8* checker, and *mypy* (static type checker). You can
-run ``nomad qa`` to run all these tests and checks before committing.
-
-The CI/CD will run on all refs that do not start with ``dev-``. The CI/CD will
-not release or deploy anything automatically, but it can be manually triggered after the
-build and test stage completed successfully.
-
-
-Git/GitLab
-^^^^^^^^^^
-
-The ``master`` branch of our repository is *protected*. You must not (even if you have
-the rights) commit to it directly. The ``master`` branch references the latest official
-release (i.e. what the current NOMAD runs on). The current development is represented by
-*version* branches, named ``vx.x.x``. Usually there are two or more of these branched,
-representing the development on *minor/bugfix* versions and the next *major* version(s).
-Ideally these *version* branches are also not manually push to.
-
-Instead you develop
-on *feature* branches. These are branches that are dedicated to implement a single feature.
-They are short lived and only exist to implement a single feature.
-
-The lifecycle of a *feature* branch should look like this:
-
-- create the *feature* branch from the last commit on the respective *version* branch that passes CI
-
-- do your work and push until you are satisfied and the CI passes
-
-- create a merge request on GitLab
-
-- discuss the merge request on GitLab
-
-- continue to work (with the open merge request) until all issues from the discussion are resolved
-
-- the maintainer performs the merge and the *feature* branch gets deleted
-
-While working on a feature, there are certain practices that will help us to create
-a clean history with coherent commits, where each commit stands on its own.
-
-.. code-block:: sh
-
-  git commit --amend
-
-If you committed something to your own feature branch and then realize by CI that you have
-some tiny error in it that you need to fix, try to amend this fix to the last commit.
-This will avoid unnecessary tiny commits and foster more coherent single commits. With `amend`
-you are basically adding changes to the last commit, i.e. editing the last commit. If
-you push, you need to force it ``git push origin feature-branch -f``. So be careful, and
-only use this on your own branches.
-
-.. code-block:: sh
-
-  git rebase <version-branch>
-
-Lets assume you work on a bigger feature that takes more time. You might want to merge
-the version branch into your feature branch from time to time to get the recent changes.
-In these cases, use rebase and not merge. Rebase puts your branch commits in front of the
-merged commits instead of creating a new commit with two ancestors. It basically moves the
-point where you initially branched away from the version branch to the current position in
-the version branch. This will avoid merges, merge commits, and generally leave us with a
-more consistent history.  You can also rebase before create a merge request, basically
-allowing for no-op merges. Ideally the only real merges that we ever have, are between
-version branches.
-
-.. code-block:: sh
-
-  git merge --squash <other-branch>
-
-When you need multiple branches to implement a feature and merge between them, try to
-use `squash`. Squashing basically puts all commits of the merged branch into a single commit.
-It basically allows you to have many commits and then squash them into one. This is useful
-if these commits where just made for synchronization between workstations or due to
-unexpected errors in CI/CD, you needed a save point, etc. Again the goal is to have
-coherent commits, where each commits makes sense on its own.
-
-Often a feature is also represented by an *issue* on GitLab. Please mention the respective
-issues in your commits by adding the issue id at the end of the commit message: `My message. #123`.
-
-We tag releases with ``vX.X.X`` according to the regular semantic versioning practices.
-After releasing and tagging the *version* branch is removed. Do not confuse tags with *version* branches.
-Remember that tags and branches are both Git references and you can accidentally pull/push/checkout a tag.
-
-The main NOMAD GitLab-project (``nomad-fair``) uses Git-submodules to maintain its
-parsers and other dependencies. All these submodules are places in the `/dependencies`
-directory. There are helper scripts to install (`dependencies.sh`, see :ref:`setup </setup.html>`) and
-commit changes to all submodules (`dependencies-git.sh`). After merging or checking out,
-you have to make sure that the modules are updated to not accidentally commit old
-submodule commits again. Usually you do the following to check if you really have a
-clean working directory.
-
-.. code-block:: sh
-
-  git checkout something-with-changes
-  git submodule update
-  git status
-
-
-Terms and Identifiers
----------------------
-
-There are is some terminology consistently used in this documentation and the source
-code. Use this terminology for identifiers.
-
-Do not use abbreviations. There are (few) exceptions: ``proc`` (processing); ``exc``, ``e`` (exception);
-``calc`` (calculation), ``repo`` (repository), ``utils`` (utilities), and ``aux`` (auxiliary).
-Other exceptions are ``f`` for file-like streams and ``i`` for index running variables.
-Btw., the latter is almost never necessary in python.
-
-Terms:
-
-- upload: A logical unit that comprises one (.zip) file uploaded by a user.
-- calculation: A computation in the sense that is was created by an individual run of a CMS code.
-- raw file: User uploaded files (e.g. part of the uploaded .zip), usually code input or output.
-- upload file/uploaded file: The actual (.zip) file a user uploaded
-- mainfile: The mainfile output file of a CMS code run.
-- aux file: Additional files the user uploaded within an upload.
-- repo entry: Some quantities of a calculation that are used to represent that calculation in the repository.
-- archive data: The normalized data of one calculation in nomad's meta-info-based format.
-
-
-.. _id-reference-label:
-
-Ids
----
-
-Throughout nomad, we use different ids. If something
-is called *id*, it is usually a random uuid and has no semantic connection to the entity
-it identifies. If something is called a *hash* than it is a hash build based on the
-entity it identifies. This means either the whole thing or just some properties of
-said entities.
-
-- The most common hashes is the ``calc_hash`` based on mainfile and auxfile contents.
-- The ``upload_id`` is a UUID assigned at upload time and never changed afterwards.
-- The ``mainfile`` is a path within an upload that points to a main code output file.
-  Since, the upload directory structure does not change, this uniquely ids a calc within the upload.
-- The ``calc_id`` (internal calculation id) is a hash over the ``mainfile`` and respective
-  ``upload_id``. Therefore, each `calc_id` ids a calc on its own.
-- We often use pairs of `upload_id/calc_id`, which in many context allow to resolve a calc
-  related file on the filesystem without having to ask a database about it.
-- The ``pid`` or (``coe_calc_id``) is an sequential interger id.
-- Calculation ``handle`` or ``handle_id`` are created based on those ``pid``.
-  To create hashes we use :py:func:`nomad.utils.hash`.
-
-
-NOMAD-coe Dependencies
-----------------------
-
-We currently use git submodules to maintain references to NOMAD-coe dependencies.
-All dependencies are python packages and installed via pip to your python environement.
-
-This allows us to target (e.g. install) individual commits. More importantly, we can address c
-ommit hashes to identify exact parser/normalizer versions. On the downside, common functions
-for all dependencies (e.g. the python-common package, or nomad_meta_info) cannot be part
-of the nomad-FAIRDI project. In general, it is hard to simultaneously develop nomad-FAIRDI
-and NOMAD-coe dependencies.
-
-Another approach is to integrate the NOMAD-coe sources with nomad-FAIRDI. The lacking
-availability of individual commit hashes, could be replaces with hashes of source-code
-files.
-
-We use the branch ``nomad-fair`` on all dependencies for nomad-FAIRDI specific changes.
-
-
-Parsers
-^^^^^^^
-
-There are several steps to take, to wrap a NOMAD-coe parser into a nomad@FAIRDI parser:
-
-- Implement ``nomadcore.baseclasses.ParserInterface`` or a class with a similar constructutor
-  and `parse` method interface.
-- Make sure that the meta-info is
-  only loaded for each parse instance, not for each parser run.
-- Have a root package that bears the parser name, e.g. ``vaspparser``
-- The important classes (e.g. the parser interface implementation) in the root module
-  (e.g. ``vaspparser/__init__.py``)
-- Only use sub-modules were necessary. Try to avoid sub-directories
-- Have a test module. Don't go overboard with the test data.
-- Make it a pypi-style package, i.e. create ``setup.py`` script.
-- The package name should be the parser name, e.g. ``vaspparser``.
-- Let the parser logging as it is. We will catch it with a handler installed on the root logger.
-  This handler will redirect all legacy log events and put it though the nomad@FAIRDI
-  treatment described below.
-- Remove all scala code.
-
-
-Normalizers
-^^^^^^^^^^^
-
-We are rewriting all NOMAD-coe normalizers, see :py:mod:`nomad.normalizing`.
-
-
-Logging
--------
-
-There are three important prerequisites to understand about nomad-FAIRDI's logging:
-
-- All log entries are recorded in a central elastic search database. To make this database
-  useful, log entries must be sensible in size, frequence, meaning, level, and logger name.
-  Therefore, we need to follow some rules when it comes to logging.
-- We use an *structured* logging approach. Instead of encoding all kinds of information
-  in log messages, we use key-value pairs that provide context to a log *event*. In the
-  end all entries are stored as JSON dictionaries with ``@timestamp``, ``level``,
-  ``logger_name``, ``event`` plus custom context data. Keep events very short, most
-  information goes into the context.
-- We use logging to inform about the state of nomad-FAIRDI, not about user
-  behavior, input, data. Do not confuse this when determining the log-level for an event.
-  For example, a user providing an invalid upload file, for example, should never be an error.
-
-Please follow the following rules when logging:
-
-- If a logger is not already provided, only use
-  :py:func:`nomad.utils.get_logger` to acquire a new logger. Never use the
-  build-in logging directly. These logger work like the system loggers, but
-  allow you to pass keyword arguments with additional context data. See also
-  the `structlog docs <https://structlog.readthedocs.io/en/stable/>`_.
-- In many context, a logger is already provided (e.g. api, processing, parser, normalizer).
-  This provided logger has already context information bounded. So it is important to
-  use those instead of acquiring your own loggers. Have a look for methods called
-  ``get_logger`` or attributes called ``logger``.
-- Keep events (what usually is called *message*) very short. Examples are: *file uploaded*,
-  *extraction failed*, etc.
-- Structure the keys for context information. When you analyse logs in ELK, you will
-  see that the set of all keys over all log entries can be quit large. Structure your
-  keys to make navigation easier. Use keys like ``nomad.proc.parser_version`` instead of
-  ``parser_version``. Use module names as prefixes.
-- Don't log everything. Try to anticipate, how you would use the logs in case of bugs,
-  error scenarios, etc.
-- Don't log sensitive data.
-- Think before logging data (especially dicts, list, numpy arrays, etc.).
-- Logs should not be abused as a *printf*-style debugging tool.
-
-Used log keys
-^^^^^^^^^^^^^
-The following keys are used in the final logs that are piped to Logstash.
-Notice that the key name is automatically formed by a separate formatter and
-may differ from the one used in the actual log call.
-
-Keys that are autogenerated for all logs:
-
- - ``@timestamp``: Timestamp for the log
- - ``@version``: Version of the logger
- - ``host``: The host name from which the log originated
- - ``path``: Path of the module from which the log was created
- - ``tags``: Tags for this log
- - ``type``: The `message_type` as set in the LogstashFormatter
- - ``level``: The log level: ``DEBUG``, ``INFO``, ``WARNING``, ``ERROR``
- - ``logger_name``: Name of the logger
- - ``nomad.service``: The service name as configured in ``config.py``
- - ``nomad.release``: The release name as configured in ``config.py``
-
-Keys that are present for events related to processing an entry:
-
- - ``nomad.upload_id``: The id of the currently processed upload
- - ``nomad.calc_id``: The id of the currently processed entry
- - ``nomad.mainfile``: The mainfile of the currently processed entry
-
-Keys that are present for events related to exceptions:
-
- - ``exc_info``: Stores the full python exception that was encountered. All
-   uncaught exceptions will be stored automatically here.
- - ``digest``: If an exception was raised, the last 256 characters of the message
-   are stored automatically into this key. If you wish to search for exceptions
-   in Kibana, you will want to use this value as it will be indexed unlike the
-   full exception object.
-
-
-Copyright Notices
------------------
-
-We follow this `recommendation <https://www.linuxfoundation.org/blog/2020/01/copyright-notices-in-open-source-software-projects/>`_
-of the Linux Foundation for the copyright notice that is placed on top of each source
-code file.
-
-It is intended to provide a broad generic statement that allows all authors/contributors
-of the NOMAD project to claim their copyright, independent of their organization or
-individual ownership.
-
-You can simply copy the notice from another file. From time to time we can use a tool
-like `licenseheaders <https://pypi.org/project/licenseheaders/>`_ to ensure correct
-notices. In addition we keep an purely informative AUTHORS file.
\ No newline at end of file
diff --git a/docs/dev/setup.md b/docs/dev/setup.md
deleted file mode 100644
index f54ec746723d456078ebf51cafb8ca015d7e55af..0000000000000000000000000000000000000000
--- a/docs/dev/setup.md
+++ /dev/null
@@ -1,397 +0,0 @@
-# Development Setup
-
-## Introduction
-The nomad infrastructure consists of a series of nomad and 3rd party services:
-- nomad worker (python): task worker that will do the processing
-- nomad app (python): the nomad app and it's REST APIs
-- nomad gui: a small server serving the web-based react gui
-- proxy: an nginx server that reverse proxyies all services under one port
-- elastic search: nomad's search and analytics engine
-- mongodb: used to store processing state
-- rabbitmq: a task queue used to distribute work in a cluster
-
-All 3rd party services should be run via *docker-compose* (see below). The
-nomad python  services can be run with python to develop them.
-The gui can be run with a development server via yarn.
-
-Below you will find information on how to install all python dependencies and code
-manually. How to use *docker*/*docker-compose*. How run 3rd-party services with *docker-compose*.
-
-Keep in mind the *docker-compose* configures all services in a way that mirror
-the configuration of the python code in `nomad/config.py` and the gui config in
-`gui/.env.development`.
-
-To learn about how to run everything in docker, e.g. to operate a NOMAD OASIS in
-production, go (here)(/app/docs/ops.html).
-
-## Install python code and dependencies
-
-### Cloning and development tools
-If not already done, you should clone nomad and create a python virtual environment.
-
-To clone the repository:
-```
-git clone git@gitlab.mpcdf.mpg.de:nomad-lab/nomad-FAIR.git
-cd nomad-FAIR
-```
-
-### C libs
-
-Even though the NOMAD infrastructure is written in python, there is a C library
-required by one of our python dependencies.
-
-#### libmagic
-
-Libmagic allows to determine the MIME type of files. It should be installed on most
-unix/linux systems. It can be installed on MacOS with homebrew:
-
-```
-brew install libmagic
-```
-
-### Virtual environment
-
-#### pyenv
-The nomad code currently targets python 3.7. If you host machine has an older version installed,
-you can use [pyenv](https://github.com/pyenv/pyenv) to use python 3.7 in parallel to your
-system's python.
-
-#### virtualenv
-We strongly recommend to use *virtualenv* to create a virtual environment. It will allow you
-to keep nomad and its dependencies separate from your system's python installation.
-Make sure to base the virtual environment on Python 3.
-To install *virtualenv*, create an environment and activate the environment use:
-```
-pip install virtualenv
-virtualenv -p `which python3` .pyenv
-source .pyenv/bin/activate
-```
-
-#### Conda
-If you are a conda user, there is an equivalent, but you have to install pip and the
-right python version while creating the environment.
-```
-conda create --name nomad_env pip python=3.7
-conda activate nomad_env
-```
-
-To install libmagick for conda, you can use (other channels might also work):
-```
-conda install -c conda-forge --name nomad_env libmagic
-```
-
-#### pip
-Make sure you have the most recent version of pip:
-```
-pip install --upgrade pip
-```
-
-
-The next steps can be done using the `setup.sh` script. If you prefer to understand all
-the steps and run them manually, read on:
-
-
-### Install NOMAD-coe dependencies.
-Nomad is based on python modules from the NOMAD-coe project.
-This includes parsers, python-common and the meta-info. These modules are maintained as
-their own GITLab/git repositories. To clone and initialize them run:
-
-```
-git submodule update --init
-```
-
-All requirements for these submodules need to be installed and they need to be installed
-themselves as python modules. Run the `dependencies.sh` script that will install
-everything into your virtual environment:
-```
-./dependencies.sh -e
-```
-
-The `-e` option will install the NOMAD-coe dependencies with symbolic links allowing you
-to change the downloaded dependency code without having to reinstall after.
-
-### Install nomad
-Finally, you can add nomad to the environment itself (including all extras)
-```
-pip install -e .[all]
-```
-
-If pip tries to use and compile sources and this creates errors, it can be told to prefer binary version:
-
-```
-pip install -e .[all] --prefer-binary
-```
-
-### Generate GUI artifacts
-The NOMAD GUI requires static artifacts that are generated from the NOMAD Python codes.
-```
-nomad dev metainfo > gui/src/metainfo.json
-nomad dev searchQuantities > gui/src/searchQuantities.json
-nomad dev units > gui/src/units.js
-./gitinfo.sh
-```
-
-In additional, you have to do some more steps to prepare your working copy to run all
-the tests. See below.
-
-## Build and run the infrastructure with docker
-
-### Docker and nomad
-Nomad depends on a set of databases, search engines, and other services. Those
-must run to make use of nomad. We use *docker* and *docker-compose* to create a
-unified environment that is easy to build and to run.
-
-You can use *docker* to run all necessary 3rd-party components and run all nomad
-services manually from your python environment. You can also run nomad in docker,
-but using Python is often preferred during development, since it allows
-you change things, debug, and re-run things quickly. The later one brings you
-closer to the environment that will be used to run nomad in production. For
-development we recommend to skip the next step.
-
-### Docker images for nomad
-Nomad comprises currently two services,
-the *worker* (does the actual processing), and the *app*. Those services can be
-run from one image that have the nomad python code and all dependencies installed. This
-is covered by the `Dockerfile` in the root directory
-of the nomad sources. The gui is served also served from the *app* which entails the react-js frontend code.
-
-Before building the image, make sure to execute
-```
-./gitinfo.sh
-```
-This allows the app to present some information about the current git revision without
-having to copy the git itself to the docker build context.
-
-### Run necessary 3-rd party services with docker-compose
-
-You can run all containers with:
-```
-cd ops/docker-compose/infrastructure
-docker-compose -f docker-compose.yml -f docker-compose.override.yml up -d mongo elastic rabbitmq
-```
-
-To shut down everything, just `ctrl-c` the running output. If you started everything
-in *deamon* mode (`-d`) use:
-```
-docker-compose down
-```
-
-Usually these services only used by the nomad containers, but sometimes you also
-need to check something or do some manual steps.
-
-The *docker-compose* can be overriden with additional seetings. See documentation section on
-operating NOMAD for more details. The override `docker-compose.override.yml` will
-expose all database ports to the hostmachine and should be used in development. To use
-it run docker-compose with `-f docker-compose.yml -f docker-compose.override.yml`.
-
-### ELK (elastic stack)
-
-If you run the ELK stack (and enable logstash in nomad/config.py),
-you can reach the Kibana with [localhost:5601](http://localhost:5601).
-The index prefix for logs is `logstash-`. The ELK is only available with the
-`docker-compose.dev-elk.yml` override.
-
-### mongodb and elastic search
-
-You can access mongodb and elastic search via your preferred tools. Just make sure
-to use the right ports (see above).
-
-## Run nomad services
-
-### API and worker
-
-To simply run a worker with the installed nomad cli, do (from the root)
-```
-nomad admin run worker
-```
-
-To run it directly with celery, do (from the root)
-```
-celery -A nomad.processing worker -l info
-```
-
-You can also run worker and app together:
-```
-nomad admin run appworker
-```
-
-### GUI
-When you run the gui on its own (e.g. with react dev server below), you have to have
-the API running manually also. This *inside docker* API is configured for ngingx paths
-and proxies, which are run by the gui container. But you can run the *production* gui
-in docker and the dev server gui in parallel with an API in docker.
-Either with docker, or:
-```
-cd gui
-yarn
-yarn start
-```
-
-## Run the tests
-
-### additional settings and artifacts
-To run the tests some additional settings and files are necessary that are not part
-of the code base.
-
-First you need to create a `nomad.yaml` with the admin password for the user management
-system:
-```
-keycloak:
-  password: <the-password>
-```
-
-Secondly, you need to provide the `springer.msg` Springer materials database. It can
-be copied from `/nomad/fairdi/db/data/springer.msg` on our servers and should
-be placed at `nomad/normalizing/data/springer.msg`.
-
-Thirdly, you have to provide static files to serve the docs and NOMAD distribution:
-```
-cd docs
-make html
-cd ..
-python setup.py compile
-python setup.py sdist
-cp dist/nomad-lab-*.tar.gz dist/nomad-lab.tar.gz
-```
-
-### run the necessary infrastructure
-You need to have the infrastructure partially running: elastic, rabbitmq.
-The rest should be mocked or provided by the tests. Make sure that you do no run any
-worker, as they will fight for tasks in the queue.
-```
-cd ops/docker-compose
-docker-compose up -d elastic rabbitmq
-cd ../..
-pytest -svx tests
-```
-
-We use pylint, pycodestyle, and mypy to ensure code quality. To run those:
-```
-nomad dev qa --skip-test
-```
-
-To run all tests and code qa:
-```
-nomad dev qa
-```
-
-This mimiques the tests and checks that the GitLab CI/CD will perform.
-
-
-## Setup your (I)DE
-
-The documentation section on development guidelines details how the code is organized,
-tested, formatted, and documented. To help you meet these guidelines, we recomment to
-use a proper IDE for development and ditch any VIM/Emacs (mal-)practices.
-
-### Visual Studio Code
-
-Here are some VSCode settings that will enable features for linting, some auto formating,
-line size ruler, etc.
-```json
-{
-    "python.venvPath": "${workspaceFolder}/.pyenv",
-    "python.pythonPath": "${workspaceFolder}/.pyenv/bin/python",
-    "git.ignoreLimitWarning": true,
-    "editor.rulers": [90],
-    "editor.renderWhitespace": "all",
-    "editor.tabSize": 4,
-    "[javascript]": {
-        "editor.tabSize": 2
-    },
-    "files.trimTrailingWhitespace": true,
-    "git.enableSmartCommit": true,
-    "eslint.autoFixOnSave": true,
-    "python.linting.pylintArgs": [
-        "--load-plugins=pylint_mongoengine,nomad/metainfo/pylint_plugin",
-    ],
-    "python.linting.pep8Path": "pycodestyle",
-    "python.linting.pep8Enabled": true,
-    "python.linting.pep8Args": ["--ignore=E501,E701"],
-    "python.linting.mypyEnabled": true,
-    "python.linting.mypyArgs": [
-        "--ignore-missing-imports",
-        "--follow-imports=silent",
-        "--no-strict-optional"
-    ],
-    "workbench.colorCustomizations": {
-        "editorError.foreground": "#FF2222",
-        "editorOverviewRuler.errorForeground": "#FF2222",
-        "editorWarning.foreground": "#FF5500",
-        "editorOverviewRuler.warningForeground": "#FF5500",
-        "activityBar.background": "#4D2111",
-        "titleBar.activeBackground": "#6B2E18",
-        "titleBar.activeForeground": "#FDF9F7"
-    },
-    "files.watcherExclude": {
-        "**/.git/objects/**": true,
-        "**/.git/subtree-cache/**": true,
-        "**/node_modules/*/**": true,
-        "**/.pyenv/*/**": true,
-        "**/__pycache__/*/**": true,
-        "**/.mypy_cache/*/**": true,
-        "**/.volumes/*/**": true,
-        "**/docs/.build/*/**": true
-    }
-}
-```
-
-Here are some example launch configs for VSCode:
-
-```json
-{
-  "version": "0.2.0",
-  "configurations": [
-    {
-      "type": "chrome",
-      "request": "launch",
-      "name": "Launch Chrome against localhost",
-      "url": "http://localhost:3000",
-      "webRoot": "${workspaceFolder}/gui"
-    },
-    {
-      "name": "Python: API Flask (0.11.x or later)",
-      "type": "python",
-      "request": "launch",
-      "module": "flask",
-      "env": {
-        "FLASK_APP": "nomad/app/__init__.py"
-      },
-      "args": [
-        "run",
-        "--port",
-        "8000",
-        "--no-debugger",
-        "--no-reload"
-      ]
-    },
-    {
-      "name": "Python: some test",
-      "type": "python",
-      "request": "launch",
-      "cwd": "${workspaceFolder}",
-      "program": "${workspaceFolder}/.pyenv/bin/pytest",
-      "args": [
-        "-sv",
-        "tests/test_cli.py::TestClient::test_mirror"
-      ]
-    },
-    {
-      "name": "Python: Current File",
-      "type": "python",
-      "request": "launch",
-      "program": "${file}"
-    },
-    {
-      "name": "Python: Attach",
-      "type": "python",
-      "request": "attach",
-      "localRoot": "${workspaceFolder}",
-      "remoteRoot": "${workspaceFolder}",
-      "port": 3000,
-      "secret": "my_secret",
-      "host": "localhost"
-    }
-  ]
-}
-```
diff --git a/docs/developers.md b/docs/developers.md
new file mode 100644
index 0000000000000000000000000000000000000000..c9d0806c479d2307b1adb30a64e7ffeb934a2839
--- /dev/null
+++ b/docs/developers.md
@@ -0,0 +1,684 @@
+# Developing NOMAD
+
+## Introduction
+The nomad infrastructure consists of a series of nomad and 3rd party services:
+- nomad worker (python): task worker that will do the processing
+- nomad app (python): the nomad app and it's REST APIs
+- nomad gui: a small server serving the web-based react gui
+- proxy: an nginx server that reverse proxyies all services under one port
+- elastic search: nomad's search and analytics engine
+- mongodb: used to store processing state
+- rabbitmq: a task queue used to distribute work in a cluster
+
+All 3rd party services should be run via *docker-compose* (see below). The
+nomad python  services can be run with python to develop them.
+The gui can be run with a development server via yarn.
+
+Below you will find information on how to install all python dependencies and code
+manually. How to use *docker*/*docker-compose*. How run 3rd-party services with *docker-compose*.
+
+Keep in mind the *docker-compose* configures all services in a way that mirror
+the configuration of the python code in `nomad/config.py` and the gui config in
+`gui/.env.development`.
+
+To learn about how to run everything in docker, e.g. to operate a NOMAD OASIS in
+production, go (here)(/app/docs/ops.html).
+
+## Getting started
+
+### Cloning and development tools
+If not already done, you should clone nomad and create a python virtual environment.
+
+To clone the repository:
+```
+git clone git@gitlab.mpcdf.mpg.de:nomad-lab/nomad-FAIR.git
+cd nomad-FAIR
+```
+
+### C libs
+
+Even though the NOMAD infrastructure is written in python, there is a C library
+required by one of our python dependencies.
+
+#### libmagic
+
+Libmagic allows to determine the MIME type of files. It should be installed on most
+unix/linux systems. It can be installed on MacOS with homebrew:
+
+```
+brew install libmagic
+```
+
+### Virtual environment
+
+#### pyenv
+The nomad code currently targets python 3.7. If you host machine has an older version installed,
+you can use [pyenv](https://github.com/pyenv/pyenv) to use python 3.7 in parallel to your
+system's python.
+
+#### virtualenv
+We strongly recommend to use *virtualenv* to create a virtual environment. It will allow you
+to keep nomad and its dependencies separate from your system's python installation.
+Make sure to base the virtual environment on Python 3.
+To install *virtualenv*, create an environment and activate the environment use:
+```
+pip install virtualenv
+virtualenv -p `which python3` .pyenv
+source .pyenv/bin/activate
+```
+
+#### Conda
+If you are a conda user, there is an equivalent, but you have to install pip and the
+right python version while creating the environment.
+```
+conda create --name nomad_env pip python=3.7
+conda activate nomad_env
+```
+
+To install libmagick for conda, you can use (other channels might also work):
+```
+conda install -c conda-forge --name nomad_env libmagic
+```
+
+#### pip
+Make sure you have the most recent version of pip:
+```
+pip install --upgrade pip
+```
+
+
+The next steps can be done using the `setup.sh` script. If you prefer to understand all
+the steps and run them manually, read on:
+
+
+### Install NOMAD-coe dependencies.
+Nomad is based on python modules from the NOMAD-coe project.
+This includes parsers, python-common and the meta-info. These modules are maintained as
+their own GITLab/git repositories. To clone and initialize them run:
+
+```
+git submodule update --init
+```
+
+All requirements for these submodules need to be installed and they need to be installed
+themselves as python modules. Run the `dependencies.sh` script that will install
+everything into your virtual environment:
+```
+./dependencies.sh -e
+```
+
+The `-e` option will install the NOMAD-coe dependencies with symbolic links allowing you
+to change the downloaded dependency code without having to reinstall after.
+
+### Install nomad
+Finally, you can add nomad to the environment itself (including all extras)
+```
+pip install -e .[all]
+```
+
+If pip tries to use and compile sources and this creates errors, it can be told to prefer binary version:
+
+```
+pip install -e .[all] --prefer-binary
+```
+
+### Generate GUI artifacts
+The NOMAD GUI requires static artifacts that are generated from the NOMAD Python codes.
+```
+nomad dev metainfo > gui/src/metainfo.json
+nomad dev searchQuantities > gui/src/searchQuantities.json
+nomad dev units > gui/src/units.js
+./gitinfo.sh
+```
+
+In additional, you have to do some more steps to prepare your working copy to run all
+the tests. See below.
+
+## Running the infrastructure
+
+### Docker and nomad
+Nomad depends on a set of databases, search engines, and other services. Those
+must run to make use of nomad. We use *docker* and *docker-compose* to create a
+unified environment that is easy to build and to run.
+
+You can use *docker* to run all necessary 3rd-party components and run all nomad
+services manually from your python environment. You can also run nomad in docker,
+but using Python is often preferred during development, since it allows
+you change things, debug, and re-run things quickly. The later one brings you
+closer to the environment that will be used to run nomad in production. For
+development we recommend to skip the next step.
+
+### Docker images for nomad
+Nomad comprises currently two services,
+the *worker* (does the actual processing), and the *app*. Those services can be
+run from one image that have the nomad python code and all dependencies installed. This
+is covered by the `Dockerfile` in the root directory
+of the nomad sources. The gui is served also served from the *app* which entails the react-js frontend code.
+
+Before building the image, make sure to execute
+```
+./gitinfo.sh
+```
+This allows the app to present some information about the current git revision without
+having to copy the git itself to the docker build context.
+
+### Run necessary 3-rd party services with docker-compose
+
+You can run all containers with:
+```
+cd ops/docker-compose/infrastructure
+docker-compose -f docker-compose.yml -f docker-compose.override.yml up -d mongo elastic rabbitmq
+```
+
+To shut down everything, just `ctrl-c` the running output. If you started everything
+in *deamon* mode (`-d`) use:
+```
+docker-compose down
+```
+
+Usually these services only used by the nomad containers, but sometimes you also
+need to check something or do some manual steps.
+
+The *docker-compose* can be overriden with additional seetings. See documentation section on
+operating NOMAD for more details. The override `docker-compose.override.yml` will
+expose all database ports to the hostmachine and should be used in development. To use
+it run docker-compose with `-f docker-compose.yml -f docker-compose.override.yml`.
+
+### ELK (elastic stack)
+
+If you run the ELK stack (and enable logstash in nomad/config.py),
+you can reach the Kibana with [localhost:5601](http://localhost:5601).
+The index prefix for logs is `logstash-`. The ELK is only available with the
+`docker-compose.dev-elk.yml` override.
+
+### mongodb and elastic search
+
+You can access mongodb and elastic search via your preferred tools. Just make sure
+to use the right ports (see above).
+
+## Running NOMAD
+
+### API and worker
+
+To simply run a worker with the installed nomad cli, do (from the root)
+```
+nomad admin run worker
+```
+
+To run it directly with celery, do (from the root)
+```
+celery -A nomad.processing worker -l info
+```
+
+You can also run worker and app together:
+```
+nomad admin run appworker
+```
+
+### GUI
+When you run the gui on its own (e.g. with react dev server below), you have to have
+the API running manually also. This *inside docker* API is configured for ngingx paths
+and proxies, which are run by the gui container. But you can run the *production* gui
+in docker and the dev server gui in parallel with an API in docker.
+Either with docker, or:
+```
+cd gui
+yarn
+yarn start
+```
+
+## Running tests
+
+### additional settings and artifacts
+To run the tests some additional settings and files are necessary that are not part
+of the code base.
+
+First you need to create a `nomad.yaml` with the admin password for the user management
+system:
+```
+keycloak:
+  password: <the-password>
+```
+
+Secondly, you need to provide the `springer.msg` Springer materials database. It can
+be copied from `/nomad/fairdi/db/data/springer.msg` on our servers and should
+be placed at `nomad/normalizing/data/springer.msg`.
+
+Thirdly, you have to provide static files to serve the docs and NOMAD distribution:
+```
+cd docs
+make html
+cd ..
+python setup.py compile
+python setup.py sdist
+cp dist/nomad-lab-*.tar.gz dist/nomad-lab.tar.gz
+```
+
+### run the necessary infrastructure
+You need to have the infrastructure partially running: elastic, rabbitmq.
+The rest should be mocked or provided by the tests. Make sure that you do no run any
+worker, as they will fight for tasks in the queue.
+```
+cd ops/docker-compose
+docker-compose up -d elastic rabbitmq
+cd ../..
+pytest -svx tests
+```
+
+We use pylint, pycodestyle, and mypy to ensure code quality. To run those:
+```
+nomad dev qa --skip-test
+```
+
+To run all tests and code qa:
+```
+nomad dev qa
+```
+
+This mimiques the tests and checks that the GitLab CI/CD will perform.
+
+
+## Setup your (I)DE
+
+The documentation section on development guidelines details how the code is organized,
+tested, formatted, and documented. To help you meet these guidelines, we recomment to
+use a proper IDE for development and ditch any VIM/Emacs (mal-)practices.
+
+### Visual Studio Code
+
+Here are some VSCode settings that will enable features for linting, some auto formating,
+line size ruler, etc.
+```json
+{
+    "python.venvPath": "${workspaceFolder}/.pyenv",
+    "python.pythonPath": "${workspaceFolder}/.pyenv/bin/python",
+    "git.ignoreLimitWarning": true,
+    "editor.rulers": [90],
+    "editor.renderWhitespace": "all",
+    "editor.tabSize": 4,
+    "[javascript]": {
+        "editor.tabSize": 2
+    },
+    "files.trimTrailingWhitespace": true,
+    "git.enableSmartCommit": true,
+    "eslint.autoFixOnSave": true,
+    "python.linting.pylintArgs": [
+        "--load-plugins=pylint_mongoengine,nomad/metainfo/pylint_plugin",
+    ],
+    "python.linting.pep8Path": "pycodestyle",
+    "python.linting.pep8Enabled": true,
+    "python.linting.pep8Args": ["--ignore=E501,E701"],
+    "python.linting.mypyEnabled": true,
+    "python.linting.mypyArgs": [
+        "--ignore-missing-imports",
+        "--follow-imports=silent",
+        "--no-strict-optional"
+    ],
+    "workbench.colorCustomizations": {
+        "editorError.foreground": "#FF2222",
+        "editorOverviewRuler.errorForeground": "#FF2222",
+        "editorWarning.foreground": "#FF5500",
+        "editorOverviewRuler.warningForeground": "#FF5500",
+        "activityBar.background": "#4D2111",
+        "titleBar.activeBackground": "#6B2E18",
+        "titleBar.activeForeground": "#FDF9F7"
+    },
+    "files.watcherExclude": {
+        "**/.git/objects/**": true,
+        "**/.git/subtree-cache/**": true,
+        "**/node_modules/*/**": true,
+        "**/.pyenv/*/**": true,
+        "**/__pycache__/*/**": true,
+        "**/.mypy_cache/*/**": true,
+        "**/.volumes/*/**": true,
+        "**/docs/.build/*/**": true
+    }
+}
+```
+
+Here are some example launch configs for VSCode:
+
+```json
+{
+  "version": "0.2.0",
+  "configurations": [
+    {
+      "type": "chrome",
+      "request": "launch",
+      "name": "Launch Chrome against localhost",
+      "url": "http://localhost:3000",
+      "webRoot": "${workspaceFolder}/gui"
+    },
+    {
+      "name": "Python: API Flask (0.11.x or later)",
+      "type": "python",
+      "request": "launch",
+      "module": "flask",
+      "env": {
+        "FLASK_APP": "nomad/app/__init__.py"
+      },
+      "args": [
+        "run",
+        "--port",
+        "8000",
+        "--no-debugger",
+        "--no-reload"
+      ]
+    },
+    {
+      "name": "Python: some test",
+      "type": "python",
+      "request": "launch",
+      "cwd": "${workspaceFolder}",
+      "program": "${workspaceFolder}/.pyenv/bin/pytest",
+      "args": [
+        "-sv",
+        "tests/test_cli.py::TestClient::test_mirror"
+      ]
+    },
+    {
+      "name": "Python: Current File",
+      "type": "python",
+      "request": "launch",
+      "program": "${file}"
+    },
+    {
+      "name": "Python: Attach",
+      "type": "python",
+      "request": "attach",
+      "localRoot": "${workspaceFolder}",
+      "remoteRoot": "${workspaceFolder}",
+      "port": 3000,
+      "secret": "my_secret",
+      "host": "localhost"
+    }
+  ]
+}
+```
+
+## Code guidelines
+
+### Design principles
+
+- simple first, complicated only when necessary
+- adopting generic established 3rd party solutions before implementing specific solutions
+- only uni directional dependencies between components/modules, no circles
+- only one language: Python (except, GUI of course)
+
+### Rules
+
+The are some *rules* or better strong *guidelines* for writing code. The following
+applies to all python code (and were applicable, also to JS and other code):
+
+- Use an IDE (e.g. [vscode](https://code.visualstudio.com/) or otherwise automatically
+  enforce code ([formatting and linting](https://code.visualstudio.com/docs/python/linting)).
+  Use `nomad qa` before committing. This will run all tests, static type checks, linting, etc.
+
+- There is a style guide to python. Write [pep-8](https://www.python.org/dev/peps/pep-0008/)
+  compliant python code. An exception is the line cap at 79, which can be broken but keep it 90-ish.
+
+- Test the public API of each sub-module (i.e. python file)
+
+- Be [pythonic](https://docs.python-guide.org/writing/style/) and watch
+  [this](https://www.youtube.com/watch?v=wf-BqAjZb8M).
+
+- Document any *public* API of each sub-module (e.g. python file). Public meaning API that
+  is exposed to other sub-modules (i.e. other python files).
+
+- Use google [docstrings](http://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html).
+
+- Add your doc-strings to the sphinx documentation in `docs`. Use .md, follow the example.
+  Markdown in sphinx is supported via [recommonmark]
+  (https://recommonmark.readthedocs.io/en/latest/index.html#autostructify)
+  and [AutoStructify](http://recommonmark.readthedocs.io/en/latest/auto_structify.html)
+
+- The project structure is according to [this guide](https://docs.python-guide.org/writing/structure/).
+  Keep it!
+
+- Write tests for all contributions.
+
+
+### Enforcing Rules: CI/CD
+
+
+These *guidelines* are partially enforced by CI/CD. As part of CI all tests are run on all
+branches; further we run a *linter*, *pep8* checker, and *mypy* (static type checker). You can
+run `nomad qa` to run all these tests and checks before committing.
+
+The CI/CD will run on all refs that do not start with `dev-`. The CI/CD will
+not release or deploy anything automatically, but it can be manually triggered after the
+build and test stage completed successfully.
+
+### Names and identifiers
+
+There are is some terminology consistently used in this documentation and the source
+code. Use this terminology for identifiers.
+
+Do not use abbreviations. There are (few) exceptions: `proc` (processing); `exc`, `e` (exception);
+`calc` (calculation), `repo` (repository), `utils` (utilities), and `aux` (auxiliary).
+Other exceptions are `f` for file-like streams and `i` for index running variables.
+Btw., the latter is almost never necessary in python.
+
+Terms:
+
+- upload: A logical unit that comprises one (.zip) file uploaded by a user.
+- calculation: A computation in the sense that is was created by an individual run of a CMS code.
+- raw file: User uploaded files (e.g. part of the uploaded .zip), usually code input or output.
+- upload file/uploaded file: The actual (.zip) file a user uploaded
+- mainfile: The mainfile output file of a CMS code run.
+- aux file: Additional files the user uploaded within an upload.
+- repo entry: Some quantities of a calculation that are used to represent that calculation in the repository.
+- archive data: The normalized data of one calculation in nomad's meta-info-based format.
+
+Throughout nomad, we use different ids. If something
+is called *id*, it is usually a random uuid and has no semantic connection to the entity
+it identifies. If something is called a *hash* than it is a hash build based on the
+entity it identifies. This means either the whole thing or just some properties of
+said entities.
+
+- The most common hashes is the `calc_hash` based on mainfile and auxfile contents.
+- The `upload_id` is a UUID assigned at upload time and never changed afterwards.
+- The `mainfile` is a path within an upload that points to a main code output file.
+  Since, the upload directory structure does not change, this uniquely ids a calc within the upload.
+- The `calc_id` (internal calculation id) is a hash over the `mainfile` and respective
+  `upload_id`. Therefore, each `calc_id` ids a calc on its own.
+- We often use pairs of `upload_id/calc_id`, which in many context allow to resolve a calc
+  related file on the filesystem without having to ask a database about it.
+- The `pid` or (`coe_calc_id`) is an sequential interger id.
+- Calculation `handle` or `handle_id` are created based on those `pid`.
+  To create hashes we use :py:func:`nomad.utils.hash`.
+
+
+### Logging
+
+There are three important prerequisites to understand about nomad-FAIRDI's logging:
+
+- All log entries are recorded in a central elastic search database. To make this database
+  useful, log entries must be sensible in size, frequence, meaning, level, and logger name.
+  Therefore, we need to follow some rules when it comes to logging.
+- We use an *structured* logging approach. Instead of encoding all kinds of information
+  in log messages, we use key-value pairs that provide context to a log *event*. In the
+  end all entries are stored as JSON dictionaries with `@timestamp`, `level`,
+  `logger_name`, `event` plus custom context data. Keep events very short, most
+  information goes into the context.
+- We use logging to inform about the state of nomad-FAIRDI, not about user
+  behavior, input, data. Do not confuse this when determining the log-level for an event.
+  For example, a user providing an invalid upload file, for example, should never be an error.
+
+Please follow the following rules when logging:
+
+- If a logger is not already provided, only use
+  :py:func:`nomad.utils.get_logger` to acquire a new logger. Never use the
+  build-in logging directly. These logger work like the system loggers, but
+  allow you to pass keyword arguments with additional context data. See also
+  the [structlog docs](https://structlog.readthedocs.io/en/stable/).
+- In many context, a logger is already provided (e.g. api, processing, parser, normalizer).
+  This provided logger has already context information bounded. So it is important to
+  use those instead of acquiring your own loggers. Have a look for methods called
+  `get_logger` or attributes called `logger`.
+- Keep events (what usually is called *message*) very short. Examples are: *file uploaded*,
+  *extraction failed*, etc.
+- Structure the keys for context information. When you analyse logs in ELK, you will
+  see that the set of all keys over all log entries can be quit large. Structure your
+  keys to make navigation easier. Use keys like `nomad.proc.parser_version` instead of
+  `parser_version`. Use module names as prefixes.
+- Don't log everything. Try to anticipate, how you would use the logs in case of bugs,
+  error scenarios, etc.
+- Don't log sensitive data.
+- Think before logging data (especially dicts, list, numpy arrays, etc.).
+- Logs should not be abused as a *printf*-style debugging tool.
+
+The following keys are used in the final logs that are piped to Logstash.
+Notice that the key name is automatically formed by a separate formatter and
+may differ from the one used in the actual log call.
+
+Keys that are autogenerated for all logs:
+
+ - `@timestamp`: Timestamp for the log
+ - `@version`: Version of the logger
+ - `host`: The host name from which the log originated
+ - `path`: Path of the module from which the log was created
+ - `tags`: Tags for this log
+ - `type`: The *message_type* as set in the LogstashFormatter
+ - `level`: The log level: `DEBUG`, `INFO`, `WARNING`, `ERROR`
+ - `logger_name`: Name of the logger
+ - `nomad.service`: The service name as configured in `config.py`
+ - `nomad.release`: The release name as configured in `config.py`
+
+Keys that are present for events related to processing an entry:
+
+ - `nomad.upload_id`: The id of the currently processed upload
+ - `nomad.calc_id`: The id of the currently processed entry
+ - `nomad.mainfile`: The mainfile of the currently processed entry
+
+Keys that are present for events related to exceptions:
+
+ - `exc_info`: Stores the full python exception that was encountered. All
+   uncaught exceptions will be stored automatically here.
+ - `digest`: If an exception was raised, the last 256 characters of the message
+   are stored automatically into this key. If you wish to search for exceptions
+   in Kibana, you will want to use this value as it will be indexed unlike the
+   full exception object.
+
+
+### Copyright Notices
+
+We follow this [recommendation](https://www.linuxfoundation.org/blog/2020/01/copyright-notices-in-open-source-software-projects/)
+of the Linux Foundation for the copyright notice that is placed on top of each source
+code file.
+
+It is intended to provide a broad generic statement that allows all authors/contributors
+of the NOMAD project to claim their copyright, independent of their organization or
+individual ownership.
+
+You can simply copy the notice from another file. From time to time we can use a tool
+like [licenseheaders](https://pypi.org/project/licenseheaders/) to ensure correct
+notices. In addition we keep an purely informative AUTHORS file.
+
+
+## Git/GitLab
+
+### Branches and clean version history
+
+The `master` branch of our repository is *protected*. You must not (even if you have
+the rights) commit to it directly. The `master` branch references the latest official
+release (i.e. what the current NOMAD runs on). The current development is represented by
+*version* branches, named `vx.x.x`. Usually there are two or more of these branched,
+representing the development on *minor/bugfix* versions and the next *major* version(s).
+Ideally these *version* branches are also not manually push to.
+
+Instead you develop
+on *feature* branches. These are branches that are dedicated to implement a single feature.
+They are short lived and only exist to implement a single feature.
+
+The lifecycle of a *feature* branch should look like this:
+
+- create the *feature* branch from the last commit on the respective *version* branch that passes CI
+
+- do your work and push until you are satisfied and the CI passes
+
+- create a merge request on GitLab
+
+- discuss the merge request on GitLab
+
+- continue to work (with the open merge request) until all issues from the discussion are resolved
+
+- the maintainer performs the merge and the *feature* branch gets deleted
+
+While working on a feature, there are certain practices that will help us to create
+a clean history with coherent commits, where each commit stands on its own.
+
+```
+  git commit --amend
+```
+
+If you committed something to your own feature branch and then realize by CI that you have
+some tiny error in it that you need to fix, try to amend this fix to the last commit.
+This will avoid unnecessary tiny commits and foster more coherent single commits. With *amend*
+you are basically adding changes to the last commit, i.e. editing the last commit. If
+you push, you need to force it `git push origin feature-branch --force-with-lease`. So be careful, and
+only use this on your own branches.
+
+```
+  git rebase <version-branch>
+```
+
+Lets assume you work on a bigger feature that takes more time. You might want to merge
+the version branch into your feature branch from time to time to get the recent changes.
+In these cases, use rebase and not merge. Rebase puts your branch commits in front of the
+merged commits instead of creating a new commit with two ancestors. It basically moves the
+point where you initially branched away from the version branch to the current position in
+the version branch. This will avoid merges, merge commits, and generally leave us with a
+more consistent history.  You can also rebase before create a merge request, basically
+allowing for no-op merges. Ideally the only real merges that we ever have, are between
+version branches.
+
+```
+  git merge --squash <other-branch>
+```
+
+When you need multiple branches to implement a feature and merge between them, try to
+use *squash*. Squashing basically puts all commits of the merged branch into a single commit.
+It basically allows you to have many commits and then squash them into one. This is useful
+if these commits where just made for synchronization between workstations or due to
+unexpected errors in CI/CD, you needed a save point, etc. Again the goal is to have
+coherent commits, where each commits makes sense on its own.
+
+Often a feature is also represented by an *issue* on GitLab. Please mention the respective
+issues in your commits by adding the issue id at the end of the commit message: *My message. #123*.
+
+We tag releases with `vX.X.X` according to the regular semantic versioning practices.
+After releasing and tagging the *version* branch is removed. Do not confuse tags with *version* branches.
+Remember that tags and branches are both Git references and you can accidentally pull/push/checkout a tag.
+
+The main NOMAD GitLab-project (`nomad-fair`) uses Git-submodules to maintain its
+parsers and other dependencies. All these submodules are places in the `/dependencies`
+directory. There are helper scripts to install (`./dependencies.sh` and
+commit changes to all submodules (`./dependencies-git.sh`). After merging or checking out,
+you have to make sure that the modules are updated to not accidentally commit old
+submodule commits again. Usually you do the following to check if you really have a
+clean working directory.
+
+```
+  git checkout something-with-changes
+  git submodule update
+  git status
+```
+
+### Submodules
+
+We currently use git submodules to manage NOMAD internal dependencies (e.g. parsers).
+All dependencies are python packages and installed via pip to your python environement.
+
+This allows us to target (e.g. install) individual commits. More importantly, we can address c
+ommit hashes to identify exact parser/normalizer versions. On the downside, common functions
+for all dependencies (e.g. the python-common package, or nomad_meta_info) cannot be part
+of the nomad-FAIRDI project. In general, it is hard to simultaneously develop nomad-FAIRDI
+and NOMAD-coe dependencies.
+
+Another approach is to integrate the NOMAD-coe sources with nomad-FAIRDI. The lacking
+availability of individual commit hashes, could be replaces with hashes of source-code
+files.
+
+We use the `master` branch on all dependencies. Of course feature branches can be used on
+dependencies to manage work in progress.
diff --git a/docs/index.rst b/docs/index.rst
index 910deca675c689954ced9d65b978a0ea00d005fa..96c39c3b9b91194323670c19f5ddc0501de42260 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -5,18 +5,64 @@ This project is a prototype for the continuation of the original NOMAD-coe softw
 and infrastructure with a simplyfied architecture and consolidated code base.
 
 .. toctree::
-   :maxdepth: 2
+   :maxdepth: 1
 
    introduction.md
    upload.rst
-   api_tutorial.md
+   api.md
    client/client.rst
    metainfo.rst
    archive.rst
+   developers.md
+   parser.md
+   normalizer.rst
+   oasis.rst
    ops/ops.rst
-   dev/setup.md
-   dev/dev_guidelines.rst
-   dev/parser_tutorial.md
-   dev/normalizer.rst
-   api.rst
-   reference.rst
+   api_reference.rst
+   python_reference.rst
+
+.. # Introduction
+..    # Repository, Encyclopedia, etc.
+..    # Interfaces: GUIs and APIs
+..    # Technical architecture
+..    # History
+.. # How to upload data
+..    # Prepare your files
+..    # Upload your files
+..    # Publish you files
+.. # How to use NOMAD APIs
+..    # The different APIs
+..    # curl
+..    # requests
+..    # bravado
+.. # NOMAD's Python library
+..    # Getting started
+..    # Command line interface (CLI)
+..    # Run parsers and normalizers
+.. # Data format (Metainfo)
+..    # Introduction
+..    # Browsing the Metainfo
+..    # Getting started
+..    # Quantities
+..    # Sections
+..    # Reference
+.. # Data access (Archive)
+..    # Introduction
+..    # Browsing the Archive
+..    # Getting started
+.. # Develop NOMAD
+..    # Getting started
+..    # Running NOMAD
+..    # Running tests
+..    # Style guide
+..    # VS code
+..    # GIT
+.. # How to write a parser
+..    # Getting started
+..    # Text files
+..    # XML parser
+.. # How to write a normalizer
+.. # Operating a NOMAD OASIS
+.. # Operating NOMAD (with k8s)
+.. # API Reference
+.. # Python Reference
diff --git a/docs/metainfo.rst b/docs/metainfo.rst
index 4cc884d3723de14a3d71ebe6c858fadf91af2f6c..14542b38c5c1747d9e3f756b5187d104786fcb75 100644
--- a/docs/metainfo.rst
+++ b/docs/metainfo.rst
@@ -1,7 +1,7 @@
 .. _metainfo-label:
 
-NOMAD Metainfo
-==============
+Data schema (Metainfo)
+======================
 
 Introduction
 ------------
diff --git a/docs/dev/normalizer.rst b/docs/normalizer.rst
similarity index 100%
rename from docs/dev/normalizer.rst
rename to docs/normalizer.rst
diff --git a/docs/oasis.rst b/docs/oasis.rst
new file mode 100644
index 0000000000000000000000000000000000000000..3e31876359d638c475a58973dbfbd27a3c7dd32b
--- /dev/null
+++ b/docs/oasis.rst
@@ -0,0 +1 @@
+.. mdinclude:: ../ops/docker-compose/nomad-oasis/README.md
diff --git a/docs/ops/oasis.rst b/docs/ops/oasis.rst
deleted file mode 100644
index 217ce1a9f2ef3eb5c20e22e8d85e596431bedda4..0000000000000000000000000000000000000000
--- a/docs/ops/oasis.rst
+++ /dev/null
@@ -1 +0,0 @@
-.. mdinclude:: ../../ops/docker-compose/nomad-oasis/README.md
diff --git a/docs/ops/ops.rst b/docs/ops/ops.rst
index 9f3025fbf1d75cadb672c1c49d87d40957b3ea5c..01612ab624f1dcb06ad933b33ba99bf0c4e0c693 100644
--- a/docs/ops/ops.rst
+++ b/docs/ops/ops.rst
@@ -1,5 +1,5 @@
-Operating NOMAD/OASIS
-#####################
+Operating NOMAD
+###############
 
 .. toctree::
    :maxdepth: 2
@@ -7,4 +7,3 @@ Operating NOMAD/OASIS
    depl_docker.rst
    depl_helm.rst
    depl_images.rst
-   oasis.rst
diff --git a/docs/dev/parser_tutorial.md b/docs/parser.md
similarity index 61%
rename from docs/dev/parser_tutorial.md
rename to docs/parser.md
index 48ff903db5872836346625ffa617ef7afa011a8c..8611018765e695f22245be3be80535219fbc8152 100644
--- a/docs/dev/parser_tutorial.md
+++ b/docs/parser.md
@@ -1,25 +1,351 @@
 # How to write a parser
 
-## The parser project
+NOMAD uses parsers to convert raw code input and output files into NOMAD's common
+Archive format. This is documentation on how to develop such a parser.
 
-First copy an existing parser project as a template. E.g. the vasp parser. Change
-the parser metadata in ``setup.py``. You can already install it with ``-e``: from
-the main dir of the new parser:
+## Getting started
 
+Let's assume we need to write a new parser from scratch.
+
+First we need the install *nomad-lab* Python package to get the necessary libraries:
 ```
-pip install -e .
+pip install nomad-lab
+```
+
+We prepared an example parser project that you can work with.
 ```
+git clone ... --branch hello-word
+```
+
+Alternatively, you can fork the example project on GitHub to create your own parser. Clone
+your fork accordingly.
 
 The project structure should be
 ```
-myparser/myparser/__init__.py
-myparser/test/example_file.out
-myparser/LICENSE.txt
-myparser/README.txt
-myparser/setup.py
+example/exampleparser/__init__.py
+example/exampleparser/__main__.py
+example/exampleparser/metainfo.py
+example/exampleparser/parser.py
+example/LICENSE.txt
+example/README.md
+example/setup.py
+```
+
+Next you should install your new parser with pip. The `-e` parameter installs the parser
+in *development*. This means you can change the sources without the need to re-install.
+```
+cd example
+pip install -e .
+```
+
+The main code file `exampleparser/parser.py` should look like this:
+```python
+class ExampleParser(FairdiParser):
+    def __init__(self):
+        super().__init__(name='parsers/example', code_name='EXAMPLE')
+
+    def run(self, mainfile: str, archive: EntryArchive, logger):
+        # Log a hello world, just to get us started. TODO remove from an actual parser.
+        logger.info('Hello World')
+
+        run = archive.m_create(Run)
+        run.program_name = 'EXAMPLE'
+```
+
+A parser is a simple program with a single class in it. The base class `FairdiParser`
+provides the necessary interface to NOMAD. We provide some basic information
+about our parser in the constructor. The *main* function `run` simply takes a filepath
+and empty archive as input. Now its up to you, to open the given file and populate the
+given archive accordingly. In the plain *hello world*, we simple create a log entry and
+populate the archive with a *root section* `Run` and set the program name to `EXAMPLE`.
+
+You can run the parser with the included `__main__.py`. It takes a file as argument and
+you can run it like this:
+```
+python -m exampleparser test/data/example.out
+```
+
+The output should show the log entry and the minimal archive with one `section_run` and
+the respective `program_name`.
+```
+INFO     root                 2020-12-02T11:00:52 Hello World
+  - nomad.release: devel
+  - nomad.service: unknown nomad service
+{
+  "section_run": [
+    {
+      "program_name": "EXAMPLE"
+    }
+  ]
+}
+```
+
+## Parsing test files
+
+Let's do some actual parsing. Here we demonstrate how to parse ASCII files with some
+structure information in it. As it is typically used by materials science codes.
+
+The on the `master` branch of the example project, we have a more 'realistic' example:
+```
+git checkout master
+```
+
+This example imagines a potential code output that looks like this (`tests/data/example.out`):
+```
+2020/05/15
+               *** super_code v2 ***
+
+system 1
+--------
+sites: H(1.23, 0, 0), H(-1.23, 0, 0), O(0, 0.33, 0)
+latice: (0, 0, 0), (1, 0, 0), (1, 1, 0)
+energy: 1.29372
+
+*** This was done with magic source                                ***
+***                                x°42                            ***
+
+
+system 2
+--------
+sites: H(1.23, 0, 0), H(-1.23, 0, 0), O(0, 0.33, 0)
+cell: (0, 0, 0), (1, 0, 0), (1, 1, 0)
+energy: 1.29372
+```
+
+There is some general information at the top and then a list of simulated systems with
+sites and lattice describing crystal structures, a computed energy value, an example for
+a code specific quantity from a 'magic source'.
+
+In order to convert the information  from this file into the archive, we first have to
+parse the necessary quantities: the date, system, energy, etc. The *nomad-lab* Python
+package provides a `text_parser` module for declarative text file parsing. You can
+define text file parsers like this:
+```python
+def str_to_sites(string):
+    sym, pos = string.split('(')
+    pos = np.array(pos.split(')')[0].split(',')[:3], dtype=float)
+    return sym, pos
+
+
+calculation_parser = UnstructuredTextFileParser(quantities=[
+    Quantity('sites', r'([A-Z]\([\d\.\, \-]+\))', str_operation=str_to_sites),
+    Quantity(
+        System.lattice_vectors,
+        r'(?:latice|cell): \((\d)\, (\d), (\d)\)\,?\s*\((\d)\, (\d), (\d)\)\,?\s*\((\d)\, (\d), (\d)\)\,?\s*',
+        repeats=False),
+    Quantity('energy', r'energy: (\d\.\d+)'),
+    Quantity('magic_source', r'done with magic source\s*\*{3}\s*\*{3}\s*[^\d]*(\d+)', repeats=False)])
+
+mainfile_parser = UnstructuredTextFileParser(quantities=[
+    Quantity('date', r'(\d\d\d\d\/\d\d\/\d\d)', repeats=False),
+    Quantity('program_version', r'super\_code\s*v(\d+)\s*', repeats=False),
+    Quantity(
+        'calculation', r'\s*system \d+([\s\S]+?energy: [\d\.]+)([\s\S]+\*\*\*)*',
+        sub_parser=calculation_parser,
+        repeats=True)
+])
+```
+
+The quantities to be parsed can be specified as a list of `Quantity` objects with a name
+and a *regular expression* (re) pattern. The matched value should be enclosed in a group(s)
+denoted by `(...)`.
+By default, the parser uses the findall method of `re`, hence overlap
+between matches is not tolerated. If overlap cannot be avoided, one should switch to the
+finditer method by passing *findall=False* to the parser. Multiple
+matches for the quantity are returned if *repeats=True* (default). The name, data type,
+shape and unit for the quantity can also intialized by passing a metainfo Quantity.
+An external function *str_operation* can be also be passed to perform more specific
+string operations on the matched value. A local parsing on a matched block can be carried
+out by nesting a *sub_parser*. This is also an instance of the `UnstructuredTextFileParser`
+with a list of quantities to parse. To access a parsed quantity, one can use the *get*
+method.
+
+We can apply these parser definitions like this:
+```
+mainfile_parser.mainfile = mainfile
+mainfile_parser.parse()
+```
+
+This will populate the `mainfile_parser` object with parsed data and it can be accessed
+like a Python dict with quantity names as keys:
+```python
+run = archive.m_create(Run)
+run.program_name = 'super_code'
+run.program_version = str(mainfile_parser.get('program_version'))
+date = datetime.datetime.strptime(
+    mainfile_parser.get('date'),
+    '%Y/%m/%d') - datetime.datetime(1970, 1, 1)
+run.program_compilation_datetime = date.total_seconds()
+
+for calculation in mainfile_parser.get('calculation'):
+    system = run.m_create(System)
+
+    system.lattice_vectors = calculation.get('lattice_vectors')
+    sites = calculation.get('sites')
+    system.atom_labels = [site[0] for site in sites]
+    system.atom_positions = [site[1] for site in sites]
+
+    scc = run.m_create(SCC)
+    scc.single_configuration_calculation_to_system_ref = system
+    scc.energy_total = calculation.get('energy') * units.eV
+    scc.single_configuration_calculation_to_system_ref = system
+    magic_source = calculation.get('magic_source')
+    if magic_source is not None:
+        scc.x_example_magic_value = magic_source
+```
+
+You can still run the parse on the given example file:
+```
+python -m exampleparser test/data/example.out
+```
+
+Now you should get a more comprehensive archive with all the provided information from
+the `example.out` file.
+
+** TODO more examples an explanations for: unit conversion, logging, types, scalar, vectors,
+multi-line matrices **
+
+## Extending the Metainfo
+The NOMAD Metainfo defines the schema of each archive. There are pre-defined schemas for
+all domains (e.g. `common_dft.py` for electron-structure codes; `common_ems.py` for
+experiment data, etc.). The sections `Run`, `System`, an single configuration calculations (`SCC`)
+in the example are taken fom `common_dft.py`. While this covers most data that is usually
+provide in code input/output files, some data is typically format specific and only applies
+to a certain code or method. For these cases, we allow to extend the Metainfo like
+this (`exampleparser/metainfo.py`):
+```python
+# We extend the existing common definition of a section "single configuration calculation"
+class ExampleSCC(SCC):
+    # We alter the default base class behavior to add all definitions to the existing
+    # base class instead of inheriting from the base class
+    m_def = Section(extends_base_section=True)
+
+    # We define an additional example quantity. Use the prefix x_<parsername>_ to denote
+    # non common quantities.
+    x_example_magic_value = Quantity(type=int, description='The magic value from a magic source.')
+```
+
+## Testing a parser
+
+Until now, we simply run our parse on some example data and manually observed the output.
+To improve the parser quality and ease the further development, you should get into the
+habit of testing the parser.
+
+We use the Python unit test framework *pytest*:
+```
+pip install pytest
+```
+
+A typical test, would take one example file, parse it, and make assertions about
+the output.
+```python
+def test_example():
+    parser = rExampleParser()
+    archive = EntryArchive()
+    parser.run('test/data/example.out', archive, logging)
+
+    run = archive.section_run[0]
+    assert len(run.section_system) == 2
+    assert len(run.section_single_configuration_calculation) == 2
+    assert run.section_single_configuration_calculation[0].x_example_magic_value == 42
+```
+
+You can run all tests in the `tests` directory like this:
+```
+pytest -svx tests
+```
+
+You should define individual test cases with example files that demonstrate certain
+features of the underlying code/format.
+
+## Structured data files with numpy
+**TODO: examples**
+
+The `DataTextFileParser` uses the numpy.loadtxt function to load an structured data file.
+The loaded data can be accessed from property *data*.
+
+## XML Parser
+
+**TODO: examples**
+
+The `XMLParser` uses the ElementTree module to parse an xml file. The parse method of the
+parser takes in an xpath style key to access individual quantities. By default, automatic
+data type conversion is performed, which can be switched off by setting *convert=False*.
+
+## Add the parser to NOMAD
+
+NOMAD has to manage multiple parsers and during processing needs to decide what parsers
+to run on what files. To manage parser, a few more information about parsers is necessary.
+
+Consider the example, where we use the `FairdiParser` constructor to add additional
+attributes that determine for what files the parser is indented:
+```python
+class ExampleParser(FairdiParser):
+    def __init__(self):
+        super().__init__(
+            name='parsers/example', code_name='EXAMPLE', code_homepage='https://www.example.eu/',
+            mainfile_mime_re=r'(application/.*)|(text/.*)',
+            mainfile_contents_re=(r'^\s*#\s*This is example output'))
+```
+
+- `mainfile_mime_re`: A regular expression on the mime type of files. The parser is only
+run on files with matching mime type. The mime-type is *guessed* with libmagic.
+- `mainfile_contents_re`: A regular expression that is applied to the first 4k of a file.
+The parser is only run on files where this matches.
+
+The nomad infrastructure keep a list of parser objects (in `nomad/parsing/parsers.py::parsers`).
+These parser are considered in the order they appear in the list. The first matching parser
+is used to parse a given file.
+
+While each parser project should provide its own tests, a single example file should be
+added to the infrastructure parser tests (`tests/parsing/test_parsing.py`).
+
+Once the parser is added, it become also available through the command line interface and
+normalizers are applied as well:
+```
+nomad parser test/data/example.out
+```
+
+## Developing an existing parser
+To develop an existing parser, you should install all parsers:
+```
+pip install nomad-lab[parsing]
+```
+
+Close the parser project on top:
+```
+git clone <parser-project-url>
+cd <parser-dir>
+```
+
+Either remove the installed parser and pip install the cloned version:
+```
+rm -rf <path-to-your-python-env>/lib/python3.7/site-packages/<parser-module-name>
+pip install -e .
+```
+
+Or use `PYTHONPATH` so that the cloned code takes precedence over the installed code:
+```
+PYTHONPATH=. nomad parser <path-to-example-file>
 ```
 
-## Basic skeleton
+Alternatively, you can also do a full developer setup of the NOMAD infrastructure and
+develop the parser there.
+
+## Legacy NOMAD-CoE parsers
+During the origin NOMAD-CoE the parsing infrastructure was different. We still have
+old parsers from this time in NOMAD. The key differences are:
+
+- different inconsistent interfaces with NOMAD, mostly via `ParserInterface` or
+  `mainFunction`
+- more complex internal structure with interface, parser, and context classes
+- no direct archive access, instead a *backend* object that takes events like
+  *open-section*, *set-value*, *close-section*
+- *backend* communication follows a *streaming* metaphor; once an *event* is send, the
+  information cannot be altered anymore, the order of parsing is tight to the order of
+  information in the archive
+- close coupling between parsing and archive writing declared in `SimpleMatcher` objects
+  that cover parsing and archive population at the same time
 
 The following is an example for simple almost empty skeleton for a parser. The
 nomad@FAIRDI infrastructure will use the ``ParserInterface`` implementation to use
@@ -99,7 +425,7 @@ or if you installed your new parser in your environment
 python -m myparser test/example_file.out
 ```
 
-## Simple matcher
+### Simple matcher
 The central thing in a parser using this infrastructure is the SimpleMatcher class.
 It is used to defines objects that matches some lines of a file and extracts some values out of them.
 
@@ -180,7 +506,7 @@ of (metaInfoName: value), unit and type conversion are not applied. Can be also
 * With *startReAction* you can provide a callback function that will get called when the *startReStr* is matched. The callback signature is startReAction(backend, groups). This callback can directly access any matched groups from the regex with argument *groups*. You can use this to e.g. format a line with datetime information into a standard form and push it directly to a corresponding metainfo, or do other more advanced tasks that are not possible e.g. with fixedStartValues.
 * Often the parsing is complex enough that you want to create an object to keep track of it and cache various flags, open files,..., to manage the various parsing that is performed. You can store this object as *superContext* of the parser.
 
-## Nomad meta-info
+### Nomad meta-info
 
 The meta-info has a central role in the parsing.
 It represents the conceptual model of the extracted data, every piece of information
@@ -235,7 +561,7 @@ that one wants to use for interpolation. Currently there are no methods to answe
 and one simply has to build the datasets himself (often with scripts) to be sure that the
 calculations are really consistent.
 
-## What to parse
+### What to parse
 
 You should try to parse all input parameters and all lines of the output.
 
@@ -245,7 +571,7 @@ But the other point, that is just as important is to be able of detecting when a
 fails and needs improvement. This is the only way to keep the parser up to date with
 respect to the data that is in the repository.
 
-## Unit conversion
+### Unit conversion
 
 The code independent meta info should use SI units, codes typically do not.
 As shown in the examples matchers can convert automatically the value matched by the
@@ -282,14 +608,14 @@ register_userdefined_quantity("usrMyCodeLength", "angstrom")
 this call *needs* to be done before any use of that unit. The unit can then be used just
 like all others: add __usrMyCodeLength to the group name.
 
-## Backend
+### Backend
 The backend is an object can stores parsed data according to its meta-info. The
 class :py:class:`nomad.parsing.Backend` provides the basic backend interface.
 It allows to open and close sections, add values, arrays, and values to arrays.
 In NOMAD-coe multiple backend implementations existed to facilitate the communication of
 python parsers with the scala infrastructure, including caching and streaming.
 
-## Triggers
+### Triggers
 
 When a section is closed a function can be called with the backend, gIndex and the section
 object (that might have cached values). This is useful to perform transformations on
@@ -305,14 +631,14 @@ def onClose_section_scf_iteration(self, backend, gIndex, section):
 
 defines a trigger called every time an scf iteration section is closed.
 
-## Logging
+### Logging
 You can use the standard python logging module in parsers. Be aware that all logging
 is stored in a database for analysis. Do not abuse the logging.
 
-## Testing and Debugging
+### Testing and Debugging
 You a writing a python program. You know what to do.
 
-## Adding the parser to nomad@FAIRDI
+### Adding the parser to nomad@FAIRDI
 
 First, you add your parser to the dependencies. Put it into the dependencies folder, then:
 ```
@@ -342,71 +668,3 @@ parser_examples = [
     ('parsers/vaspoutcar', 'tests/data/parsers/vasp_outcar/OUTCAR'),
 ]
 ```
-
-## FAIRDI parsers
-The new fairdi parsers avoid the use of a backend and instead make use of the new metainfo
-sections. The project structure is the same as above with the addition of a `metainfo`
-folder
-
-```
-myparser/myparser/metainfo
-```
-This contains a file containing the definitions and an `__init__.py`. One should refer
-to `nomad.metainfo.example.py` for a guide in writing the metainfo definitions.
-Consequently, the parser class implementation is modified as in the following example.
-
-```python
-import json
-
-from .metainfo import m_env
-from nomad.parsing.parser import MatchingParser
-from nomad.datamodel.metainfo.common_experimental import section_experiment as msection_experiment
-from nomad.datamodel.metainfo.common_experimental import section_data as msection_data
-from nomad.datamodel.metainfo.general_experimental_method import section_method as msection_method
-from nomad.datamodel.metainfo.general_experimental_sample import section_sample as msection_sample
-
-
-class ExampleParser(MatchingParser):
-    def __init__(self):
-        super().__init__(
-            name='parsers/example', code_name='example', code_homepage='https://github.com/example/example',
-            domain='ems', mainfile_mime_re=r'(application/json)|(text/.*)', mainfile_name_re=(r'.*.example')
-        )
-
-    def run(self, filepath, logger=None):
-        self._metainfo_env = m_env
-
-        with open(filepath, 'rt') as f:
-            data = json.load(f)
-
-        section_experiment = msection_experiment()
-
-        # Read general tool environment details
-        section_experiment.experiment_location = data.get('experiment_location')
-        section_experiment.experiment_facility_institution = data.get('experiment_facility_institution')
-
-        # Read data parameters
-        section_data = section_experiment.m_create(msection_data)
-        section_data.data_repository_name = data.get('data_repository_name')
-        section_data.data_preview_url = data.get('data_repository_url')
-
-        # Read parameters related to method
-        section_method = section_experiment.m_create(msection_method)
-        section_method.experiment_method_name = data.get('experiment_method')
-        section_method.probing_method = 'electric pulsing'
-
-        # Read parameters related to sample
-        section_sample = section_experiment.m_create(msection_sample)
-        section_sample.sample_description = data.get('specimen_description')
-        section_sample.sample_microstructure = data.get('specimen_microstructure')
-        section_sample.sample_constituents = data.get('specimen_constitution')
-
-        return section_experiment
-```
-The parser extends the ``MatchingParser`` class which already implements the determination
-of the necessary file for parsing. The main difference to the old framework is the absense
-the opening and closing of sections. One only needs to create a section which can be
-accessed at any point in the code. The ``run`` method should return the root section.
-
-Lastly, one should add an instance of the parser class in the list of parsers at
-``nomad.parsing``.
\ No newline at end of file
diff --git a/docs/reference.rst b/docs/python_reference.rst
similarity index 100%
rename from docs/reference.rst
rename to docs/python_reference.rst
diff --git a/docs/upload.rst b/docs/upload.rst
index abc9cb18f05348eb65435def29efa41650b99476..1b4b931c1326333cefa781baf67232f376ec05e5 100644
--- a/docs/upload.rst
+++ b/docs/upload.rst
@@ -1,6 +1,6 @@
-======================================
-Uploading Data to the NOMAD Repository
-======================================
+==================
+How to upload data
+==================
 
 To contribute your data to the repository, please, login to our `upload page <../gui/uploads>`_
 (you need to register first, if you do not have a NOMAD account yet).
diff --git a/nomad/parsing/file_parser/README.md b/nomad/parsing/file_parser/README.md
deleted file mode 100644
index fb91896971e25d76d4521051695e4fc7c75940b5..0000000000000000000000000000000000000000
--- a/nomad/parsing/file_parser/README.md
+++ /dev/null
@@ -1,146 +0,0 @@
-# NOMAD file parsing module
-
-The parsing module consists of the `UnstructuredTextFileParser`, `DataTextFileParser`
-and `XMLParser` classes to enable the parsing of unstructured text, structured data text,
-and xml files, respectively. These classes are based on the FileParser class which
-provides the common methods for file handling, and querying the parsed results.
-
-## UnstructuredTextFileParser
-
-The most common type of file that are parsed in NOMAD are unstructured text files which
-can be handled using the UnstructuredTextFileParser. The parser uses the `re` module to
-match a given pattern for a quantity in the text file. To illustrate the use of this parser,
-let us consider a file `super_code.out` with the following contents:
-
-```
-2020/05/15
-               *** super_code v2 ***
-
-system 1
---------
-sites: H(1.23, 0, 0), H(-1.23, 0, 0), O(0, 0.33, 0)
-latice: (0, 0, 0), (1, 0, 0), (1, 1, 0)
-energy: 1.29372
-
-*** This was done with magic source                                ***
-***                                x°42                            ***
-
-
-system 2
---------
-sites: H(1.23, 0, 0), H(-1.23, 0, 0), O(0, 0.33, 0)
-cell: (0, 0, 0), (1, 0, 0), (1, 1, 0)
-energy: 1.29372
-```
-
-In order to create a nomad archive from this file, we first have to parse the necessary
-quantities which includes the date, system, energy, etc. The following python code
-illustrates how can this be achieved. Note that we will be using *parser* to refer to the
-file parser and to the code parser that writes the archive.
-
-```python
-import datetime
-import numpy as np
-
-from nomad.parsing.file_parser import UnstructuredTextFileParser, Quantity
-from nomad.datamodel import EntryArchive
-from nomad.datamodel.metainfo.public import section_run, section_system, section_single_configuration_calculation
-
-p = UnstructuredTextFileParser()
-
-def str_to_sites(string):
-    sym, pos = string.split('(')
-    pos = np.array(pos.split(')')[0].split(',')[:3], dtype=float)
-    return sym, pos
-
-q_system = Quantity(
-    'system', r'\s*system \d+([\s\S]+?energy: [\d\.]+)([\s\S]+\*\*\*)*',
-    sub_parser=UnstructuredTextFileParser(quantities=[
-        Quantity(
-            'sites', r'([A-Z]\([\d\.\, \-]+\))',
-            str_operation=str_to_sites),
-        Quantity(
-            section_system.lattice_vectors,
-            r'(?:latice|cell): \((\d)\, (\d), (\d)\)\,?\s*\((\d)\, (\d), (\d)\)\,?\s*\((\d)\, (\d), (\d)\)\,?\s*',
-            repeats=False),
-        Quantity(
-            'energy', r'energy: (\d\.\d+)'),
-        Quantity(
-            'magic_source', r'done with magic source\s*\*{3}\s*\*{3}\s*([\S]+)',
-            repeats=False)]),
-    repeats=True)
-
-quantities = [
-        Quantity('date', r'(\d\d\d\d\/\d\d\/\d\d)', repeats=False),
-        Quantity('program_version', r'super\_code\s*v(\d+)\s*', repeats=False),
-        q_system]
-
-p.quantities = quantities
-# this returns the energy for system 2
-p.system[1].get('energy', unit='hartree')
-```
-
-The quantities to be parsed can be specified as a list of `Quantity` objects with a name
-and a re pattern. The matched value should be enclosed in a group(s). By default,
-the parser uses the findall method of `re`, hence overlap
-between matches is not tolerated. If overlap cannot be avoided, one should switch to the
-finditer method by passing *findall=False* to the parser. Multiple
-matches for the quantity are returned if *repeats=True* (default). The name, data type,
-shape and unit for the quantity can also intialized by passing a metainfo Quantity.
-An external function *str_operation* can be also be passed to perform more specific
-string operations on the matched value. A local parsing on a matched block can be carried
-out by nesting a *sub_parser*. This is also an instance of the `UnstructuredTextFileParser`
-with a list of quantities to parse. To access a parsed quantity, one can use the *get*
-method.
-
-The creation of the archive is implemented in the parse method of the code parser which takes
-the mainfile, archive and logger as arguments. The file parser, *out_parser* is
-created only in the constructor and subsequent parsing on a different *mainfile* can be
-performed by assigning it to the file parser.
-
-```python
-class SupercodeParser:
-    def __init__(self):
-        self.out_parser = UnstructuredTextFileParser()
-        self.out_parser.quantities = quantities
-
-    def parse(self, mainfile, archive, logger):
-        self.out_parser.mainfile = mainfile
-        sec_run = archive.m_create(section_run)
-        sec_run.program_name = 'super_code'
-        sec_run.program_version = str(self.out_parser.get('program_version'))
-        date = datetime.datetime.strptime(
-            self.out_parser.get('date'), '%Y/%m/%d') - datetime.datetime(1970, 1, 1)
-        sec_run.program_compilation_datetime = date.total_seconds()
-        for system in self.out_parser.get('system'):
-            sec_system = sec_run.m_create(section_system)
-            sec_system.lattice_vectors = system.get('lattice_vectors')
-            sites = system.get('sites')
-            sec_system.atom_labels = [site[0]  for site in sites]
-            sec_system.atom_positions = [site[1] for site in sites]
-
-            sec_scc = sec_run.m_create(section_single_configuration_calculation)
-            sec_scc.energy_total = system.get('energy')
-            sec_scc.single_configuration_calculation_to_system_ref = sec_system
-            magic_source = system.get('magic_source')
-            if magic_source is not None:
-                sec_scc.message_info_evaluation = magic_source
-
-archive = EntryArchive()
-
-parser = SupercodeParser()
-parser.parse('temp.dat', archive, None)
-
-print(archive.m_to_json())
-```
-
-## DataTextFileParser
-The `DataTextFileParser` uses the numpy.loadtxt function to load an structured data file.
-The loaded data can be accessed from property *data*.
-
-## XMLParser
-The `XMLParser` uses the ElementTree module to parse an xml file. The parse method of the
-parser takes in an xpath style key to access individual quantities. By default, automatic
-data type conversion is performed, which can be switched off by setting *convert=False*.
-
-
diff --git a/nomad/parsing/file_parser/text_parser.py b/nomad/parsing/file_parser/text_parser.py
index 71bad6788602eb70b53926c308c61b2de30a40bf..bf5fbc0ed72810b1850b2ce0444ec3742698b9d3 100644
--- a/nomad/parsing/file_parser/text_parser.py
+++ b/nomad/parsing/file_parser/text_parser.py
@@ -406,8 +406,8 @@ class UnstructuredTextFileParser(FileParser):
 
                 self._results[quantities[i].name] = value_processed
 
-            except Exception:
-                self.logger.warn('Error setting value for %s ' % quantities[i].name)
+            except Exception as e:
+                self.logger.warn('Error setting value for %s ' % quantities[i].name, exc_info=e)
                 pass
 
     def _parse_quantity(self, quantity):
diff --git a/nomad/units/__init__.py b/nomad/units/__init__.py
index 070cba5626a8b2990ef160d54290c0ec89ec9f3a..239baeb975088453c2cd1de124b64234e6ca2088 100644
--- a/nomad/units/__init__.py
+++ b/nomad/units/__init__.py
@@ -19,4 +19,4 @@
 import os
 from pint import UnitRegistry
 
-ureg = UnitRegistry(os.path.join(os.path.dirname(__file__), "default_en.txt"))
+ureg = UnitRegistry(os.path.join(os.path.dirname(__file__), 'default_en.txt'))
diff --git a/nomad/utils/__init__.py b/nomad/utils/__init__.py
index 274faa3a5774da2e0d51846d013ceb47ed5b199d..e3d464adb4e13ab16669b15031645bad79edcb3a 100644
--- a/nomad/utils/__init__.py
+++ b/nomad/utils/__init__.py
@@ -59,6 +59,7 @@ default_hash_len = 28
 try:
     from . import structlogging
     from .structlogging import legacy_logger
+    from .structlogging import configure_logging
 
     def get_logger(name, **kwargs):
         '''
@@ -71,6 +72,10 @@ except ImportError:
     def get_logger(name, **kwargs):
         return ClassicLogger(name, **kwargs)
 
+    def configure_logging(console_log_level=config.console_log_level):
+        import logging
+        logging.basicConfig(level=console_log_level)
+
 
 class ClassicLogger:
     '''
diff --git a/nomad/utils/structlogging.py b/nomad/utils/structlogging.py
index d86f1fd5cb20e4f26da37c686597b7e0921a96d4..3d9ff90168a70545189f6f01cdf6fc325c72ae8c 100644
--- a/nomad/utils/structlogging.py
+++ b/nomad/utils/structlogging.py
@@ -281,13 +281,21 @@ structlog.configure(
     logger_factory=logger_factory,
     wrapper_class=structlog.stdlib.BoundLogger)
 
-# configure logging in general
-logging.basicConfig(level=logging.DEBUG)
+
 root = logging.getLogger()
-for handler in root.handlers:
-    if not isinstance(handler, LogstashHandler):
-        handler.setLevel(config.console_log_level)
-        handler.setFormatter(ConsoleFormatter())
+
+
+# configure logging in general
+def configure_logging(console_log_level=config.console_log_level):
+    logging.basicConfig(level=logging.DEBUG)
+    for handler in root.handlers:
+        if not isinstance(handler, LogstashHandler):
+            handler.setLevel(console_log_level)
+            handler.setFormatter(ConsoleFormatter())
+
+
+configure_logging()
+
 
 # configure logstash
 if config.logstash.enabled:
diff --git a/ops/docker-compose/nomad-oasis/README.md b/ops/docker-compose/nomad-oasis/README.md
index e89e5904f0aa08f79f8abcab539c31f115faddf7..ba2a3579047a061260bbd336c76f93d12a1ddacb 100644
--- a/ops/docker-compose/nomad-oasis/README.md
+++ b/ops/docker-compose/nomad-oasis/README.md
@@ -1,4 +1,4 @@
-# Operating a NOMAD OASIS
+# Operating an OASIS
 
 The following describes the simplest way to run your own NOMAD.
 
@@ -9,7 +9,7 @@ However, the NOMAD software is Open-Source, and everybody can run it. Any servic
 uses NOMAD software independently is called a *NOMAD OASIS*.
 
 While several use cases require different setups, this documentation
-describes the simplest setup of a NOMAD OASIS. It would allow a group to use NOMAD for 
+describes the simplest setup of a NOMAD OASIS. It would allow a group to use NOMAD for
 local research data management, while using NOMAD's central user-management and its users.
 
 ## Pre-requisites