Skip to content
Snippets Groups Projects
install.md 25.96 KiB

How to install an Oasis

Originally, the NOMAD Central Repository is a service that runs at the Max-Planck's computing facility in Garching, Germany. However, the NOMAD software is Open-Source, and everybody can run it. Any service that uses NOMAD software independently is called a NOMAD Oasis. A NOMAD Oasis does not need to be fully isolated. For example, you can publish uploads from your NOMAD Oasis to the central NOMAD installation.

!!! note

**Register your Oasis**
If you installed (or even just plan to install) a NOMAD Oasis, please
[register your Oasis with FAIRmat](https://www.fairmat-nfdi.eu/fairmat/oasis_registration)
and help us to assist you in the future.

Quick-start

  • Find a linux computer.
  • Make sure you have docker{:target="_blank"} installed. Docker nowadays comes with docker compose build in. Prior, you needed to install the stand alone docker-compose{:target="_blank"}.
  • Download our basic configuration files nomad-oasis.zip
  • Run the following commands (skip chown on MacOS and Windows computers)
unzip nomad-oasis.zip
cd nomad-oasis
sudo chown -R 1000 .volumes
docker compose pull
docker compose up -d
curl localhost/nomad-oasis/alive

To run NORTH (the NOMAD Remote Tools Hub), the north container needs to run docker and the container has to be run under the docker group. You need to replace the default group id 991 in the docker-compose.yaml's north section with your systems docker group id. Run id if you are a docker user, or getent group | grep docker to find our your systems docker gid. The user id 1000 is used as the nomad user inside all containers.

This is good as a quick test. We strongly recommend to read the following instructions carefully and adapt the configuration files accordingly. The following might also include meaningful help, if you run into problems.

Before you start

Hardware considerations

Of course this depends on how much data you need to manage and process. Data storage is the obvious aspect here. NOMAD keeps all files that it manages as they are. The files that NOMAD processes in addition (e.g. through parsing) are typically smaller than the original raw files. Therefore, you can base your storage requirements based on the size of the data files that you expect to manage. The additional mongo database and elasticsearch index is comparatively small.

Storage speed is another consideration. You can work with NAS systems. All that NOMAD needs is a "regular" POSIX filesystem as an interface. So everything you can (e.g. docker host) mount should be fine. For processing data obviously relies on read/write speed, but this is just a matter of convenience. The processing is designed to run as managed asynchronous tasks. Local storage might be favorable for mongodb and elasticsearch operation, but it is not a must.

The amount of compute resource (e.g. processor cores) is also a matter of convenience (and amount of expected users). Four cpu-cores are typically enough to support a research group and run application, processing, and databases in parallel. Smaller systems still work, e.g. for testing.

There should be enough RAM to run databases, application, and processing at the same time. The minimum requirements here can be quite low, but for processing the metadata for individual files is kept in memory. For large DFT geometry-optimizations this can add up quickly, especially if many CPU cores are available for processing entries in parallel. We recommend at least 2GB per core and a minimum of 8GB. You also need to consider RAM and CPU for running tools like jupyter, if you opt to use NOMAD NORTH.

Sharing data through log transfer and data privacy notice

NOMAD includes a log transfer functions. When enabled this it automatically collects and transfers non-personalized logging data to us. Currently, this functionality is experimental and requires opt-in. However, in upcoming versions of NOMAD Oasis, we might change to out-out.

To enable this functionality add logtransfer.enabled: true to you nomad.yaml.

The service collects log-data and aggregated statistics, such as the number of users or the number of uploaded datasets. In any case this data does not personally identify any users or contains any uploaded data. All data is in an aggregated and anonymized form.

The data is solely used by the NOMAD developers and FAIRmat, including but not limited to:

  • Analyzing and monitoring system performance to identify and resolve issues.
  • Improving our NOMAD software based on usage patterns.
  • Generating aggregated and anonymized reports.

We do not share any collected data with any third parties.

We may update this data privacy notice from time to time to reflect changes in our data practices. We encourage you to review this notice periodically for any updates.

Using the central user management

Our recommendation is to use the central user management provided by nomad-lab.eu. We simplified its use and you can use it out-of-the-box. You can even run your system from localhost (e.g. for initial testing). The central user management system is not communicating with your OASIS directly. Therefore, you can run your OASIS without exposing it to the public internet.

There are two requirements. First, your users must be able to reach the OASIS. If a user is logging in, she/he is redirected to the central user management server and after login, she/he is redirected back to the OASIS. These redirects are executed by your user's browser and do not require direct communication.

Second, your OASIS must be able to request (via HTTP) the central user management and central NOMAD installation. This is necessary for non JWT-based authentication methods and to retrieve existing users for data-sharing features.

The central user management will make future synchronizing data between NOMAD installations easier and generally recommend to use the central system. But in principle, you can also run your own user management. See the section on your own user management.

Docker and docker compose

We recommend the installation via docker and docker-compose. It is the most documented, simplest, easiest to update, and generally the most frequently chosen option.

Pre-requisites

NOMAD software is distributed as a set of docker containers and there are also other services required that can be run with docker. Further, we use docker-compose to setup all necessary containers in the simplest way possible.

You will need a single computer, with docker and docker-compose installed. Refer to the official docker{:target="_blank"} (and docker-compose{:target="_blank"}) documentation for installation instructions. Newer version of docker have a re-implementation of docker-compose integrated as the docker compose sub-command. This should be fully compatible and you might chose to can replace docker compose with docker-compose in this tutorial.

The following will run all necessary services with docker. These comprise: a mongo database, an elasticsearch, a rabbitmq distributed task queue, the NOMAD app, NOMAD worker, and NOMAD gui. In this architecture documentation, you will learn what each service does and why it is necessary.

Configuration

All docker containers are configured via docker-compose and the respective docker-compose.yaml file. Further, we will need to mount some configuration files to configure the NOMAD services within their respective containers.

There are three files to configure:

  • docker-compose.yaml
  • configs/nomad.yaml
  • configs/nginx.conf

In this example, we have all files in the same directory (the directory we are also working in). You can download minimal example files here.

docker-compose.yaml

The most basic docker-compose.yaml to run an OASIS looks like this:

--8<-- "ops/docker-compose/nomad-oasis/docker-compose.yaml"

Changes necessary:

  • The group in the value of the hub's user parameter needs to match the docker group on the host. This should ensure that the user which runs the hub, has the rights to access the host's docker.
  • On Windows or MacOS computers you have to run the app and worker container without user: '1000:1000' and the north container with user: root.

A few things to notice:

  • The app, worker, and north service use the NOMAD docker image. Here we use the latest tag, which gives you the latest beta version of NOMAD. You might want to change this to stable, a version tag (format is vX.X.X, you find all releases here{:target="_blank"}), or a specific branch tag{:target="_blank"}.
  • All services use docker volumes for storage. This could be changed to host mounts.
  • It mounts two configuration files that need to be provided (see below): nomad.yaml, nginx.conf.
  • The only exposed port is 80 (proxy service). This could be changed to a desired port if necessary.
  • The NOMAD images are pulled from our gitlab at MPCDF, the other services use images from a public registry (dockerhub).
  • All containers will be named nomad_oasis_*. These names can be used later to reference the container with the docker cmd.
  • The services are setup to restart always, you might want to change this to no while debugging errors to prevent indefinite restarts.
  • Make sure that the PWD environment variable is set. NORTH needs to create bind mounts that require absolute paths and we need to pass the current working directory to the configuration from the PWD variable (see hub service in the docker-compose.yaml).
  • The north service needs to run docker containers. We have to use the systems docker group as a group. You might need to replace 991 with your systems docker group id.

nomad.yaml

NOMAD app and worker read a nomad.yaml for configuration.

--8<-- "ops/docker-compose/nomad-oasis/configs/nomad.yaml"