Commit 7b535c36 authored by Markus Scheidgen's avatar Markus Scheidgen
Browse files

Merge branch 'v0.9.9' into 'master'

Merge for release

See merge request !246
parents e486b276 ddc0b19b
Pipeline #91943 passed with stage
in 12 seconds
......@@ -120,6 +120,7 @@ install tests:
image: python:3.7
before_script:
- git submodule sync
- sleep 5
- git submodule update --init --jobs=4
script:
- pip install --upgrade pip
......@@ -221,7 +222,7 @@ pypi package:
script:
- docker login -u gitlab-ci-token -p $CI_BUILD_TOKEN gitlab-registry.mpcdf.mpg.de
- docker pull $TEST_IMAGE
- docker run --rm $TEST_IMAGE python -m twine upload -u $CI_TWINE_USER -p $CI_TWINE_PASSWORD dist/nomad-lab.tar.gz
- docker run --rm $TEST_IMAGE bash -c "python -m twine upload -u $CI_TWINE_USER -p $CI_TWINE_PASSWORD dist/nomad-lab-*.tar.gz"
when: manual
only:
- tags
......@@ -46,8 +46,15 @@ contributing, and API reference.
Omitted versions are plain bugfix releases with only minor changes and fixes.
### v0.9.9
- A rdf-API that provides dcat datasets and catalog for NOMAD entries.
- Support to directly publish upon upload via API.
### v0.9.8
- A new library for parsing text-based raw files.
- A new main menu in the GUI.
- Upload OASIS uploads to central NOMAD.
- Updated documentation.
### v0.9.3
- Encyclopedia with dedicated materials search index.
......
***Note:** This is a general README file for NOMAD parsers, consult the README of specific parser projects for more detailed information!*
$preamble$
This is a NOMAD parser for [$codeLabel$]($codeUrl$). It will read $codeLabel$ input and
output files and provide all information in NOMAD's unified Metainfo based Archive format.
......@@ -64,7 +66,15 @@ python_dict = section_run.m_to_dict()
## Developing the parser
Also install NOMAD's pypi package:
Create a virtual environment to install the parser in development mode:
```
pip install virtualenv
virtualenv -p `which python3` .pyenv
source .pyenv/bin/activate
```
Install NOMAD's pypi package:
```
pip install nomad-lab
......
Subproject commit 11af5d0b67b53abf2d6593b40aa9b4a4735012e2
Subproject commit b5731ab61f5ef0d019426523b8b21ad4c82596a2
Subproject commit cf4d76ad7b1b51388936e227d52a868a97a5d1cc
Subproject commit 199a7322b7a00c3b658b0b328219b58ecc227c84
Subproject commit c378ce3c667fa0b6bd93ecdfd5699b7b9e165fa5
Subproject commit aff72472303ff1da53fe88388e6f7edc76f28535
......@@ -8,7 +8,7 @@ and infrastructure with a simplyfied architecture and consolidated code base.
:maxdepth: 1
introduction.md
upload.rst
upload.md
api.md
client/client.rst
metainfo.rst
......
==================
How to upload data
==================
# How to upload data
To contribute your data to the repository, please, login to our `upload page <../gui/uploads>`_
To contribute your data to the repository, please, login to our [upload page](https://nomad-lab.eu/prod/rae/gui/uploads)
(you need to register first, if you do not have a NOMAD account yet).
*A note for returning NOMAD users!* We revised the upload process with browser based upload
alongside new shell commands. The new Upload page allows you to monitor upload processing
and verify processing results before publishing your data to the Repository.
The `upload page <../gui/uploads>`_ acts as a staging area for your data. It allows you to
The [upload page](https://nomad-lab.eu/prod/rae/gui/uploads) acts as a staging area for your data. It allows you to
upload data, to supervise the processing of your data, and to examine all metadata that
NOMAD extracts from your uploads. The data on the upload page will be private and can be
deleted again. If you are satisfied with our processing, you can publish the data.
......@@ -22,14 +20,6 @@ You should upload many files at the same time by creating .zip or .tar files of
Ideally, input and output files are accompanied by relevant auxiliary files. NOMAD will
consider everything within a single directory as related.
**A note for VASP users** on the handling of **POTCAR** files: NOMAD takes care of it; you don't
need to worry about it. We understand that according to your VASP license, POTCAR files are
not supposed to be visible to the public. Thus, in agreement with Georg Kresse, NOMAD will
extract the most important information of POTCAR files and store it in the files named
``POTCAR.stripped``. These files can be assessed and downloaded by anyone, while the original
POTCAR files are only available to the uploader and assigned co-authors.
This is done automatically; you don't need to do anything.
Once published, data cannot be erased. Linking a corrected version to a corresponding older
one ("erratum") will be possible soon. Files from an improved calculation, even for the
same material, will be handled as a new entry.
......@@ -50,4 +40,127 @@ By uploading you confirm authorship of the uploaded calculations. Co-authors mus
after the upload process. This procedure is very much analogous to the submission of a
publication to a scientific journal.
Upload of data is free of charge.
\ No newline at end of file
Upload of data is free of charge.
## Limits
The following limitations apply to uploading:
- One upload cannot exceed 32 GB in size
- Only 10 non published uploads are allowed per user
## On the supported codes
NOMAD is interpreting your files. It will check each file and recognize if it is the
main output file of one of the supported codes. NOMAD will create a entry for this *mainfile*
that represents the respective data of this code run, experiment, etc. NOMAD only
shows that for such recognized entries. If you uploads do not contain any files that
NOMAD recognizes, you upload will be shown as empty and no data can be published.
However, all files that are associated to a recognized *mainfile* by residing in the
same directory, will be presented as *auxiliary* files along side the entry represented
by the *mainfile*.
### A note for VASP users
On the handling of **POTCAR** files: NOMAD takes care of it; you don't
need to worry about it. We understand that according to your VASP license, POTCAR files are
not supposed to be visible to the public. Thus, in agreement with Georg Kresse, NOMAD will
extract the most important information of POTCAR files and store it in the files named
`POTCAR.stripped`. These files can be assessed and downloaded by anyone, while the original
POTCAR files are only available to the uploader and assigned co-authors.
This is done automatically; you don't need to do anything.
## Preparing an upload file
You can upload .zip and .tar.gz files to NOMAD. The directory structure within can
be arbitrary. Keep in mind that files in a single directory are all associated (see above).
Ideally you only keep the files of a single (or closely related) code runs, experiments, etc.
in one directory.
You should not place files in additional archives within the upload file. NOMAD will not
extract any zips in zips and similar entrapments.
## Uploading large amounts of data
This problem is many fold. In the remainder the following topics are discussed.
- NOMAD restrictions about upload size and number of unpublished simultaneous uploads
- Managing metadata (comments, references, co-authors, datasets) for a large number of entries
- Safely transferring the data to NOMAD
### General strategy
Before you attempt to upload large amounts of data, do some experiments with a representative
and small subset of your data. Use this to simulate a larger upload,
checking and editing it the normal way. You do not have to publish this test upload;
simply delete it before publish, once you are satisfied with the results.
Ask for assistance. [Contact us](https://nomad-lab.eu/about/contact) in advance. This will
allow us to react to your specific situation and eventually prepare additional measures.
Keep enough time before you need your data to be published. Adding multiple hundreds of
GBs to NOMAD isn't a trivial feat and will take some time and effort from all sides.
### Upload restrictions
The upload restrictions are necessary to keep NOMAD data in manageable chunks and we cannot
simply grant exceptions to these rules.
This means you have to split your data into 32 GB uploads. Uploading these files, observing
the processing, and publishing the data can be automatized through NOMAD APIs.
When splitting your data, it is important to not split sub-directories if they represent
all files of a single entry. NOMAD can only bundle those related files to an entry if
they are part of the same upload (and directory). Therefore, there is no single recipe to
follow and a script to split your data will depend on how your data is organized.
### Avoid additional operations on your data
Changing the metadata of a large amounts of entries can be expensive and will also mean
more work with our APIs. A simpler solution is to add the metadata directly to your uploads.
This way NOMAD can pick it up automatically, no further actions required.
Each NOMAD upload can contain a `nomad.json` file at the root. This file can contain
metadata that you want to apply to all your entries. Here is an example:
```
{
"comment": "Data from a cool research project",
"references": ['http://archivex.org/mypaper'],
"co_authors": [
'<co-author-ids>',
'<co-author-ids>'
]
"datasets": [
'<dataset-id>'
],
"entries": {
"path/to/calcs/vasp.xml": {
"commit": "An entry specific comment."
}
}
}
```
Another measure is to directly publish your data upon upload. After performing some
smaller test upload, you should consider to skip our staging and publish the upload
right away. This can save you some time and additional API calls. The upload endpoint
has a parameter `publish_directly`. You can modify the upload command
that you get from the upload page like this:
```
curl "http://nomad-lab.eu/prod/rae/api/uploads/?token=<your-token>&publish_directly=true" -T <local_file>
```
### Save transfer of files
HTTP makes it easy for you to upload files via browser and curl, but it is not an
ideal protocol for the stable transfer of large and many files. Alternatively, we can organize
a separate manual file transfer to our servers. We will put your prepared upload
files (.zip or .tag.gz) on a predefined path on the NOMAD servers. NOMAD allows to *"upload"*
files directly from its servers via an additional `local_path` parameter:
```
curl -X PUT "http://nomad-lab.eu/prod/rae/api/uploads/?token=<your-token>&local_path=<path-to-upload-file>"
```
......@@ -3,29 +3,59 @@ This is a brief example demonstrating the public nomad@FAIRDI API for doing oper
that might be necessary to integrate external project data.
"""
from bravado.requests_client import RequestsClient
from bravado.requests_client import RequestsClient, Authenticator
from bravado.client import SwaggerClient
from keycloak import KeycloakOpenID
from urllib.parse import urlparse
import time
import os.path
import sys
nomad_url = 'http://nomad-lab.eu/prod/rae/api'
user = 'leonard.hofstadter@nomad-fairdi.tests.de'
password = 'password'
user = 'youruser'
password = 'yourpassword'
upload_file = os.path.join(os.path.dirname(__file__), 'example.zip')
# an authenticator for NOMAD's keycloak user management
class KeycloakAuthenticator(Authenticator):
def __init__(self, user, password):
super().__init__(host=urlparse(nomad_url).netloc)
self.user = user
self.password = password
self.token = None
self.__oidc = KeycloakOpenID(
server_url='https://nomad-lab.eu/fairdi/keycloak/auth/',
realm_name='fairdi_nomad_test',
client_id='nomad_public')
def apply(self, request):
if self.token is None:
self.token = self.__oidc.token(username=self.user, password=self.password)
self.token['time'] = time.time()
elif self.token['expires_in'] < int(time.time()) - self.token['time'] + 10:
try:
self.token = self.__oidc.refresh_token(self.token['refresh_token'])
self.token['time'] = time.time()
except Exception:
self.token = self.__oidc.token(username=self.user, password=self.password)
self.token['time'] = time.time()
request.headers.setdefault('Authorization', 'Bearer %s' % self.token['access_token'])
return request
upload_file = os.path.join(os.path.dirname(__file__), 'external_project_example.zip')
# create the bravado client
host = urlparse(nomad_url).netloc.split(':')[0]
http_client = RequestsClient()
http_client.set_basic_auth(host, user, password)
http_client.authenticator = KeycloakAuthenticator(user=user, password=password)
client = SwaggerClient.from_url('%s/swagger.json' % nomad_url, http_client=http_client)
# upload data
print('uploading a file with "external_id/AcAg/vasp.xml" inside ...')
with open(upload_file, 'rb') as f:
upload = client.uploads.upload(file=f).response().result
upload = client.uploads.upload(file=f, publish_directly=True).response().result
print('processing ...')
while upload.tasks_running:
......@@ -37,70 +67,6 @@ while upload.tasks_running:
if upload.tasks_status != 'SUCCESS':
print('something went wrong')
print('errors: %s' % str(upload.errors))
# delete the unsuccessful upload
# try to delete the unsuccessful upload
client.uploads.delete_upload(upload_id=upload.upload_id).response().result
sys.exit(1)
# publish data
print('publishing ...')
client.uploads.exec_upload_operation(upload_id=upload.upload_id, payload={
'operation': 'publish',
'metadata': {
# these metadata are applied to all calcs in the upload
'comment': 'Data from a cool external project',
'references': ['http://external.project.eu'],
'calculations': [
{
# these metadata are only applied to the calc identified by its 'mainfile'
'mainfile': 'external_id/AcAg/vasp.xml',
# 'coauthors': ['sheldon.cooper@ucla.edu'], this does not YET work with emails,
# Currently you have to use user_ids: leonard (the uploader, who is automatically an author) is 2 and sheldon is 1.
# Ask NOMAD developers about how to find out about user_ids.
'coauthors': [1],
# If users demand, we can implement a specific metadata keys (e.g. 'external_id', 'external_url') for external projects.
# This could allow to directly search for, or even have API endpoints that work with external_ids
# 'external_id': 'external_id',
# 'external_url': 'http://external.project.eu/data/calc/external_id/'
}
]
}
}).response().result
while upload.process_running:
upload = client.uploads.get_upload(upload_id=upload.upload_id).response().result
time.sleep(1)
if upload.tasks_status != 'SUCCESS' or len(upload.errors) > 0:
print('something went wrong')
print('errors: %s' % str(upload.errors))
# delete the unsuccessful upload
client.uploads.delete_upload(upload_id=upload.upload_id).response().result
sys.exit(1)
# search for data
result = client.repo.search(paths=['external_id']).response().result
if result.pagination.total == 0:
print('not found')
sys.exit(1)
elif result.pagination.total > 1:
print('my ids are not specific enough, bummer ... or did I uploaded stuff multiple times?')
# The results key holds an array with the current page data
print('Found the following calcs for my "external_id".')
print(', '.join(calc['calc_id'] for calc in result.results))
# download data
calc = result.results[0]
client.raw.get(upload_id=calc['upload_id'], path=calc['mainfile']).response()
print('Download of first calc works.')
# download urls, e.g. for curl
print('Possible download URLs are:')
print('%s/raw/%s/%s' % (nomad_url, calc['upload_id'], calc['mainfile']))
print('%s/raw/%s/%s/*' % (nomad_url, calc['upload_id'], os.path.dirname(calc['mainfile'])))
# direct download urls without having to search before
print('%s/raw/query?paths=external_id' % nomad_url)
{
"name": "nomad-fair-gui",
"version": "0.9.8",
"version": "0.9.9",
"commit": "e98694e",
"private": true,
"dependencies": {
......
......@@ -9,13 +9,13 @@ window.nomadEnv = {
'matomoUrl': 'https://nomad-lab.eu/fairdi/stat',
'matomoSiteId': '2',
'version': {
'label': '0.9.8',
'label': '0.9.9',
'isBeta': false,
'isTest': true,
'usesBetaData': true,
'officialUrl': 'https://nomad-lab.eu/prod/rae/gui'
},
'encyclopediaEnabled': true,
'aitoolkitEnabled': true,
'aitoolkitEnabled': false,
'oasis': false
}
/*
* Copyright The NOMAD Authors.
*
* This file is part of NOMAD. See https://nomad-lab.eu for further info.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { makeStyles } from '@material-ui/core'
import React from 'react'
import { apiBase, appBase, optimadeBase } from '../config'
import Markdown from './Markdown'
const useStyles = makeStyles(theme => ({
root: {
padding: theme.spacing(3),
maxWidth: 1024,
margin: 'auto',
width: '100%'
}
}))
export default function About() {
const classes = useStyles()
return <div className={classes.root}>
<Markdown>{`
# APIs
NOMAD's Application Programming Interface (API) allows you to access NOMAD data
and functions programatically.
## NOMAD's main API
- [API dashboard](${apiBase}/)
This is NOMAD main REST API. This API the main interface to NOMAD and it also used
by this web-page to provide all functions. Therefore, everything you do here, can
also be done by using this API.
There is a [tutorial on how to use the API with plain Python](${appBase}/docs/api_tutorial.html).
Another [tutorial covers how to install and use NOMAD's Python client library](${appBase}/docs/archive_tutorial.html).
The [NOMAD Analytics Toolkit](https://nomad-lab.eu/AIToolkit) allows to use
this without installation and directly on NOMAD servers.
## OPTIMADE
- [OPTIMADE API dashboard](${optimadeBase}/)
[OPTIMADE](https://www.optimade.org/) is an
open API standard for materials science databases. This API can be used to search
and access NOMAD metadata in a standardized way that can also be applied to many
[other materials science databses](https://providers.optimade.org/).
## DCAT
- [DCAT API dashboard](${appBase}/dcat/)
[DCAT](https://www.w3.org/TR/vocab-dcat-2/) is a RDF vocabulary designed to facilitate
interoperability between data catalogs published on the Web. This API allows you
access to NOMAD via RDF documents following DCAT. You can access NOMAD entries as
DCAT Datasets or all NOMAD entries as a DCAT Catalog.
`}</Markdown>
</div>
}
......@@ -19,7 +19,7 @@ import React, { useContext, useLayoutEffect, useRef, useCallback, useEffect, use
import {ReactComponent as AboutSvg} from './about.svg'
import PropTypes from 'prop-types'
import Markdown from './Markdown'
import { appBase, optimadeBase, apiBase, debug, consent, aitoolkitEnabled, encyclopediaEnabled } from '../config'
import { appBase, debug, consent, aitoolkitEnabled, encyclopediaEnabled } from '../config'
import { apiContext } from './api'
import packageJson from '../../package.json'
import { domains } from './domains'
......@@ -318,20 +318,16 @@ export default function About() {
</InfoCard>
<InfoCard xs={4} title="APIs" bottom><Markdown>{`
The NOMAD can also be accessed programmatically via ReST APIs.
There is the proprietary NOMAD API and an implementation of the
standardized [OPTiMaDe API (0.10.0)](https://github.com/Materials-Consortia/OPTiMaDe/tree/master)
materials science database API.
There is the proprietary NOMAD API,an implementation of the
standardized [OPTiMaDe API](https://github.com/Materials-Consortia/OPTiMaDe/tree/master)
materials science database API, and more.
Both APIs are described via [swagger/OpenAPI spec.](https://swagger.io/),
therefore you can use your favorite swagger client library
(e.g. [bravado](https://github.com/Yelp/bravado) for Python):
- [NOMAD API](${apiBase}/)
- [OPTiMaDe API](${optimadeBase}/)
There is a [tutorial on how to use the API with plain Python](${appBase}/docs/api_tutorial.html).
We offer a [tutorial on how to use the API with plain Python](${appBase}/docs/api_tutorial.html).
Another [tutorial covers how to install and use NOMAD's Python client library](${appBase}/docs/archive_tutorial.html).
The [NOMAD Analytics Toolkit](https://nomad-lab.eu/AIToolkit) allows to use
this without installation and directly on NOMAD servers.
Visit our [API page](/apis).
`}</Markdown></InfoCard>
<Grid item xs={12}>
<Markdown>{`
......
......@@ -75,7 +75,6 @@ export default function RepoEntryView({uploadId, calcId}) {
const loading = !state.calcData
const quantityProps = {data: calcData, loading: loading}
const authors = loading ? null : calcData.authors
const domain = calcData.domain && domains[calcData.domain]
let entryHeader = 'Entry metadata'
......@@ -117,7 +116,7 @@ export default function RepoEntryView({uploadId, calcId}) {
</Quantity>
<Quantity quantity='authors' {...quantityProps}>
<Typography>
{authorList(authors || [])}
{authorList(loading ? null : calcData)}
</Typography>
</Quantity>
<Quantity quantity='datasets' placeholder='no datasets' {...quantityProps}>
......
......@@ -164,14 +164,24 @@ export default function MainMenu() {
/>}
</MenuBarMenu>
<MenuBarMenu name="analyze" route="/metainfo" icon={<AnalyticsIcon/>}>
{!oasis && aitoolkitEnabled && <MenuBarItem
label="AI Toolkit" name="aitoolkit" route="/aitoolkit"
tooltip="NOMAD's Artificial Intelligence Toolkit tutorial Jupyter notebooks"
icon={<MetainfoIcon />}
/>}
{(!oasis && aitoolkitEnabled)
? <MenuBarItem
label="AI Toolkit" name="aitoolkit" route="/aitoolkit"
tooltip="NOMAD's Artificial Intelligence Toolkit tutorial Jupyter notebooks"
icon={<MetainfoIcon />}
/>
: <MenuBarItem
label="AI Toolkit" name="aitoolkit"
href="https://nomad-lab.eu/AIToolkit"
tooltip="Visit the NOMAD Artificial Intelligence Analytics Toolkit"
/>
}
<MenuBarItem
name="metainfo" route="/metainfo" tooltip="Browse the NOMAD Archive schema"
/>
<MenuBarItem
name="apis" label="APIs" route="/apis" tooltip="The list of APIs offered by NOMAD"
/>
</MenuBarMenu>
<MenuBarMenu name="about" route="/" icon={<AboutIcon/>}>
<MenuBarItem
......
......@@ -19,6 +19,7 @@
import React from 'react'
import { Route } from 'react-router-dom'
import About from '../About'
import APIs from '../APIs'
import AIToolkitPage from '../aitoolkit/AIToolkitPage'
import { MetainfoPage, help as metainfoHelp } from '../archive/MetainfoBrowser'
import ResolveDOI from '../dataset/ResolveDOI'
......@@ -126,6 +127,13 @@ export const routes = {
navPath: 'analyze/aitoolkit',
component: AIToolkitPage
},
'apis': {
exact: true,
path: '/apis',
appBarTitle: 'APIs',
navPath: 'analyze/apis',
component: APIs
},
'about': {
exact: true,
path: '/',
......
......@@ -256,14 +256,7 @@ class DatasetListUnstyled extends React.Component {
authors: {
label: 'Authors',
description: 'Authors including the uploader and the co-authors',
render: (dataset) => {
const authors = dataset.example.authors
if (authors.length > 3) {
return authorList(authors.filter((_, index) => index < 2)) + ' et al'
} else {
return authorList(authors)
}
}
render: (dataset) => authorList(dataset.example)
}
}
......
......@@ -32,7 +32,7 @@ import SharedIcon from '@material-ui/icons/SupervisedUserCircle'
import PrivateIcon from '@material-ui/icons/VisibilityOff'
import { domains } from '../domains'
import { apiContext, withApi } from '../api'
import { authorList } from '../../utils'
import { authorList, nameList } from '../../utils'
export function Published(props) {
const api = useContext(apiContext)
......@@ -147,19 +147,19 @@ export class EntryListUnstyled extends React.Component {