Skip to content
GitLab
Projects
Groups
Snippets
/
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
nomad-lab
nomad-FAIR
Commits
a8513c50
Commit
a8513c50
authored
Aug 21, 2020
by
Markus Scheidgen
Browse files
Backend refactor
parent
07effaf8
Changes
37
Hide whitespace changes
Inline
Side-by-side
README.md
View file @
a8513c50
...
...
@@ -32,7 +32,7 @@ pip install nomad-lab
To
**use the NOMAD parsers for example**
, install the
`parsing`
extra:
```
pip install nomad-lab[parsing]
nomad parse --show-
backend
<your-file-to-parse>
nomad parse --show-
archive
<your-file-to-parse>
```
### For NOMAD developer
...
...
README.parsers.md
View file @
a8513c50
...
...
@@ -42,7 +42,7 @@ To parse code input/output from the command line, you can use NOMAD's command li
interface (CLI) and print the processing results output to stdout:
```
nomad parse --show-
backend
<path-to-file>
nomad parse --show-
archive
<path-to-file>
```
To parse a file in Python, you can program something like this:
...
...
elastic
@
f41d95aa
Compare
afc55d91
...
f41d95aa
Subproject commit
afc55d917505d8a882611ca27ef91f0cc3ac11f6
Subproject commit
f41d95aa9bf238dbcf2258eba82c87ecdb491cd1
fleur
@
41bc37d7
Compare
5b1305cb
...
41bc37d7
Subproject commit
5b1305cb24a7ec2806a94b2b5192ab6ec9a7d0d2
Subproject commit
41bc37d7d165f671de427ab25bda17d110e22e38
vasp
@
bd9c04e2
Compare
0a9bb171
...
bd9c04e2
Subproject commit
0a9bb1715
0428
c5c86115091aed58a1ae502d96b
Subproject commit
bd9c
04
e
28
1aa42010d3e57310f1106680a332763
docs/client/parsers.rst
View file @
a8513c50
...
...
@@ -4,11 +4,11 @@ Using the NOMAD parsers
To use the NOMAD parsers from the command line, you can use the ``parse`` command. The
parse command will automatically *match* the right parser to your code output file and
run the parser. There are two output formats, ``--show-metadata`` (a JSON representation
of the repository metadata), ``--show-
backend
`` (a JSON representation of the archive data).
of the repository metadata), ``--show-
archive
`` (a JSON representation of the archive data).
.. code-block:: sh
nomad parser --show-
backend
<path-to-your-mainfile-code-output-file>
nomad parser --show-
archive
<path-to-your-mainfile-code-output-file>
You can also use the NOMAD parsers from within Python. This will give you the parse
results as metainfo objects to conveniently analyse the results in Python. See :ref:`metainfo <metainfo-label>`
...
...
docs/dev/parser_tutorial.md
View file @
a8513c50
...
...
@@ -284,10 +284,9 @@ like all others: add __usrMyCodeLength to the group name.
## Backend
The backend is an object can stores parsed data according to its meta-info. The
class :py:class:
`nomad.parsing.
AbstractParser
Backend`
provides the basic backend interface.
class :py:class:
`nomad.parsing.Backend`
provides the basic backend interface.
It allows to open and close sections, add values, arrays, and values to arrays.
In nomad@FAIRDI, we practically only use the :py:class:
`nomad.parsing.LocalBackend`
. In
NOMAD-coe multiple backend implementations existed to facilitate the communication of
In NOMAD-coe multiple backend implementations existed to facilitate the communication of
python parsers with the scala infrastructure, including caching and streaming.
## Triggers
...
...
docs/dev/setup.md
View file @
a8513c50
...
...
@@ -124,6 +124,9 @@ nomad dev searchQuantities > gui/src/searchQuantities.json
./gitinfo.sh
```
In additional, you have to do some more steps to prepare your working copy to run all
the tests. See below.
## Build and run the infrastructure with docker
### Docker and nomad
...
...
@@ -218,6 +221,33 @@ yarn start
```
## Run the tests
### additional settings and artifacts
To run the tests some additional settings and files are necessary that are not part
of the code base.
First you need to create a
`nomad.yaml`
with the admin password for the user management
system:
```
keycloak:
password: <the-password>
```
Secondly, you need to provide the
`springer.msg`
Springer materials database. It can
be copied from
`/nomad/fairdi/db/data/springer.msg`
on our servers and should
be placed at
`nomad/normalizing/data/springer.msg`
.
Thirdly, you have to provide static files to serve the docs and NOMAD distribution:
```
cd docs
make html
cd ..
python setup.py compile
python setup.py sdist
cp dist/nomad-lab-*.tar.gz dist/nomad-lab.tar.gz
```
### run the necessary infrastructure
You need to have the infrastructure partially running: elastic, rabbitmq.
The rest should be mocked or provided by the tests. Make sure that you do no run any
worker, as they will fight for tasks in the queue.
...
...
nomad-ems.yaml
deleted
100644 → 0
View file @
07effaf8
elastic
:
index_name
:
fairdi_nomad_ems
mongo
:
db_name
:
fairdi_nomad_ems
domain
:
EMS
nomad/cli/client/local.py
View file @
a8513c50
...
...
@@ -23,7 +23,6 @@ import json
from
nomad
import
config
,
utils
from
nomad
import
files
from
nomad
import
datamodel
from
nomad.cli
import
parse
as
cli_parse
from
.client
import
client
...
...
@@ -131,31 +130,31 @@ class CalcProcReproduction:
self
.
upload_files
.
raw_file_object
(
self
.
mainfile
).
os_path
,
parser_name
=
parser_name
,
logger
=
self
.
logger
,
**
kwargs
)
def
normalize
(
self
,
normalizer
:
typing
.
Union
[
str
,
typing
.
Callable
],
parser_backend
=
None
):
def
normalize
(
self
,
normalizer
:
typing
.
Union
[
str
,
typing
.
Callable
],
entry_archive
=
None
):
'''
Parse the downloaded calculation and run the given normalizer.
'''
if
parser_backend
is
None
:
parser_backend
=
self
.
parse
()
if
entry_archive
is
None
:
entry_archive
=
self
.
parse
()
return
cli_parse
.
normalize
(
parser_backend
=
parser_backend
,
normalizer
=
normalizer
,
logger
=
self
.
logger
)
return
cli_parse
.
normalize
(
entry_archive
=
entry_archive
,
normalizer
=
normalizer
,
logger
=
self
.
logger
)
def
normalize_all
(
self
,
parser_backend
=
None
):
def
normalize_all
(
self
,
entry_archive
=
None
):
'''
Parse the downloaded calculation and run the whole normalizer chain.
'''
return
cli_parse
.
normalize_all
(
parser_backend
=
parser_backend
,
logger
=
self
.
logger
)
return
cli_parse
.
normalize_all
(
entry_archive
=
entry_archive
,
logger
=
self
.
logger
)
@
client
.
command
(
help
=
'Run processing locally.'
)
@
click
.
argument
(
'CALC_ID'
,
nargs
=
1
,
required
=
True
,
type
=
str
)
@
click
.
option
(
'--override'
,
is_flag
=
True
,
help
=
'Override existing local calculation data.'
)
@
click
.
option
(
'--show-
backend
'
,
is_flag
=
True
,
help
=
'Print the
backend
data.'
)
@
click
.
option
(
'--show-
archive
'
,
is_flag
=
True
,
help
=
'Print the
archive
data.'
)
@
click
.
option
(
'--show-metadata'
,
is_flag
=
True
,
help
=
'Print the extracted repo metadata.'
)
@
click
.
option
(
'--mainfile'
,
default
=
None
,
type
=
str
,
help
=
'Use this mainfile (in case mainfile cannot be retrived via API.'
)
@
click
.
option
(
'--skip-normalizers'
,
is_flag
=
True
,
help
=
'Do not normalize.'
)
@
click
.
option
(
'--not-strict'
,
is_flag
=
True
,
help
=
'Also match artificial parsers.'
)
def
local
(
calc_id
,
show_
backend
,
show_metadata
,
skip_normalizers
,
not_strict
,
**
kwargs
):
def
local
(
calc_id
,
show_
archive
,
show_metadata
,
skip_normalizers
,
not_strict
,
**
kwargs
):
utils
.
get_logger
(
__name__
).
info
(
'Using %s'
%
config
.
client
.
url
)
with
CalcProcReproduction
(
calc_id
,
**
kwargs
)
as
local
:
...
...
@@ -163,15 +162,15 @@ def local(calc_id, show_backend, show_metadata, skip_normalizers, not_strict, **
print
(
'Data being saved to .volumes/fs/tmp/repro_'
'%s if not already there'
%
local
.
upload_id
)
backend
=
local
.
parse
(
strict
=
not
not_strict
)
entry_archive
=
local
.
parse
(
strict
=
not
not_strict
)
if
not
skip_normalizers
:
local
.
normalize_all
(
parser_backend
=
backend
)
local
.
normalize_all
(
entry_archive
=
entry_archive
)
if
show_
backend
:
json
.
dump
(
backend
.
resourc
e
.
m_to_dict
(),
sys
.
stdout
,
indent
=
2
)
if
show_
archive
:
json
.
dump
(
entry_archiv
e
.
m_to_dict
(),
sys
.
stdout
,
indent
=
2
)
if
show_metadata
:
metadata
=
datamodel
.
EntryMetadata
(
domain
=
'dft'
)
# TODO take domain from matched parser
metadata
.
apply_domain_metadata
(
backend
)
metadata
=
entry_archive
.
section_metadata
metadata
.
apply_domain_metadata
(
entry_archive
)
json
.
dump
(
metadata
.
m_to_dict
(),
sys
.
stdout
,
indent
=
4
)
nomad/cli/parse.py
View file @
a8513c50
...
...
@@ -41,18 +41,20 @@ def parse(
if
hasattr
(
parser
,
'backend_factory'
):
setattr
(
parser
,
'backend_factory'
,
backend_factory
)
parser_backend
=
parser
.
run
(
mainfile_path
,
logger
=
logger
)
if
not
parser_backend
.
status
[
0
]
==
'ParseSuccess'
:
logger
.
error
(
'parsing was not successful'
,
status
=
parser_backend
.
status
)
entry_archive
=
datamodel
.
EntryArchive
()
metadata
=
entry_archive
.
m_create
(
datamodel
.
EntryMetadata
)
metadata
.
domain
=
parser
.
domain
try
:
parser
.
parse
(
mainfile_path
,
entry_archive
,
logger
=
logger
)
except
Exception
as
e
:
logger
.
error
(
'parsing was not successful'
,
exc_info
=
e
)
logger
.
info
(
'ran parser'
)
return
parser_backend
return
entry_archive
def
normalize
(
normalizer
:
typing
.
Union
[
str
,
typing
.
Callable
],
parser_backend
=
None
,
logger
=
None
):
normalizer
:
typing
.
Union
[
str
,
typing
.
Callable
],
entry_archive
,
logger
=
None
):
if
logger
is
None
:
logger
=
utils
.
get_logger
(
__name__
)
...
...
@@ -63,50 +65,46 @@ def normalize(
if
normalizer_instance
.
__class__
.
__name__
==
normalizer
)
assert
normalizer
is
not
None
,
'there is no normalizer %s'
%
str
(
normalizer
)
normalizer_instance
=
typing
.
cast
(
typing
.
Callable
,
normalizer
)(
parser_backend
.
entry_archive
)
normalizer_instance
=
typing
.
cast
(
typing
.
Callable
,
normalizer
)(
entry_archive
)
logger
=
logger
.
bind
(
normalizer
=
normalizer_instance
.
__class__
.
__name__
)
logger
.
info
(
'identified normalizer'
)
normalizer_instance
.
normalize
(
logger
=
logger
)
logger
.
info
(
'ran normalizer'
)
return
parser_backend
def
normalize_all
(
parser_backend
=
Non
e
,
logger
=
None
):
def
normalize_all
(
entry_archiv
e
,
logger
=
None
):
'''
Parse the downloaded calculation and run the whole normalizer chain.
'''
for
normalizer
in
normalizing
.
normalizers
:
if
normalizer
.
domain
==
parser_backend
.
domain
:
parser_backend
=
normalize
(
normalizer
,
parser_backend
=
parser_backend
,
logger
=
logger
)
return
parser_backend
if
normalizer
.
domain
==
entry_archive
.
section_metadata
.
domain
:
normalize
(
normalizer
,
entry_archive
,
logger
=
logger
)
@
cli
.
command
(
help
=
'Run parsing and normalizing locally.'
,
name
=
'parse'
)
@
click
.
argument
(
'MAINFILE'
,
nargs
=
1
,
required
=
True
,
type
=
str
)
@
click
.
option
(
'--show-
backend
'
,
is_flag
=
True
,
default
=
False
,
help
=
'Print the
backend
data.'
)
@
click
.
option
(
'--show-
archive
'
,
is_flag
=
True
,
default
=
False
,
help
=
'Print the
archive
data.'
)
@
click
.
option
(
'--show-metadata'
,
is_flag
=
True
,
default
=
False
,
help
=
'Print the extracted repo metadata.'
)
@
click
.
option
(
'--skip-normalizers'
,
is_flag
=
True
,
default
=
False
,
help
=
'Do not run the normalizer.'
)
@
click
.
option
(
'--not-strict'
,
is_flag
=
True
,
help
=
'Do also match artificial parsers.'
)
@
click
.
option
(
'--parser'
,
help
=
'Skip matching and use the provided parser'
)
@
click
.
option
(
'--annotate'
,
is_flag
=
True
,
help
=
'Sub-matcher based parsers will create a .annotate file.'
)
def
_parse
(
mainfile
,
show_
backend
,
show_metadata
,
skip_normalizers
,
not_strict
,
parser
,
mainfile
,
show_
archive
,
show_metadata
,
skip_normalizers
,
not_strict
,
parser
,
annotate
):
nomadcore
.
simple_parser
.
annotate
=
annotate
kwargs
=
dict
(
strict
=
not
not_strict
,
parser_name
=
parser
)
backend
=
parse
(
mainfile
,
**
kwargs
)
entry_archive
=
parse
(
mainfile
,
**
kwargs
)
if
not
skip_normalizers
:
normalize_all
(
backend
)
normalize_all
(
entry_archive
)
if
show_
backend
:
json
.
dump
(
backend
.
resourc
e
.
m_to_dict
(),
sys
.
stdout
,
indent
=
2
)
if
show_
archive
:
json
.
dump
(
entry_archiv
e
.
m_to_dict
(),
sys
.
stdout
,
indent
=
2
)
if
show_metadata
:
metadata
=
datamodel
.
EntryMetadata
(
domain
=
'dft'
)
# TODO take domain from matched parser
metadata
.
apply_domain_metadata
(
backend
)
metadata
=
entry_archive
.
section_metadata
metadata
.
apply_domain_metadata
(
entry_archive
)
json
.
dump
(
metadata
.
m_to_dict
(),
sys
.
stdout
,
indent
=
4
)
nomad/datamodel/common.py
deleted
100644 → 0
View file @
07effaf8
# Copyright 2018 Markus Scheidgen
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an"AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import
numpy
as
np
from
nomad
import
config
def
get_optional_backend_value
(
backend
,
key
,
section
,
unavailable_value
=
None
,
logger
=
None
):
# Section is section_system, section_symmetry, etc...
val
=
None
# Initialize to None, so we can compare section values.
# Loop over the sections with the name section in the backend.
for
section_index
in
backend
.
get_sections
(
section
):
if
section
==
'section_system'
:
try
:
if
not
backend
.
get_value
(
'is_representative'
,
section_index
):
continue
except
(
KeyError
,
IndexError
):
continue
try
:
new_val
=
backend
.
get_value
(
key
,
section_index
)
except
(
KeyError
,
IndexError
):
new_val
=
None
# Compare values from iterations.
if
val
is
not
None
and
new_val
is
not
None
:
if
val
.
__repr__
()
!=
new_val
.
__repr__
()
and
logger
:
logger
.
warning
(
'The values for %s differ between different %s: %s vs %s'
%
(
key
,
section
,
str
(
val
),
str
(
new_val
)))
val
=
new_val
if
new_val
is
not
None
else
val
if
val
is
None
and
logger
:
logger
.
warning
(
'The values for %s where not available in any %s'
%
(
key
,
section
))
return
unavailable_value
if
unavailable_value
is
not
None
else
config
.
services
.
unavailable_value
else
:
if
isinstance
(
val
,
np
.
generic
):
return
val
.
item
()
return
val
nomad/datamodel/datamodel.py
View file @
a8513c50
...
...
@@ -460,7 +460,7 @@ class EntryMetadata(metainfo.MSection):
''' Applies a user provided metadata dict to this calc. '''
self
.
m_update
(
**
metadata
)
def
apply_domain_metadata
(
self
,
backend
):
def
apply_domain_metadata
(
self
,
archive
):
"""Used to apply metadata that is related to the domain.
"""
assert
self
.
domain
is
not
None
,
'all entries must have a domain'
...
...
@@ -473,7 +473,7 @@ class EntryMetadata(metainfo.MSection):
if
domain_section
is
None
:
domain_section
=
self
.
m_create
(
domain_section_def
.
section_cls
)
domain_section
.
apply_domain_metadata
(
backend
)
domain_section
.
apply_domain_metadata
(
archive
)
class
EntryArchive
(
metainfo
.
MSection
):
...
...
nomad/datamodel/dft.py
View file @
a8513c50
...
...
@@ -289,18 +289,17 @@ class DFTMetadata(MSection):
self
.
m_parent
.
with_embargo
,
user_id
)
def
apply_domain_metadata
(
self
,
backend
):
def
apply_domain_metadata
(
self
,
entry_archive
):
from
nomad.normalizing.system
import
normalized_atom_labels
entry
=
self
.
m_parent
logger
=
utils
.
get_logger
(
__name__
).
bind
(
upload_id
=
entry
.
upload_id
,
calc_id
=
entry
.
calc_id
,
mainfile
=
entry
.
mainfile
)
if
backend
is
None
:
if
entry_archive
is
None
:
self
.
code_name
=
self
.
code_name_from_parser
()
return
entry_archive
=
backend
.
entry_archive
section_run
=
entry_archive
.
section_run
if
not
section_run
:
logger
.
warn
(
'no section_run found'
)
...
...
@@ -321,7 +320,7 @@ class DFTMetadata(MSection):
else
:
raise
KeyError
except
KeyError
as
e
:
logger
.
warn
(
'
backend after parsing
without program_name'
,
exc_info
=
e
)
logger
.
warn
(
'
archive
without program_name'
,
exc_info
=
e
)
self
.
code_name
=
self
.
code_name_from_parser
()
try
:
...
...
nomad/datamodel/ems.py
View file @
a8513c50
...
...
@@ -48,15 +48,15 @@ class EMSMetadata(MSection):
quantities
=
Quantity
(
type
=
str
,
shape
=
[
'0..*'
],
default
=
[],
a_search
=
Search
())
group_hash
=
Quantity
(
type
=
str
,
a_search
=
Search
())
def
apply_domain_metadata
(
self
,
backend
):
def
apply_domain_metadata
(
self
,
entry_archive
):
from
nomad
import
utils
if
backend
is
None
:
if
entry_archive
is
None
:
return
entry
=
self
.
m_parent
root_section
=
backend
.
entry_archive
.
section_experiment
root_section
=
entry_archive
.
section_experiment
entry
.
formula
=
root_section
.
section_sample
[
0
].
sample_chemical_formula
atoms
=
root_section
.
section_sample
[
0
].
sample_atom_labels
if
hasattr
(
atoms
,
'tolist'
):
...
...
nomad/normalizing/encyclopedia/basisset.py
View file @
a8513c50
...
...
@@ -50,7 +50,7 @@ class BasisSet(ABC):
@
abstractmethod
def
to_dict
(
self
)
->
RestrictedDict
:
"""Used to extract basis set settings from the
backend
and returning
"""Used to extract basis set settings from the
archive
and returning
them as a RestrictedDict.
"""
pass
...
...
nomad/normalizing/encyclopedia/encyclopedia.py
View file @
a8513c50
...
...
@@ -106,7 +106,7 @@ class EncyclopediaNormalizer(Normalizer):
except
(
AttributeError
,
KeyError
):
pass
else
:
# Try to find system type information from
backend
for the selected system.
# Try to find system type information from
archive
for the selected system.
try
:
system
=
self
.
section_run
.
section_system
[
system_idx
]
stype
=
system
.
system_type
...
...
@@ -278,7 +278,7 @@ class EncyclopediaNormalizer(Normalizer):
representative_scc_idx
=
representative_scc_idx
,
)
# Put the encyclopedia section into
backend
# Put the encyclopedia section into
archive
self
.
fill
(
context
)
# Check that the necessary information is in place
...
...
nomad/normalizing/optimade.py
View file @
a8513c50
...
...
@@ -33,8 +33,8 @@ class OptimadeNormalizer(SystemBasedNormalizer):
This normalizer performs all produces a section all data necessary for the Optimade API.
It assumes that the :class:`SystemNormalizer` was run before.
'''
def
__init__
(
self
,
backend
):
super
().
__init__
(
backend
,
only_representatives
=
True
)
def
__init__
(
self
,
archive
):
super
().
__init__
(
archive
,
only_representatives
=
True
)
def
add_optimade_data
(
self
,
index
)
->
OptimadeEntry
:
'''
...
...
nomad/normalizing/workflow.py
View file @
a8513c50
...
...
@@ -23,9 +23,6 @@ class WorkflowNormalizer(Normalizer):
This normalizer performs all produces a section all data necessary for the Optimade API.
It assumes that the :class:`SystemNormalizer` was run before.
'''
def
__init__
(
self
,
backend
):
super
().
__init__
(
backend
)
def
_get_relaxation_type
(
self
):
sec_system
=
self
.
section_run
.
section_system
if
not
sec_system
:
...
...
nomad/parsing/__init__.py
View file @
a8513c50
...
...
@@ -64,12 +64,10 @@ basends. In nomad@FAIRDI, we only currently only use a single backed. The follow
classes provide a interface definition for *backends* as an ABC and a concrete implementation
based on nomad@fairdi's metainfo:
.. autoclass:: nomad.parsing.AbstractParserBackend
:members:
.. autoclass:: nomad.parsing.Backend
:members:
'''
from
nomad.parsing.legacy
import
AbstractParserBackend
,
Backend
,
BackendError
,
LegacyParser
from
nomad.parsing.legacy
import
Backend
,
BackendError
,
LegacyParser
from
nomad.parsing.parser
import
Parser
,
BrokenParser
,
MissingParser
,
MatchingParser
from
nomad.parsing.artificial
import
TemplateParser
,
GenerateRandomParser
,
ChaosParser
,
EmptyParser
Prev
1
2
Next
Write
Preview
Supports
Markdown
0%
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment