Commit 3156aa6f authored by Lauri Himanen's avatar Lauri Himanen
Browse files

Restructured the folders and files to use a common project structure, updated readme.

parent fbdb8625
This is the main repository of the [NOMAD](http://nomad-lab.eu) parser for This is the main repository of the [NOMAD](https://www.nomad-coe.eu/) parser for
[BigDFT](http://bigdft.org/). [BigDFT](http://bigdft.org/).
# Standalone Installation # Example
The parser is designed to be usable as a separate python package. Here is an
example of the call syntax:
```python ```python
from bigdftparser import bigdftparser from bigdftparser import BigDFTParser
import matplotlib.pyplot as mpl import matplotlib.pyplot as mpl
# 0. Initialize a parser by giving a path to the BigDFT output file and a list of # 1. Initialize a parser with a set of default units.
# default units
path = "path/to/main.file"
default_units = ["eV"] default_units = ["eV"]
parser = bigdftparser(path, default_units=default_units) parser = BigDFTParser(default_units=default_units)
# 1. Parse # 2. Parse a file
results = parser.parse() path = "path/to/main.file"
results = parser.parse(path)
# 2. Query the results with using the id's created specifically for NOMAD. # 3. Query the results with using the id's created specifically for NOMAD.
scf_energies = results["energy_total_scf_iteration"] scf_energies = results["energy_total_scf_iteration"]
mpl.plot(scf_energies) mpl.plot(scf_energies)
mpl.show() mpl.show()
``` ```
To install this standalone version, you need to first clone the # Installation
*git@gitlab.mpcdf.mpg.de:nomad-lab/python-common.git* repository and the The code is python 2 and python 3 compatible. First download and install
*git@gitlab.mpcdf.mpg.de:nomad-lab/nomad-meta-info.git* repository into the the nomadcore package:
same folder. Then install the *python-common* package according to the
instructions found in the README. After that, you can install this package by
running either of the following two commands depending on your python version:
```sh ```sh
python setup.py develop --user # for python2 git clone https://gitlab.mpcdf.mpg.de/nomad-lab/python-common.git
python3 setup.py develop --user # for python3 cd python-common
pip install -r requirements.txt
pip install -e .
``` ```
# Scala access Then download the metainfo definitions to the same folder where the
The scala layer in the Nomad infrastructure can access the parser functionality 'python-common' repository was cloned:
through the scalainterface.py file, by calling the following command:
```python ```sh
python scalainterface.py path/to/main/file git clone https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-meta-info.git
``` ```
This scala interface is in it's own file to separate it from the rest of the Finally download and install the parser:
code.
# Support of different versions
The parser is designed to support multiple versions of BigDFT with a
[DRY](https://en.wikipedia.org/wiki/Don%27t_repeat_yourself) approach: The
initial parser class is based on BigDFT 1.8.0, and other versions will be
subclassed from it. By sublassing, all the previous functionality will be
preserved, new functionality can be easily created, and old functionality
overridden only where necesssary.
# Developer Info ```sh
This section describes some of the guidelines that are used in the development git clone https://gitlab.mpcdf.mpg.de/nomad-lab/parser-big-dft.git
of this parser. cd parser-big-dft
pip install -e .
## Documentation ```
This parser tries to follow the [google style
guide](https://google.github.io/styleguide/pyguide.html?showone=Comments#Comments)
for documenting python code. Documenting makes it much easier to follow the
logic behind your parser.
## Testing
The parsers can become quite complicated and maintaining them without
systematic testing is impossible. There are general tests that are
performed automatically in the scala layer for all parsers. This is essential,
but can only test that the data is outputted in the correct format and
according to some general rules. These tests cannot verify that the contents
are correct.
In order to truly test the parser output, regression testing is needed. The
tests for this parser are located in the **regtest** folder. Tests provide one
way to test each parseable quantity and python has a very good [library for
unit testing](https://docs.python.org/2/library/unittest.html). When the parser
supports a new quantity it is quite fast to create unit tests for it. These
tests will validate the parsing, and also easily detect bugs that may rise when
the code is modified in the future.
## Profiling
The parsers have to be reasonably fast. For some codes there is already
significant amount of data in the NoMaD repository and the time taken to parse
it will depend on the performance of the parser. Also each time the parser
evolves after system deployment, the existing data may have to be reparsed at
least partially.
By profiling what functions take the most computational time and memory during # Notes
parsing you can identify the bottlenecks in the parser. There are already The parser is based on BigDFT 1.8.
existing profiling tools such as
[cProfile](https://docs.python.org/2/library/profile.html#module-cProfile)
which you can plug into your scripts very easily.
...@@ -6,7 +6,6 @@ from nomadcore.baseclasses import ParserInterface ...@@ -6,7 +6,6 @@ from nomadcore.baseclasses import ParserInterface
logger = logging.getLogger("nomad") logger = logging.getLogger("nomad")
#===============================================================================
class BigDFTParser(ParserInterface): class BigDFTParser(ParserInterface):
"""This class handles the initial setup before any parsing can happen. It """This class handles the initial setup before any parsing can happen. It
determines which version of BigDFT was used to generate the output and then determines which version of BigDFT was used to generate the output and then
...@@ -15,8 +14,8 @@ class BigDFTParser(ParserInterface): ...@@ -15,8 +14,8 @@ class BigDFTParser(ParserInterface):
After the implementation has been setup, you can parse the files with After the implementation has been setup, you can parse the files with
parse(). parse().
""" """
def __init__(self, main_file, metainfo_to_keep=None, backend=None, default_units=None, metainfo_units=None, debug=True, log_level=logging.ERROR, store=True): def __init__(self, metainfo_to_keep=None, backend=None, default_units=None, metainfo_units=None, debug=True, log_level=logging.ERROR, store=True):
super(BigDFTParser, self).__init__(main_file, metainfo_to_keep, backend, default_units, metainfo_units, debug, log_level, store) super(BigDFTParser, self).__init__(metainfo_to_keep, backend, default_units, metainfo_units, debug, log_level, store)
def setup_version(self): def setup_version(self):
"""Setups the version by looking at the output file and the version """Setups the version by looking at the output file and the version
...@@ -78,4 +77,4 @@ class BigDFTParser(ParserInterface): ...@@ -78,4 +77,4 @@ class BigDFTParser(ParserInterface):
except AttributeError: except AttributeError:
logger.exception("A parser class '{}' could not be found in the module '[]'.".format(class_name, parser_module)) logger.exception("A parser class '{}' could not be found in the module '[]'.".format(class_name, parser_module))
raise raise
self.main_parser = parser_class(self.parser_context.main_file, self.parser_context) self.main_parser = parser_class(self.parser_context)
...@@ -13,5 +13,5 @@ if __name__ == "__main__": ...@@ -13,5 +13,5 @@ if __name__ == "__main__":
# Initialise the parser with the main filename and a JSON backend # Initialise the parser with the main filename and a JSON backend
main_file = sys.argv[1] main_file = sys.argv[1]
parser = BigDFTParser(main_file, backend=JsonParseEventsWriterBackend) parser = BigDFTParser(backend=JsonParseEventsWriterBackend)
parser.parse() parser.parse(main_file)
...@@ -7,15 +7,14 @@ from bigdftparser.generic.libxc_codes import LIB_XC_MAPPING ...@@ -7,15 +7,14 @@ from bigdftparser.generic.libxc_codes import LIB_XC_MAPPING
LOGGER = logging.getLogger("nomad") LOGGER = logging.getLogger("nomad")
#===============================================================================
class BigDFTMainParser(AbstractBaseParser): class BigDFTMainParser(AbstractBaseParser):
"""The main parser class that is called for all run types. Parses the NWChem """The main parser class that is called for all run types. Parses the NWChem
output file. output file.
""" """
def __init__(self, file_path, parser_context): def __init__(self, parser_context):
""" """
""" """
super(BigDFTMainParser, self).__init__(file_path, parser_context) super(BigDFTMainParser, self).__init__(parser_context)
# Map keys in the output to funtions that handle the values # Map keys in the output to funtions that handle the values
self.key_to_funct_map = { self.key_to_funct_map = {
...@@ -30,7 +29,7 @@ class BigDFTMainParser(AbstractBaseParser): ...@@ -30,7 +29,7 @@ class BigDFTMainParser(AbstractBaseParser):
"Energy (Hartree)": lambda x: self.backend.addRealValue("energy_total", float(x), unit="hartree"), "Energy (Hartree)": lambda x: self.backend.addRealValue("energy_total", float(x), unit="hartree"),
} }
def parse(self): def parse(self, filepath):
"""The output file of a BigDFT run is a YAML document. Here we directly """The output file of a BigDFT run is a YAML document. Here we directly
parse this document with an existing YAML library, and push its parse this document with an existing YAML library, and push its
contents into the backend. This function will read the document in contents into the backend. This function will read the document in
...@@ -39,7 +38,7 @@ class BigDFTMainParser(AbstractBaseParser): ...@@ -39,7 +38,7 @@ class BigDFTMainParser(AbstractBaseParser):
""" """
self.prepare() self.prepare()
self.print_json_header() self.print_json_header()
with open(self.file_path, "r") as fin: with open(filepath, "r") as fin:
try: try:
# Open default sections and output default information # Open default sections and output default information
section_run_id = self.backend.openSection("section_run") section_run_id = self.backend.openSection("section_run")
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment