Commit d54602d6 authored by Lauri Himanen's avatar Lauri Himanen
Browse files

Fixed naming in readme, started using the new way of giving the input file to the parser.

parent ea52d86d
This is the main repository of the [NOMAD]( parser for
This is the main repository of the [NOMAD]( parser for
# Installation
This parser is a submodule of the nomad-lab-base repository. Developers within
the NoMaD project will automatically get a copy of this repository when they
download and install the base repository.
# Structure
The scala layer can access the parser functionality through the file, by calling the following command:
python path/to/main/file
This scala interface is separated into it's own file to separate it from the
rest of the code. Some parsers will have the interface in the same file as the
parsing code, but I feel that this is a cleaner approach.
The parser is designed to support multiple versions of NWChem with a
[DRY]( approach: The
initial parser class is based on NWChem 6.6, and other versions will be
subclassed from it. By sublassing, all the previous functionality will be
preserved, new functionality can be easily created, and old functionality
overridden only where necesssary.
# Standalone Mode
The parser is designed to be usable also outside the NoMaD project as a
separate python package. This standalone python-only mode is primarily for
people who want to easily access the parser without the need to setup the whole
"NOMAD Stack". It is also used when running custom unit tests found in the
folder *nwchem/test/unittests*. Here is an example of the call syntax:
# Example
from nwchemparser import NWChemParser
import matplotlib.pyplot as mpl
# 1. Initialize a parser by giving a path to the NWChem output file and a list of
# default units
path = "path/to/main.file"
# 1. Initialize a parser with a set of default units.
default_units = ["eV"]
parser = NWChemParser(path, default_units=default_units)
parser = NWChemParser(default_units=default_units)
# 2. Parse
results = parser.parse()
# 2. Parse a file
path = "path/to/main.file"
results = parser.parse(path)
# 3. Query the results with using the id's created specifically for NOMAD.
scf_energies = results["energy_total_scf_iteration"]
......@@ -52,53 +20,31 @@ folder *nwchem/test/unittests*. Here is an example of the call syntax:
To install this standalone version, you need to clone the repositories
"python-common", "nomad-meta-info", and "parser-nwchem" into the same folder.
Then install the python-common according to the instructions found in the
README. After that, you can install this package by running either of the
following two commands depending on your python version:
# Installation
The code is python 2 and python 3 compatible. First download and install
the nomadcore package:
python develop --user
python3 develop --user
git clone
cd python-common
pip install -r requirements.txt
pip install -e .
# Tools and Methods
This section describes some of the guidelines that are used in the development
of this parser.
## Documentation
This parser tries to follow the [google style
for documenting python code. Documenting makes it much easier to follow the
logic behind your parser.
Then download the metainfo definitions to the same folder where the
'python-common' repository was cloned:
## Testing
The parsers can become quite complicated and maintaining them without
systematic testing is impossible. There are general tests that are
performed automatically in the scala layer for all parsers. This is essential,
but can only test that the data is outputted in the correct format and
according to some general rules. These tests cannot verify that the contents
are correct.
git clone
In order to truly test the parser output, regression testing is needed. The
tests for this parser are located in
**/nwchem/parser/parser-nwchem/nwchemparser/regtest**. Tests provide one way to test
each parseable quantity and python has a very good [library for unit
testing]( When the parser
supports a new quantity it is quite fast to create unit tests for it. These
tests will validate the parsing, and also easily detect bugs that may rise when
the code is modified in the future.
Finally download and install the parser:
## Profiling
The parsers have to be reasonably fast. For some codes there is already
significant amount of data in the NoMaD repository and the time taken to parse
it will depend on the performance of the parser. Also each time the parser
evolves after system deployment, the existing data may have to be reparsed at
least partially.
git clone
cd parser-nwchem
pip install -e .
By profiling what functions take the most computational time and memory during
parsing you can identify the bottlenecks in the parser. There are already
existing profiling tools such as
which you can plug into your scripts very easily.
# Notes
The parser is based on NWChem 6.6.
......@@ -8,7 +8,6 @@ from nomadcore.baseclasses import ParserInterface
logger = logging.getLogger("nomad")
class NWChemParser(ParserInterface):
"""This class handles the initial setup before any parsing can happen. It
determines which version of NWChem was used to generate the output and then
......@@ -17,8 +16,8 @@ class NWChemParser(ParserInterface):
After the implementation has been setup, you can parse the files with
def __init__(self, main_file, metainfo_to_keep=None, backend=None, default_units=None, metainfo_units=None, debug=True, log_level=logging.ERROR, store=True):
super(NWChemParser, self).__init__(main_file, metainfo_to_keep, backend, default_units, metainfo_units, debug, log_level, store)
def __init__(self, metainfo_to_keep=None, backend=None, default_units=None, metainfo_units=None, debug=True, log_level=logging.ERROR, store=True):
super(NWChemParser, self).__init__(metainfo_to_keep, backend, default_units, metainfo_units, debug, log_level, store)
def setup_version(self):
"""Setups the version by looking at the output file and the version
......@@ -77,4 +76,4 @@ class NWChemParser(ParserInterface):
except AttributeError:
logger.exception("A parser class 'NWChemMainParser' could not be found in the module '[]'.".format(parser_module))
self.main_parser = parser_class(self.parser_context.main_file, self.parser_context)
self.main_parser = parser_class(self.parser_context)
# Unit tests
This directory contains unit tests to evaluate the correctness of the parser in
a systematic way. Ideally each parsed metainfo should have at least one unit
test, and if the resulting values are predetermined, the available values
should all be tested individually. Also certain scenarios that should produce a
parsing error should be tested.
......@@ -13,5 +13,5 @@ if __name__ == "__main__":
# Initialise the parser with the main filename and a JSON backend
main_file = sys.argv[1]
parser = NWChemParser(main_file, backend=JsonParseEventsWriterBackend)
parser = NWChemParser(backend=JsonParseEventsWriterBackend)
......@@ -8,15 +8,14 @@ import numpy as np
LOGGER = logging.getLogger("nomad")
class NWChemMainParser(MainHierarchicalParser):
"""The main parser class that is called for all run types. Parses the NWChem
output file.
def __init__(self, file_path, parser_context):
def __init__(self, parser_context):
super(NWChemMainParser, self).__init__(file_path, parser_context)
super(NWChemMainParser, self).__init__(parser_context)
# Cache for storing current method settings
self.method_cache = CacheService(self.parser_context)
......@@ -763,7 +762,7 @@ class NWChemMainParser(MainHierarchicalParser):
def transform_total_charge(self, backend, groups):
charge = groups[0]
self.backend.addValue("total_charge", round(float(charge)))
self.backend.addValue("total_charge", int(float(charge)))
def set_gaussian_basis(self, backend, groups):
self.method_cache["program_basis_set_type"] = "gaussians"
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment