Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
P
parser-cp2k
Manage
Activity
Members
Code
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Deploy
Releases
Model registry
Analyze
Contributor analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
This is an archived project. Repository and other project resources are read-only.
Show more breadcrumbs
nomad-lab
parser-cp2k
Commits
892d047c
Commit
892d047c
authored
8 years ago
by
Lauri Himanen
Browse files
Options
Downloads
Patches
Plain Diff
Cleaned up the readme file.
parent
5881bcd6
No related branches found
No related tags found
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
README.md
+35
-85
35 additions, 85 deletions
README.md
with
35 additions
and
85 deletions
README.md
+
35
−
85
View file @
892d047c
This is the main repository of the
[
NOMAD
](
http://nomad-lab.eu
)
parser for
[
CP2K
](
https://www.cp2k.org/
)
.
# Installation
This parser is a submodule of the nomad-lab-base repository. Developers within
the NoMaD project will automatically get a copy of this repository when they
download and install the base repository.
# Structure
The scala layer can access the parser functionality through the
scalainterface.py file, by calling the following command:
```
python
python
scalainterface
.
py
path
/
to
/
main
/
file
```
This scala interface is separated into it's own file to separate it from the
rest of the code. Some parsers will have the interface in the same file as the
parsing code, but I feel that this is a cleaner approach.
The parser is designed to support multiple versions of CP2K with a
[
DRY
](
https://en.wikipedia.org/wiki/Don%27t_repeat_yourself
)
approach: The initial parser class is based on CP2K 2.6.2, and other versions
will be subclassed from it. By sublassing, all the previous functionality will
be preserved, new functionality can be easily created, and old functionality
overridden only where necesssary.
# Upload Folder Structure, File Naming and CP2K Settings
## Notes for Uploaders
The CP2K input setting
[
PRINT_LEVEL
](
https://manual.cp2k.org/trunk/CP2K_INPUT/GLOBAL.html#PRINT_LEVEL
)
controls the amount of details that are outputted during the calculation. The
higher this setting is, the more can be parsed from the upload.
The parser will try to find the paths to all the input and output files, but if
they are located very deep inside some folder structure or outside the folder
where the output file is, the parser will not be able to locate them. For this
reason it is recommended to keep the upload structure as flat as possible.
# Standalone Mode
The parser is designed to be usable also outside the NoMaD project as a
separate python package. This standalone python-only mode is primarily for
people who want to easily access the parser without the need to setup the whole
"NOMAD Stack". It is also used when running custom unit tests found in the
folder
*cp2k/test/unittests*
. Here is an example of the call syntax:
# Example
```
python
from
cp2kparser
import
CP2KParser
import
matplotlib.pyplot
as
mpl
...
...
@@ -63,58 +21,50 @@ folder *cp2k/test/unittests*. Here is an example of the call syntax:
mpl
.
show
()
```
To install this standalone version, you need to clone the repositories
"python-common", "nomad-meta-info", and "parser-cp2k" into the same folder.
Then install the python-common according to the instructions found in the
README. After that, you can install this package by running(if using python3,
use the python3 executable):
# Installation
The code is python>=2.7 and python>=3.4 compatible. First download and install
the nomadcore package:
```
sh
python setup.py develop
--user
#python3 setup.py develop --user
git clone https://gitlab.mpcdf.mpg.de/nomad-lab/python-common.git
cd
python-common
pip
install
-r
requirements.txt
pip
install
-e
.
```
# Tools and Methods
This section describes some of the guidelines that are used in the development
of this parser.
Download and install the parser:
## Documentation
This parser tries to follow the
[
google style
guide
](
https://google.github.io/styleguide/pyguide.html?showone=Comments#Comments
)
for documenting python code. Documenting makes it much easier to follow the
logic behind your parser.
```
sh
git clone https://gitlab.mpcdf.mpg.de/nomad-lab/parser-cp2k.git
cd
parser-cp2k
pip
install
-e
.
```
## Testing
The parsers can become quite complicated and maintaining them without
systematic testing is impossible. There are general tests that are
performed automatically in the scala layer for all parsers. This is essential,
but can only test that the data is outputted in the correct format and
according to some general rules. These tests cannot verify that the contents
are correct.
# Advanced
In order to truly test the parser output, regression testing is needed. The
tests for this parser are located in
**/cp2k/parser/parser-cp2k/cp2kparser/regtest**
. Tests provide one way to test
each parseable quantity and python has a very good
[
library for unit
testing
](
https://docs.python.org/2/library/unittest.html
)
. When the parser
supports a new quantity it is quite fast to create unit tests for it. These
tests will validate the parsing, and also easily detect bugs that may rise when
the code is modified in the future.
The parser is designed to support multiple versions of CP2K with a
[
DRY
](
https://en.wikipedia.org/wiki/Don%27t_repeat_yourself
)
approach: The initial parser class is based on CP2K 2.6.2, and other versions
will be subclassed from it. By sublassing, all the previous functionality will
be preserved, new functionality can be easily created, and old functionality
overridden only where necesssary.
## Profiling
The parsers have to be reasonably fast. For some codes there is already
significant amount of data in the NoMaD repository and the time taken to parse
it will depend on the performance of the parser. Also each time the parser
evolves after system deployment, the existing data may have to be reparsed at
least partially.
# Upload Folder Structure, File Naming and CP2K Settings
The CP2K input setting
[
PRINT_LEVEL
](
https://manual.cp2k.org/trunk/CP2K_INPUT/GLOBAL.html#PRINT_LEVEL
)
controls the amount of details that are outputted during the calculation. The
higher this setting is, the more can be parsed from the upload.
By profiling what functions take the most computational time and memory during
parsing you can identify the bottlenecks in the parser. There are already
existing profiling tools such as
[
cProfile
](
https://docs.python.org/2/library/profile.html#module-cProfile
)
which you can plug into your scripts very easily.
The parser will try to find the paths to all the input and output files, but if
they are located very deep inside some folder structure or outside the folder
where the output file is, the parser will not be able to locate them. For this
reason it is recommended to keep the upload structure as flat as possible.
## Testing
The regression tests for this parser are located in
**/cp2k/parser/parser-cp2k/cp2kparser/regtest**
. You can run the tests by
running the run_tests.py file in one of the version directories.
# Notes for CP2K Developers
#
# Notes for CP2K Developers
Here is a list of features/fixes that would make the parsing of CP2K results
easier:
-
The pdb trajectory output doesn't seem to conform to the actual standard as
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment