Commit a9ea7db5 authored by temok-mx's avatar temok-mx

Updated README.md; added metadata.yml; the lead branch is now master, inactive branches became tags

parent 5742af28
stages:
- test
testing:
stage: test
script:
- cd .. && rm -rf nomad-lab-base
- git clone --recursive git@gitlab.mpcdf.mpg.de:nomad-lab/nomad-lab-base.git
- cd nomad-lab-base
- git submodule foreach git checkout master
- git submodule foreach git pull
- sbt cp2k/test
- export PYTHONEXE=/labEnv/bin/python
- sbt cp2k/test
only:
- master
tags:
- test
- spec2
\ No newline at end of file
This is the main repository of the [NOMAD](https://www.nomad-coe.eu/) parser for
[CP2K](https://www.cp2k.org/).
This is a NOMAD parser for [CP2K](https://www.cp2k.org/). It will read CP2K input and
output files and provide all information in NOMAD's unified Metainfo based Archive format.
# Example
```python
from cp2kparser import CP2KParser
import matplotlib.pyplot as mpl
## Preparing code input and output file for uploading to NOMAD
NOMAD accepts `.zip` and `.tar.gz` archives as uploads. Each upload can contain arbitrary
files and directories. NOMAD will automatically try to choose the right parser for you files.
For each parser (i.e. for each supported code) there is one type of file that the respective
parser can recognize. We call these files `mainfiles` as they typically are the main
output file a code. For each `mainfile` that NOMAD discovers it will create an entry
in the database that users can search, view, and download. NOMAD will associate all files
in the same directory as files that also belong to that entry. Parsers
might also read information from these auxillary files. This way you can add more files
to an entry, even if the respective parser/code might not directly support it.
For cp2k please provide at least the files from this table if applicable to your
calculations (remember that you can provide more files if you want):
# 1. Initialize a parser with a set of default units.
default_units = ["eV"]
parser = CP2KParser(default_units=default_units)
# 2. Parse a file
path = "path/to/main.file"
results = parser.parse(path)
# 3. Query the results with using the id's created specifically for NOMAD.
scf_energies = results["energy_total_scf_iteration"]
mpl.plot(scf_energies)
mpl.show()
To create an upload with all calculations in a directory structure:
```
zip -r <upload-file>.zip <directory>/*
```
# Installation
The code is python 2 and python 3 compatible. First download and install
the nomadcore package:
Go to the [NOMAD upload page](https://nomad-lab.eu/prod/rae/gui/uploads) to upload files
or find instructions about how to upload files from the command line.
## Using the parser
```sh
git clone https://gitlab.mpcdf.mpg.de/nomad-lab/python-common.git
cd python-common
pip install -r requirements.txt
pip install -e .
You can use NOMAD's parsers and normalizers locally on your computer. You need to install
NOMAD's pypi package:
```
pip install nomad-lab
```
Then download the metainfo definitions to the same folder where the
'python-common' repository was cloned:
To parse code input/output from the command line, you can use NOMAD's command line
interface (CLI) and print the processing results output to stdout:
```sh
git clone https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-meta-info.git
```
nomad parse --show-archive <path-to-file>
```
To parse a file in Python, you can program something like this:
```python
import sys
from nomad.cli.parse import parse, normalize_all
Finally download and install the parser:
# match and run the parser
backend = parse(sys.argv[1])
# run all normalizers
normalize_all(backend)
```sh
git clone https://gitlab.mpcdf.mpg.de/nomad-lab/parser-cp2k.git
cd parser-cp2k
pip install -e .
# get the 'main section' section_run as a metainfo object
section_run = backend.resource.contents[0].section_run[0]
# get the same data as JSON serializable Python dict
python_dict = section_run.m_to_dict()
```
## Developing the parser
Also install NOMAD's pypi package:
```
pip install nomad-lab
```
Clone the parser project and install it in development mode:
```
git clone https://gitlab.mpcdf.mpg.de/nomad-lab/parser-cp2k parser-cp2k
pip install -e parser-cp2k
```
# Notes
The parser is based on CP2K 2.6.2.
The CP2K input setting
[PRINT_LEVEL](https://manual.cp2k.org/trunk/CP2K_INPUT/GLOBAL.html#PRINT_LEVEL)
controls the amount of details that are outputted during the calculation. The
higher this setting is, the more can be parsed from the upload.
The parser will try to find the paths to all the input and output files, but if
they are located very deep inside some folder structure or outside the folder
where the output file is, the parser will not be able to locate them. For this
reason it is recommended to keep the upload structure as flat as possible.
Here is a list of features/fixes that would make the parsing of CP2K results
easier:
- The pdb trajectory output doesn't seem to conform to the actual standard as
the different configurations are separated by the END keyword which is
supposed to be written only once in the file. The [format
specification](http://www.wwpdb.org/documentation/file-format) states that
different configurations should start with MODEL and end with ENDMDL tags.
- The output file should contain the paths/filenames of different input and
output files that are accessed during the program run. This data is already
available for some files (input file, most files produced by MD), but many
are not mentioned.
Running the parser now, will use the parser's Python code from the clone project.
code-label: CP2K
code-label-style: all in capitals
code-url: https://www.cp2k.org/
parser-dir-name: dependencies/parsers/cp2k/
parser-git-url: https://gitlab.mpcdf.mpg.de/nomad-lab/parser-cp2k
parser-specific: |
## Usage notes
The parser is based on CP2K 2.6.2.
The CP2K input setting
[PRINT_LEVEL](https://manual.cp2k.org/trunk/CP2K_INPUT/GLOBAL.html#PRINT_LEVEL)
controls the amount of details that are outputted during the calculation. The
higher this setting is, the more can be parsed from the upload.
The parser will try to find the paths to all the input and output files, but if
they are located very deep inside some folder structure or outside the folder
where the output file is, the parser will not be able to locate them. For this
reason it is recommended to keep the upload structure as flat as possible.
Here is a list of features/fixes that would make the parsing of CP2K results
easier:
- The pdb trajectory output doesn't seem to conform to the actual standard as
the different configurations are separated by the END keyword which is
supposed to be written only once in the file. The [format
specification](http://www.wwpdb.org/documentation/file-format) states that
different configurations should start with MODEL and end with ENDMDL tags.
- The output file should contain the paths/filenames of different input and
output files that are accessed during the program run. This data is already
available for some files (input file, most files produced by MD), but many
are not mentioned.
table-of-files: ''
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment