From a9ea7db5a98e473b0295516e93c1e9edfb46012a Mon Sep 17 00:00:00 2001
From: temok-mx <temok.mx@gmail.com>
Date: Thu, 10 Sep 2020 17:23:12 +0200
Subject: [PATCH] Updated README.md; added metadata.yml; the lead branch is now
 master, inactive branches became tags

---
 .gitlab-ci.yml |  19 --------
 README.md      | 120 ++++++++++++++++++++++++++-----------------------
 metadata.yml   |  32 +++++++++++++
 3 files changed, 95 insertions(+), 76 deletions(-)
 delete mode 100644 .gitlab-ci.yml
 create mode 100644 metadata.yml

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
deleted file mode 100644
index e25c269..0000000
--- a/.gitlab-ci.yml
+++ /dev/null
@@ -1,19 +0,0 @@
-stages:
-  - test
-
-testing:
-  stage: test
-  script:
-    - cd .. && rm -rf nomad-lab-base
-    - git clone --recursive git@gitlab.mpcdf.mpg.de:nomad-lab/nomad-lab-base.git
-    - cd nomad-lab-base
-    - git submodule foreach git checkout master
-    - git submodule foreach git pull
-    - sbt cp2k/test
-    - export PYTHONEXE=/labEnv/bin/python
-    - sbt cp2k/test
-  only:
-    - master
-  tags:
-    - test
-    - spec2
\ No newline at end of file
diff --git a/README.md b/README.md
index af9a84a..feaee4a 100644
--- a/README.md
+++ b/README.md
@@ -1,72 +1,78 @@
-This is the main repository of the [NOMAD](https://www.nomad-coe.eu/) parser for
-[CP2K](https://www.cp2k.org/).
+This is a NOMAD parser for [CP2K](https://www.cp2k.org/). It will read CP2K input and
+output files and provide all information in NOMAD's unified Metainfo based Archive format.
 
-# Example
-```python
-    from cp2kparser import CP2KParser
-    import matplotlib.pyplot as mpl
+## Preparing code input and output file for uploading to NOMAD
+
+NOMAD accepts `.zip` and `.tar.gz` archives as uploads. Each upload can contain arbitrary
+files and directories. NOMAD will automatically try to choose the right parser for you files.
+For each parser (i.e. for each supported code) there is one type of file that the respective
+parser can recognize. We call these files `mainfiles` as they typically are the main
+output file a code. For each `mainfile` that NOMAD discovers it will create an entry
+in the database that users can search, view, and download. NOMAD will associate all files
+in the same directory as files that also belong to that entry. Parsers
+might also read information from these auxillary files. This way you can add more files
+to an entry, even if the respective parser/code might not directly support it.
+
+For cp2k please provide at least the files from this table if applicable to your
+calculations (remember that you can provide more files if you want):
 
-    # 1. Initialize a parser with a set of default units.
-    default_units = ["eV"]
-    parser = CP2KParser(default_units=default_units)
 
-    # 2. Parse a file
-    path = "path/to/main.file"
-    results = parser.parse(path)
 
-    # 3. Query the results with using the id's created specifically for NOMAD.
-    scf_energies = results["energy_total_scf_iteration"]
-    mpl.plot(scf_energies)
-    mpl.show()
+To create an upload with all calculations in a directory structure:
+
+```
+zip -r <upload-file>.zip <directory>/*
 ```
 
-# Installation
-The code is python 2 and python 3 compatible. First download and install
-the nomadcore package:
+Go to the [NOMAD upload page](https://nomad-lab.eu/prod/rae/gui/uploads) to upload files
+or find instructions about how to upload files from the command line.
+
+## Using the parser
 
-```sh
-git clone https://gitlab.mpcdf.mpg.de/nomad-lab/python-common.git
-cd python-common
-pip install -r requirements.txt
-pip install -e .
+You can use NOMAD's parsers and normalizers locally on your computer. You need to install
+NOMAD's pypi package:
+
+```
+pip install nomad-lab
 ```
 
-Then download the metainfo definitions to the same folder where the
-'python-common' repository was cloned:
+To parse code input/output from the command line, you can use NOMAD's command line
+interface (CLI) and print the processing results output to stdout:
 
-```sh
-git clone https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-meta-info.git
 ```
+nomad parse --show-archive <path-to-file>
+```
+
+To parse a file in Python, you can program something like this:
+```python
+import sys
+from nomad.cli.parse import parse, normalize_all
 
-Finally download and install the parser:
+# match and run the parser
+backend = parse(sys.argv[1])
+# run all normalizers
+normalize_all(backend)
 
-```sh
-git clone https://gitlab.mpcdf.mpg.de/nomad-lab/parser-cp2k.git
-cd parser-cp2k
-pip install -e .
+# get the 'main section' section_run as a metainfo object
+section_run = backend.resource.contents[0].section_run[0]
+
+# get the same data as JSON serializable Python dict
+python_dict = section_run.m_to_dict()
+```
+
+## Developing the parser
+
+Also install NOMAD's pypi package:
+
+```
+pip install nomad-lab
+```
+
+Clone the parser project and install it in development mode:
+
+```
+git clone https://gitlab.mpcdf.mpg.de/nomad-lab/parser-cp2k parser-cp2k
+pip install -e parser-cp2k
 ```
 
-# Notes
-The parser is based on CP2K 2.6.2.
-
-The CP2K input setting
-[PRINT_LEVEL](https://manual.cp2k.org/trunk/CP2K_INPUT/GLOBAL.html#PRINT_LEVEL)
-controls the amount of details that are outputted during the calculation. The
-higher this setting is, the more can be parsed from the upload.
-
-The parser will try to find the paths to all the input and output files, but if
-they are located very deep inside some folder structure or outside the folder
-where the output file is, the parser will not be able to locate them. For this
-reason it is recommended to keep the upload structure as flat as possible.
-
-Here is a list of features/fixes that would make the parsing of CP2K results
-easier:
- - The pdb trajectory output doesn't seem to conform to the actual standard as
-   the different configurations are separated by the END keyword which is
-   supposed to be written only once in the file. The [format
-   specification](http://www.wwpdb.org/documentation/file-format) states that
-   different configurations should start with MODEL and end with ENDMDL tags.
- - The output file should contain the paths/filenames of different input and
-   output files that are accessed during the program run. This data is already
-   available for some files (input file, most files produced by MD), but many
-   are not mentioned.
+Running the parser now, will use the parser's Python code from the clone project.
diff --git a/metadata.yml b/metadata.yml
new file mode 100644
index 0000000..d46378a
--- /dev/null
+++ b/metadata.yml
@@ -0,0 +1,32 @@
+code-label: CP2K
+code-label-style: all in capitals
+code-url: https://www.cp2k.org/
+parser-dir-name: dependencies/parsers/cp2k/
+parser-git-url: https://gitlab.mpcdf.mpg.de/nomad-lab/parser-cp2k
+parser-specific: |
+  ## Usage notes
+  The parser is based on CP2K 2.6.2.
+
+  The CP2K input setting
+  [PRINT_LEVEL](https://manual.cp2k.org/trunk/CP2K_INPUT/GLOBAL.html#PRINT_LEVEL)
+  controls the amount of details that are outputted during the calculation. The
+  higher this setting is, the more can be parsed from the upload.
+
+  The parser will try to find the paths to all the input and output files, but if
+  they are located very deep inside some folder structure or outside the folder
+  where the output file is, the parser will not be able to locate them. For this
+  reason it is recommended to keep the upload structure as flat as possible.
+
+  Here is a list of features/fixes that would make the parsing of CP2K results
+  easier:
+  - The pdb trajectory output doesn't seem to conform to the actual standard as
+    the different configurations are separated by the END keyword which is
+    supposed to be written only once in the file. The [format
+    specification](http://www.wwpdb.org/documentation/file-format) states that
+    different configurations should start with MODEL and end with ENDMDL tags.
+  - The output file should contain the paths/filenames of different input and
+    output files that are accessed during the program run. This data is already
+    available for some files (input file, most files produced by MD), but many
+    are not mentioned.
+
+table-of-files: ''
-- 
GitLab