Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
nomad-lab
parser-big-dft
Commits
ea1ca62d
Commit
ea1ca62d
authored
Nov 11, 2016
by
Lauri Himanen
Browse files
Initial push with the correct code structure.
parent
4877388a
Pipeline
#8373
failed with stage
Changes
16
Pipelines
1
Expand all
Hide whitespace changes
Inline
Side-by-side
.gitignore
0 → 100644
View file @
ea1ca62d
# use glob syntax.
syntax: glob
*.ser
*.class
*~
*.bak
#*.off
*.old
*.pyc
*.bk
*.swp
.DS_Store
**/__pycache__
# logging files
detailed.log
# eclipse conf file
.settings
.classpath
.project
.manager
.scala_dependencies
# idea
.idea
*.iml
# building
target
build
null
tmp*
temp*
dist
test-output
build.log
# other scm
.svn
.CVS
.hg*
# switch to regexp syntax.
# syntax: regexp
# ^\.pc/
#SHITTY output not in target directory
build.log
#emacs TAGS
TAGS
lib/
env/
# Egg
parser/parser-big-dft/bigdftparser.egg-info/
.gitlab-ci.yml
0 → 100644
View file @
ea1ca62d
stages
:
-
test
testing
:
stage
:
test
script
:
-
cd .. && rm -rf nomad-lab-base
-
git clone --recursive git@gitlab.mpcdf.mpg.de:nomad-lab/nomad-lab-base.git
-
cd nomad-lab-base
-
git submodule foreach git checkout master
-
git submodule foreach git pull
-
sbt nwchem/test
-
export PYTHONEXE=/labEnv/bin/python
-
sbt bigdft/test
only
:
-
master
tags
:
-
test
-
spec2
README.md
View file @
ea1ca62d
[
NOMAD Laboratory CoE
](
http://nomad-coe.eu
)
parser for
[
BigDFT
](
http://bigdft.org/
)
This is the main repository of the
[
NOMAD
](
http://nomad-lab.eu
)
parser for
[
BigDFT
](
http://bigdft.org/
)
.
The original repository lives at
https://gitlab.mpcdf.mpg.de/nomad-lab/parser-wien2k
but you probably want to checkout
https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-lab-base
to also get all dependencies.
# Standalone Installation
The parser is designed to be usable as a separate python package. Here is an
example of the call syntax:
```
python
from
bigdftparser
import
bigdftparser
import
matplotlib.pyplot
as
mpl
# 0. Initialize a parser by giving a path to the BigDFT output file and a list of
# default units
path
=
"path/to/main.file"
default_units
=
[
"eV"
]
parser
=
bigdftparser
(
path
,
default_units
=
default_units
)
# 1. Parse
results
=
parser
.
parse
()
# 2. Query the results with using the id's created specifically for NOMAD.
scf_energies
=
results
[
"energy_total_scf_iteration"
]
mpl
.
plot
(
scf_energies
)
mpl
.
show
()
```
To install this standalone version, you need to first clone the
*git@gitlab.mpcdf.mpg.de:nomad-lab/python-common.git*
repository and the
*git@gitlab.mpcdf.mpg.de:nomad-lab/nomad-meta-info.git*
repository into the
same folder. Then install the
*python-common*
package according to the
instructions found in the README. After that, you can install this package by
running either of the following two commands depending on your python version:
```
sh
python setup.py develop
--user
# for python2
python3 setup.py develop
--user
# for python3
```
# Scala access
The scala layer in the Nomad infrastructure can access the parser functionality
through the scalainterface.py file, by calling the following command:
```
python
python
scalainterface
.
py
path
/
to
/
main
/
file
```
This scala interface is in it's own file to separate it from the rest of the
code.
# Support of different versions
The parser is designed to support multiple versions of BigDFT with a
[
DRY
](
https://en.wikipedia.org/wiki/Don%27t_repeat_yourself
)
approach: The
initial parser class is based on BigDFT 1.8.0, and other versions will be
subclassed from it. By sublassing, all the previous functionality will be
preserved, new functionality can be easily created, and old functionality
overridden only where necesssary.
# Developer Info
This section describes some of the guidelines that are used in the development
of this parser.
## Documentation
This parser tries to follow the
[
google style
guide
](
https://google.github.io/styleguide/pyguide.html?showone=Comments#Comments
)
for documenting python code. Documenting makes it much easier to follow the
logic behind your parser.
## Testing
The parsers can become quite complicated and maintaining them without
systematic testing is impossible. There are general tests that are
performed automatically in the scala layer for all parsers. This is essential,
but can only test that the data is outputted in the correct format and
according to some general rules. These tests cannot verify that the contents
are correct.
In order to truly test the parser output, regression testing is needed. The
tests for this parser are located in the
**regtest**
folder. Tests provide one
way to test each parseable quantity and python has a very good
[
library for
unit testing
](
https://docs.python.org/2/library/unittest.html
)
. When the parser
supports a new quantity it is quite fast to create unit tests for it. These
tests will validate the parsing, and also easily detect bugs that may rise when
the code is modified in the future.
## Profiling
The parsers have to be reasonably fast. For some codes there is already
significant amount of data in the NoMaD repository and the time taken to parse
it will depend on the performance of the parser. Also each time the parser
evolves after system deployment, the existing data may have to be reparsed at
least partially.
By profiling what functions take the most computational time and memory during
parsing you can identify the bottlenecks in the parser. There are already
existing profiling tools such as
[
cProfile
](
https://docs.python.org/2/library/profile.html#module-cProfile
)
which you can plug into your scripts very easily.
parser/parser-big-dft/bigdftparser/__init__.py
0 → 100644
View file @
ea1ca62d
from
bigdftparser.parser
import
BigDFTParser
parser/parser-big-dft/bigdftparser/generic/__init__.py
0 → 100644
View file @
ea1ca62d
parser/parser-big-dft/bigdftparser/parser.py
0 → 100644
View file @
ea1ca62d
import
os
import
re
import
logging
import
importlib
from
nomadcore.baseclasses
import
ParserInterface
logger
=
logging
.
getLogger
(
"nomad"
)
#===============================================================================
class
BigDFTParser
(
ParserInterface
):
"""This class handles the initial setup before any parsing can happen. It
determines which version of BigDFT was used to generate the output and then
sets up a correct main parser.
After the implementation has been setup, you can parse the files with
parse().
"""
def
__init__
(
self
,
main_file
,
metainfo_to_keep
=
None
,
backend
=
None
,
default_units
=
None
,
metainfo_units
=
None
,
debug
=
True
,
log_level
=
logging
.
ERROR
,
store
=
True
):
super
(
BigDFTParser
,
self
).
__init__
(
main_file
,
metainfo_to_keep
,
backend
,
default_units
,
metainfo_units
,
debug
,
log_level
,
store
)
def
setup_version
(
self
):
"""Setups the version by looking at the output file and the version
specified in it.
"""
# Search for the BigDFT version specification. The correct parser is
# initialized based on this information.
regex_version
=
re
.
compile
(
" Northwest Computational Chemistry Package \(NWChem\) (\d+\.\d+)"
)
version_id
=
None
with
open
(
self
.
parser_context
.
main_file
,
'r'
)
as
outputfile
:
for
line
in
outputfile
:
# Look for version
result_version
=
regex_version
.
match
(
line
)
if
result_version
:
version_id
=
result_version
.
group
(
1
).
replace
(
'.'
,
''
)
if
version_id
is
None
:
msg
=
"Could not find a version specification from the given main file."
logger
.
exception
(
msg
)
raise
RuntimeError
(
msg
)
# Setup the root folder to the fileservice that is used to access files
dirpath
,
filename
=
os
.
path
.
split
(
self
.
parser_context
.
main_file
)
dirpath
=
os
.
path
.
abspath
(
dirpath
)
self
.
parser_context
.
file_service
.
setup_root_folder
(
dirpath
)
self
.
parser_context
.
file_service
.
set_file_id
(
filename
,
"output"
)
# Setup the correct main parser based on the version id. If no match
# for the version is found, use the main parser for NWChem 6.6
self
.
setup_main_parser
(
version_id
)
def
get_metainfo_filename
(
self
):
return
"big_dft.nomadmetainfo.json"
def
get_parser_info
(
self
):
return
{
'name'
:
'big-dft-parser'
,
'version'
:
'1.0'
}
def
setup_main_parser
(
self
,
version_id
):
# Currently the version id is a pure integer, so it can directly be mapped
# into a package name.
base
=
"bigdftparser.versions.bigdft{}.mainparser"
.
format
(
version_id
)
parser_module
=
None
parser_class
=
None
try
:
parser_module
=
importlib
.
import_module
(
base
)
except
ImportError
:
logger
.
warning
(
"Could not find a parser for version '{}'. Trying to default to the base implementation for BigDFT 1.8.0"
.
format
(
version_id
))
base
=
"bigdftparser.versions.bigdft180.mainparser"
try
:
parser_module
=
importlib
.
import_module
(
base
)
except
ImportError
:
logger
.
exception
(
"Could not find the module '{}'"
.
format
(
base
))
raise
try
:
class_name
=
"BigDFTMainParser"
parser_class
=
getattr
(
parser_module
,
class_name
)
except
AttributeError
:
logger
.
exception
(
"A parser class '{}' could not be found in the module '[]'."
.
format
(
class_name
,
parser_module
))
raise
self
.
main_parser
=
parser_class
(
self
.
parser_context
.
main_file
,
self
.
parser_context
)
parser/parser-big-dft/bigdftparser/regtest/bigdft_1.8.0/run_tests.py
0 → 100644
View file @
ea1ca62d
This diff is collapsed.
Click to expand it.
parser/parser-big-dft/bigdftparser/scalainterface.py
0 → 100644
View file @
ea1ca62d
"""
This is the access point to the parser for the scala layer in the
nomad project.
"""
from
__future__
import
absolute_import
import
sys
import
setup_paths
from
nomadcore.parser_backend
import
JsonParseEventsWriterBackend
from
bigdftparser
import
BigDFTParser
if
__name__
==
"__main__"
:
# Initialise the parser with the main filename and a JSON backend
main_file
=
sys
.
argv
[
1
]
parser
=
BigDFTParser
(
main_file
,
backend
=
JsonParseEventsWriterBackend
)
parser
.
parse
()
parser/parser-big-dft/bigdftparser/setup_paths.py
0 → 100644
View file @
ea1ca62d
"""
Setups the python-common library in the PYTHONPATH system variable.
"""
import
sys
import
os
import
os.path
baseDir
=
os
.
path
.
dirname
(
os
.
path
.
abspath
(
__file__
))
commonDir
=
os
.
path
.
normpath
(
os
.
path
.
join
(
baseDir
,
"../../../../../python-common/common/python"
))
parserDir
=
os
.
path
.
normpath
(
os
.
path
.
join
(
baseDir
,
"../../parser-nwchem"
))
# Using sys.path.insert(1, ...) instead of sys.path.insert(0, ...) based on
# this discusssion:
# http://stackoverflow.com/questions/10095037/why-use-sys-path-appendpath-instead-of-sys-path-insert1-path
if
commonDir
not
in
sys
.
path
:
sys
.
path
.
insert
(
1
,
commonDir
)
sys
.
path
.
insert
(
1
,
parserDir
)
parser/parser-big-dft/bigdftparser/tools/__init__.py
0 → 100644
View file @
ea1ca62d
parser/parser-big-dft/bigdftparser/versions/__init__.py
0 → 100644
View file @
ea1ca62d
parser/parser-big-dft/bigdftparser/versions/bigdft180/__init__.py
0 → 100644
View file @
ea1ca62d
parser/parser-big-dft/bigdftparser/versions/bigdft180/mainparser.py
0 → 100644
View file @
ea1ca62d
from
__future__
import
absolute_import
from
nomadcore.simple_parser
import
SimpleMatcher
as
SM
from
nomadcore.caching_backend
import
CachingLevel
from
nomadcore.baseclasses
import
MainHierarchicalParser
,
CacheService
import
re
import
logging
import
numpy
as
np
LOGGER
=
logging
.
getLogger
(
"nomad"
)
#===============================================================================
class
BigDFTMainParser
(
MainHierarchicalParser
):
"""The main parser class that is called for all run types. Parses the NWChem
output file.
"""
def
__init__
(
self
,
file_path
,
parser_context
):
"""
"""
super
(
BigDFTMainParser
,
self
).
__init__
(
file_path
,
parser_context
)
# Cache for storing current method settings
# self.method_cache = CacheService(self.parser_context)
# self.method_cache.add("single_configuration_to_calculation_method_ref", single=False, update=False)
#=======================================================================
# Cache levels
# self.caching_levels.update({
# 'x_nwchem_section_geo_opt_module': CachingLevel.Cache,
# 'x_nwchem_section_geo_opt_step': CachingLevel.Cache,
# 'x_nwchem_section_xc_functional': CachingLevel.Cache,
# 'x_nwchem_section_qmd_module': CachingLevel.ForwardAndCache,
# 'x_nwchem_section_qmd_step': CachingLevel.ForwardAndCache,
# 'x_nwchem_section_xc_part': CachingLevel.ForwardAndCache,
# })
#=======================================================================
# Main Structure
self
.
root_matcher
=
SM
(
""
,
forwardMatch
=
True
,
sections
=
[
'section_run'
],
subMatchers
=
[
self
.
input
(),
self
.
header
(),
self
.
system
(),
# This repeating submatcher supports multiple different tasks
# within one run
SM
(
"(?:\s+NWChem DFT Module)|(?:\s+NWChem Geometry Optimization)|(?:\s+NWChem QMD Module)|(?:\s+\* NWPW PSPW Calculation \*)"
,
repeats
=
True
,
forwardMatch
=
True
,
subFlags
=
SM
.
SubFlags
.
Unordered
,
subMatchers
=
[
self
.
energy_force_gaussian_task
(),
self
.
energy_force_pw_task
(),
self
.
geo_opt_module
(),
self
.
dft_gaussian_md_task
(),
]
),
]
)
#=======================================================================
# onClose triggers
def
onClose_section_run
(
self
,
backend
,
gIndex
,
section
):
backend
.
addValue
(
"program_name"
,
"NWChem"
)
backend
.
addValue
(
"program_basis_set_type"
,
"gaussians+plane_waves"
)
#=======================================================================
# onOpen triggers
def
onOpen_section_method
(
self
,
backend
,
gIndex
,
section
):
self
.
method_cache
[
"single_configuration_to_calculation_method_ref"
]
=
gIndex
#=======================================================================
# adHoc
def
adHoc_forces
(
self
,
save_positions
=
False
):
def
wrapper
(
parser
):
match
=
True
forces
=
[]
positions
=
[]
while
match
:
line
=
parser
.
fIn
.
readline
()
if
line
==
""
or
line
.
isspace
():
match
=
False
break
components
=
line
.
split
()
position
=
np
.
array
([
float
(
x
)
for
x
in
components
[
-
6
:
-
3
]])
force
=
np
.
array
([
float
(
x
)
for
x
in
components
[
-
3
:]])
forces
.
append
(
force
)
positions
.
append
(
position
)
forces
=
-
np
.
array
(
forces
)
positions
=
np
.
array
(
positions
)
# If anything found, push the results to the correct section
if
forces
.
size
!=
0
:
self
.
scc_cache
[
"atom_forces"
]
=
forces
if
save_positions
:
if
positions
.
size
!=
0
:
self
.
system_cache
[
"atom_positions"
]
=
positions
return
wrapper
#=======================================================================
# SimpleMatcher specific onClose
def
save_geo_opt_sampling_id
(
self
,
backend
,
gIndex
,
section
):
backend
.
addValue
(
"frame_sequence_to_sampling_ref"
,
gIndex
)
#=======================================================================
# Start match transforms
def
transform_dipole
(
self
,
backend
,
groups
):
dipole
=
groups
[
0
]
components
=
np
.
array
([
float
(
x
)
for
x
in
dipole
.
split
()])
backend
.
addArrayValues
(
"x_nwchem_qmd_step_dipole"
,
components
)
#=======================================================================
# Misc
def
debug_end
(
self
):
def
wrapper
():
print
(
"DEBUG"
)
return
wrapper
setup.py
0 → 100644
View file @
ea1ca62d
"""
This is a setup script for installing the parser locally on python path with
all the required dependencies. Used mainly for local testing.
"""
from
setuptools
import
setup
,
find_packages
#===============================================================================
def
main
():
# Start package setup
setup
(
name
=
"bigdftparser"
,
version
=
"0.1"
,
description
=
"NoMaD parser implementation for BigDFT."
,
author
=
"Lauri Himanen"
,
author_email
=
"lauri.himanen@aalto.fi"
,
license
=
"GPL3"
,
package_dir
=
{
''
:
'parser/parser-big-dft'
},
packages
=
find_packages
(),
install_requires
=
[
'pint'
,
'numpy'
,
'nomadcore'
,
],
)
# Run main function by default
if
__name__
==
"__main__"
:
main
()
src/main/scala/eu/nomad_lab/parsers/NWChemParser.scala
0 → 100644
View file @
ea1ca62d
package
eu.nomad_lab.parsers
import
eu.
{
nomad_lab
=>
lab
}
import
eu.nomad_lab.DefaultPythonInterpreter
import
org.
{
json4s
=>
jn
}
import
scala.collection.breakOut
object
BigDFTParser
extends
SimpleExternalParserGenerator
(
name
=
"BigDFTParser"
,
parserInfo
=
jn
.
JObject
(
(
"name"
->
jn
.
JString
(
"BigDFTParser"
))
::
(
"parserId"
->
jn
.
JString
(
"BigDFTParser"
+
lab
.
BigdftVersionInfo
.
version
))
::
(
"versionInfo"
->
jn
.
JObject
(
(
"nomadCoreVersion"
->
jn
.
JObject
(
lab
.
NomadCoreVersionInfo
.
toMap
.
map
{
case
(
k
,
v
)
=>
k
->
jn
.
JString
(
v
.
toString
)
}(
breakOut
)
:
List
[(
String
,
jn.JString
)]))
::
(
lab
.
BigdftVersionInfo
.
toMap
.
map
{
case
(
key
,
value
)
=>
(
key
->
jn
.
JString
(
value
.
toString
))
}(
breakOut
)
:
List
[(
String
,
jn.JString
)])
))
::
Nil
),
mainFileTypes
=
Seq
(
"text/.*"
),
mainFileRe
=
""" Northwest Computational Chemistry Package \(NWChem\) \d+\.\d+
------------------------------------------------------
Environmental Molecular Sciences Laboratory
Pacific Northwest National Laboratory
Richland, WA 99352"""
.
r
,
cmd
=
Seq
(
DefaultPythonInterpreter
.
pythonExe
(),
"${envDir}/parsers/nwchem/parser/parser-nwchem/nwchemparser/scalainterface.py"
,
"${mainFilePath}"
),
cmdCwd
=
"${mainFilePath}/.."
,
resList
=
Seq
(
"parser-big-dft/bigdftparser/__init__.py"
,
"parser-big-dft/bigdftparser/setup_paths.py"
,
"parser-big-dft/bigdftparser/parser.py"
,
"parser-big-dft/bigdftparser/scalainterface.py"
,
"parser-big-dft/bigdftparser/versions/__init__.py"
,
"parser-big-dft/bigdftparser/versions/bigdft180/__init__.py"
,
"parser-big-dft/bigdftparser/versions/bigdft180/mainparser.py"
,
"nomad_meta_info/public.nomadmetainfo.json"
,
"nomad_meta_info/common.nomadmetainfo.json"
,
"nomad_meta_info/meta_types.nomadmetainfo.json"
,
"nomad_meta_info/big_dft.nomadmetainfo.json"
)
++
DefaultPythonInterpreter
.
commonFiles
(),
dirMap
=
Map
(
"parser-big-dft"
->
"parsers/big-dft/parser/parser-big-dft"
,
"nomad_meta_info"
->
"nomad-meta-info/meta_info/nomad_meta_info"
)
++
DefaultPythonInterpreter
.
commonDirMapping
()
)
src/test/scala/eu/nomad_lab/parsers/BigDFTParserSpec.scala
0 → 100644
View file @
ea1ca62d
package
eu.nomad_lab.parsers
import
org.specs2.mutable.Specification
object
BigDFTParserSpec
extends
Specification
{
"BigDFTParserTest"
>>
{
"test with json-events"
>>
{
ParserRun
.
parse
(
BigDFTParser
,
"parsers/big-dft/test/examples/single_point/output.out"
,
"json-events"
)
must_==
ParseResult
.
ParseSuccess
V
}
}
"test single_point with json"
>>
{
ParserRun
.
parse
(
BigDFTParser
,
"parsers/big-dft/test/examples/single_point/output.out"
,
"json"
)
must_==
ParseResult
.
ParseSuccess
}
}
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment