Metainfo suggestions
Here's a few ideas for the metainfo workshop next week:
-
section_metadata
refactoring: The idea behindsection_metadata
is to provide a "summary" of the entry contents that can also be easily serialized for ElasticSearch for performant searches. This section contains duplicate or derived information that is gathered by a normalizer fromsection_run
or other sources. Here are a few key points that could be improved/refactored withinsection_metadata
:- As we are expanding from DFT to other domains (experimental, classical potentials, higher-order quantum methods, etc.) we could think about the top-level organization of
section_metadata
. Instead of doing a top-level division between these domains (currentlysection_metadata
has subsectionsdft
,ems
) could we do a division that would better support cross-searches across different domains? E.g. introducing subsections formaterial
,method
, andproperties
? These subsections could share some common metainfo between the different subfields (e.g. information about crystal symmetry can be included within thematerial
subsection), but also support subfield-specific data under corresponding sections (e.g. (section_metadata/method/dft
vssection_metadata/method/ems
) - Could the information contained in
section_symmetry
,is_representative
,section_prototype
be moved as part ofsection_metadata
? - Information that summarizes the calculation type (e.g. geometry optimization vs. molecular dynamics vs. phonon calculation vs. single point calculation) is very hard to find within the current metainfo.
section_encyclopedia
now contains a relatively simple keyword for this (section_metadata/encyclopedia/calculation/calculation_type
, but maybe this should be renamed and placed somewhere else? Maybe this metainfo could also be used for experimental data to summarize the technique? - Could we introduce a metainfo that summarizes the used calculation methodology? Currently the method information can be found by combining information from all section_methods, but the referencing mechanism makes it hard to see the overarching methodology. E.g. providing a summary metainfo that summarizes the method is much easier to understand (and search) than navigating through
section_methods
. Of course, we still need to keep section_methods as they contain all the details.
- As we are expanding from DFT to other domains (experimental, classical potentials, higher-order quantum methods, etc.) we could think about the top-level organization of
-
Normalization: If I understood correctly, all extensive properties should be reported per the simulation cell? Related to this:
- The property
vibrational_free_energy_at_constant_volume
is reported per atom, should it be fixed? - It would make sense to also report some of the values in a normalized form that can be directly compared between other materials, e.g. heat_capacity/specific_heat_capacity, dos_values/dos_values_normalized, etc. What would be a good mechanism for providing these values, how should they be named and is there some kind of default normalization we should use (e.g. per atom, per kilogram)?
- The normalized band structure energies and values should be reported under
section_k_band
, andsection_k_band_normalized
should be removed.
- The property