nomad-FAIR issueshttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues2021-06-17T16:41:04Zhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/438Add categories for normalized metadata2021-06-17T16:41:04ZLauri HimanenAdd categories for normalized metadatahttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/611Statistics for "marketing"2021-12-20T07:45:07ZMarkus ScheidgenStatistics for "marketing"As an NOMAD PI, I want to generate marketing artefacts that shows the current NOMAD data and usage. Statistics on NOMAD data can be generated via API. Statistics on usage rely on data that is not obtained or stored by NOMAD itself, but t...As an NOMAD PI, I want to generate marketing artefacts that shows the current NOMAD data and usage. Statistics on NOMAD data can be generated via API. Statistics on usage rely on data that is not obtained or stored by NOMAD itself, but that can be acquired depending on how NOMAD is operated.
## data sources
Statistics from NOMAD API:
- how much entries and metadata
- the infamous per code/method chart
- histogram of authors, uploads, growth
Statistics from the NOMAD file system (could be integrated into API)
- how much data on disk (e.g. raw files in TB)
Internal statistics, e.g. from web-server logs
- how much "traffic", DAU, visits on certain parts of the app: encyclopedia, aittoolkit, archive, raw-files
## What kind of artefacts do we want
Depending on the type of statistics/report, we could:
- add a set of statistic components (e.g. to show on the `/about` "dashboard"), also useful to visualize oasis usage
- generated piece of HTML for the web-page (or some iframe stuff)
- generated pdf report
- individual charts as pdfFelix DietrichFelix Dietrichhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/869Meaningful error not raised when an indentation error is made in yaml file2022-11-10T15:33:42ZAndrea AlbinoMeaningful error not raised when an indentation error is made in yaml fileThe error raised when I misindent "base_section" is:
AttributeError: 'str' object has no attribute 'update'
Working:
[INDENTATION_W.archive.yaml](/uploads/fb8c6195f3df2a869c5c2a00ee3eefc0/INDENTATION_W.archive.yaml)
Not Working:
[IND...The error raised when I misindent "base_section" is:
AttributeError: 'str' object has no attribute 'update'
Working:
[INDENTATION_W.archive.yaml](/uploads/fb8c6195f3df2a869c5c2a00ee3eefc0/INDENTATION_W.archive.yaml)
Not Working:
[INDENTATION_NW.archive.yaml](/uploads/c61c721844ab8c946469c778bcca0915/INDENTATION_NW.archive.yaml)https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1159Export data entries as CSV2022-11-17T08:31:41ZJose Marquez PrietoExport data entries as CSVHaving the possibility to download the entries from the Results table As CSV will decrease the entry barrier for many new users. After a discussion with @himanel1 my understanding is that we would need a modification of the API endpoint ...Having the possibility to download the entries from the Results table As CSV will decrease the entry barrier for many new users. After a discussion with @himanel1 my understanding is that we would need a modification of the API endpoint in `nomad/app/v1/routers/entries.py` in which the format could be specified. @himanel1, please feel free to modify the issue and correct it.Lauri HimanenLauri Himanenhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1194Graph database prototype2022-11-28T08:15:42ZLauri HimanenGraph database prototypeAfter discussions across the areas, it has become clear that the current `results`-section is not fulfilling our needs anymore. The results section as it stands is only able to properly capture workflows that contain a single system/mate...After discussions across the areas, it has become clear that the current `results`-section is not fulfilling our needs anymore. The results section as it stands is only able to properly capture workflows that contain a single system/material, a single method, and several properties. We are now transitioning into a much more complicated scenario with multiple systems, multiple methods very complex workflow graphs which can dramatically differ between entries.
In order to try out a solution, we wanted to try out a graph database that could better capture all of the interesting properties in these more complicated workflows. This first step will only be a POC, which attemps to capture **only the systems** in a workflow, and consists of the following steps:
- [ ] Simple (local) performance test of how Neo4J queries scale with different types of data.
- [ ] How the query time scales with respect to the number of entries in the database (e.g. range 100-100 000 entries)
- [ ] How the query time scales with respect to the size of individual graphs in the database (e.g. range 10-10000 nodes and edges)
- [ ] How the query time scales with respect to the query complexity (e.g. range 1-10 connections queried)
- [ ] Agree on a very simple base class that represents systems, he most minimal definition will do for now. The agreement should be between areas A, B and C, also possibly looking into optimade and already existing ontologies a bit.
- [ ] Adding Neo4J into our docker infrastructure
- [ ] Metainfo annotations for Neo4J. Certain quantities and sections can become nodes, parent/child relationships can become edges.
- [ ] Adding Neo4J ingestion based on the annotations.
- [ ] Adding an entry query API endpoint for Neo4J.
- [ ] Adding a new search menu that builds meaningful queries based on user input.https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1641Neo4j prototype2023-08-16T09:10:26ZLauri HimanenNeo4j prototypeWe should build a small demo to see if Neo4j could be used as a graph db solution within NOMAD.
Tasks:
- [x] Add Neo4j to the dev docker compose
- [ ] Build Annotation class for Graph database
- [ ] Build function for creating the mappi...We should build a small demo to see if Neo4j could be used as a graph db solution within NOMAD.
Tasks:
- [x] Add Neo4j to the dev docker compose
- [ ] Build Annotation class for Graph database
- [ ] Build function for creating the mapping/schema for Neo4j based on annotations in the metainfo
- [ ] Build function for ingesting several archives into Neo4j based on the annotations
Relevant at least for: @hnaesstroem, @g-michaelgoetteHampus NaesstroemHampus Naesstroemhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1600Dispersive Material Database2023-10-13T08:20:55ZJose Marquez PrietoDispersive Material DatabaseWe would like to have an implementation of the [refractive index database](https://refractiveindex.info/) in NOMAD. It will include an Overview card, a section in results to store and index some data, and a dedicated Search app.
We nee...We would like to have an implementation of the [refractive index database](https://refractiveindex.info/) in NOMAD. It will include an Overview card, a section in results to store and index some data, and a dedicated Search app.
We need:
- [x] Write `OpticalProperties` results section in `resuts.properties`: `nomad/datamodel/results.py
- [x] Parse/normalize the data to the results section
- `Formula` in `nomad/atomutils.py` could greatly help. Also, the new PubChem-related substance base sections from #1585, could be of great help, particularly for the organic compounds.
- [x] Write an overview card for Optical Properties
- `gui/src/components/visualization`
- `gui/src/components/entry/properties`
- [ ] Define the search app
- For the central service in the `models.py` file
- An example of a `gui` section in `nomad.yaml` for an Oasis use caseFlorian DobenerFlorian Dobenerhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1759Dynamic Quantities and Updating without Reprocessing2023-11-01T09:58:18ZNathan DaelmanDynamic Quantities and Updating without ReprocessingAtm, any quantity that we want to appear in the ES query, has to be stored in the archive and indexed.
However, some quantities are derived from an already stored archive quantity with a many-to-one mapping.
In many of these cases, I wou...Atm, any quantity that we want to appear in the ES query, has to be stored in the archive and indexed.
However, some quantities are derived from an already stored archive quantity with a many-to-one mapping.
In many of these cases, I would see it possible that one wants to **update these quantities without necessarily reprocessing**.
Real case scenarios include:
- URLs to a git project or code: the project migrated
- summing of an array: we get user requests to add search statistics of an array previously not considered
- metadata from XC functionals / Force Fields: the scientific knowledge here gets updated
- searching groups of elements, e.g. transition metals, rare-earth, actinides, lanthanides, halogens, etc.
Due to the many-to-one relationship, we can derive these quantities from other stored quantities.
Could we devise an additional layer, so:
1. queries for some quantities are transformed into queries for the mapped quantities, returning the corresponding entries? Obviously, the mapped quantities have to be indexed.
2. an end user cannot distinguish between a stored or dynamically derived quantity?
For example: somebody looks for a field describing the dataset used to train an XC functional. We, internally, have a map to `xc_functional_name`. They get all the entries with the matching `xc_functional_name`. When they visit a specific entry, they see the dataset under DATA, even though it is not stored in the archive.https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1601Syntactic sugar for yaml schemas2023-11-01T10:48:45ZMarkus ScheidgenSyntactic sugar for yaml schemasWhen you write Metainfo schemas in `.archive.yaml` files, you have to follow the Metainfo "lanuage" very precisely. On top, we also lack semantic checks and good error messages. As a result, it is not very easy to write these schemas. A...When you write Metainfo schemas in `.archive.yaml` files, you have to follow the Metainfo "lanuage" very precisely. On top, we also lack semantic checks and good error messages. As a result, it is not very easy to write these schemas. Additionally, our YAML "flavour" is very different to this used by the Nexus tools. This also does not help.
### Examples
We need to put some "syntactic suggar" around our YAML. Here is a before after example.
```yaml
section_definitions:
Values:
description: Represents a named array for float data
quantities:
name:
type: str
description:
type: str
values:
type: np.float64
shape: ['*']
MySection:
base_sections: nomad.datamodel.EntryData
quantities:
time:
type: int
shape: ['*']
unit: s
sub_sections:
values:
section: Values
repeats: true
```
```yaml
Values:
description: Represents a named array for float data
name: str
description: str
values: np.float64[*]
MySection(nomad.datamodel.EntryData):
time: int[*] in s
values: Values*
```
However, there are a few problems:
- definition properties like `name` or `description` might collide with properties that the schema wants to define (like `Values.description` is colliding with `Section.description`).
- it might not be clear if we want to define a quantity or a sub_section
A more explicit form without these problems might be this:
```yaml
Values:
m_def: Section
m_description: Represents a named array for float data
name:
m_type: str
values:
m_type: np.float64
m_shape: ['*']
MySection(nomad.datamodel.EntryData):
time:
m_def: Quantity
m_type: int
m_shape: ['*']
m_unit: s
values:
m_def: SubSection
m_section: Values
m_repreats: true
```
But we loose some convenience again. How much can be implied and how much has to be explicit?
### Implementation
- can we keep a line/col mapping to objects parsed from YAML to include in errors
- errors should include paths, e.g. "MySection.values.m_section: The referenced section Values does not exist."
- the output is dict data that can be put into `Package.m_from_dict`. The validation of the resulting `Package` might produce semantic errors, that ideally could also reported back to a path?
- its all about giving options: lots of aliases, user decide if they want to have it explicit or not
- documentation is important
- ideally this can be reused for nexus. Their schema files currently look like this: https://github.com/FAIRmat-Experimental/nexus_definitions/tree/3c4cbcbb90640336206b99b75e03735f2353b9c6/applications/nyamlAhmed IlyasAhmed Ilyashttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1549Representing notebooks as custom ELN schema2023-11-13T14:04:04ZAdam FeketeRepresenting notebooks as custom ELN schemaAdam FeketeAdam Feketehttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1765Support Specializing Custom Quantity Types2023-12-13T09:45:57ZHampus NaesstroemSupport Specializing Custom Quantity TypesIt seems like it is currently only supported to specialize native python types and not custom ones. I.e. the following works:
```python
from nomad.metainfo import (
Quantity,
)
from nomad.datamodel.data import (
EntryData,
A...It seems like it is currently only supported to specialize native python types and not custom ones. I.e. the following works:
```python
from nomad.metainfo import (
Quantity,
)
from nomad.datamodel.data import (
EntryData,
ArchiveSection,
)
class A(ArchiveSection):
a = Quantity(type=str)
class B(A):
a = A.a.m_copy()
a.type = int
class C(EntryData):
c = Quantity(type=A)
```
But trying to specialize the type of the property `c` fails. I.e. adding this:
```python
class D(C):
c = C.c.m_copy()
c.type = B
```
Will cause the following error message when running the appworker:
```bash
nomad.metainfo.metainfo.MetainfoError: Type <class 'hzb_unold_lab.schema.B'> of nomad.metainfo.metainfo.Quantity.type:Quantity is not a valid metainfo quantity type
```
*The test was performed in a plugin called `hzb_unold_lab`*
Not sure if it is relevant but VScode also doesn't seem to pick up the `m_copy()` method for `Quantity` like it does for `SubSection`.https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/823Conditionally adding some processing results to the raw file.2023-12-21T15:55:24ZAndrea AlbinoConditionally adding some processing results to the raw file.We allow schemas to normalize data from `*.archive.json|yaml` raw files. Sometimes it is desired to put these changes or additions back into the raw file. A specific example ELNs that parse a referenced file and add the contents to the E...We allow schemas to normalize data from `*.archive.json|yaml` raw files. Sometimes it is desired to put these changes or additions back into the raw file. A specific example ELNs that parse a referenced file and add the contents to the ELN entry. User here would expect that the data they see is also available from the raw file. If they press the save button twice, it would even show up in the raw file, because the ELN functionality edits the archive and saves it to the raw file.
**This issue was changed. The old issue text**:
The first time I press save from the eln the archive is not filled. The second time I press it, the archive gets filled.
The two files ending with "2nd_SAVE.json" are the ones generated after the second time I press save.
[gwzkYaOipjERZgHrzmU1WyvOv4Qy.json](/uploads/b538b012ed93c9ac52f8a5347390afd2/gwzkYaOipjERZgHrzmU1WyvOv4Qy.json)[aaa.archive.json](/uploads/d6dddd11e909a89d95e8bdfd3586823e/aaa.archive.json)[gwzkYaOipjERZgHrzmU1WyvOv4Qy_2nd_SAVE.json](/uploads/7f337884e3975c68d65c0b375a0c2a33/gwzkYaOipjERZgHrzmU1WyvOv4Qy_2nd_SAVE.json)
[aaa.archive_2nd_SAVE.json](/uploads/b7f7a4cd4fe4401f7c4a35a59e4cbde1/aaa.archive_2nd_SAVE.json)https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1713Improved widget axis settings2023-12-21T15:55:43ZLauri HimanenImproved widget axis settingsWe need to add more configuration options for the widget axes used by the histogram and scatter plot widgets:
- [ ] Ability to specify the unit
- [ ] Ability to specify a custom label
- [ ] Ability to specify the label "mode": if no cus...We need to add more configuration options for the widget axes used by the histogram and scatter plot widgets:
- [ ] Ability to specify the unit
- [ ] Ability to specify a custom label
- [ ] Ability to specify the label "mode": if no custom label is defined, how many levels of the hierarchy should be displayed?
- [ ] The histogram x-axis title should be moved to the bottom and an y-axis should be added, as reported in #1697.
- [ ] Improve the UX for the statistics scaling:
- Instead of showing the scaling option name, show a 'graph' icon that opens a dropdown.
- Add a tooltip for each option that fully defines the scaling mathematically
- Replace the fairly odd options (1/2, 1/4, 1/8) with simple `log` option?Lauri HimanenLauri Himanenhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1605PDOS schema and parsing2023-12-21T15:58:57ZJose PizarroPDOS schema and parsing@ndaelman @aalbino @lucamghi
We need some schema able to store the **projected DOS** (PDOS) in NOMAD. This is mainly interesting when doing post-DFT calculations (such as Wannier or Slater-Koster models, or even for DMFT and beyond). I ...@ndaelman @aalbino @lucamghi
We need some schema able to store the **projected DOS** (PDOS) in NOMAD. This is mainly interesting when doing post-DFT calculations (such as Wannier or Slater-Koster models, or even for DMFT and beyond). I will start working on it (as it pretty much touches Task C3), but can you help me reviewing it, @ndaelman ?
Changes separated in two branches:
**1605-pdos-schema-and-parsing**:
* [x] Fix bugs with _run.system_ and _run.calculation_ in CP2K parser.
* [x] SinglePoint parsing fix (https://github.com/nomad-coe/electronic-parsers/pull/141).
* [x] GeomOpt parsing fix (done by @ladinesa).
* [x] For different spin channels, `n_energies` and `energy_fermi` can be different; then:
* [x] Define `spin_channel` in `Calculation.Dos`.
* [x] Populate `run.calculation.dos_electronic.species_projected`, `atom_projected`, and `orbital_projected` for CP2K.
* [x] Generate DOS (in `DosNormalizer`) from lower to upper levels, i.e., from orbital -\> atom -\> species -\> total.
* [x] Debug parsers for this refactoring.
* [x] Define `DOSNew` and `DOSElectronicNew` to allow for future deprecating of all sections.
* [x] Debug normalizers for this refactoring.
* [x] Add / modify pytesting.
**1605-electronic-prop-and-dos-card-layout**:
* [ ] Add plotting for projected DOS: possibly, clickable options inside the kebab menu?
* [ ] Add react spec testing.Jose PizarroJose Pizarrohttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1796allow searching reaction energy and formula from workflow entries2023-12-21T15:59:01ZJulia Schumannallow searching reaction energy and formula from workflow entries@ndaelman and me want to be able to search the chemical reaction workflow results, energies and reaction formula. We think we need a dedicated section in results for that.
Not sure if we want to put it under catalysis -> computational...@ndaelman and me want to be able to search the chemical reaction workflow results, energies and reaction formula. We think we need a dedicated section in results for that.
Not sure if we want to put it under catalysis -> computational catalysis -> reaction_name, reactants, products, reaction energy, activation energy ? We want to be flexible enough to accommodate redox energies for batteries later.
What are your thoughts, @himanel1 ?
I have shared an upload with one example reaction with @ndaelman @himanel1 https://nomad-lab.eu/prod/v1/staging/gui/user/uploads/upload/id/eGvhu4qwRRCtFQkp3C0ijQ/entry/id/ACkaJJMl_HC-WzwrszAVaWnt-Lw3https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1770Visualize Fermi Surface2023-12-21T15:59:02ZNathan DaelmanVisualize Fermi SurfaceThere's a request to introduce the Fermi surface with a color coding for the band energies, e.g. Figure 6 form the [EPW documentation](https://docs.epw-code.org/doc/MgB2.html).
The main considerations to settle on:
- At which point shou...There's a request to introduce the Fermi surface with a color coding for the band energies, e.g. Figure 6 form the [EPW documentation](https://docs.epw-code.org/doc/MgB2.html).
The main considerations to settle on:
- At which point should the plot be generated? During parsing / normalization, visualization in the Overview card, or in North?
- If we go for the Overview card, do we want a React card or a Plotly graph?
- How do we decide which bands to include? Do we select them all, or use an energy cut-off window? Should it be up to the user?
- Should we later on be able to support an animation of the T-dependence?
These considerations are constrained by the
- data size / complexity
- processing time
- degree of plotting interactivity
- necessity for exporting
@pizarroj your opinion on these specifications.https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1155QMC metainfo and ALF parser2023-12-21T16:18:34ZJose PizarroQMC metainfo and ALF parserIncluding the (auxiliary fields) quantum Monte Carlo package, [ALF](https://gitpages.physik.uni-wuerzburg.de/ALF/ALF_Webpage/page/about/).
- [x] Create QMC metainfo
- [x] Add parser
- [ ] Populate HoppingMatrix:
- [ ] N-leg-ladder
-...Including the (auxiliary fields) quantum Monte Carlo package, [ALF](https://gitpages.physik.uni-wuerzburg.de/ALF/ALF_Webpage/page/about/).
- [x] Create QMC metainfo
- [x] Add parser
- [ ] Populate HoppingMatrix:
- [ ] N-leg-ladder
- [x] Square and Honeycomb
- [ ] Bilayer Square and Bilayer Honeycomb
- [x] Populate results:
- [x] GreensFunctions
- [x] Energies
- [x] Correlations
- [ ] PlotsJose PizarroJose Pizarrohttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1854Add bond information to structure files2024-01-18T09:18:48ZLauri HimanenAdd bond information to structure filesSometimes simulations contain explicit bond information. This is currently stored in `system.atoms.bond_list`. Whenever this information is present, we should add it to the structure files produced by the systems API endpoint. Not all fo...Sometimes simulations contain explicit bond information. This is currently stored in `system.atoms.bond_list`. Whenever this information is present, we should add it to the structure files produced by the systems API endpoint. Not all formats will support this, but this should be added at least to the .pdb format that currently drives the Overview visualization. With the bond information, NGL can also display the bonds.Lauri HimanenLauri Himanenhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1824Improved unwrapping of structures2024-01-23T11:44:04ZLauri HimanenImproved unwrapping of structuresDinga WonankeDinga Wonankehttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1671FileEditQuantity Option to Not Process File2024-02-01T13:31:23ZHampus NaesstroemFileEditQuantity Option to Not Process FileWe need the option for the ELN FileEditQuantity to not process the file that is being uploaded.
Currently we (@hnaesstroem, @g-michaelgoette, and @sbrueck) write a parser which creates an additional ELN entry where this file is reference...We need the option for the ELN FileEditQuantity to not process the file that is being uploaded.
Currently we (@hnaesstroem, @g-michaelgoette, and @sbrueck) write a parser which creates an additional ELN entry where this file is referenced using a FileEditQuantity. The actual parsing is then done in the normalizer of that ELN class. This is needed so that the user can either just upload the file after their process is done OR start taking notes in an ELN and then later add the file. However, currently the file gets processed when it is being uploaded in the FileEditQuantity and we get 2 ELN entries.
See https://github.com/FAIRmat-NFDI/AreaA-data_modeling_and_schemas/tree/53-add-substrate-support-to-ikz-pld-plugin/PVD/thermal_evaporation/hzb_unold_lab_pvdp/hzb_unold_lab_plugin/src/hzb_unold_lab for an example.
I see two ways that this could be solved:
1. The file is simply not processed at all if an annotation is given to the FileEditQuantity. This has the disadvantage that the file does not get any metadata and just looks like a raw file.
2. The parser somehow gets the information if (and in that case from which entry) this was uploaded using a FileEditQuantity.