nomad-FAIR issueshttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues2021-12-20T07:45:07Zhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/611Statistics for "marketing"2021-12-20T07:45:07ZMarkus ScheidgenStatistics for "marketing"As an NOMAD PI, I want to generate marketing artefacts that shows the current NOMAD data and usage. Statistics on NOMAD data can be generated via API. Statistics on usage rely on data that is not obtained or stored by NOMAD itself, but t...As an NOMAD PI, I want to generate marketing artefacts that shows the current NOMAD data and usage. Statistics on NOMAD data can be generated via API. Statistics on usage rely on data that is not obtained or stored by NOMAD itself, but that can be acquired depending on how NOMAD is operated.
## data sources
Statistics from NOMAD API:
- how much entries and metadata
- the infamous per code/method chart
- histogram of authors, uploads, growth
Statistics from the NOMAD file system (could be integrated into API)
- how much data on disk (e.g. raw files in TB)
Internal statistics, e.g. from web-server logs
- how much "traffic", DAU, visits on certain parts of the app: encyclopedia, aittoolkit, archive, raw-files
## What kind of artefacts do we want
Depending on the type of statistics/report, we could:
- add a set of statistic components (e.g. to show on the `/about` "dashboard"), also useful to visualize oasis usage
- generated piece of HTML for the web-page (or some iframe stuff)
- generated pdf report
- individual charts as pdfFelix DietrichFelix Dietrichhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1759Dynamic Quantities and Updating without Reprocessing2023-11-01T09:58:18ZNathan DaelmanDynamic Quantities and Updating without ReprocessingAtm, any quantity that we want to appear in the ES query, has to be stored in the archive and indexed.
However, some quantities are derived from an already stored archive quantity with a many-to-one mapping.
In many of these cases, I wou...Atm, any quantity that we want to appear in the ES query, has to be stored in the archive and indexed.
However, some quantities are derived from an already stored archive quantity with a many-to-one mapping.
In many of these cases, I would see it possible that one wants to **update these quantities without necessarily reprocessing**.
Real case scenarios include:
- URLs to a git project or code: the project migrated
- summing of an array: we get user requests to add search statistics of an array previously not considered
- metadata from XC functionals / Force Fields: the scientific knowledge here gets updated
- searching groups of elements, e.g. transition metals, rare-earth, actinides, lanthanides, halogens, etc.
Due to the many-to-one relationship, we can derive these quantities from other stored quantities.
Could we devise an additional layer, so:
1. queries for some quantities are transformed into queries for the mapped quantities, returning the corresponding entries? Obviously, the mapped quantities have to be indexed.
2. an end user cannot distinguish between a stored or dynamically derived quantity?
For example: somebody looks for a field describing the dataset used to train an XC functional. We, internally, have a map to `xc_functional_name`. They get all the entries with the matching `xc_functional_name`. When they visit a specific entry, they see the dataset under DATA, even though it is not stored in the archive.https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1932Theming, profiles, and launch urls for NORTH2024-03-14T13:32:33ZMarkus ScheidgenTheming, profiles, and launch urls for NORTHThis is a continuation of #1897 (!1677)
- [ ] NOMAD and FAIRmat logos for JupyterHub, Jupyter, and JupyterLab
- [ ] !1677 added code to generate profiles for the k8s Jupyterhub, but it does not play well with NOMAD launching containers ...This is a continuation of #1897 (!1677)
- [ ] NOMAD and FAIRmat logos for JupyterHub, Jupyter, and JupyterLab
- [ ] !1677 added code to generate profiles for the k8s Jupyterhub, but it does not play well with NOMAD launching containers via service, also does not work for webtop tools
- [ ] profiles should allow hub urls that launch a tool (without nomad support, e.g. mounted uploads)
- [ ] hub should use the nomad api to determine the user optionsAdam FeketeAdam Feketehttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1900Support serialization of references to inner sub sections in foreign Python p...2024-02-23T13:54:12ZMarkus ScheidgenSupport serialization of references to inner sub sections in foreign Python packagesIf you serialize a Python metainfo package that contains a reference to an inner section definition in another package you get an `MetainfoReferenceError` stating that this not yet supported.If you serialize a Python metainfo package that contains a reference to an inner section definition in another package you get an `MetainfoReferenceError` stating that this not yet supported.https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1788Landing page2024-02-21T10:15:52ZLauri HimanenLanding pageThe explore menu should be replaced with a dedicated landing page (gui/explore). @wojasadr has already created a template for the layout that should be translated into a proper page.
Tasks:
- [ ] `ExplorePage` component
- [ ] `App` comp...The explore menu should be replaced with a dedicated landing page (gui/explore). @wojasadr has already created a template for the layout that should be translated into a proper page.
Tasks:
- [ ] `ExplorePage` component
- [ ] `App` component (Displays app title, description, possible icon)
- [ ] `AppCategory` component (Displays a list of apps with category name)
- [ ] Change links in the nomad-lab.eu webpage once released to production.Lauri HimanenLauri Himanenhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1765Support Specializing Custom Quantity Types2023-12-13T09:45:57ZHampus NaesstroemSupport Specializing Custom Quantity TypesIt seems like it is currently only supported to specialize native python types and not custom ones. I.e. the following works:
```python
from nomad.metainfo import (
Quantity,
)
from nomad.datamodel.data import (
EntryData,
A...It seems like it is currently only supported to specialize native python types and not custom ones. I.e. the following works:
```python
from nomad.metainfo import (
Quantity,
)
from nomad.datamodel.data import (
EntryData,
ArchiveSection,
)
class A(ArchiveSection):
a = Quantity(type=str)
class B(A):
a = A.a.m_copy()
a.type = int
class C(EntryData):
c = Quantity(type=A)
```
But trying to specialize the type of the property `c` fails. I.e. adding this:
```python
class D(C):
c = C.c.m_copy()
c.type = B
```
Will cause the following error message when running the appworker:
```bash
nomad.metainfo.metainfo.MetainfoError: Type <class 'hzb_unold_lab.schema.B'> of nomad.metainfo.metainfo.Quantity.type:Quantity is not a valid metainfo quantity type
```
*The test was performed in a plugin called `hzb_unold_lab`*
Not sure if it is relevant but VScode also doesn't seem to pick up the `m_copy()` method for `Quantity` like it does for `SubSection`.https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1671FileEditQuantity Option to Not Process File2024-02-01T13:31:23ZHampus NaesstroemFileEditQuantity Option to Not Process FileWe need the option for the ELN FileEditQuantity to not process the file that is being uploaded.
Currently we (@hnaesstroem, @g-michaelgoette, and @sbrueck) write a parser which creates an additional ELN entry where this file is reference...We need the option for the ELN FileEditQuantity to not process the file that is being uploaded.
Currently we (@hnaesstroem, @g-michaelgoette, and @sbrueck) write a parser which creates an additional ELN entry where this file is referenced using a FileEditQuantity. The actual parsing is then done in the normalizer of that ELN class. This is needed so that the user can either just upload the file after their process is done OR start taking notes in an ELN and then later add the file. However, currently the file gets processed when it is being uploaded in the FileEditQuantity and we get 2 ELN entries.
See https://github.com/FAIRmat-NFDI/AreaA-data_modeling_and_schemas/tree/53-add-substrate-support-to-ikz-pld-plugin/PVD/thermal_evaporation/hzb_unold_lab_pvdp/hzb_unold_lab_plugin/src/hzb_unold_lab for an example.
I see two ways that this could be solved:
1. The file is simply not processed at all if an annotation is given to the FileEditQuantity. This has the disadvantage that the file does not get any metadata and just looks like a raw file.
2. The parser somehow gets the information if (and in that case from which entry) this was uploaded using a FileEditQuantity.https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1601Syntactic sugar for yaml schemas2023-11-01T10:48:45ZMarkus ScheidgenSyntactic sugar for yaml schemasWhen you write Metainfo schemas in `.archive.yaml` files, you have to follow the Metainfo "lanuage" very precisely. On top, we also lack semantic checks and good error messages. As a result, it is not very easy to write these schemas. A...When you write Metainfo schemas in `.archive.yaml` files, you have to follow the Metainfo "lanuage" very precisely. On top, we also lack semantic checks and good error messages. As a result, it is not very easy to write these schemas. Additionally, our YAML "flavour" is very different to this used by the Nexus tools. This also does not help.
### Examples
We need to put some "syntactic suggar" around our YAML. Here is a before after example.
```yaml
section_definitions:
Values:
description: Represents a named array for float data
quantities:
name:
type: str
description:
type: str
values:
type: np.float64
shape: ['*']
MySection:
base_sections: nomad.datamodel.EntryData
quantities:
time:
type: int
shape: ['*']
unit: s
sub_sections:
values:
section: Values
repeats: true
```
```yaml
Values:
description: Represents a named array for float data
name: str
description: str
values: np.float64[*]
MySection(nomad.datamodel.EntryData):
time: int[*] in s
values: Values*
```
However, there are a few problems:
- definition properties like `name` or `description` might collide with properties that the schema wants to define (like `Values.description` is colliding with `Section.description`).
- it might not be clear if we want to define a quantity or a sub_section
A more explicit form without these problems might be this:
```yaml
Values:
m_def: Section
m_description: Represents a named array for float data
name:
m_type: str
values:
m_type: np.float64
m_shape: ['*']
MySection(nomad.datamodel.EntryData):
time:
m_def: Quantity
m_type: int
m_shape: ['*']
m_unit: s
values:
m_def: SubSection
m_section: Values
m_repreats: true
```
But we loose some convenience again. How much can be implied and how much has to be explicit?
### Implementation
- can we keep a line/col mapping to objects parsed from YAML to include in errors
- errors should include paths, e.g. "MySection.values.m_section: The referenced section Values does not exist."
- the output is dict data that can be put into `Package.m_from_dict`. The validation of the resulting `Package` might produce semantic errors, that ideally could also reported back to a path?
- its all about giving options: lots of aliases, user decide if they want to have it explicit or not
- documentation is important
- ideally this can be reused for nexus. Their schema files currently look like this: https://github.com/FAIRmat-Experimental/nexus_definitions/tree/3c4cbcbb90640336206b99b75e03735f2353b9c6/applications/nyamlAhmed IlyasAhmed Ilyashttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1194Graph database prototype2022-11-28T08:15:42ZLauri HimanenGraph database prototypeAfter discussions across the areas, it has become clear that the current `results`-section is not fulfilling our needs anymore. The results section as it stands is only able to properly capture workflows that contain a single system/mate...After discussions across the areas, it has become clear that the current `results`-section is not fulfilling our needs anymore. The results section as it stands is only able to properly capture workflows that contain a single system/material, a single method, and several properties. We are now transitioning into a much more complicated scenario with multiple systems, multiple methods very complex workflow graphs which can dramatically differ between entries.
In order to try out a solution, we wanted to try out a graph database that could better capture all of the interesting properties in these more complicated workflows. This first step will only be a POC, which attemps to capture **only the systems** in a workflow, and consists of the following steps:
- [ ] Simple (local) performance test of how Neo4J queries scale with different types of data.
- [ ] How the query time scales with respect to the number of entries in the database (e.g. range 100-100 000 entries)
- [ ] How the query time scales with respect to the size of individual graphs in the database (e.g. range 10-10000 nodes and edges)
- [ ] How the query time scales with respect to the query complexity (e.g. range 1-10 connections queried)
- [ ] Agree on a very simple base class that represents systems, he most minimal definition will do for now. The agreement should be between areas A, B and C, also possibly looking into optimade and already existing ontologies a bit.
- [ ] Adding Neo4J into our docker infrastructure
- [ ] Metainfo annotations for Neo4J. Certain quantities and sections can become nodes, parent/child relationships can become edges.
- [ ] Adding Neo4J ingestion based on the annotations.
- [ ] Adding an entry query API endpoint for Neo4J.
- [ ] Adding a new search menu that builds meaningful queries based on user input.https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1159Export data entries as CSV2022-11-17T08:31:41ZJose Marquez PrietoExport data entries as CSVHaving the possibility to download the entries from the Results table As CSV will decrease the entry barrier for many new users. After a discussion with @himanel1 my understanding is that we would need a modification of the API endpoint ...Having the possibility to download the entries from the Results table As CSV will decrease the entry barrier for many new users. After a discussion with @himanel1 my understanding is that we would need a modification of the API endpoint in `nomad/app/v1/routers/entries.py` in which the format could be specified. @himanel1, please feel free to modify the issue and correct it.Lauri HimanenLauri Himanenhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/869Meaningful error not raised when an indentation error is made in yaml file2022-11-10T15:33:42ZAndrea AlbinoMeaningful error not raised when an indentation error is made in yaml fileThe error raised when I misindent "base_section" is:
AttributeError: 'str' object has no attribute 'update'
Working:
[INDENTATION_W.archive.yaml](/uploads/fb8c6195f3df2a869c5c2a00ee3eefc0/INDENTATION_W.archive.yaml)
Not Working:
[IND...The error raised when I misindent "base_section" is:
AttributeError: 'str' object has no attribute 'update'
Working:
[INDENTATION_W.archive.yaml](/uploads/fb8c6195f3df2a869c5c2a00ee3eefc0/INDENTATION_W.archive.yaml)
Not Working:
[INDENTATION_NW.archive.yaml](/uploads/c61c721844ab8c946469c778bcca0915/INDENTATION_NW.archive.yaml)https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/823Conditionally adding some processing results to the raw file.2023-12-21T15:55:24ZAndrea AlbinoConditionally adding some processing results to the raw file.We allow schemas to normalize data from `*.archive.json|yaml` raw files. Sometimes it is desired to put these changes or additions back into the raw file. A specific example ELNs that parse a referenced file and add the contents to the E...We allow schemas to normalize data from `*.archive.json|yaml` raw files. Sometimes it is desired to put these changes or additions back into the raw file. A specific example ELNs that parse a referenced file and add the contents to the ELN entry. User here would expect that the data they see is also available from the raw file. If they press the save button twice, it would even show up in the raw file, because the ELN functionality edits the archive and saves it to the raw file.
**This issue was changed. The old issue text**:
The first time I press save from the eln the archive is not filled. The second time I press it, the archive gets filled.
The two files ending with "2nd_SAVE.json" are the ones generated after the second time I press save.
[gwzkYaOipjERZgHrzmU1WyvOv4Qy.json](/uploads/b538b012ed93c9ac52f8a5347390afd2/gwzkYaOipjERZgHrzmU1WyvOv4Qy.json)[aaa.archive.json](/uploads/d6dddd11e909a89d95e8bdfd3586823e/aaa.archive.json)[gwzkYaOipjERZgHrzmU1WyvOv4Qy_2nd_SAVE.json](/uploads/7f337884e3975c68d65c0b375a0c2a33/gwzkYaOipjERZgHrzmU1WyvOv4Qy_2nd_SAVE.json)
[aaa.archive_2nd_SAVE.json](/uploads/b7f7a4cd4fe4401f7c4a35a59e4cbde1/aaa.archive_2nd_SAVE.json)https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/438Add categories for normalized metadata2021-06-17T16:41:04ZLauri HimanenAdd categories for normalized metadatahttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1945Add unit support for percentages2024-03-21T08:23:23ZLauri HimanenAdd unit support for percentagesRelated to https://github.com/nomad-coe/nomad/issues/100. Starting from 0.21, Pint has built-in support for percentages. Unfortunately, also starting from that exact version, [this Pint issue](https://github.com/hgrecco/pint/issues/1809)...Related to https://github.com/nomad-coe/nomad/issues/100. Starting from 0.21, Pint has built-in support for percentages. Unfortunately, also starting from that exact version, [this Pint issue](https://github.com/hgrecco/pint/issues/1809) is preventing an easy migration as it breaks our parser code.
Tasks:
- [ ] Upgrade Pint, once multiplication issue is solved.
- [x] Ensure that new unit definitions get migrated to the front-end.
- [x] Update JS unit code to support percentages.
- [x] Add tests for the JS unit code.Lauri HimanenLauri Himanenhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1936Gallery search results2024-03-15T09:07:33ZMarkus ScheidgenGallery search resultsSome pseudo code
```js
const GalleryCard = (title, entryId) => {
return <Paper>
...
{children}
</Paper>
}
const H5WebGalleryCard = (result, archive_path) => {
return <GalleryCard {...}><H5Web path={result.mainfile + arch...Some pseudo code
```js
const GalleryCard = (title, entryId) => {
return <Paper>
...
{children}
</Paper>
}
const H5WebGalleryCard = (result, archive_path) => {
return <GalleryCard {...}><H5Web path={result.mainfile + archive_path}></H5Web></GalleryCard>
}
const DosGalleryCard = (result) => {
return <GalleryCard {...}><DOS entryId={result.entryId}/></GalleryCard>
}
const galleryComponents = {
'h5web': H5WebGalleryCard,
'dos': DosGalleryCard
}
app = {
search: {
gallery: {
component: 'h5web',
props: {
archive_path: '/nexus/NXmpes/entry/data/data'
}
}
}
<SearchGallery> // a new alternative for SearchResults
const {results, app, callbacks...} = useSearchContext() // ask lauri
const [pinnedResults, setPinnedResults] = useState()
<PinnedGallery>
<MUIGrid>
{pinnedResults.map(...)}
</MUIGrid>
</PinnedGallery>
<MainGallery
<MUIGrid>
{results.map(() => React.createEntry(galleryComponents[app.gallery.component], {result, ...app.gallery.props})}
</MUIGrid>
</MainGallery>
</SearchGallery>
```Sherjeel ShabihSherjeel Shabihhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1872pyiron integration2024-03-22T09:26:26ZAmir Golparvarpyiron integrationPyiron IDE is a jupyter-based code for high performance computing in material sciencePyiron IDE is a jupyter-based code for high performance computing in material scienceAmir GolparvarAmir Golparvarhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1854Add bond information to structure files2024-01-18T09:18:48ZLauri HimanenAdd bond information to structure filesSometimes simulations contain explicit bond information. This is currently stored in `system.atoms.bond_list`. Whenever this information is present, we should add it to the structure files produced by the systems API endpoint. Not all fo...Sometimes simulations contain explicit bond information. This is currently stored in `system.atoms.bond_list`. Whenever this information is present, we should add it to the structure files produced by the systems API endpoint. Not all formats will support this, but this should be added at least to the .pdb format that currently drives the Overview visualization. With the bond information, NGL can also display the bonds.Lauri HimanenLauri Himanenhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1713Improved widget axis settings2023-12-21T15:55:43ZLauri HimanenImproved widget axis settingsWe need to add more configuration options for the widget axes used by the histogram and scatter plot widgets:
- [ ] Ability to specify the unit
- [ ] Ability to specify a custom label
- [ ] Ability to specify the label "mode": if no cus...We need to add more configuration options for the widget axes used by the histogram and scatter plot widgets:
- [ ] Ability to specify the unit
- [ ] Ability to specify a custom label
- [ ] Ability to specify the label "mode": if no custom label is defined, how many levels of the hierarchy should be displayed?
- [ ] The histogram x-axis title should be moved to the bottom and an y-axis should be added, as reported in #1697.
- [ ] Improve the UX for the statistics scaling:
- Instead of showing the scaling option name, show a 'graph' icon that opens a dropdown.
- Add a tooltip for each option that fully defines the scaling mathematically
- Replace the fairly odd options (1/2, 1/4, 1/8) with simple `log` option?Lauri HimanenLauri Himanenhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1641Neo4j prototype2023-08-16T09:10:26ZLauri HimanenNeo4j prototypeWe should build a small demo to see if Neo4j could be used as a graph db solution within NOMAD.
Tasks:
- [x] Add Neo4j to the dev docker compose
- [ ] Build Annotation class for Graph database
- [ ] Build function for creating the mappi...We should build a small demo to see if Neo4j could be used as a graph db solution within NOMAD.
Tasks:
- [x] Add Neo4j to the dev docker compose
- [ ] Build Annotation class for Graph database
- [ ] Build function for creating the mapping/schema for Neo4j based on annotations in the metainfo
- [ ] Build function for ingesting several archives into Neo4j based on the annotations
Relevant at least for: @hnaesstroem, @g-michaelgoetteHampus NaesstroemHampus Naesstroemhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1549Representing notebooks as custom ELN schema2023-11-13T14:04:04ZAdam FeketeRepresenting notebooks as custom ELN schemaAdam FeketeAdam Fekete