nomad-FAIR issueshttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues2024-03-28T14:02:12Zhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1958Fix FastAPI response model for datasets2024-03-28T14:02:12ZLauri HimanenFix FastAPI response model for datasetsRelated to https://github.com/nomad-coe/nomad/issues/101Related to https://github.com/nomad-coe/nomad/issues/101Lauri HimanenLauri Himanenhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1947Metainfo improvements2024-03-20T16:11:39ZMarkus ScheidgenMetainfo improvements### The DataType idea
Move all types into their own `DataTypes` and remove the "if then else" logic from *set*, *get*, and *(de)serialize* operations. E.g. `NPArray(dtype)` and `PythonType(type)` should deal with np and python types resp...### The DataType idea
Move all types into their own `DataTypes` and remove the "if then else" logic from *set*, *get*, and *(de)serialize* operations. E.g. `NPArray(dtype)` and `PythonType(type)` should deal with np and python types respectively. This should make it much easier to extend `NPArray(dtype)` with something like `HDF5Dataset(dtype)` in the future. It might also make it easier to add more specialised types. Also the `m_to_dict`, `__set_normalized`, etc. looks horrible in its current state.
### Parameters for DataTypes
Potentially those new types need parameters like `dtype` or `type`. Note that types can have parameters. `Reference` and `SectionReference` for example, already take the referenced section definition as a type.
### Unit and shape
But, the unit and the shape should stay in the quantity definition. In all `DataType` operations, you will have access to the quantity def anyways. Maybe `DataType` needs a field `supports_shapes` or something. Maybe of the *set*, *get*, and *(de)serialize* will need to know, if they should manage the list vs scalar or if the data type is actually doing this. For something like `Quantity(type=Datatime, shape=['*'])` `m_to_dict` would create a list and would ask Datetime to serialize the elements. For `Quantity(type=np.float64, shape=[1,2])` `m_to_dict` would just call `serialize` on `NPArray` and would expect it to serialze the whole array and not just an element. For data types that do not support shapes in itself, only scalars and list would work, for higher shapes we would throw an error.
### Backwards compatibility in type definitions
The `QuantityType` (the type for types) could duck-type and help with backwards compatibility. E.g. everytime you use `type=np.float64`, `QuantityType` replaces it it it#s `set_normalized` function with `NPArray(np.float64)`. Keep in mind that we also need backwards compatibility in how `QuantityType` serializes types. For example an `NPArray(np.float64)` should still (de)serialize to `{type_kind: 'numpy', type_data: 'float64'}`, etc.
### Duck-typing and type conversion
When we need to map the metainfo to other systems like Pydantic, Optimade, Mongo, etc. we often make use of `MTypes` to figure out if a quantity value is compatible with a respective foreign Pydantic, Optimade, Mongo, etc. type. Here `MTypes` provides list of types that are called `number`, `numpy`, etc. Maybe `DataType` can define functions to implement this more explicitly:
```py
class DataType:
def compatible_with(target_type: Type) -> bool:
"""
Returns true if the given type is compatible. All compatible types can be used in `convert`.
Also values in all compatible types can be assigned to quantities with self type.
"""
return target_type == self
def convert(target_type: Type[T], value) -> T:
"""
Converts the given value into a value of the given compatible type.
This will not assert if the given type is actually compatible.
Use `compatible_with` to check.
"""
return value
class NPArray(DataType):
def __init__(self, dtype):
self.dtype = dtype
def compatible_with(target_type):
if (self.dtype.type in [np.float64, npfloat32]):
return target_type == float
if (self.dtype.type in [np.int64, np.uint64]):
return target_type == int
return target_type == self.dtype.type
def convert(target_type, value):
if target_type == self.dtype.type:
return value
target_type(value)
```
### Smaller things:
- Maybe we also add more standard types, e.g. `Pydantic(pydantic_model)`, `Dataframe(...)` for table data.
- Non standard data types should be moved to `nomad.datamodel.metainfo`. Ideally, the `nomad.metainfo` could be reduced to pure Python (no numpy, no pandas, no nomad.config). Ways to possibly inject dependencies are specialisations of `DataType` and `Context`.
- Cleanup: a concise way to define annotations
- Cleanup: remove "more" attributes
- Cleanup: remove/deprecate `label` property
- Cleanup: Remove unused submodules: `benchmarks`, `legacy`, `generate`
- Cleanup: Deprecate `Category`
- Cleanup: Remove `Environments` completely
- Split the package vertically (metainfo, extensions, annotations, context, datatypes) and not horizontally (metainfo, utils). This will be hard as a lot of stuff between `MSection`, `Definition`, `Datatype`, `Annotation`, `Context` is cyclic by nature.
- Similar to the context and annotation implementations also the extensions should go into `nomad.datamodel.metainfo`, where they are only imported if actually needed and they might depend on more than basic python packages.
- Also `nexus` should definitely move. First into `nomad.datamodel.metainfo`, but eventually in its own plugin.
Not everything has to be in one MR.Theodore ChangTheodore Changhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1946Improved plugin mechanism2024-03-20T10:09:16ZLauri HimanenImproved plugin mechanism- [ ] Use `entry_points` to load additional plugins when plugins are lazy-loaded
- [ ] Create pydantic base models for different plugin types
- [ ] Create a new plugin repo that demonstrates the new plugin structure- [ ] Use `entry_points` to load additional plugins when plugins are lazy-loaded
- [ ] Create pydantic base models for different plugin types
- [ ] Create a new plugin repo that demonstrates the new plugin structureLauri HimanenLauri Himanenhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1945Add unit support for percentages2024-03-21T08:23:23ZLauri HimanenAdd unit support for percentagesRelated to https://github.com/nomad-coe/nomad/issues/100. Starting from 0.21, Pint has built-in support for percentages. Unfortunately, also starting from that exact version, [this Pint issue](https://github.com/hgrecco/pint/issues/1809)...Related to https://github.com/nomad-coe/nomad/issues/100. Starting from 0.21, Pint has built-in support for percentages. Unfortunately, also starting from that exact version, [this Pint issue](https://github.com/hgrecco/pint/issues/1809) is preventing an easy migration as it breaks our parser code.
Tasks:
- [ ] Upgrade Pint, once multiplication issue is solved.
- [x] Ensure that new unit definitions get migrated to the front-end.
- [x] Update JS unit code to support percentages.
- [x] Add tests for the JS unit code.Lauri HimanenLauri Himanenhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1938Make link on logo in top left corner confugurable on an oasis2024-03-20T10:09:31ZMichael GötteMake link on logo in top left corner confugurable on an oasisRight now it redirects to the central nomad which is confusingRight now it redirects to the central nomad which is confusinghttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1936Gallery search results2024-03-15T09:07:33ZMarkus ScheidgenGallery search resultsSome pseudo code
```js
const GalleryCard = (title, entryId) => {
return <Paper>
...
{children}
</Paper>
}
const H5WebGalleryCard = (result, archive_path) => {
return <GalleryCard {...}><H5Web path={result.mainfile + arch...Some pseudo code
```js
const GalleryCard = (title, entryId) => {
return <Paper>
...
{children}
</Paper>
}
const H5WebGalleryCard = (result, archive_path) => {
return <GalleryCard {...}><H5Web path={result.mainfile + archive_path}></H5Web></GalleryCard>
}
const DosGalleryCard = (result) => {
return <GalleryCard {...}><DOS entryId={result.entryId}/></GalleryCard>
}
const galleryComponents = {
'h5web': H5WebGalleryCard,
'dos': DosGalleryCard
}
app = {
search: {
gallery: {
component: 'h5web',
props: {
archive_path: '/nexus/NXmpes/entry/data/data'
}
}
}
<SearchGallery> // a new alternative for SearchResults
const {results, app, callbacks...} = useSearchContext() // ask lauri
const [pinnedResults, setPinnedResults] = useState()
<PinnedGallery>
<MUIGrid>
{pinnedResults.map(...)}
</MUIGrid>
</PinnedGallery>
<MainGallery
<MUIGrid>
{results.map(() => React.createEntry(galleryComponents[app.gallery.component], {result, ...app.gallery.props})}
</MUIGrid>
</MainGallery>
</SearchGallery>
```Sherjeel ShabihSherjeel Shabihhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1933NOMAD Oasis demonstrator deployment2024-03-13T09:58:37ZMarkus ScheidgenNOMAD Oasis demonstrator deploymentMarkus ScheidgenMarkus Scheidgenhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1915Consistent labels in archive browser and ELNs2024-03-28T16:38:08ZMarkus ScheidgenConsistent labels in archive browser and ELNsThis started with a discussion on discord: https://discord.com/channels/1201445470485106719/1213106750291574794
Currently the labeling of quantities and sub sections in the archive browser, ELN and in the search GUI are very inconsisten...This started with a discussion on discord: https://discord.com/channels/1201445470485106719/1213106750291574794
Currently the labeling of quantities and sub sections in the archive browser, ELN and in the search GUI are very inconsistent.
- edit quantity annotations allow to overwrite quantity labels, this does not exist for sub sections or all other use of quantities
- edit quantities use modified quantity names as labels, but different types of edit quantities does this slightly different and the logic is only applied to edit quantities
- overall this is pretty messy in the code with lots of places influencing the appearance of names and lables.
- an attempt to fix this was also inconsistent and stirred the afore mentioned discussion: !1684
We should expand the `Display-` annotations and use them as a new and more consistent mechanism to overwrite the labeling behaviour. The annotation could ideally allow to specify a more human readable alternative. Where this human-readable alternative is used remains a bit unclear: should we use it always, or only when editing the data + in the search interface?Mohammad NakhaeeMohammad Nakhaeehttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1882rework elabftw parser2024-03-04T10:11:48ZAmir Golparvarrework elabftw parser- [x] use API to fetch experiments, linked items and references
- [x] re-use Base sections for already existing
- [x] customized schema- [x] use API to fetch experiments, linked items and references
- [x] re-use Base sections for already existing
- [x] customized schemaAmir GolparvarAmir Golparvarhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1874Performance improvement, generalized file parser2024-01-31T10:26:20ZAlvin Noe LadinesPerformance improvement, generalized file parserA recent upload of a file >1gb upload id=n1jui6XHTFmQkeeoagS2ag exposes the limitation of the current file parser. To solve this, directly write the parsed data to the archive. We use the m_annotations to specify parser directives. This ...A recent upload of a file >1gb upload id=n1jui6XHTFmQkeeoagS2ag exposes the limitation of the current file parser. To solve this, directly write the parsed data to the archive. We use the m_annotations to specify parser directives. This would then give rise to a generalized file parser.Alvin Noe LadinesAlvin Noe Ladines2024-02-29https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1872pyiron integration2024-03-22T09:26:26ZAmir Golparvarpyiron integrationPyiron IDE is a jupyter-based code for high performance computing in material sciencePyiron IDE is a jupyter-based code for high performance computing in material scienceAmir GolparvarAmir Golparvarhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1854Add bond information to structure files2024-01-18T09:18:48ZLauri HimanenAdd bond information to structure filesSometimes simulations contain explicit bond information. This is currently stored in `system.atoms.bond_list`. Whenever this information is present, we should add it to the structure files produced by the systems API endpoint. Not all fo...Sometimes simulations contain explicit bond information. This is currently stored in `system.atoms.bond_list`. Whenever this information is present, we should add it to the structure files produced by the systems API endpoint. Not all formats will support this, but this should be added at least to the .pdb format that currently drives the Overview visualization. With the bond information, NGL can also display the bonds.Lauri HimanenLauri Himanenhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1834Group-based user and user-rights management2024-03-21T15:46:19ZMarkus ScheidgenGroup-based user and user-rights managementWe replace users with groups and use groups to manage edit and view rights on datasets.
Virtual groups capturing each individual user or “all” users provide additional flexibility.
- group management API
- updating rights queries to use...We replace users with groups and use groups to manage edit and view rights on datasets.
Virtual groups capturing each individual user or “all” users provide additional flexibility.
- group management API
- updating rights queries to use groups
- respective GUI functionality in the revised GUI
The issues below will be covered by this issue.
Related (old) issues:
- NOMAD Oasis UX Test: user roles (#757)
- Better user-magement (#628)
- Test access with allowed_users in config (#1657)
- Groups and upload visibility (#1760)Sascha KlawohnSascha Klawohnhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1774Incorporate more ruff linting rules2023-11-29T10:48:25ZAhmed IlyasIncorporate more ruff linting rulesIntegrate additional Ruff [rules](https://docs.astral.sh/ruff/rules/) to improve code quality:
- [x] Pyflake: Focuses on cleanliness, eliminating unnecessary code, and enhancing performance.
- [x] Pyupgrade: Modernizes Python code with ...Integrate additional Ruff [rules](https://docs.astral.sh/ruff/rules/) to improve code quality:
- [x] Pyflake: Focuses on cleanliness, eliminating unnecessary code, and enhancing performance.
- [x] Pyupgrade: Modernizes Python code with new features and idioms.
- [ ] pydocstyle: Ensures compliance with Python docstring conventions.
- [ ] flake8-bugbear: Detects programming errors and enhances code quality.
- [ ] pep8-naming: Enforces naming conventions for clarity.
- [ ] flake8-async: Identifies asynchronous code issues.
- [ ] flake8-bandit: Enhances security by detecting vulnerabilities.
- [ ] flake8-boolean-trap: Identifies issues with boolean expressions.
- [ ] flake8-builtins: Enforces best practices for Python built-ins.
- [ ] flake8-comprehensions: Enhances comprehension readability.
- [ ] flake8-pie: implements misc. lints
- [ ] flake8-simplify: Simplifies code and removes complexity.
- [ ] Perflint: Identifies performance bottlenecks.
- [ ] refurb: Improves code maintainability.
- [ ] Ruff-specific rules: Covers various code quality and style aspects.
- [ ] flake8-copyright: Checks for copyright notices in all python filesAhmed IlyasAhmed Ilyashttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1713Improved widget axis settings2023-12-21T15:55:43ZLauri HimanenImproved widget axis settingsWe need to add more configuration options for the widget axes used by the histogram and scatter plot widgets:
- [ ] Ability to specify the unit
- [ ] Ability to specify a custom label
- [ ] Ability to specify the label "mode": if no cus...We need to add more configuration options for the widget axes used by the histogram and scatter plot widgets:
- [ ] Ability to specify the unit
- [ ] Ability to specify a custom label
- [ ] Ability to specify the label "mode": if no custom label is defined, how many levels of the hierarchy should be displayed?
- [ ] The histogram x-axis title should be moved to the bottom and an y-axis should be added, as reported in #1697.
- [ ] Improve the UX for the statistics scaling:
- Instead of showing the scaling option name, show a 'graph' icon that opens a dropdown.
- Add a tooltip for each option that fully defines the scaling mathematically
- Replace the fairly odd options (1/2, 1/4, 1/8) with simple `log` option?Lauri HimanenLauri Himanenhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1710Refactor normalizing workflow tests2023-12-21T15:58:56ZAlvin Noe LadinesRefactor normalizing workflow testsSome tests under normalizing/test_workflow.py should technically be in datamodel/metainfo/test_workflow.py.
It is understandable that the two are confused but I personally think that all tests regarding the metainfo
defs including the im...Some tests under normalizing/test_workflow.py should technically be in datamodel/metainfo/test_workflow.py.
It is understandable that the two are confused but I personally think that all tests regarding the metainfo
defs including the implementation of normalize should be under the latter. Only normalizations done in
nomad/normalizing/workflow.py should be in the other.
@pizarrojAlvin Noe LadinesAlvin Noe Ladineshttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1641Neo4j prototype2023-08-16T09:10:26ZLauri HimanenNeo4j prototypeWe should build a small demo to see if Neo4j could be used as a graph db solution within NOMAD.
Tasks:
- [x] Add Neo4j to the dev docker compose
- [ ] Build Annotation class for Graph database
- [ ] Build function for creating the mappi...We should build a small demo to see if Neo4j could be used as a graph db solution within NOMAD.
Tasks:
- [x] Add Neo4j to the dev docker compose
- [ ] Build Annotation class for Graph database
- [ ] Build function for creating the mapping/schema for Neo4j based on annotations in the metainfo
- [ ] Build function for ingesting several archives into Neo4j based on the annotations
Relevant at least for: @hnaesstroem, @g-michaelgoetteHampus NaesstroemHampus Naesstroemhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1549Representing notebooks as custom ELN schema2023-11-13T14:04:04ZAdam FeketeRepresenting notebooks as custom ELN schemaAdam FeketeAdam Feketehttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1536Restructure parser plug ins2024-02-26T18:49:23ZAlvin Noe LadinesRestructure parser plug insCurrently, the simulation parser plug-ins reside under parsers/electronic,atomistic,workflow,database, each pointing to their github project. This is problematic in case a code falls in more than one category. In addition, it has the unw...Currently, the simulation parser plug-ins reside under parsers/electronic,atomistic,workflow,database, each pointing to their github project. This is problematic in case a code falls in more than one category. In addition, it has the unwanted effect of splitting modules from a package as in the case of quantumespresso, where we have the scf modules in electronicparsers/quantumespresso, the xspectra, phonon and epw modules under workflow.
- [ ] put all simulation parsers under parsers/simulation
- [ ] create github project in fairmat-nfdi and nomad-coe and provide a mechanism to sync them similar to nomad gitlab/github
- [ ] implement interface for packages with multiple modules and outputsAlvin Noe LadinesAlvin Noe Ladineshttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/611Statistics for "marketing"2021-12-20T07:45:07ZMarkus ScheidgenStatistics for "marketing"As an NOMAD PI, I want to generate marketing artefacts that shows the current NOMAD data and usage. Statistics on NOMAD data can be generated via API. Statistics on usage rely on data that is not obtained or stored by NOMAD itself, but t...As an NOMAD PI, I want to generate marketing artefacts that shows the current NOMAD data and usage. Statistics on NOMAD data can be generated via API. Statistics on usage rely on data that is not obtained or stored by NOMAD itself, but that can be acquired depending on how NOMAD is operated.
## data sources
Statistics from NOMAD API:
- how much entries and metadata
- the infamous per code/method chart
- histogram of authors, uploads, growth
Statistics from the NOMAD file system (could be integrated into API)
- how much data on disk (e.g. raw files in TB)
Internal statistics, e.g. from web-server logs
- how much "traffic", DAU, visits on certain parts of the app: encyclopedia, aittoolkit, archive, raw-files
## What kind of artefacts do we want
Depending on the type of statistics/report, we could:
- add a set of statistic components (e.g. to show on the `/about` "dashboard"), also useful to visualize oasis usage
- generated piece of HTML for the web-page (or some iframe stuff)
- generated pdf report
- individual charts as pdfFelix DietrichFelix Dietrich