nomad-FAIR issueshttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues2024-03-28T14:02:12Zhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1958Fix FastAPI response model for datasets2024-03-28T14:02:12ZLauri HimanenFix FastAPI response model for datasetsRelated to https://github.com/nomad-coe/nomad/issues/101Related to https://github.com/nomad-coe/nomad/issues/101Lauri HimanenLauri Himanenhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1957Preview plot from CLI parse2024-03-22T17:14:40ZHampus NaesstroemPreview plot from CLI parseIt would be great with some sort of preview of the plot generated from calling the CLI parse command. Currently we have to either inspect the json manually or run nomad locally to test this during plugin development.It would be great with some sort of preview of the plot generated from calling the CLI parse command. Currently we have to either inspect the json manually or run nomad locally to test this during plugin development.https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1956Booleans are not displayed on overview page2024-03-25T12:10:50ZHampus NaesstroemBooleans are not displayed on overview pageIt seems like float, str, and int are all displayed properly on the overview page but not booleans. The data is still visible from the data tab.
Tested on staging with nomad verion 1.2.2.dev483+gd2702d604.
`test.archive.yaml` to reprodu...It seems like float, str, and int are all displayed properly on the overview page but not booleans. The data is still visible from the data tab.
Tested on staging with nomad verion 1.2.2.dev483+gd2702d604.
`test.archive.yaml` to reproduce images:
```yaml
definitions:
sections:
MySection:
quantities:
my_float:
type: float
my_int:
type: int
my_str:
type: str
my_bool:
type: bool
data:
m_def: MySection
my_float: 3.14
my_int: 1
my_str: "Hello World!"
my_bool: true
```
![image](/uploads/588dd1f34187780981af62d63dec210a/image.png)![image](/uploads/640f537273e9cb9fadf5db7b76214eca/image.png)https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1955Extend contexts to provide same functionality in CLI, on Server, and during T...2024-03-22T16:58:17ZHampus NaesstroemExtend contexts to provide same functionality in CLI, on Server, and during TestsWe should extend the Client and Server context to provide a create_archive function which works on the server, in the CLI, and during tests.We should extend the Client and Server context to provide a create_archive function which works on the server, in the CLI, and during tests.https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1954Include SMILES and IUPAC name in System2024-03-26T14:41:56ZJose Marquez PrietoInclude SMILES and IUPAC name in SystemThere is an increasing need to enhance the representation of some materials beyond inorganic compounds. The minimum thing we should try to add to `System` is a quantity to store the `SMILES` and the `iupac_name`. Other potential quantiti...There is an increasing need to enhance the representation of some materials beyond inorganic compounds. The minimum thing we should try to add to `System` is a quantity to store the `SMILES` and the `iupac_name`. Other potential quantities can some of the content that we already have in our [Substance class](https://nomad-lab.eu/prod/v1/staging/gui/analyze/metainfo/nomad/section_definitions@nomad.datamodel.metainfo.eln.Substance).
Later on, we can think on how we can derive additional quantities on the fly with SMILES. Some code that might be useful for that task is [this ](https://github.com/OpenBioML/chemnlp/blob/7612d91c232af64fad0e464424d35dbbe716e30a/src/chemnlp/data/reprs.py#L53-L78)from @g-kevinjablonka.
Interested people, thumbs up if you agree on this: @hnaesstroem, @jrudz, @pizarroj, @lucamghi, @himanel1https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1947Metainfo improvements2024-03-20T16:11:39ZMarkus ScheidgenMetainfo improvements### The DataType idea
Move all types into their own `DataTypes` and remove the "if then else" logic from *set*, *get*, and *(de)serialize* operations. E.g. `NPArray(dtype)` and `PythonType(type)` should deal with np and python types resp...### The DataType idea
Move all types into their own `DataTypes` and remove the "if then else" logic from *set*, *get*, and *(de)serialize* operations. E.g. `NPArray(dtype)` and `PythonType(type)` should deal with np and python types respectively. This should make it much easier to extend `NPArray(dtype)` with something like `HDF5Dataset(dtype)` in the future. It might also make it easier to add more specialised types. Also the `m_to_dict`, `__set_normalized`, etc. looks horrible in its current state.
### Parameters for DataTypes
Potentially those new types need parameters like `dtype` or `type`. Note that types can have parameters. `Reference` and `SectionReference` for example, already take the referenced section definition as a type.
### Unit and shape
But, the unit and the shape should stay in the quantity definition. In all `DataType` operations, you will have access to the quantity def anyways. Maybe `DataType` needs a field `supports_shapes` or something. Maybe of the *set*, *get*, and *(de)serialize* will need to know, if they should manage the list vs scalar or if the data type is actually doing this. For something like `Quantity(type=Datatime, shape=['*'])` `m_to_dict` would create a list and would ask Datetime to serialize the elements. For `Quantity(type=np.float64, shape=[1,2])` `m_to_dict` would just call `serialize` on `NPArray` and would expect it to serialze the whole array and not just an element. For data types that do not support shapes in itself, only scalars and list would work, for higher shapes we would throw an error.
### Backwards compatibility in type definitions
The `QuantityType` (the type for types) could duck-type and help with backwards compatibility. E.g. everytime you use `type=np.float64`, `QuantityType` replaces it it it#s `set_normalized` function with `NPArray(np.float64)`. Keep in mind that we also need backwards compatibility in how `QuantityType` serializes types. For example an `NPArray(np.float64)` should still (de)serialize to `{type_kind: 'numpy', type_data: 'float64'}`, etc.
### Duck-typing and type conversion
When we need to map the metainfo to other systems like Pydantic, Optimade, Mongo, etc. we often make use of `MTypes` to figure out if a quantity value is compatible with a respective foreign Pydantic, Optimade, Mongo, etc. type. Here `MTypes` provides list of types that are called `number`, `numpy`, etc. Maybe `DataType` can define functions to implement this more explicitly:
```py
class DataType:
def compatible_with(target_type: Type) -> bool:
"""
Returns true if the given type is compatible. All compatible types can be used in `convert`.
Also values in all compatible types can be assigned to quantities with self type.
"""
return target_type == self
def convert(target_type: Type[T], value) -> T:
"""
Converts the given value into a value of the given compatible type.
This will not assert if the given type is actually compatible.
Use `compatible_with` to check.
"""
return value
class NPArray(DataType):
def __init__(self, dtype):
self.dtype = dtype
def compatible_with(target_type):
if (self.dtype.type in [np.float64, npfloat32]):
return target_type == float
if (self.dtype.type in [np.int64, np.uint64]):
return target_type == int
return target_type == self.dtype.type
def convert(target_type, value):
if target_type == self.dtype.type:
return value
target_type(value)
```
### Smaller things:
- Maybe we also add more standard types, e.g. `Pydantic(pydantic_model)`, `Dataframe(...)` for table data.
- Non standard data types should be moved to `nomad.datamodel.metainfo`. Ideally, the `nomad.metainfo` could be reduced to pure Python (no numpy, no pandas, no nomad.config). Ways to possibly inject dependencies are specialisations of `DataType` and `Context`.
- Cleanup: a concise way to define annotations
- Cleanup: remove "more" attributes
- Cleanup: remove/deprecate `label` property
- Cleanup: Remove unused submodules: `benchmarks`, `legacy`, `generate`
- Cleanup: Deprecate `Category`
- Cleanup: Remove `Environments` completely
- Split the package vertically (metainfo, extensions, annotations, context, datatypes) and not horizontally (metainfo, utils). This will be hard as a lot of stuff between `MSection`, `Definition`, `Datatype`, `Annotation`, `Context` is cyclic by nature.
- Similar to the context and annotation implementations also the extensions should go into `nomad.datamodel.metainfo`, where they are only imported if actually needed and they might depend on more than basic python packages.
- Also `nexus` should definitely move. First into `nomad.datamodel.metainfo`, but eventually in its own plugin.
Not everything has to be in one MR.Theodore ChangTheodore Changhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1946Improved plugin mechanism2024-03-20T10:09:16ZLauri HimanenImproved plugin mechanism- [ ] Use `entry_points` to load additional plugins when plugins are lazy-loaded
- [ ] Create pydantic base models for different plugin types
- [ ] Create a new plugin repo that demonstrates the new plugin structure- [ ] Use `entry_points` to load additional plugins when plugins are lazy-loaded
- [ ] Create pydantic base models for different plugin types
- [ ] Create a new plugin repo that demonstrates the new plugin structureLauri HimanenLauri Himanenhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1945Add unit support for percentages2024-03-21T08:23:23ZLauri HimanenAdd unit support for percentagesRelated to https://github.com/nomad-coe/nomad/issues/100. Starting from 0.21, Pint has built-in support for percentages. Unfortunately, also starting from that exact version, [this Pint issue](https://github.com/hgrecco/pint/issues/1809)...Related to https://github.com/nomad-coe/nomad/issues/100. Starting from 0.21, Pint has built-in support for percentages. Unfortunately, also starting from that exact version, [this Pint issue](https://github.com/hgrecco/pint/issues/1809) is preventing an easy migration as it breaks our parser code.
Tasks:
- [ ] Upgrade Pint, once multiplication issue is solved.
- [x] Ensure that new unit definitions get migrated to the front-end.
- [x] Update JS unit code to support percentages.
- [x] Add tests for the JS unit code.Lauri HimanenLauri Himanenhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1943Need to exclude sections from custom index2024-03-19T11:58:46ZMichael GötteNeed to exclude sections from custom indexDear Nomad, i recently hit the 10000 values limit for an entry. The measurement has 625 different spot measurements with each multiple values.
It is no important to have all of them indexed.
Ideally i can just skipt the index, or is th...Dear Nomad, i recently hit the 10000 values limit for an entry. The measurement has 625 different spot measurements with each multiple values.
It is no important to have all of them indexed.
Ideally i can just skipt the index, or is there an easy away to increase the elasticearch limit?
Best
michahttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1939Simulated X-ray powder diffraction patterns from nomad bulk systems2024-03-15T15:23:32ZJose Marquez PrietoSimulated X-ray powder diffraction patterns from nomad bulk systemsThere is a growing request to be able to display the simulated XRD patterns from NOMAD systems. In Python, this could be done with several packages including `ase`, `pymatgen` or `xrayutilities`. `ase` felt a slower solution when given r...There is a growing request to be able to display the simulated XRD patterns from NOMAD systems. In Python, this could be done with several packages including `ase`, `pymatgen` or `xrayutilities`. `ase` felt a slower solution when given realistic values, and `xrayutilities` would be an additional dependency that we are aiming to avoid.
For reference: in a random PC calculating this from a MOF structure ([AgC38N4H28.cif](/uploads/d2806a85b6d63e63ec39c9ef2a2c81ca/AgC38N4H28.cif)) took several seconds and for a more normal symmetrized structure like [Calcite.cif](/uploads/e9d4ad4e8dce28694976e6c7dae2aacf/Calcite.cif) 0.2s.
There are several options for this, but upon conversations with @mscheidg and @himanel1, the ideal case would be to do it on the fly.
1. One option would be to do this directly on JS, but I am not aware of anything existing that does this on the fly.
2. Python and serve it to the GUI with an API endpoint. This could be done maybe in `nomad/app/v1/routers/systems.py`, using `pymatgen` or other packages... some very quick draft for calculating this from a structure file or atoms in any NOMAD path (not tested).
```python
class DiffractionPattern(BaseModel):
two_theta: List[float]
q_values: List[float]
intensities: List[float]
d_spacing: List[float]
async def extract_atoms_from_path(entry_id: str, path: str, user: User) -> NOMADAtoms: # extracted from Lauri's system endpoint
for prefix in ['#/']:
if path.startswith(prefix):
path = path[len(prefix) :]
query_list: List[Union[str, int]] = []
paths = [x for x in path.split('/') if x != '']
i = 0
while i < len(paths):
query_list.append(paths[i])
try:
query_list.append(int(paths[i + 1]))
i += 1
except (IndexError, ValueError):
pass
i += 1
value = {'atoms': '*'}
if 'topology' in path:
value['atoms_ref'] = 'include-resolved'
value['indices'] = '*'
required = query_list_to_dict(query_list, value)
required['resolve-inplace'] = True
query = {'entry_id': entry_id}
try:
archive = answer_entry_archive_request(query, required=required, user=user)[
'data'
]['archive']
except Exception as e:
raise HTTPException(
status_code=404, detail='Archive data not found or access denied.'
)
try:
result_dict = deep_get(
archive, *[0 if isinstance(x, int) else x for x in query_list]
)
atoms_data = result_dict.get('atoms', result_dict.get('atoms_ref'))
if atoms_data is None:
raise HTTPException(
status_code=404, detail='Atoms data not found in the archive.'
)
return NOMADAtoms.m_from_dict(atoms_data)
except Exception as e:
raise HTTPException(
status_code=404, detail='Failed to extract atoms data from the archive.'
)
from fastapi import HTTPException, UploadFile, File, Query
from pydantic import BaseModel
from typing import List
import numpy as np
from pymatgen.io.ase import AseAtomsAdaptor
class DiffractionPattern(BaseModel):
two_theta: List[float]
q_values: List[float]
intensities: List[float]
d_spacing: List[float]
@router.post('/calculate_diffraction_pattern', response_model=DiffractionPattern)
async def calculate_diffraction_pattern_from_file(
structure_file: UploadFile = File(...),
wavelength: float = Query(
default=1.54056, # Default to Cu Kα radiation
description='The wavelength of the X-ray source in Ångstroms.',
),
):
try:
structure_data = await structure_file.read()
structure = Structure.from_str(
structure_data.decode('utf-8'), fmt=structure_file.filename.split('.')[-1]
)
xrd_calculator = XRDCalculator(wavelength=wavelength)
pattern = xrd_calculator.get_pattern(structure)
# Explicit conversion of numpy.float64 to Python float for JSON serialization
q_values = [
float(value)
for value in (4 * np.pi * np.sin(np.radians(pattern.x / 2)) / wavelength)
]
two_theta = [
float(value) for value in pattern.x
] # Convert each element to float
intensities = [
float(value) for value in pattern.y
] # Convert each element to float
d_spacing = [
float(value) for value in pattern.d_hkls
] # Convert each element to float
return DiffractionPattern(
two_theta=two_theta,
q_values=q_values,
intensities=intensities,
d_spacing=d_spacing,
)
except Exception as e:
raise HTTPException(
status_code=400, detail=f'Error calculating diffraction pattern: {str(e)}'
)
finally:
await structure_file.close()
@router.get(
'/calculate_diffraction_pattern_from_path', response_model=DiffractionPattern
)
async def calculate_diffraction_pattern_from_path(
entry_id: str,
path: str,
wavelength: float = Query(default=1.54056),
user: User = Depends(create_user_dependency(signature_token_auth_allowed=True)),
):
atoms = await extract_atoms_from_path(entry_id, path, user)
ase_atoms = ase_atoms_from_nomad_atoms(
atoms
)
structure = AseAtomsAdaptor.get_structure(ase_atoms)
xrd_calculator = XRDCalculator(wavelength=wavelength)
pattern = xrd_calculator.get_pattern(structure)
return DiffractionPattern(
two_theta=[float(value) for value in pattern.x.tolist()],
q_values=[
float(value)
for value in (4 * np.pi * np.sin(np.radians(pattern.x / 2)) / wavelength)
],
intensities=[float(value) for value in pattern.y.tolist()],
d_spacing=[float(value) for value in pattern.d_hkls.tolist()],
)
```https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1938Make link on logo in top left corner confugurable on an oasis2024-03-20T10:09:31ZMichael GötteMake link on logo in top left corner confugurable on an oasisRight now it redirects to the central nomad which is confusingRight now it redirects to the central nomad which is confusinghttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1936Gallery search results2024-03-15T09:07:33ZMarkus ScheidgenGallery search resultsSome pseudo code
```js
const GalleryCard = (title, entryId) => {
return <Paper>
...
{children}
</Paper>
}
const H5WebGalleryCard = (result, archive_path) => {
return <GalleryCard {...}><H5Web path={result.mainfile + arch...Some pseudo code
```js
const GalleryCard = (title, entryId) => {
return <Paper>
...
{children}
</Paper>
}
const H5WebGalleryCard = (result, archive_path) => {
return <GalleryCard {...}><H5Web path={result.mainfile + archive_path}></H5Web></GalleryCard>
}
const DosGalleryCard = (result) => {
return <GalleryCard {...}><DOS entryId={result.entryId}/></GalleryCard>
}
const galleryComponents = {
'h5web': H5WebGalleryCard,
'dos': DosGalleryCard
}
app = {
search: {
gallery: {
component: 'h5web',
props: {
archive_path: '/nexus/NXmpes/entry/data/data'
}
}
}
<SearchGallery> // a new alternative for SearchResults
const {results, app, callbacks...} = useSearchContext() // ask lauri
const [pinnedResults, setPinnedResults] = useState()
<PinnedGallery>
<MUIGrid>
{pinnedResults.map(...)}
</MUIGrid>
</PinnedGallery>
<MainGallery
<MUIGrid>
{results.map(() => React.createEntry(galleryComponents[app.gallery.component], {result, ...app.gallery.props})}
</MUIGrid>
</MainGallery>
</SearchGallery>
```Sherjeel ShabihSherjeel Shabihhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1934Parsing nexus files generates a lot of warnings and errors2024-03-25T08:23:13ZFlorian DobenerParsing nexus files generates a lot of warnings and errorsWhen I parse a nexus file into nomad a lot of error and warnings are generated. The metainfo gets populated but some of the fields and attributes are missing.
This can be reproduced by running the mpes example on staging. Make sure that...When I parse a nexus file into nomad a lot of error and warnings are generated. The metainfo gets populated but some of the fields and attributes are missing.
This can be reproduced by running the mpes example on staging. Make sure that the `MoTe2.mpes.nxs` file was generated (you might need to reprocess once after the upload has been created and all of the files downloaded). While this happens for this example the same error exists for every nexus file we try to process.
Here is the error log I get while processing: [mpes_error.log](/uploads/1f8d1cefe39076678414eaa6a0d9a604/mpes_error.log)
These are the two most common errors I get from the processing:
```
nomad_oasis_worker | ERROR nomad.processing 2024-03-13T14:45:33 could not normalize section
nomad_oasis_worker | - exception: Traceback (most recent call last):
nomad_oasis_worker | File "/usr/local/lib/python3.9/site-packages/nomad/normalizing/metainfo.py", line 35, in normalize_section
nomad_oasis_worker | normalize(self.entry_archive, logger)
nomad_oasis_worker | TypeError: 'MSubSectionList' object is not callable
```
```
nomad_oasis_worker | WARNING nomad.processing 2024-03-13T14:45:31 Error while setting attribute.
nomad_oasis_worker | - nomad.commit:
nomad_oasis_worker | - nomad.deployment: oasis
nomad_oasis_worker | - nomad.entry_id: Da5PJyBHOYH_drwzEfokahG1Cf3e
nomad_oasis_worker | - nomad.mainfile: MoTe2.mpes.nxs
nomad_oasis_worker | - nomad.processing.exe_info: Cannot find a proper definition for name depends_on__field
nomad_oasis_worker | - nomad.processing.logger: nomad.processing
nomad_oasis_worker | - nomad.processing.parser: parsers/nexus
nomad_oasis_worker | - nomad.processing.proc: Entry
nomad_oasis_worker | - nomad.processing.process: process_entry
nomad_oasis_worker | - nomad.processing.process_status: RUNNING
nomad_oasis_worker | - nomad.processing.process_worker_id: 6XXPTPE8S7ehdxNZRoy0Lg
nomad_oasis_worker | - nomad.processing.step: parsers/nexus
nomad_oasis_worker | - nomad.processing.target_name: depends_on__attribute
nomad_oasis_worker | - nomad.service: unknown nomad service
nomad_oasis_worker | - nomad.upload_id: N_uam4l-Q1-zRhkMNmZ35A
nomad_oasis_worker | - nomad.version: 1.2.2.dev497+g150c1828d
```
Also when I try to parse the IV_temp example in this MR !1717 (this deactivates the nexus obj file which produced another error with loading the file) I get an additional error. This fails the whole processing and no metainfo is generated for this file.
```
nomad_oasis_worker | ERROR nomad.metainfo 2024-03-13T14:54:52 error in indexing dynamic quantity
nomad_oasis_worker | - exception: Traceback (most recent call last):
nomad_oasis_worker | File "/usr/local/lib/python3.9/site-packages/nomad/processing/data.py", line 1488, in parsing
nomad_oasis_worker | parser.parse(
nomad_oasis_worker | File "/usr/local/lib/python3.9/site-packages/nomad/parsing/parser.py", line 460, in parse
nomad_oasis_worker | self.mainfile_parser.parse(mainfile, archive, logger)
nomad_oasis_worker | File "/usr/local/lib/python3.9/site-packages/nomad/parsing/nexus/nexus.py", line 346, in parse
nomad_oasis_worker | nexus_helper.process_nexus_master_file(self.__nexus_populate)
nomad_oasis_worker | File "/usr/local/lib/python3.9/site-packages/pynxtools/nexus/nexus.py", line 733, in process_nexus_master_file
nomad_oasis_worker | self.full_visit(self.in_file, self.in_file, "", self.visit_node)
nomad_oasis_worker | File "/usr/local/lib/python3.9/site-packages/pynxtools/nexus/nexus.py", line 717, in full_visit
nomad_oasis_worker | self.full_visit(root, child, full_name, func)
nomad_oasis_worker | File "/usr/local/lib/python3.9/site-packages/pynxtools/nexus/nexus.py", line 717, in full_visit
nomad_oasis_worker | self.full_visit(root, child, full_name, func)
nomad_oasis_worker | File "/usr/local/lib/python3.9/site-packages/pynxtools/nexus/nexus.py", line 717, in full_visit
nomad_oasis_worker | self.full_visit(root, child, full_name, func)
nomad_oasis_worker | [Previous line repeated 2 more times]
nomad_oasis_worker | File "/usr/local/lib/python3.9/site-packages/pynxtools/nexus/nexus.py", line 712, in full_visit
nomad_oasis_worker | func(name, hdf_node)
nomad_oasis_worker | File "/usr/local/lib/python3.9/site-packages/pynxtools/nexus/nexus.py", line 656, in visit_node
nomad_oasis_worker | process_node(hdf_node, "/" + hdf_name, self.parser, self.logger)
nomad_oasis_worker | File "/usr/local/lib/python3.9/site-packages/pynxtools/nexus/nexus.py", line 361, in process_node
nomad_oasis_worker | parser(
nomad_oasis_worker | File "/usr/local/lib/python3.9/site-packages/nomad/parsing/nexus/nexus.py", line 306, in __nexus_populate
nomad_oasis_worker | self._populate_data(depth, nx_path, nx_def, hdf_node, current)
nomad_oasis_worker | File "/usr/local/lib/python3.9/site-packages/nomad/parsing/nexus/nexus.py", line 239, in _populate_data
nomad_oasis_worker | if metainfo_def.use_full_storage:
nomad_oasis_worker | File "/usr/local/lib/python3.9/site-packages/nomad/metainfo/metainfo.py", line 3501, in __getattr__
nomad_oasis_worker | return super().__getattr__(name)
nomad_oasis_worker | File "/usr/local/lib/python3.9/site-packages/nomad/metainfo/metainfo.py", line 1471, in __getattr__
nomad_oasis_worker | raise AttributeError(name)
nomad_oasis_worker | AttributeError: use_full_storage
nomad_oasis_worker |
nomad_oasis_worker | During handling of the above exception, another exception occurred:
nomad_oasis_worker |
nomad_oasis_worker | Traceback (most recent call last):
nomad_oasis_worker | File "/usr/local/lib/python3.9/site-packages/nomad/processing/base.py", line 969, in proc_task
nomad_oasis_worker | rv = unwrapped_func(proc, *args, **kwargs)
nomad_oasis_worker | File "/usr/local/lib/python3.9/site-packages/nomad/processing/data.py", line 1247, in process_entry
nomad_oasis_worker | self._process_entry_local()
nomad_oasis_worker | File "/usr/local/lib/python3.9/site-packages/nomad/processing/data.py", line 1341, in _process_entry_local
nomad_oasis_worker | self.parsing()
nomad_oasis_worker | File "/usr/local/lib/python3.9/site-packages/nomad/processing/data.py", line 1496, in parsing
nomad_oasis_worker | raise ProcessFailure(
nomad_oasis_worker | nomad.processing.base.ProcessFailure: parser failed with exception
nomad_oasis_worker |
nomad_oasis_worker | During handling of the above exception, another exception occurred:
nomad_oasis_worker |
nomad_oasis_worker | Traceback (most recent call last):
nomad_oasis_worker | File "/usr/local/lib/python3.9/site-packages/nomad/metainfo/elasticsearch_extension.py", line 1614, in create_searchable_quantity
nomad_oasis_worker | value = float(value)
nomad_oasis_worker | TypeError: float() argument must be a string or a number, not 'dict'
nomad_oasis_worker | - exception_hash: 17kZjSVVAOoLB3FaSS-eAoOrnW5H
nomad_oasis_worker | - nomad.commit:
nomad_oasis_worker | - nomad.deployment: oasis
nomad_oasis_worker | - nomad.metainfo.path_archive: nexus.NXiv_temp.ENTRY.0.DATA.0.DATA__field
nomad_oasis_worker | - nomad.service: unknown nomad service
nomad_oasis_worker | - nomad.version: 1.2.2.dev497+g150c1828d
```https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1933NOMAD Oasis demonstrator deployment2024-03-13T09:58:37ZMarkus ScheidgenNOMAD Oasis demonstrator deploymentMarkus ScheidgenMarkus Scheidgenhttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1932Theming, profiles, and launch urls for NORTH2024-03-14T13:32:33ZMarkus ScheidgenTheming, profiles, and launch urls for NORTHThis is a continuation of #1897 (!1677)
- [ ] NOMAD and FAIRmat logos for JupyterHub, Jupyter, and JupyterLab
- [ ] !1677 added code to generate profiles for the k8s Jupyterhub, but it does not play well with NOMAD launching containers ...This is a continuation of #1897 (!1677)
- [ ] NOMAD and FAIRmat logos for JupyterHub, Jupyter, and JupyterLab
- [ ] !1677 added code to generate profiles for the k8s Jupyterhub, but it does not play well with NOMAD launching containers via service, also does not work for webtop tools
- [ ] profiles should allow hub urls that launch a tool (without nomad support, e.g. mounted uploads)
- [ ] hub should use the nomad api to determine the user optionsAdam FeketeAdam Feketehttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1930workflow2 gets broken when reprocessing2024-03-12T12:05:15ZAndrea Albinoworkflow2 gets broken when reprocessingI noticed that when I reprocess some upload, some archive will have their workflow with TaskReference not found, like in the following. Also, some box is smaller.
Maybe with @ladinesa we can try to take a look. The code lives in a plugi...I noticed that when I reprocess some upload, some archive will have their workflow with TaskReference not found, like in the following. Also, some box is smaller.
Maybe with @ladinesa we can try to take a look. The code lives in a plugin: https://github.com/FAIRmat-NFDI/AreaA-data_modeling_and_schemas/tree/main/IKZ_plugin
![Screenshot_from_2024-03-12_12-00-12](/uploads/9df98508bcb19a94178e823175323959/Screenshot_from_2024-03-12_12-00-12.png)
![Screenshot_from_2024-03-12_12-00-30](/uploads/716dccea56880825318eab9c996a5800/Screenshot_from_2024-03-12_12-00-30.png)https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1925HDF5 support2024-03-07T12:14:23ZMarkus ScheidgenHDF5 support- theory parsers use HDF for large data (e.g. MD)
- as a HDF "sibling"-file to the archive .msg, but placed in the raw files
- written via the ServerContext
- if numpy quantity has HDF5Reference type, the values are written to HDF...- theory parsers use HDF for large data (e.g. MD)
- as a HDF "sibling"-file to the archive .msg, but placed in the raw files
- written via the ServerContext
- if numpy quantity has HDF5Reference type, the values are written to HDF5
- if numpy quantity has annotation, the values are written to HDF5
- for both annotations and HDF5Reference the archive browser opens H5Web
- problems
- we create more raw files even though it should be part of the archive
- if published h5groove cannot work with the HDF5 raw files anymore
- entries with these HDF5 files cannot be reprocessed, because the raw files are immutable
- transparently resolving HDF5 references in the archive api
- solution
- move archive HDF5 actually into the archive folder
- publish the archive in a way that the HDF5 stays usable (h5groove)
- publish the raw-files in a way the the HDF5 stays usable (h5groove)
- bigger idea: replace the archive with hdf5 ..., even bigger idea: replacing files with s3, object storage via HSDS
- fully replace
- hybrid
- whole entry, whole upload
- always have both + transparent NOMAD API
unpublished-upload:
raw/** (/<entry_id>.hdf5)
archive/<entry_id>.msg
published-upload:
raw-...-.zip
raw/**/*.hdf
archive-...-.msg
- archive/<entry_id>.hdf5
- archive-...-.hdf5
- <entry-id> / ...
architure:
h5web | nomad GUI |
h5grove | HSDS | archive API |
hd5 | msg | parques
s3, fs | fs | fs
tasks:
- suite of benchmark
- generic benchmark
- impl for .msg/archive
- impl for .hdf5
- impl for .parquet
- investigate if h5groove could read HDF from within a .zip filehttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1924Tests leave nexus.obj behind2024-03-12T13:41:18ZSascha KlawohnTests leave nexus.obj behindThe tests below (and more) leave a `nexus.obj` behind at the current working directory:
- `tests/app/v1/routers/uploads/test_basic_uploads.py::test_put_upload_raw_path[multipart]`
- `tests/app/v1/routers/uploads/test_basic_uploads.py::t...The tests below (and more) leave a `nexus.obj` behind at the current working directory:
- `tests/app/v1/routers/uploads/test_basic_uploads.py::test_put_upload_raw_path[multipart]`
- `tests/app/v1/routers/uploads/test_basic_uploads.py::test_post_upload_edit[edit-all]`
- `tests/processing/test_data.py::test_processing`
It was introduced by ba44dcae6d39404a9dd70626b1d771beac90b586.
Could be solved by adding `pathlib.Path('nexus.obj').unlink(missing_ok=True)` to the test function but I'm not sure if the creation of this file is on purpose in this case, especially at that location. If it is, an assertion that it was created would be better before deleting it.https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1923New "SearchEditQuantity" to reference entries based on search2024-03-12T12:14:07ZSarthak KapoorNew "SearchEditQuantity" to reference entries based on searchCurrently, to reference multiple entries in a schema, one has to use a repeating subsection that contains a `ReferenceEditQuantity`. Each entry has to be linked individually and this becomes a tedious task if one wants to refer a large n...Currently, to reference multiple entries in a schema, one has to use a repeating subsection that contains a `ReferenceEditQuantity`. Each entry has to be linked individually and this becomes a tedious task if one wants to refer a large number of entries.
Having a new EditQuantity, perhaps `SearchEditQuantity` for lack for a better name, that combines the existing search/explore and `ReferenceEditQuantity` interfaces could be a great solution. It could be used to collect entries based on a search query and generate a list of references pointing to each of the entries. There could be an upper limit to the number of entries that can be handled here.
Some background:
- I had a related discussion with @mscheidg, where he pointed out two ways in which such an EditQuantity could be persisted: saving the query or saving the query results.Mohammad NakhaeeMohammad Nakhaeehttps://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1922Staging files sometimes are not removed.2024-03-05T14:34:41ZMarkus ScheidgenStaging files sometimes are not removed.This causes problems in displaying published entries.
We saw this behaviour when files like this `.nfs000000000c92634f0000001c` where part of the uploaded data. Everything gets deleted but those files. This is also not causing any error...This causes problems in displaying published entries.
We saw this behaviour when files like this `.nfs000000000c92634f0000001c` where part of the uploaded data. Everything gets deleted but those files. This is also not causing any errors while deleting, because we use `shutil.rmtree(..., ingore_errors=True)`.