analytics issueshttps://gitlab.mpcdf.mpg.de/nomad-lab/analytics/-/issues2022-07-08T20:30:06Zhttps://gitlab.mpcdf.mpg.de/nomad-lab/analytics/-/issues/18Docker images taking too much space2022-07-08T20:30:06ZNicolas FabasDocker images taking too much spaceHi all!
We noticed here at MPCDF that Docker images are taking ~300GB on the container registry of this project and really start to fill up the partition. We are talking about images with name like
'develop' + git commit id
Is there a...Hi all!
We noticed here at MPCDF that Docker images are taking ~300GB on the container registry of this project and really start to fill up the partition. We are talking about images with name like
'develop' + git commit id
Is there a way that you could reassess the usefulness of these images?
Thank you in advance,
Nicolas Fabashttps://gitlab.mpcdf.mpg.de/nomad-lab/analytics/-/issues/29Notebook: NequIP tutorial2023-03-24T00:00:33ZAdam FeketeNotebook: NequIP tutorial@lsbailo, @sshlu, @lucamghi
Repository:
- https://gitlab.mpcdf.mpg.de/nomad-lab/analytics-nequip
Major dependencies:
- LAMMPS - stable release 23/06/22. Install LAMMPS as a shared library using cmake, as described here https://docs.la...@lsbailo, @sshlu, @lucamghi
Repository:
- https://gitlab.mpcdf.mpg.de/nomad-lab/analytics-nequip
Major dependencies:
- LAMMPS - stable release 23/06/22. Install LAMMPS as a shared library using cmake, as described here https://docs.lammps.org/Howto_pylammps.html, to use the 'lammps' Python package.
- torch, nequip, ase, sklearn
Known issue:
- [ ] running lammps using lammps package or as a subprocess
- [x] the major problem is that the pairing NequIP-Lammps package uses Cudahttps://gitlab.mpcdf.mpg.de/nomad-lab/analytics/-/issues/27The cleanup process of container images doesn't work2022-07-08T20:30:06ZAdam FeketeThe cleanup process of container images doesn't workhttps://gitlab.mpcdf.mpg.de/nomad-lab/analytics/-/issues/26Refactoring the infrastructure2022-07-08T13:42:38ZAdam FeketeRefactoring the infrastructurehttps://gitlab.mpcdf.mpg.de/nomad-lab/analytics/-/issues/25Add lint to notebooks2021-11-29T10:55:37ZLuigi SbailoAdd lint to notebooksNotebooks now do not follow any architectureNotebooks now do not follow any architectureLuigi SbailoLuigi Sbailohttps://gitlab.mpcdf.mpg.de/nomad-lab/analytics/-/issues/24Add password to sensitive notebooks2021-08-31T09:16:35ZLuigi SbailoAdd password to sensitive notebooksTo protect sensitive data, notebooks in develop should have the option to be protected with a password.
Submodules storing sensitive notebooks should also be private.
Currently, submodules need to be public repositories.To protect sensitive data, notebooks in develop should have the option to be protected with a password.
Submodules storing sensitive notebooks should also be private.
Currently, submodules need to be public repositories.https://gitlab.mpcdf.mpg.de/nomad-lab/analytics/-/issues/23Documentation on notebook development roadmap2021-08-30T13:50:54ZLuca Massimiliano GhiringhelliDocumentation on notebook development roadmapIt would be good to have a documentation (written + tutorial video(s)) on the steps to be taken to develop a new tutorial (and update an existing one). This should include
- fully online. Show how to create a new (empty notebook), modif...It would be good to have a documentation (written + tutorial video(s)) on the steps to be taken to develop a new tutorial (and update an existing one). This should include
- fully online. Show how to create a new (empty notebook), modify and save the modified version of an existing notebook. Show also how to manage the personal "Work" directory, (e.g., create a data directory, upload local files, rename, delete, etc.). Show how to go from the files saved in Work to commit the change into gitlab (upload the modified files to gitlab, etc.)
- via docker. For now, assume people can install docker without further help, just document how to use local IDE (e.g., anaconda) for development, docker for preliminary testing, upload to gitlab, etc.
- point out that metainfo.json is also to be created by the developers as part of the development processLuigi SbailoLuigi Sbailohttps://gitlab.mpcdf.mpg.de/nomad-lab/analytics/-/issues/17Visualizer improvements2021-02-08T15:11:20ZLuigi SbailoVisualizer improvements- Implement a feature to avoid plotting overlapping points.
- Add structure visualization as optional requirement- Implement a feature to avoid plotting overlapping points.
- Add structure visualization as optional requirementLuigi SbailoLuigi Sbailohttps://gitlab.mpcdf.mpg.de/nomad-lab/analytics/-/issues/16Size of the image is quite big2020-11-12T07:06:42ZAdam FeketeSize of the image is quite bigIt would make sense to try to reduce the size of these images.
You can find some summary of the size of the latest develop image:
```bash
$ du -shL tutorials/data/*
448M tutorials/data/cmlkit
21M tutorials/data/co2
736K tutor...It would make sense to try to reduce the size of these images.
You can find some summary of the size of the latest develop image:
```bash
$ du -shL tutorials/data/*
448M tutorials/data/cmlkit
21M tutorials/data/co2
736K tutorials/data/compressed_sensing
11M tutorials/data/convolutional_nn
0 tutorials/data/data
118M tutorials/data/decision_tree
880K tutorials/data/descriptor_role
109M tutorials/data/domain_of_applicability
123M tutorials/data/error_estimates
180K tutorials/data/exploratory_analysis
137M tutorials/data/grain_boundaries
14M tutorials/data/hisisso_perovskites
250M tutorials/data/kaggle_competition
1.1G tutorials/data/krr4mat
253M tutorials/data/nn_regression
252K tutorials/data/perovskites_tolerance_factor
15M tutorials/data/query_nomad_archive
3.4M tutorials/data/soap_atomic_charges
1.2M tutorials/data/tcmi
209M tutorials/data/tetradymite_PRM2020
```
```bash
$ du -sh /opt/*
3.8G /opt/conda
1.1G /opt/cpp_sisso
75M /opt/qmmlpack
877M /opt/quip
2.8G /opt/tutorials
```
```bash
$ du -sh /usr/*
173M /usr/bin
0 /usr/games
28M /usr/include
833M /usr/lib
28K /usr/local
3.0M /usr/sbin
1.6G /usr/share
0 /usr/src
```
Although the size of the tutorials is getting bigger and bigger is quite normal (for about 20 tutorial, 3GB shouldn't be a problem).
As far as I can see one of the possible ways to reduce the size is to do a multi-stage build.https://gitlab.mpcdf.mpg.de/nomad-lab/analytics/-/issues/15CO2 tutorial improvements2020-09-21T12:56:27ZLuigi SbailoCO2 tutorial improvementsA few things to be addressed in the future:
- printing a list of top k (k is user given) subgroups, not just the top one
- allowing selection of features in TR. I do not think I got an answer, here: which feature are offered to the TR tr...A few things to be addressed in the future:
- printing a list of top k (k is user given) subgroups, not just the top one
- allowing selection of features in TR. I do not think I got an answer, here: which feature are offered to the TR training, in the notebook and in the learning done for the manuscript? All the features listed in the section "physical features"?
- remove hard-coded data in the script and all bad-programming choice such as relative change of directories
- a tough but very important task: providing the scripts to extract the data from the raw FHI-aims calculations, essentially a meta-notebook that provides the input data to this notebook. As usual, if the processing is slow (to be defined), the data table is pre-loaded like now, but each entry points back to a FHI-aims calculation and a script for reproducing the reported values.
This task will very probably tackled by a new guy who will be 100% in publishing notebooks, but they will need assistance from Alex.