Updated query nomad archive
Compare changes
- Aakash Ashok Naik authored
query_nomad_archive.ipynb
0 → 100644
+ 787
− 0
```
```
```
```
```
```
```
```
Nomad package allows to retrieve data from the NOMAD Archive with means of a script, as shown below. In this script we insert metadata characterizing the materials that we aim to retrieve. In this case, we select ternary materials containing Oxygen. We also request that simulations were carried out using the VASP code using GGA exchange-correlation (xc) functionals. Values are retrieved from the simulation run that found geometrically convergence wihin a threshold value of 1e-20.
```
The required condition ensures that all quantities in the simulation run that we are interested in are fetched during the query. For example, we can see quantities as 'chemical_composition' which gives the composition of the material or 'atom_positions' that contains the positon of all atoms after geometric convergence.
We notice that the variable 'query' contains a number of other variables: the 'max' value sets the maximum number of entries that can be retrieved; the 'per_page' value indicates the number of entries fetched at each API call; the 'parallel' value gives the number of parallel calls that are performed at each iteration.
```
```
To retrieve data and place it within a framework, we use a 'for' loop that iteratively fetch all entries up to the maximum value, which is given by 'max_entries'. Taking into account that some links in the query might be broken, the resulting 'IndexError' exception is handled within the 'for' loop, that skips over the broken entry. In addition, we also make sure the entry contains the simulation cell value which we are interested in, and that all elements in the material have an admissible atomic number.
```
```
```
```
We might have different entries with the same chemical composition, because e.g. simulations were performed for the same material with different settings that were not included among the filters of our query. Each of these simulations might have produced a slightly different value of the resulting atomic density of the material. As data is taken from heterogeneous simulations which were carried out in different laboratories, we do not aim to evaluate all possible parameters of each simulation. Hence, we average the atomic density value over all materials with the same chemical composition.
```
```
```
```
```
```
```
```
```
```
```
```
```
```
```
```
```
```
```
```
Plots show the predictions of the trained model on the test set, and on the kwnon values of the training set that are taken as reference values. Each point shows also the standard deviation. We emphasize that, considering that each value on the plot is given by an average over all elements in the periodic table, the standard deviation cannot go to zero by construction, even in the limit of taking all possible combinations. We then aim that averages and standard deviations predicted by our model are comparable to the ones of the reference model.
```