From 9977d3a56a088a673fce67017a9663f092f9b232 Mon Sep 17 00:00:00 2001 From: Andreas Leitherer <leitherer@fhi-berlin.mpg.de> Date: Wed, 23 Dec 2020 11:00:02 +0100 Subject: [PATCH] Small tuning of text --- nn_regression.ipynb | 28 ++++++++++++++++++++-------- 1 file changed, 20 insertions(+), 8 deletions(-) diff --git a/nn_regression.ipynb b/nn_regression.ipynb index bdc0511..52d91f8 100644 --- a/nn_regression.ipynb +++ b/nn_regression.ipynb @@ -281,6 +281,13 @@ "## 2. Neural network regression example - \"ElemNet\"" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In the following, we will consider a specific application of multilayer perceptrons. The idea of this section is that you first read through / run the cells and then return to them later (in particular, changing the neural-network parameter settings) when answering the questions at the end." + ] + }, { "cell_type": "markdown", "metadata": {}, @@ -313,7 +320,7 @@ "source": [ "### 2.1 The task and dataset creation\n", "\n", - "Given only the chemical composition of an inorganic compound, the goal is to predict the volume per atom, which is the total volume of the unit cell divided by the number of atoms in the unit cell. Thus, this property provides an average characterization of the atomic arrangement and therefore becomes interesting in context of crystal-structure prediction (see for instance [here](https://www.nature.com/articles/s41578-019-0101-8) for a review on this topic). Note, however, that this quantity is only an average assessment of the unit cell geometry and thus further properties have to be known (for instance total volume, space group) to get a geometrical understanding of the crystal that is sufficient to actually predict the atomic arrangement." + "Given only the chemical composition of an inorganic compound, the goal is to predict the volume per atom, which is the total volume of the unit cell divided by the number of atoms in the unit cell. Thus, this property provides an average characterization of the atomic arrangement and therefore becomes interesting in context of crystal-structure prediction (we refer the interested reader to [this review](https://www.nature.com/articles/s41578-019-0101-8) for a review on this topic). Note, however, that this quantity is only an average assessment of the unit cell geometry and thus further properties have to be known (for instance total volume, space group) to get a geometrical understanding of the crystal that is sufficient to actually predict the atomic arrangement." ] }, { @@ -431,7 +438,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "From the bar plot, we see that most of the materials contain oxygen. This reflects the bias of experimentalists and theoreticians towards investigating specific systems (i.e., those that are either well-known or currently of interest)." + "From the bar plot, we see that most of the materials contain oxygen. This also reflects the bias of researchers towards investigating specific systems (i.e., those that are either well-known or currently of interest)." ] }, { @@ -489,7 +496,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The first step is to split the data into training and test set, the latter not being touched during optimization and only at the end for the final model evaluation. Note that it is essential to report split ratio and random state to enable reproducibility:" + "The first step is to split the data into training and test set, the latter not being touched during optimization and only at the end for the final model test. Note that it is essential to report split ratio and random state to enable reproducibility:" ] }, { @@ -503,7 +510,7 @@ }, "outputs": [], "source": [ - "# Very important for reproducibility!!\n", + "# Very important for reproducibility!\n", "RANDOM_STATE = 42\n", "split_ratio = 0.2\n", "\n", @@ -564,7 +571,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "If we are not satisfied with the validation performance, we can take a step back, change the hyperparameters (such as the number of layers or number of neurons in each layer), and restart the training process. This way, we can optimize the generalization ability of the model, before the final evaluation on the test set. Ideally, one should also consider different splits (for instance via *cross-validation*), which would be too time-consuming for this tutorial. Furthermore, we provide code only for hand-tuning the hyperparameters while specifying the references on more advanced tuning methods at the end (e.g., Bayesian optimization). " + "If we are not satisfied with the validation performance, we can take a step back, change the hyperparameters (such as the number of layers or number of neurons in each layer), and restart the training process. This way, we can optimize the generalization ability of the model, before the final evaluation on the test set. Ideally, one should also consider different splits (for instance via *cross-validation*, see capter 7.10 of this [book](https://link.springer.com/book/10.1007/978-0-387-84858-7)), which would be too time-consuming for this tutorial. Furthermore, we provide code only for hand-tuning the hyperparameters while specifying the references on more advanced tuning methods at the end (e.g., Bayesian optimization). " ] }, { @@ -578,7 +585,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "We use the python library Keras to create the multilayer perceptron architecture (the documentation https://keras.io/ is an excellent resource for more details). The function defined below will generate a model object from a dictionary \"params\", which will be defined later and contains specifications about the model architecture:" + "We use the python library Keras to create the multilayer perceptron architecture (the online documentation is an excellent resource for more details - please refer to the side remark at the beginning of the tutorial for the URLs). The function defined below will generate a model object from a dictionary \"params\", which will be defined later and contains specifications about the model architecture:" ] }, { @@ -852,7 +859,7 @@ "\n", "* Evaluate the results: What can you infer from the plot showing training/validation MSE vs. epoch number (e.g., is overfitting observed)? What do the individual performance measures tell you - especially in comparison with the dataset statistics?\n", "\n", - "* Feel free to change hyperparameters by hand to get a feeling for them and possibly to improve model performance. In case training takes too long, reduce the number of neural-network parameters and/or the number of epochs.\n", + "* Feel free to change hyperparameters by hand to get a feeling for them and possibly to improve model performance. In case training takes too long, reduce the neural-network size and/or the number of epochs.\n", "\n", "* Compare the results of this tutorial to the ones in the [original reference](https://www.nature.com/articles/npjcompumats201628?report=reader) (Table 1). If they are worse, try to explain what could be improved (tip: think about how the information on inorganic compounds is encoded in \"ElemNet\" and compare it to the one of the reference. Also have a look at [this documentation](https://scikit-learn.org/stable/modules/cross_validation.html)). Furthermore, even if the performance (as, for instance, indicated by MSE) is worse, why are neural networks still useful / why do they provide an advantage compared to other \"standard\" machine learning methods (hint: Figure 6 of the original ElemNet reference and/or google \"Representation learning\")?\n", "\n", @@ -865,7 +872,7 @@ "source": [ "### 2.6 Test\n", "\n", - "Once model optimization is finished, we are ready to investigate model performance on the test set - set the following keyword to True and then run the subsequent cell." + "Once model optimization is finished, we are ready to investigate model performance on the test set - set the following keyword to True and then run the subsequent cell. Evaluate and interpret your final results. " ] }, { @@ -937,6 +944,11 @@ "* L. Ward, A. Agrawal, A. Choudhary and C. Wolverton. A general-purpose machine learning framework for predicting\n", "properties of inorganic materials. npj Computational Materials 2, 16028 (2016)\n", "\n", + "You may also have a look at this review:\n", + "\n", + "* Schmidt, J., Marques, M. R., Botti, S. & Marques, M. A. Recent advances and applications of machine learning in\n", + "solid-state materials science. npj Comput. Mater. 5, 1–36 (2019)\n", + "\n", "Furthermore, more information on (advanced) hyperparameter tuning techniques (for instance grid search, random search or Bayesian optimization for tuning the number of layers, neurons, and dropout ratios) can be found in the following references:\n", "\n", "* As a start: https://scikit-learn.org/stable/modules/grid_search.html\n", -- GitLab