From 1fe4667e952ea43e39e1371591613323fcd41ccc Mon Sep 17 00:00:00 2001
From: Andreas Leitherer <leitherer@fhi-berlin.mpg.de>
Date: Mon, 4 Jan 2021 09:02:12 +0100
Subject: [PATCH] Small changes in text

---
 nn_regression.ipynb | 433 ++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 398 insertions(+), 35 deletions(-)

diff --git a/nn_regression.ipynb b/nn_regression.ipynb
index 52d91f8..f0c4153 100644
--- a/nn_regression.ipynb
+++ b/nn_regression.ipynb
@@ -72,7 +72,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 1,
    "metadata": {
     "ExecuteTime": {
      "end_time": "2020-05-22T14:48:01.350451Z",
@@ -80,7 +80,15 @@
     },
     "scrolled": true
    },
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "Using TensorFlow backend.\n"
+     ]
+    }
+   ],
    "source": [
     "# Plotting\n",
     "%matplotlib inline\n",
@@ -161,7 +169,7 @@
     "\n",
     "$(\\mathbf{x}^{(1)}, y^{(1)}), (\\mathbf{x}^{(2)}, y^{(2)}), ..., (\\mathbf{x}^{(m)}, y^{(m)})$.\n",
     "\n",
-    "Weights and bias term of the model (here: $w_1, w_2, b$) are optimized by minimizing a *loss function* $L(w_1, w_2, b)$ that quantifies how the predicted values $\\hat{y}^{(i)}$ deviate from the true values $y^{(i)}$ - as a function of the model parameters. We will see an explicit form for $L(w_1, w_2, b)$ in context of regression in section 2 of this tutorial.   Usually gradient decent is used to find the parameters minimizing $L$ (with modifications of gradient decent enabling faster convergence, see, for instance chapter 5 and 8 of [this](https://www.deeplearningbook.org/) standard reference). After finishing optimization, in case of classification, model performance can be assessed via the classification accuracy (# of correct predictions divided by # of total predictions).\n",
+    "Weights and bias term of the model (here: $w_1, w_2, b$) are optimized by minimizing a *loss function* $L(w_1, w_2, b)$ that quantifies how the predicted values $\\hat{y}^{(i)}$ deviate from the true values $y^{(i)}$ - as a function of the model parameters. We will see an explicit form for $L(w_1, w_2, b)$ in context of regression in section 2 of this tutorial.   Usually gradient decent is used to find the parameters minimizing $L$ (with modifications of gradient decent enabling for instance faster convergence - see chapter 5 and 8 of [this](https://www.deeplearningbook.org/) standard reference). After finishing optimization, in case of classification, model performance can be assessed via the classification accuracy (# of correct predictions divided by # of total predictions).\n",
     "\n",
     "We will explain the optimization procedure in more detail in section 2 of the tutorial. In this example, one can think of the training / optimization phase as changing the model parameters such that the optimal position of a straight line (see above figure, right) is found, which serves as a decision boundary between the two classes. "
    ]
@@ -178,7 +186,7 @@
     "\n",
     "Rectified linear unit (ReLU): $f(x) = max(0, x)$\n",
     "\n",
-    "The ReLU activation function is most frequently used. Non-linear functions are essential to increase the space of possible (complex) functions that the model can learn. If  no activation function would be used, i.e., the identity - also called *linear activation function*- the class of possible functions that the model can represent would be drastically reduced.\n",
+    "The ReLU activation function is most frequently used. Note that the use of non-linear functions is essential: if  no activation function would be used, i.e., the identity - also called *linear activation function*- the class of possible functions that the model can represent would be drastically reduced.\n",
     "\n",
     "![activation_functions.png](./assets/nn_regression/activation_functions.png)"
    ]
@@ -189,7 +197,7 @@
    "source": [
     "### 1.2 Multilayer peceptron\n",
     "\n",
-    "Extending the idea of simple perceptrons, one can construct multilayer perceptrons as sequences of layers. \n",
+    "Extending the idea of simple perceptrons, one can construct multilayer perceptrons as a sequence of layers. \n",
     "Each layer consists of a predefined number of neurons, where the neurons of the first layer (the *input layer*) correspond to the input features $\\mathbf{x} = (x_1, x_2, ...)$. The subsequent layers are called *hidden layers*. The individual neurons in each hidden layer are a linear combination of neurons from the previous layer. For instance, the *activation value* $a_1$ highlighted in the figure below is computed the following way:\n",
     "\n",
     "$\\begin{equation*}\n",
@@ -223,7 +231,19 @@
     "\n",
     "The final activation function $f^\\prime$ is chosen in a specific way, usually depending on the task being either  regression or classification - we will come back to this later.\n",
     "\n",
-    "To simplify the above expression for $\\mathbf{o}$, one can change the definition of input vector and weight matrices such that the bias terms can be omitted. We denote the input vector as before and introduce weight matrices W, W$^\\prime$, which yields a more compact expression for the output:  \n",
+    "To simplify the above expression for $\\mathbf{o}$, it is common to change the definition of input vector and weight matrices such that the bias terms can be omitted. To illustrate this, we consider the simplified case of two input features and two activations $a_1 = w_{11}x_1 + w_{12}x_2 + b_1$ and $a_2 = w_{21}x_1 + w_{22}x_2 + b_2$. Then, A and b are defined as  \n",
+    "$\\begin{equation*}\n",
+    "A = \\begin{bmatrix}w_{11} & w_{12}\\\\w_{21} & w_{22}\\end{bmatrix}, b = \\begin{bmatrix}b_1 \\\\ b_2 \\end{bmatrix}.\n",
+    "\\end{equation*}$\n",
+    "\n",
+    "Introducing the new definitions \n",
+    "\n",
+    "$\\begin{equation*}\n",
+    "W = \\begin{bmatrix} b_1 & w_{11} & w_{12}\\\\ b_2 & w_{21} & w_{22}\\end{bmatrix}, x = \\begin{bmatrix} 1 \\\\ x_1 \\\\ x_2 \\end{bmatrix}\n",
+    "\\end{equation*}$\n",
+    "\n",
+    "allows to omit the bias term and replace $A\\mathbf{x}+\\mathbf{b}$ with $W\\mathbf{x}$. Coming back to the previous example, \n",
+    "we  introduce weight matrices W, W$^\\prime$, which yields a more compact expression for the output:  \n",
     "\n",
     "$\\begin{equation*}\n",
     "\\mathbf{o} = f^\\prime (W^\\prime \\mathbf{a}) = f^\\prime (W^\\prime f(W \\mathbf{x})).\n",
@@ -259,7 +279,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "To illustrate the usefulness of softmax activation functions, let us consider the case of crystal-structure classification. The task is to assign the correct (symmetry) label to a given, unknown crystal structure as defined by atomic positions and chemical species. For instance, possible assignments could be face-centered-cubic, body-centered-cubic, diamond or hexagonal closed packed - a collection of structures that covers more than 80% of the elemental solids. More thorough explanations on deep learning applied to crystal-structure classification can be found  [here](https://www.nature.com/articles/s41467-018-05169-6). When applying the multilayer perceptron architecture which we introduced above, each of the four output neurons correspond to a specific crystal structure. The use of the softmax activation function guarantees that all output activations sum to one, which is why the output vector $\\mathbf{o}$ can be considered as a vector of classification probabilities. For instance, if $\\mathbf{o} = (1, 0, 0, 0)$, the input structure is predicted to have fcc symmetry with 100\\% probability (see figure below). This is also called \"one-hot-encoding\" and corresponds to representing a given number N of classes in the standard basis in $\\mathbb{R}^\\text{N}$, i.e., by N vectors $e_i = (0, ...0, 1, 0, ..., 0)$, for $i=1, ..., N$ and all components of $e_i$ being zero except for the $i$th entry. \n",
+    "To illustrate the usefulness of softmax activation functions, let us consider the case of crystal-structure classification. The task is to assign the correct (symmetry) label to a given, unknown crystal structure as defined by atomic positions and chemical species. For instance, possible assignments could be face-centered-cubic, body-centered-cubic, diamond or hexagonal closed packed - a collection of structures that covers more than 80% of the elemental solids. More thorough explanations on deep learning applied to crystal-structure classification can be found  [here](https://www.nature.com/articles/s41467-018-05169-6). When applying the multilayer perceptron architecture that we introduced above, each of the four output neurons correspond to a specific crystal structure. The use of the softmax activation function guarantees that all output activations sum to one, which is why the output vector $\\mathbf{o}$ can be considered as a vector of classification probabilities. For instance, if $\\mathbf{o} = (1, 0, 0, 0)$, the input structure is predicted to have fcc symmetry with 100\\% probability (see figure below). This is also called \"one-hot-encoding\" and corresponds to representing a given number of classes N in the standard basis in $\\mathbb{R}^\\text{N}$, i.e., by N vectors $e_i = (0, ...0, 1, 0, ..., 0)$, for $i=1, ..., N$ and all components of $e_i$ being zero except for the $i$th entry. \n",
     "\n",
     "<img src=\"./assets/nn_regression/cs_classification_first_example.png\" width=\"1700\">\n",
     "\n"
@@ -303,7 +323,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "The deep-learning model behind \"ElemNet\" is essentially a multilayer perceptron with the input vector (the representation, i.e, the descriptor of inorganic compounds) being chosen in a very specific way: \n",
+    "The deep-learning model behind \"ElemNet\" is essentially a multilayer perceptron with the input vector being chosen in a very specific way: \n",
     "Each compound is represented by a feature vector $\\mathbf{f}$ of fixed length, whose components correspond to the elements of the periodic table. They are sorted according to the atomic number Z in ascending order (i.e., the first component of $\\mathbf{f}$ corresponds to hydrogen, the second to Helium etc.).\n",
     "For instance, given a binary compound $\\text{A}_x \\text{B}_y$ with $x+y=1$, all entries of $\\mathbf{f}$ are zero except those corresponding to element A and B. For these entries, the relative stoichiometric attributes x and y are assigned. In case of NaCl (rock salt), the representation would be \n",
     "\n",
@@ -341,7 +361,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 2,
    "metadata": {
     "ExecuteTime": {
      "end_time": "2020-05-22T14:08:25.380307Z",
@@ -349,7 +369,88 @@
     },
     "scrolled": true
    },
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>vol_per_atom</th>\n",
+       "      <th>composition</th>\n",
+       "      <th>number_of_elements</th>\n",
+       "      <th>stoichiometry_dicts</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>17.8351</td>\n",
+       "      <td>Li1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>{'Li': 1}</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>22.9639</td>\n",
+       "      <td>Mg1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>{'Mg': 1}</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>41.4146</td>\n",
+       "      <td>Kr1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>{'Kr': 1}</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>32.9826</td>\n",
+       "      <td>Na1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>{'Na': 1}</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>15.2088</td>\n",
+       "      <td>Pd1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>{'Pd': 1}</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "   vol_per_atom composition  number_of_elements stoichiometry_dicts\n",
+       "0       17.8351         Li1                   1           {'Li': 1}\n",
+       "1       22.9639         Mg1                   1           {'Mg': 1}\n",
+       "2       41.4146         Kr1                   1           {'Kr': 1}\n",
+       "3       32.9826         Na1                   1           {'Na': 1}\n",
+       "4       15.2088         Pd1                   1           {'Pd': 1}"
+      ]
+     },
+     "execution_count": 2,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "df = pd.read_pickle('./data/nn_regression/OQMD_Ward_et_al_2016_df.pkl')\n",
     "\n",
@@ -376,14 +477,45 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 3,
    "metadata": {
     "ExecuteTime": {
      "end_time": "2020-05-22T14:08:25.710996Z",
      "start_time": "2020-05-22T14:08:25.382690Z"
     }
    },
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "image/png": "\n",
+      "text/plain": [
+       "<Figure size 432x288 with 2 Axes>"
+      ]
+     },
+     "metadata": {
+      "needs_background": "light"
+     },
+     "output_type": "display_data"
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "Statistics of the target property:\n",
+      "\n",
+      "count    347227.000000\n",
+      "mean         22.157036\n",
+      "std           8.613623\n",
+      "min           2.723960\n",
+      "25%          16.116350\n",
+      "50%          20.881500\n",
+      "75%          26.504150\n",
+      "max          99.884500\n",
+      "Name: vol_per_atom, dtype: float64\n"
+     ]
+    }
+   ],
    "source": [
     "df.hist()\n",
     "plt.show()\n",
@@ -400,14 +532,51 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 4,
    "metadata": {
     "ExecuteTime": {
      "end_time": "2020-05-22T14:08:26.547127Z",
      "start_time": "2020-05-22T14:08:25.713822Z"
     }
    },
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Total number of datapoints: 347227\n",
+      "\n",
+      "Compounds with 1 element(s) appear 2472 times in the dataset\n",
+      "Compounds with 2 element(s) appear 99411 times in the dataset\n",
+      "Compounds with 3 element(s) appear 235658 times in the dataset\n",
+      "Compounds with 4 element(s) appear 8248 times in the dataset\n",
+      "Compounds with 5 element(s) appear 1320 times in the dataset\n",
+      "Compounds with 6 element(s) appear 112 times in the dataset\n",
+      "Compounds with 7 element(s) appear 6 times in the dataset\n",
+      "\n",
+      "The following elements (in total 89) appear in the dataset:\n",
+      "\n",
+      " ['H' 'He' 'Li' 'Be' 'B' 'C' 'N' 'O' 'F' 'Ne' 'Na' 'Mg' 'Al' 'Si' 'P' 'S'\n",
+      " 'Cl' 'Ar' 'K' 'Ca' 'Sc' 'Ti' 'V' 'Cr' 'Mn' 'Fe' 'Co' 'Ni' 'Cu' 'Zn' 'Ga'\n",
+      " 'Ge' 'As' 'Se' 'Br' 'Kr' 'Rb' 'Sr' 'Y' 'Zr' 'Nb' 'Mo' 'Tc' 'Ru' 'Rh' 'Pd'\n",
+      " 'Ag' 'Cd' 'In' 'Sn' 'Sb' 'Te' 'I' 'Xe' 'Cs' 'Ba' 'La' 'Ce' 'Pr' 'Nd' 'Pm'\n",
+      " 'Sm' 'Eu' 'Gd' 'Tb' 'Dy' 'Ho' 'Er' 'Tm' 'Yb' 'Lu' 'Hf' 'Ta' 'W' 'Re' 'Os'\n",
+      " 'Ir' 'Pt' 'Au' 'Hg' 'Tl' 'Pb' 'Bi' 'Ac' 'Th' 'Pa' 'U' 'Np' 'Pu']\n"
+     ]
+    },
+    {
+     "data": {
+      "image/png": "\n",
+      "text/plain": [
+       "<Figure size 1800x720 with 1 Axes>"
+      ]
+     },
+     "metadata": {
+      "needs_background": "light"
+     },
+     "output_type": "display_data"
+    }
+   ],
    "source": [
     "print(\"Total number of datapoints: {}\\n\".format(len(y_vol_per_atom)))\n",
     "\n",
@@ -450,7 +619,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 5,
    "metadata": {
     "ExecuteTime": {
      "end_time": "2020-05-22T14:08:26.640466Z",
@@ -473,14 +642,22 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 6,
    "metadata": {
     "ExecuteTime": {
      "end_time": "2020-05-22T14:08:26.645227Z",
      "start_time": "2020-05-22T14:08:26.642010Z"
     }
    },
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Input shape = (347227, 89), target shape = (347227,)\n"
+     ]
+    }
+   ],
    "source": [
     "print(\"Input shape = {}, target shape = {}\".format(X_ElemNet.shape, y_vol_per_atom.shape))"
    ]
@@ -501,7 +678,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 7,
    "metadata": {
     "ExecuteTime": {
      "end_time": "2020-05-22T14:08:26.793126Z",
@@ -529,14 +706,25 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 8,
    "metadata": {
     "ExecuteTime": {
      "end_time": "2020-05-22T14:08:26.799383Z",
      "start_time": "2020-05-22T14:08:26.795789Z"
     }
    },
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'\\nscaler = StandardScaler()\\nX = scaler.fit_transform(X)\\nX_test = scaler.transform(X_test)\\n'"
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "\"\"\"\n",
     "scaler = StandardScaler()\n",
@@ -554,7 +742,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 9,
    "metadata": {
     "ExecuteTime": {
      "end_time": "2020-05-22T14:08:26.921060Z",
@@ -590,7 +778,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 10,
    "metadata": {
     "ExecuteTime": {
      "end_time": "2020-05-22T14:08:26.928103Z",
@@ -663,14 +851,66 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 11,
    "metadata": {
     "ExecuteTime": {
      "end_time": "2020-05-22T14:08:27.092710Z",
      "start_time": "2020-05-22T14:08:26.930206Z"
     }
    },
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Model: \"model_1\"\n",
+      "_________________________________________________________________\n",
+      "Layer (type)                 Output Shape              Param #   \n",
+      "=================================================================\n",
+      "x_input (InputLayer)         (None, 89)                0         \n",
+      "_________________________________________________________________\n",
+      "dense_1 (Dense)              (None, 512)               46080     \n",
+      "_________________________________________________________________\n",
+      "dropout_1 (Dropout)          (None, 512)               0         \n",
+      "_________________________________________________________________\n",
+      "dense_2 (Dense)              (None, 256)               131328    \n",
+      "_________________________________________________________________\n",
+      "dropout_2 (Dropout)          (None, 256)               0         \n",
+      "_________________________________________________________________\n",
+      "dense_3 (Dense)              (None, 128)               32896     \n",
+      "_________________________________________________________________\n",
+      "dropout_3 (Dropout)          (None, 128)               0         \n",
+      "_________________________________________________________________\n",
+      "dense_4 (Dense)              (None, 64)                8256      \n",
+      "_________________________________________________________________\n",
+      "dropout_4 (Dropout)          (None, 64)                0         \n",
+      "_________________________________________________________________\n",
+      "dense_5 (Dense)              (None, 32)                2080      \n",
+      "_________________________________________________________________\n",
+      "dropout_5 (Dropout)          (None, 32)                0         \n",
+      "_________________________________________________________________\n",
+      "dense_6 (Dense)              (None, 18)                594       \n",
+      "_________________________________________________________________\n",
+      "dropout_6 (Dropout)          (None, 18)                0         \n",
+      "_________________________________________________________________\n",
+      "dense_7 (Dense)              (None, 8)                 152       \n",
+      "_________________________________________________________________\n",
+      "dropout_7 (Dropout)          (None, 8)                 0         \n",
+      "_________________________________________________________________\n",
+      "dense_8 (Dense)              (None, 4)                 36        \n",
+      "_________________________________________________________________\n",
+      "dropout_8 (Dropout)          (None, 4)                 0         \n",
+      "_________________________________________________________________\n",
+      "dense_9 (Dense)              (None, 1)                 5         \n",
+      "=================================================================\n",
+      "Total params: 221,427\n",
+      "Trainable params: 221,427\n",
+      "Non-trainable params: 0\n",
+      "_________________________________________________________________\n",
+      "None\n"
+     ]
+    }
+   ],
    "source": [
     "batch_size = 64\n",
     "epochs = 30\n",
@@ -698,7 +938,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 12,
    "metadata": {
     "ExecuteTime": {
      "end_time": "2020-05-22T14:15:47.803232Z",
@@ -706,7 +946,75 @@
     },
     "scrolled": true
    },
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Train on 222224 samples, validate on 55557 samples\n",
+      "Epoch 1/30\n",
+      "222224/222224 [==============================] - 11s 52us/step - loss: 12.1375 - val_loss: 6.3870\n",
+      "Epoch 2/30\n",
+      "222224/222224 [==============================] - 11s 49us/step - loss: 6.3106 - val_loss: 6.1132\n",
+      "Epoch 3/30\n",
+      "222224/222224 [==============================] - 11s 49us/step - loss: 5.9615 - val_loss: 5.2210\n",
+      "Epoch 4/30\n",
+      "222224/222224 [==============================] - 12s 55us/step - loss: 5.6623 - val_loss: 5.3325\n",
+      "Epoch 5/30\n",
+      "222224/222224 [==============================] - 12s 54us/step - loss: 5.5168 - val_loss: 5.2384\n",
+      "Epoch 6/30\n",
+      "222224/222224 [==============================] - 11s 50us/step - loss: 5.3398 - val_loss: 6.1497\n",
+      "Epoch 7/30\n",
+      "222224/222224 [==============================] - 11s 50us/step - loss: 5.2586 - val_loss: 4.9812\n",
+      "Epoch 8/30\n",
+      "222224/222224 [==============================] - 12s 54us/step - loss: 5.1311 - val_loss: 5.3434\n",
+      "Epoch 9/30\n",
+      "222224/222224 [==============================] - 12s 53us/step - loss: 5.0557 - val_loss: 4.9242\n",
+      "Epoch 10/30\n",
+      "222224/222224 [==============================] - 12s 53us/step - loss: 4.9974 - val_loss: 5.0442\n",
+      "Epoch 11/30\n",
+      "222224/222224 [==============================] - 12s 53us/step - loss: 4.9424 - val_loss: 4.9540\n",
+      "Epoch 12/30\n",
+      "222224/222224 [==============================] - 12s 54us/step - loss: 4.8965 - val_loss: 5.1100\n",
+      "Epoch 13/30\n",
+      "222224/222224 [==============================] - 11s 51us/step - loss: 4.8423 - val_loss: 5.0179\n",
+      "Epoch 14/30\n",
+      "222224/222224 [==============================] - 11s 48us/step - loss: 4.7907 - val_loss: 4.7615\n",
+      "Epoch 15/30\n",
+      "222224/222224 [==============================] - 11s 51us/step - loss: 4.7590 - val_loss: 4.8152\n",
+      "Epoch 16/30\n",
+      "222224/222224 [==============================] - 11s 49us/step - loss: 4.7547 - val_loss: 4.7692\n",
+      "Epoch 17/30\n",
+      "222224/222224 [==============================] - 11s 48us/step - loss: 4.6649 - val_loss: 5.0429\n",
+      "Epoch 18/30\n",
+      "222224/222224 [==============================] - 12s 56us/step - loss: 4.6749 - val_loss: 4.7545\n",
+      "Epoch 19/30\n",
+      "222224/222224 [==============================] - 11s 50us/step - loss: 4.6553 - val_loss: 4.7230\n",
+      "Epoch 20/30\n",
+      "222224/222224 [==============================] - 11s 51us/step - loss: 4.6322 - val_loss: 4.7478\n",
+      "Epoch 21/30\n",
+      "222224/222224 [==============================] - 12s 52us/step - loss: 4.5951 - val_loss: 4.7244\n",
+      "Epoch 22/30\n",
+      "222224/222224 [==============================] - 12s 55us/step - loss: 4.5834 - val_loss: 4.7551\n",
+      "Epoch 23/30\n",
+      "222224/222224 [==============================] - 11s 52us/step - loss: 4.5497 - val_loss: 4.6306\n",
+      "Epoch 24/30\n",
+      "222224/222224 [==============================] - 13s 59us/step - loss: 4.5465 - val_loss: 4.7262\n",
+      "Epoch 25/30\n",
+      "222224/222224 [==============================] - 13s 56us/step - loss: 4.5242 - val_loss: 4.6449\n",
+      "Epoch 26/30\n",
+      "222224/222224 [==============================] - 13s 57us/step - loss: 4.5048 - val_loss: 4.7496\n",
+      "Epoch 27/30\n",
+      "222224/222224 [==============================] - 14s 62us/step - loss: 4.4913 - val_loss: 4.5988\n",
+      "Epoch 28/30\n",
+      "222224/222224 [==============================] - 15s 67us/step - loss: 4.4824 - val_loss: 4.7221\n",
+      "Epoch 29/30\n",
+      "222224/222224 [==============================] - 13s 59us/step - loss: 4.4582 - val_loss: 4.6347\n",
+      "Epoch 30/30\n",
+      "222224/222224 [==============================] - 13s 58us/step - loss: 4.4414 - val_loss: 4.8915\n"
+     ]
+    }
+   ],
    "source": [
     "history = model.fit(X_train, y_train,\n",
     "                    validation_data = (X_val, y_val),\n",
@@ -723,14 +1031,27 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 13,
    "metadata": {
     "ExecuteTime": {
      "end_time": "2020-05-22T14:15:47.945244Z",
      "start_time": "2020-05-22T14:15:47.805420Z"
     }
    },
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "image/png": "\n",
+      "text/plain": [
+       "<Figure size 432x288 with 1 Axes>"
+      ]
+     },
+     "metadata": {
+      "needs_background": "light"
+     },
+     "output_type": "display_data"
+    }
+   ],
    "source": [
     "import matplotlib.pyplot as plt\n",
     "\n",
@@ -768,14 +1089,27 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 14,
    "metadata": {
     "ExecuteTime": {
      "end_time": "2020-05-22T14:15:53.924363Z",
      "start_time": "2020-05-22T14:15:47.947273Z"
     }
    },
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Training: MAE: 1.011 | MSE: 4.305 | RMSE: 2.075 | Pearson: 0.972\n",
+      "Dataset: Min: 2.726 | Max: 98.717 | Mean: 22.149 | Std.dev.: 8.588\n",
+      "\n",
+      "\n",
+      "Validation: MAE: 1.070 | MSE: 4.892 | RMSE: 2.212 | Pearson: 0.969\n",
+      "Dataset: Min: 2.724 | Max: 99.106 | Mean: 22.167 | Std.dev.: 8.672\n"
+     ]
+    }
+   ],
    "source": [
     "def rmse(y_pred, y_true):\n",
     "    return np.sqrt(((y_pred - y_true) ** 2).mean())\n",
@@ -830,14 +1164,25 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 15,
    "metadata": {
     "ExecuteTime": {
      "end_time": "2020-05-22T14:15:56.919219Z",
      "start_time": "2020-05-22T14:15:53.926422Z"
     }
    },
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "image/png": "\n",
+      "text/plain": [
+       "<Figure size 432x432 with 3 Axes>"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
    "source": [
     "y_true = y_true_val\n",
     "y_pred = y_pred_val\n",
@@ -877,7 +1222,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 16,
    "metadata": {
     "ExecuteTime": {
      "end_time": "2020-05-22T14:15:56.923919Z",
@@ -891,7 +1236,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 17,
    "metadata": {
     "ExecuteTime": {
      "end_time": "2020-05-22T14:16:02.573238Z",
@@ -899,7 +1244,25 @@
     },
     "scrolled": true
    },
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Test: MAE: 1.082 | MSE: 4.913 | RMSE: 2.217 | Pearson: 0.968\n"
+     ]
+    },
+    {
+     "data": {
+      "image/png": "\n",
+      "text/plain": [
+       "<Figure size 432x432 with 3 Axes>"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
    "source": [
     "if investigate_test_set:\n",
     "    y_pred_test = model.predict(X_test).flatten()\n",
-- 
GitLab