From fe3dfbecc267e70312c3650e9ce6a55c7442faa1 Mon Sep 17 00:00:00 2001
From: Luigi Sbailo <sbailo@fhi-berlin.mpg.de>
Date: Tue, 8 Sep 2020 18:38:34 +0200
Subject: [PATCH] Minor

---
 descriptor_role.ipynb | 116 ++++++++++++++++++++++--------------------
 1 file changed, 61 insertions(+), 55 deletions(-)

diff --git a/descriptor_role.ipynb b/descriptor_role.ipynb
index 2c32679..39a3ce7 100644
--- a/descriptor_role.ipynb
+++ b/descriptor_role.ipynb
@@ -66,11 +66,11 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 15,
+   "execution_count": 16,
    "metadata": {
     "ExecuteTime": {
-     "end_time": "2020-09-08T13:45:48.760201Z",
-     "start_time": "2020-09-08T13:45:48.753441Z"
+     "end_time": "2020-09-08T16:36:45.838092Z",
+     "start_time": "2020-09-08T16:36:45.831885Z"
     },
     "init_cell": true
    },
@@ -159,8 +159,6 @@
     "               \n",
     "The prediction of the ground-state structure for binary compounds from a simple descriptor has a notable history in materials science [1-7], where descriptors were designed by chemically/physically-inspired intuition. The tool presented here allows for the machine-learning-aided automatic discovery of a descriptor and a model for the prediction of the difference in energy between a selected pair of structures for 82 octet binary materials.\n",
     "\n",
-    "By running the tutorial with the default setting, the (RS vs. ZB) results of the <a href=\"http://journals.aps.org/prl/abstract/10.1103/PhysRevLett.114.10550\" target=\"_blank\">PRL 2015</a> identified by the LASSO+$\\ell_0$ method can be recovered. SISSO and LASSO+$\\ell_0$ do not always yield the same results (see <a href=\"http://analytics-toolkit.nomad-coe.eu/tutorial-SIS\">compressed-sensing tutorial</a>) but in this case the default model parameters were tuned to obtain the same results.\n",
-    "Additionally, in <a href=\"http://journals.aps.org/prl/abstract/10.1103/PhysRevLett.114.10550\" target=\"_blank\">PRL-2015</a>, a slightly different criterion for the construction of the feature set was adopted, compared to <a href=\"https://journals.aps.org/prmaterials/abstract/10.1103/PhysRevMaterials.2.083802\" target=\"_blank\">PRM-2018</a>. For the sake of reproducing exactly the results of <a href=\"http://journals.aps.org/prl/abstract/10.1103/PhysRevLett.114.10550\" target=\"_blank\">PRL-2015</a>, the default settings in the input widget include \"PRL2015\" as choice for \"SISSO rung\".\n",
     "               \n",
     "   References:\n",
     "            <ol>\n",
@@ -186,16 +184,20 @@
    "source": [
     "The idea demonstrated in this tutorial is to start from simple physical quantities (\"primary features\", here properties of the constituent free atoms such as orbital radii), to generate millions (or billions) of candidate formulas by applying arithmetic operations combining primary features. These candidate formulas constitute the so-called \"feature space\". Then, SISSO is used to select only a few of these formulas that explain the data.\n",
     "\n",
-    "By clicking directly on \"Run\" below, you can reproduce results from the above publication, or you can modify the settings to produce your own results. To the purpose, in the panel below, you can select primary features and allowed operations by clicking the check-boxes. You can also select the SISSO rung (i.e., the number of iterations in the construction of the feature space), the number of features that are selected at each iteration of the SIS step, and the max number of dimensions of the model. Then, press \"Run\". After the results are shown for all models from one dimensional to the max chosen dimension, you can press \"Plot interactive map\" to reveal a map of the RS vs ZB relative stability, for the highest dimensional model. If the highest dimension model is 2D, the separation line between the two phases (i.e., the locus where the predicted $\\Delta$E is zero) is shown. For higher dimensional models, the 3rd and 4th dimensions can be controlled by the size of the Marker or the Color. Drop-down menus allow to assign axes, markers, and colors, to the descriptor components of choice"
+    "By clicking directly on \"Run\" below, you can reproduce results from the above publication, or you can modify the settings to produce your own results. To the purpose, in the panel below, you can select primary features and allowed operations by clicking the check-boxes. You can also select the SISSO rung (i.e., the number of iterations in the construction of the feature space), the number of features that are selected at each iteration of the SIS step, and the max number of dimensions of the model. Then, press \"Run\". After the results are shown for all models from one dimensional to the max chosen dimension, you can press \"Plot interactive map\" to reveal a map of the RS vs ZB relative stability, for the highest dimensional model. If the highest dimension model is 2D, the separation line between the two phases (i.e., the locus where the predicted $\\Delta$E is zero) is shown. For higher dimensional models, the 3rd and 4th dimensions can be controlled by the size of the Marker or the Color. Drop-down menus allow to assign axes, markers, and colors, to the descriptor components of choice\n",
+    "\n",
+    "\n",
+    "By running the tutorial with the default setting, the (RS vs. ZB) results of the <a href=\"http://journals.aps.org/prl/abstract/10.1103/PhysRevLett.114.10550\" target=\"_blank\">PRL 2015</a> identified by the LASSO+$\\ell_0$ method can be recovered. SISSO and LASSO+$\\ell_0$ do not always yield the same results (see <a href=\"http://analytics-toolkit.nomad-coe.eu/tutorial-SIS\">compressed-sensing tutorial</a>) but in this case the default model parameters were tuned to obtain the same results.\n",
+    "Additionally, in <a href=\"http://journals.aps.org/prl/abstract/10.1103/PhysRevLett.114.10550\" target=\"_blank\">PRL-2015</a>, a slightly different criterion for the construction of the feature set was adopted, compared to <a href=\"https://journals.aps.org/prmaterials/abstract/10.1103/PhysRevMaterials.2.083802\" target=\"_blank\">PRM-2018</a>. For the sake of reproducing exactly the results of <a href=\"http://journals.aps.org/prl/abstract/10.1103/PhysRevLett.114.10550\" target=\"_blank\">PRL-2015</a>, the default settings in the input widget include \"PRL2015\" as choice for \"SISSO rung\". In order to unlock the feature and operator selection, first select a rung different from \"PRL2015\"."
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 16,
+   "execution_count": 17,
    "metadata": {
     "ExecuteTime": {
-     "end_time": "2020-09-08T13:45:48.781091Z",
-     "start_time": "2020-09-08T13:45:48.763626Z"
+     "end_time": "2020-09-08T16:36:45.855476Z",
+     "start_time": "2020-09-08T16:36:45.842007Z"
     },
     "init_cell": true,
     "tags": [
@@ -236,11 +238,11 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 17,
+   "execution_count": 18,
    "metadata": {
     "ExecuteTime": {
-     "end_time": "2020-09-08T13:45:48.931420Z",
-     "start_time": "2020-09-08T13:45:48.785741Z"
+     "end_time": "2020-09-08T16:36:45.953707Z",
+     "start_time": "2020-09-08T16:36:45.857052Z"
     },
     "init_cell": true,
     "tags": [
@@ -300,11 +302,11 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 18,
+   "execution_count": 19,
    "metadata": {
     "ExecuteTime": {
-     "end_time": "2020-09-08T13:45:48.935450Z",
-     "start_time": "2020-09-08T13:45:48.932798Z"
+     "end_time": "2020-09-08T16:36:45.957601Z",
+     "start_time": "2020-09-08T16:36:45.955224Z"
     },
     "init_cell": true,
     "tags": [
@@ -324,11 +326,11 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 19,
+   "execution_count": 20,
    "metadata": {
     "ExecuteTime": {
-     "end_time": "2020-09-08T13:45:48.953536Z",
-     "start_time": "2020-09-08T13:45:48.937202Z"
+     "end_time": "2020-09-08T16:36:45.975731Z",
+     "start_time": "2020-09-08T16:36:45.958855Z"
     },
     "init_cell": true,
     "tags": [
@@ -386,11 +388,11 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 20,
+   "execution_count": 21,
    "metadata": {
     "ExecuteTime": {
-     "end_time": "2020-09-08T13:45:48.966509Z",
-     "start_time": "2020-09-08T13:45:48.955002Z"
+     "end_time": "2020-09-08T16:36:45.988530Z",
+     "start_time": "2020-09-08T16:36:45.977227Z"
     },
     "init_cell": true,
     "tags": [
@@ -448,47 +450,51 @@
     "            \n",
     "        global feat_space\n",
     "        global sisso\n",
-    "        feat_space, sisso = get_feat_space_and_sr(\n",
-    "            df = df_reduced,\n",
-    "            ops = allowed_operations,\n",
-    "            cols = selected_features,\n",
-    "            max_phi = tier,\n",
-    "            n_sis_select = feat_per_iter_selection.value,\n",
-    "            remove_double_divison=True,\n",
-    "            max_dim = dimension_selection.value,\n",
-    "            n_residual = 1,\n",
-    "            default = default)\n",
     "        \n",
-    "        clear_output()\n",
-    "        if (dimension_selection.value>1):\n",
-    "            plot_button.disabled=False\n",
-    "        else:\n",
-    "            plot_button.disabled=True\n",
-    "\n",
-    "        print(\"Number of features generated: \" + str(feat_space.n_feat))\n",
-    "\n",
     "        try:\n",
-    "            sisso.fit()\n",
-    "            for i in range(dimension_selection.value):\n",
-    "                print(str(i+1)+'D model')\n",
-    "                print(\"RMSE: {:.4} | Descriptor: {}\".format(sisso.models[i][0].rmse, sisso.models[i][0]))\n",
-    "                string = \"c0:{:.4}\".format(sisso.models[i][0].coefs[0][-1])\n",
-    "                for j in range(i+1):\n",
-    "                    string = string + str(\"  |  a\"+str(j)+\":{:.4}\".format(sisso.models[i][0].coefs[0][j]))\n",
-    "                print(string + '\\n')\n",
-    "                \n",
-    "                    \n",
-    "        except RuntimeError:\n",
-    "            print(\"\\nThe number of selected features per SIS iteration is bigger than the number of features available. Please reduce the number of selected features per SIS iteration (number of features generated / max number of dimensions) or increase the number of selected features and operations.\")"
+    "            feat_space, sisso = get_feat_space_and_sr(\n",
+    "                df = df_reduced,\n",
+    "                ops = allowed_operations,\n",
+    "                cols = selected_features,\n",
+    "                max_phi = tier,\n",
+    "                n_sis_select = feat_per_iter_selection.value,\n",
+    "                remove_double_divison=True,\n",
+    "                max_dim = dimension_selection.value,\n",
+    "                n_residual = 1,\n",
+    "                default = default)\n",
+    "            \n",
+    "            clear_output()\n",
+    "            if (dimension_selection.value>1):\n",
+    "                plot_button.disabled=False\n",
+    "            else:\n",
+    "                plot_button.disabled=True\n",
+    "\n",
+    "            print(\"Number of features generated: \" + str(feat_space.n_feat))\n",
+    "\n",
+    "            try:\n",
+    "                sisso.fit()\n",
+    "                for i in range(dimension_selection.value):\n",
+    "                    print(str(i+1)+'D model')\n",
+    "                    print(\"RMSE: {:.4} | Descriptor: {}\".format(sisso.models[i][0].rmse, sisso.models[i][0]))\n",
+    "                    string = \"c0:{:.4}\".format(sisso.models[i][0].coefs[0][-1])\n",
+    "                    for j in range(i+1):\n",
+    "                        string = string + str(\"  |  a\"+str(j)+\":{:.4}\".format(sisso.models[i][0].coefs[0][j]))\n",
+    "                    print(string + '\\n')\n",
+    "\n",
+    "\n",
+    "            except RuntimeError:\n",
+    "                print(\"\\nThe number of selected features per SIS iteration is bigger than the number of features available. Please reduce the number of selected features per SIS iteration (number of features generated / max number of dimensions) or increase the number of selected features and operations.\")\n",
+    "        except:\n",
+    "            print('The present selection does not lead to the creation of any derived features in the highest selected rung, please select at least one binary or power operator, or reduce the maximum rung')"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 21,
+   "execution_count": 22,
    "metadata": {
     "ExecuteTime": {
-     "end_time": "2020-09-08T13:45:49.247498Z",
-     "start_time": "2020-09-08T13:45:48.968337Z"
+     "end_time": "2020-09-08T16:36:46.292035Z",
+     "start_time": "2020-09-08T16:36:45.989512Z"
     },
     "init_cell": true,
     "scrolled": false,
@@ -500,7 +506,7 @@
     {
      "data": {
       "application/vnd.jupyter.widget-view+json": {
-       "model_id": "917dfd1e0b234443b3d6311c50764996",
+       "model_id": "ab40adb717384a099214dbf571713e57",
        "version_major": 2,
        "version_minor": 0
       },
-- 
GitLab