"object": "<script>\nvar beaker = bkHelper.getBeakerObject().beakerObj;\n</script>\n<style type=\"text/css\">\n/*!\n * Nomad Beaker Notebook Template\n *\n * @copyright Copyright 2017 Fritz Haber Institute of the Max Planck Society,\n * Benjamin Regler - Apache 2.0 License\n * @license http://www.apache.org/licenses/LICENSE-2.0\n * @author Benjamin Regler\n * @version 1.0.0\n *\n * Licensed under the Apache License, Version 2.0 (the \"License\");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n * \n * http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an \"AS IS\" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\np{margin-bottom:1.3em}h1,h2,h3,h4{margin:1.414em 0 .5em;font-weight:inherit;line-height:1.2}h1{margin-top:0;font-size:3.998em}h2{font-size:2.827em}h3{font-size:1.999em}h4{font-size:1.414em}.font_small,small{font-size:.707em}.notebook-container{font-size:16px}.notebook-container .bkr{font-size:100%;font-weight:400;line-height:1.45;color:#333}.nomad--header h2{color:#20335d;font-weight:700;margin:0 0 .2em}.nomad--header h3{color:#20335d;font-weight:700;margin-top:0;text-indent:-1em;padding-left:1em}.nomad--header h3:before{content:\"\\2014\";padding-right:.25em}.nomad--header .nomad--description{margin:-1em 0 0 2em}.atomic-data--block,.nomad--last-updated{display:inline-block;margin-top:1em}.nomad--last-updated{color:grey;float:right;position:relative;z-index:1}.nomad--last-updated::before{bottom:-75%;content:attr(data-version);font-size:4em;font-weight:700;opacity:.2;position:absolute;right:0}.atomic-data label{display:block;font-size:medium;font-weight:700}.atomic-data--select,.chosen-container{width:100%!important}.atomic-data--select:disabled{color:#d3d3d3}.atomic-data--reset-buton{display:inline-block;margin-top:1.6em;width:100%}.modal-dialog{max-width:1000px;width:80%}.modal-header h1{font-size:2em;line-height:1.2}.modal-dialog h2{font-size:1.414em}.modal-dialog h2:first-child{margin-top:0}.modal-dialog h3{font-size:1.2em}.modal-dialog dt{font-size:larger;margin-top:1.414em}.modal-dialog img{width:100%}.modal-dialog .authors{text-transform:uppercase}\n</style>"
" <h3> Prediction of topological quantum phase transitions </h3>",
" <p class=\"nomad--description\">",
" created by:",
" Carlos Mera Acosta<sup>1,2</sup>,",
" Emre Ahmetcik<sup> 1</sup>,",
" Adalberto Fazzio<sup>2</sup>, ",
" Luca Ghiringhelli<sup>1</sup>,",
" and Matthias Scheffler<sup>1</sup> <br><br>",
" ",
" <sup>1</sup> Fritz Haber Institute of the Max Planck Society, Faradayweg 4-6, D-14195 Berlin, Germany <br>",
" <sup>2</sup> Brazilian Nanotechnology National Laboratory, Campinas, SP, Brazil <br>",
" <span class=\"nomad--last-updated\" data-version=\"v1.0.0\">[Last updated: November 9, 2017]</span>",
" </p>",
"</div>",
"",
"</div> ",
"",
"<div style='text-align: right;'>",
"<a href=\"https://analytics-toolkit.nomad-coe.eu/home/\" class=\"btn btn-primary\" style=\"font-size:larger;\">Back to Analytics Home</a> ",
"<a href=\"https://www.nomad-coe.eu/\" class=\"btn btn-primary\" style=\"font-size:larger;\">Back to NOMAD CoE Home</a> ",
"</div>",
""
],
"hidden": true
},
"output": {
"state": {},
"selectedType": "BeakerDisplay",
"height": 391,
"result": {
"type": "BeakerDisplay",
"innertype": "Html",
"object": "<script>\nvar beaker = bkHelper.getBeakerObject().beakerObj;\n</script>\n<div id=\"teaser\" style=\"background-color: rgba(149,170,79, 1.0); background-position: right center; background-size: 200px; background-repeat: no-repeat; \n padding-top: 20px;\n padding-right: 10px;\n padding-bottom: 50px;\n padding-left: 80px;\"> \n\n <div class=\"nomad--header\">\n <div style=\"text-align:center\">\n <h2> <img id=\"nomad\" src=\"https://nomad-coe.eu/uploads/nomad/images/NOMAD_Logo2.png\" alt=\"NOMAD Logo\" height=\"100\"> NOMAD Analytics Toolkit \n <img id=\"nomad\" src=\"https://www.nomad-coe.eu/uploads/nomad/backgrounds/head_big-data_analytics_2.png\" alt=\"NOMAD Logo\" height=\"80\"> </h2>\n </div>\n <h3> Prediction of topological quantum phase transitions </h3>\n <p class=\"nomad--description\">\n created by:\n Carlos Mera Acosta<sup>1,2</sup>,\n Emre Ahmetcik<sup> 1</sup>,\n Adalberto Fazzio<sup>2</sup>, \n Luca Ghiringhelli<sup>1</sup>,\n and Matthias Scheffler<sup>1</sup> <br><br>\n \n <sup>1</sup> Fritz Haber Institute of the Max Planck Society, Faradayweg 4-6, D-14195 Berlin, Germany <br>\n <sup>2</sup> Brazilian Nanotechnology National Laboratory, Campinas, SP, Brazil <br>\n <span class=\"nomad--last-updated\" data-version=\"v1.0.0\">[Last updated: November 9, 2017]</span>\n </p>\n</div>\n\n</div> \n\n<div style=\"text-align: right;\">\n<a href=\"https://analytics-toolkit.nomad-coe.eu/home/\" class=\"btn btn-primary\" style=\"font-size:larger;\">Back to Analytics Home</a> \n<a href=\"https://www.nomad-coe.eu/\" class=\"btn btn-primary\" style=\"font-size:larger;\">Back to NOMAD CoE Home</a> \n</div>\n"
},
"elapsedTime": 0
},
"evaluatorReader": true,
"lineCount": 33
},
{
"id": "markdownKOPYxK",
"type": "markdown",
"body": [
"### Introduction ",
"",
"This tutorial shows how to find descriptive parameters (short formulas) for the prediction of topological phase transitions. As an example, we address the topological classification of two-dimensional functionalized honeycomb-lattice materials, which are formally described by the Z<sub>2</sub> topological invariant, i.e., Z<sub>2</sub>=0 for trivial (normal) insulators and Z<sub>2</sub>=1 for two-dimensional topological insulators (quantum spin Hall insulators).",
"Using a recently developed machine learning based on compressed sensing, we then derive a map of these materials, in which metals, trivial insulators, and quantum spin Hall insulators are separated in different spatial domains. The axes of this map are given by a physically meaningful descriptor, i.e., a non-linear analytic function that only depends on the properties of the material's constituent atoms, but not on the properties of the material itself.",
"The method is based on the algorithm <u>s</u>ure <i><u>i</u>ndependence <u>s</u>creening and <u>s</u>parsifying <u>o</u>perator</i> (SISSO), which enables to search for optimal descriptors by scanning huge feature spaces. ",
"R. Ouyang, S. Curtarolo, E. Ahmetcik, M. Scheffler, L. M. Ghiringhelli: <span style=\"font-style: italic;\">SISSO: a compressed-sensing method for systematically identifying efficient physical models of materials properties, </span> <a href=\"https://arxiv.org/abs/1710.03319\">https://arxiv.org/abs/1710.03319</a> (2017). <br>",
"You can download the code <a href=\"https://github.com/rouyang2017/SISSO\">here</a> .",
"</div>",
" Click first <b>Reference settings</b> and afterwards <b>RUN</b> to reproduce the results from this publication; click <b>Background</b> for an explanation of the approach; or, modify <b>Settings</b> to produce your own results.",
"",
"<b> Idea: </b> Starting from simple physical quantities (\"building blocks\", here properties of the constituent free atoms such as orbital radii), millions (or billions) of candidate formulas are generated by applying arithmetic operations combining building blocks, for example forming sums and products of them. These candidate formulas constitute the so-called \"feature space\". Then a feature selection method (SISSO) is used to select only a few of these formulas that explain the data."
" <p>We present a tool for predicting topological phase transition in functionalized ghaphene-like materials, by using a set of descriptive parameters (a descriptor) based on free-atom data of the atomic species constituting the material. We first compute the Z<sub>2</sub>-invariant for a representative set of materials from first principles, identifying trivial insulators, QSHIs that are independent on the functionalization (FI-QSHIs), QSHIs that depend on the functionalization (FD-QSHIs), and metals. ",
"We then apply a newly developed method: sure independence screening and sparsifying operator (SISSO), that allows to find an optimal descriptor in a huge feature space containing billions of features. In this tutorial an $\\ell$<sub>0</sub>-optimization is used as the sparsifying operator. The method is described in:",
"R. Ouyang, S. Curtarolo, E. Ahmetcik, M. Scheffler, L. M. Ghiringhelli: <span style=\"font-style: italic;\">SISSO: a compressed-sensing method for systematically identifying efficient physical models of materials properties, </span> <a href=\" https://arxiv.org/abs/1710.03319\">https://arxiv.org/abs/1710.03319</a> (2017). </div>",
"By running the tutorial with the reference settings ( click <b>Reference settings</b> , then <b>RUN</b>) the classification for trivial insulators, FI-QSHIs, and FD-QSHIs is obtained. In particular, by clicking on “View interactive 2D plot”, an interactive classification map will be opened in a new tab:",
"<figcaption>Perfect classification (100\\%) of trivial/QSHIs for 160 materials (80 trivial insulators, 22 FD-QSHIs, and 58 FI-QSHIs). Symbols: $EA$ electron affinity, $E_{h}$ atomic HOMO, $r$<sub>s, p</sub> orbital extansion of s- and p-orbitals, and $Z$ atomic number. ",
"The symbols' color is used to distinguish between FD-QSHIs~(blue), FI-QSHIs~(green), and trivial insulators~(black). The same color-code is used to highlight the different regions identified by the SISSO descriptors.</figcaption>",
"<figure>",
"</center>",
"</p>",
"",
"<p>SISSO($\\ell_{0}$) for classification seeks in an iterative approach for that desriptor in whose space the overlap between class domains is minimal, where a class domain is represented by the convex hull of the training data. In the first iteration, a number $k$ of features is collected which seperate the convex hulls best. The feature with the lowest domain overlap is simply the 1D descriptor. In the next iteration, a new set of $k$ features is selected, now as those seperating the unclassified data from the first iteration best. The 2D descriptor is the pair of features that yield the smallest overlap region, among all possible pairs contained in the union of the sets selected in this and the first iteration. In each next iteration new set of $k$ features is extracted as those that separate the unclassified data from the previous step best. The $n$D descriptor is the $n$-tuple of features that yield the smallest convex hull overlap regions, among all possible $n$-tuples contained in the union of the sets obtained in each new iteration and all the previous iterations. ",
"</p>",
"",
" <p>Note that this tutorial is constrained to seek for 2D descriptors only. SISSO can also be applied to regressions problems as i.e. described in the <a href=\" https://analytics-toolkit.nomad-coe.eu/notebook-edit/data/shared/tutorials/sis_cscl.bkr\"> SISSO tutorial for the prediction of the energy difference between crystal structures</a>. <br>",
"The SISSO approach was developed following a compressed-sensing based methodology to solve materials science problems as introduced in",
"L. M. Ghiringhelli, J. Vybiral, S. V. Levchenko, C. Draxl, M. Scheffler: <span style=\"font-style: italic;\">Big Data of Materials Science: Critical Role of the Descriptor</span>, Phys. Rev. Lett. 114, 105503 (2015) <a href=\"http://journals.aps.org/prl/abstract/10.1103/PhysRevLett.114.105503\">[PDF]</a>,",
"</div>",
"in which the LASSO+$\\ell_0$ method was proposed.",
" <p>In this example, you can run the SISSO($\\ell_{0}$) algorithm for finding the optimal descriptor that classifies functionalized Graphene-like materials in trivial and QSHIs. The descriptor is selected out of a large number of candidates constructed as functions of basic input features, the primary features. </p>",
"",
"<p>In the settings you can select the primary features as well as which kind of unary and binary operations are allowed during feature space construction from the checklist below. Moreover the following two parameters of the SISSO($\\ell_0$) algorithm can be specified: ",
" <ul>",
" <li>Number of iterations for the construction for the feature space: How often the selected operations are applied to build the feature space. At each step the operations are applied on all features created until the current step. </li>",
" <li>Number of collected features per SIS iteration.</li>",
" </ul> ",
" ",
" ",
"<p> After the preferred settings have been adjusted, click <b>RUN</b> for performing the calculations (creation of the feature space and optimization via SISSO($\\ell_0$)). </p>",
"",
"During the run, a brief summary is printed out below the <b>RUN</b> button. At the end of the run: ",
" <ul>",
" <li> the solution (identified 2D descriptor and the number of data points in the overlap region) is printed out.</li>",
"<li> the “View interactive 2D scatter plot” button unlocks. By clicking this button, the scatter plot with the 2D descriptor appears in a separate tab.</li>",
"</ul>",
"<p>Note: the scatter plot remains active even if another run is performed, which enables the output of several sets of input parameters to be compared.</p>",
"object": "<script>\nvar beaker = bkHelper.getBeakerObject().beakerObj;\n</script>\n<div class=\"lasso_control\">\n\n<button type=\"button\" class=\"btn btn-info\" style=\"font-size:larger; color: #ffffff; font-weight: bold;\" onclick=\"run_lasso()\">RUN</button>\n\n<button type=\"button\" class=\"btn btn-default\" style=\"font-size:larger;\" onclick=\"reset_lasso()\">Reset</button>\n\n<label title=\"This button becomes active when the\nrun is finished. By clicking it, an interactive plot of the first 2\ndimensions of the optimized descriptor will be opened\"> \n<a href=\"/user/tmp/c9d9e53485326f4e.html\" target=\"_blank\" class=\"btn btn-primary active\" style=\"font-size:larger;\" id=\"lasso_result_button\">View interactive 2D scatter plot</a> </label>\n </div>"
"<p style=\"text-align: left; color: #aa3311; font-weight: 400; font-size: 16pt;\"> You will automatically generate atomic descriptors, select an <i>optimal combination</i> among <i>millions or billions</i> of candidates, and produce a <i>fully-interactive plot</i> of your results. <br>"
],
"hidden": true
},
"output": {
"state": {},
"selectedType": "BeakerDisplay",
"height": 168,
"result": {
"type": "BeakerDisplay",
"innertype": "Html",
"object": "<script>\nvar beaker = bkHelper.getBeakerObject().beakerObj;\n</script>\n<p style=\"text-align: left; color: #aa3311; font-weight: 900; font-size: 20pt;\"> Note</p>\n<p style=\"text-align: left; color: #aa3311; font-weight: 400; font-size: 16pt;\"> You will automatically generate atomic descriptors, select an <i>optimal combination</i> among <i>millions or billions</i> of candidates, and produce a <i>fully-interactive plot</i> of your results. <br></p>"
"f_lists_by_units_flat = [b for a in f_lists_by_units for b in a]",
"",
"# set unit_classes",
"",
"feature_unit_classes = ['no_unit' if f not in f_lists_by_units_flat else [i_class for i_class, dimension_group in enumerate(f_lists_by_units) if f in dimension_group][0] for f in selected_feature_list]",
"",
"# input descriptors and target",
"D = df[selected_feature_list].values",
"P = df['target'].tolist()",
"P = [0 if t=='Trivial' else 1 if t=='FIQSHI' else 2 for t in P ]"
" f.write(\"SETTINGS\\nPrimary features: %s \\nAllowed operations : %s\\nNumber of iterations for the construction for the feature space: %s\\nNumber of collected features per SIS iteration: %s\\n\\nOUTPUT\\nDescriptor: %s,%s \\nNumber of samples in the overlap region: %s\" ",