Update logos

ab761447 · Luigi Sbailo · af59d1e2 · ab761447 · ab761447 · ab761447
Commit ab761447 authored 3 years ago by Luigi Sbailo
--- a/assets/grain_boundaries/logo_HU.png
+++ b/assets/grain_boundaries/logo_HU.png
--- a/assets/grain_boundaries/logo_MPG.png
+++ b/assets/grain_boundaries/logo_MPG.png
--- a/grain_boundaries.ipynb
+++ b/grain_boundaries.ipynb
@@ -11,7 +11,9 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "<img  style=\"float: left;\" src=\"assets/grain_boundaries/logo_NOMAD.png\" width=300>"
+    "<img style=\"float: left;\" src=\"assets/grain_boundaries/logo_MPG.png\" width=150>\n",
+    "<img style=\"float: left; margin-top: -10px\" src=\"assets/grain_boundaries/logo_NOMAD.png\" width=250>\n",
+    "<img style=\"float: left; margin-top: -5px\" src=\"assets/grain_boundaries/logo_HU.png\" width=130>"
   ]
  },
  {
@@ -1008,7 +1010,7 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "Python 3",
+   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
@@ -1022,7 +1024,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.7.3"
+   "version": "3.9.7"
  },
  "pycharm": {
   "stem_cell": {

 %% Cell type:markdown id: tags:
 <img  src="assets/grain_boundaries/header.jpg" width="900">
 %% Cell type:markdown id: tags:
-<img  style="float: left;" src="assets/grain_boundaries/logo_NOMAD.png" width=300>
+<img style="float: left;" src="assets/grain_boundaries/logo_MPG.png" width=150>
+<img style="float: left; margin-top: -10px" src="assets/grain_boundaries/logo_NOMAD.png" width=250>
+<img style="float: left; margin-top: -5px" src="assets/grain_boundaries/logo_HU.png" width=130>
 %% Cell type:markdown id: tags:
 ## Introduction
 In this tutorial we will be using a machine learning method (clustering) to analyse results of Grain Boundary (GB) calculations of $\alpha$-iron. Along the way we will learn about different methods to describe local atomic environment in order to calculate properties of GBs. We will use these properties to separate the different regions of the GB using clustering methods. Finally we will determine how the energy of the GB is changing according to the angle difference of the regions.
 ### Tutorial overview:
 1. [The data (Nomad, Imeall)](#The-data)
 2. [Analysis of the data - Definition of Local Atomic Enviroment](#2.-Analysis-of-the-data---Definition-of-Local-Atomic-Enviroment)
    1. [coordination number](#Coordination-Numbers)
    2. [centrosymmetry parameter analysis](#Centrosymmetry-parameter-analysis)
    3. [polyhedral template matching](#Polyhedral-Template-Matching)
 3. [Machine Learning methods: Clustering](#3.-Machine-Learning-methods:--Clustering)
 %% Cell type:markdown id: tags:
 ## 1.  A look at the data: Grain boundaries
 Grain boundaries are 2D defects in the crystal structure, studying them is important because they can change the mechanical, electrical and thermal properties of the material. GBs can also play a significant role in how metals  break or become brittle and fracture due to the introduction and subsequent diffusion of hydrogen into the metal.
 Mainly there are two types of GBs; the schematic below represents a tilt (top) and a twist (bottom) boundary between two idealised grains.
 <img align="center" width="30%" src="https://upload.wikimedia.org/wikipedia/commons/9/96/TiltAndTwistBoundaries_remade.svg">
 ### The data
 In this tutorial we will use a tiny subset of the Imeall database (http://www.imeall.co.uk). All the calculations of this subset are **relaxed structures of tilt GBs**, calculated using **PotHB potential** and stored in **extended xyz** file format of **bcc Fe**.
 ***
 %% Cell type:markdown id: tags:
 First we need to load the python packages that we use in this notebook.
 %% Cell type:code id: tags:
 ``` python
 %matplotlib inline
 from os import listdir
 from pathlib import Path
 import numpy as np
 import matplotlib.pyplot as plt
 from scipy.spatial import cKDTree
 from ase import Atoms
 from ase.io import read
 from asap3 import FullNeighborList
 from asap3.analysis import CoordinationNumbers, FullCNA, PTM
 from sklearn.cluster import KMeans
 from sklearn.mixture import GaussianMixture
 from grain_boundaries import AtomViewer
 ```
 %% Cell type:markdown id: tags:
 Define some parameters of alpha-Fe (BCC):
 %% Cell type:code id: tags:
 ``` python
 # potential energy of the perfect crystal according to a specific potential
 Fe_BCC_energy_per_atom = -4.01298214176 # alpha-Fe PotBH
 Fe_BCC_lattice_constant = 2.856
 ```
 %% Cell type:markdown id: tags:
 List the first 10 files in the directory of tilt GB
 %% Cell type:code id: tags:
 ``` python
 dir_path = Path("data/grain_boundaries/GB_alphaFe_001_tilt")
 listdir(str(dir_path))[:10], 'etc.'
 ```
 %% Cell type:markdown id: tags:
 Load a selected calculation using ase.io.read function into an ase.Atoms object which contains the properties of the calculation and the list of atoms.
 You can choose any of them.
 %% Cell type:code id: tags:
 ``` python
 # filepath = dir_path / '0012807140_v6bxv2_tv0.4bxv0.0_d1.8z_traj.xyz'
 filepath = dir_path / '0016193350_v6bxv2_tv0.4bxv0.4_d2.0z_traj.xyz'
 atoms = read(str(filepath))
 ```
 %% Cell type:markdown id: tags:
 We can use the help function, or "?" mark to get more details about attributes and methods of the ase.Atoms object:
 %% Cell type:code id: tags:
 ``` python
 # ?Atoms
 ```
 %% Cell type:markdown id: tags:
 Print some properties of the calculation:
 %% Cell type:code id: tags:
 ``` python
 print('number of atoms: {:d}\n'.format(atoms.get_number_of_atoms()),
      'total_energy: {:.4f} eV\n'.format(atoms.get_total_energy()),
      'cell voluem:  {:.4f} A^3\n'.format(atoms.get_volume()),
      'periodic boundary: {}'.format(atoms.get_pbc()))
 ```
 %% Cell type:markdown id: tags:
 We can calculate the GB's energy as the energy difference between the perfect crystal and the actual calculation diveded by the area of the GB
 %% Cell type:code id: tags:
 ``` python
 def gb_energy(total_energy, n_atoms, area):
    eV = 1.6021766208e-19
    Angstrom = 1.e-10
    return 1 / (2 * area * Angstrom**2) * \
        (total_energy - Fe_BCC_energy_per_atom * n_atoms) * eV
 cell = atoms.get_cell_lengths_and_angles()
 area = cell[0] * cell[1]
 E_gb = gb_energy(atoms.get_total_energy(), len(atoms), area)
 print('energy of grain boundary: {:.4f} J/m^2\n'.format(E_gb),
      'area: {:.4f} A^2'.format(area))
 ```
 %% Cell type:markdown id: tags:
 Let's visualise the atomic structure. AtomViewer is capable of visualising ase.Atoms object in jupyter notebook environment. We can also represent each atom with a different colour
 %% Cell type:code id: tags:
 ``` python
 atom_index = range(len(atoms))
 view = AtomViewer(atoms, atom_index)
 view.gui
 ```
 %% Cell type:markdown id: tags:
 ## 2. Analysis of the data - Definition of Local Atomic Enviroment
 In this part we will see methods for describing the local atomic environment (LEA) based on the atomic coordinates only. Later we will use these LAE parameters to construct the feature space for clustering. Most of the methods are invariant under translation and rotation. Usually this is useful, but we will see that in our tutorial we need to use orientation information for proper clustering.
 Methods:
 1. [coordination number](#Coordination-Numbers)
 2. [centrosymmetry parameter analysis](#Centrosymmetry-parameter-analysis)
 3. [polyhedral template matching](#Polyhedral-Template-Matching)
 You can find more details about each method at the end of the tutorial.
 %% Cell type:markdown id: tags:
 ### Coordination Numbers
 The coordination number of an atom is the number of its nearest neighbor atoms. In a realistic system, it is not necessarily well defined if two atoms are nearest neighbors, so the coordination number is defined as the number of neighbors within a certain distance.
 **Task:**
 - Try to use different values for cutoff radius.
 - Find a reasonable value for cutoff radious.<br>
  *Hint: optimal value should be between the first and second shell*<br>
  $\frac{\sqrt{3}}{2} \alpha < r_{cut} < \alpha$, <br>
  - where $\alpha$ is the lattice constant.
 %% Cell type:code id: tags:
 ``` python
 # ?CoordinationNumbers
 ```
 %% Cell type:code id: tags:
 ``` python
 coord_num = CoordinationNumbers(atoms, rCut=0.93*Fe_BCC_lattice_constant)
 ```
 %% Cell type:markdown id: tags:
 Below you can find th python implementation of the CN method:
 %% Cell type:code id: tags:
 ``` python
 # ?FullNeighborList
 ```
 %% Cell type:code id: tags:
 ``` python
 nblist = FullNeighborList(0.93*Fe_BCC_lattice_constant, atoms=atoms)
 coord_num = np.zeros(len(atoms))
 for i, (atom, neighbor) in enumerate(zip(atoms, nblist)):
    coord_num[i] = len(neighbor)
 ```
 %% Cell type:markdown id: tags:
 Let's visualise the result! On the following histogram we can see that most atoms have the same amount of neighbors.
 %% Cell type:code id: tags:
 ``` python
 fig, ax = plt.subplots(figsize=(12,5))
 bins = np.arange(-0.5, max(coord_num) + 1)
 ax.hist(coord_num, bins)
 ax.set_title('Coordination number')
 ax.set_xlabel("Number of nearest neighbors")
 ax.set_ylabel("Number of atoms")
 plt.show()
 ```
 %% Cell type:markdown id: tags:
 We can visualise indiviual values. Here the colour represents the number of neighbors for each atom.
 %% Cell type:code id: tags:
 ``` python
 view = AtomViewer(atoms, coord_num)
 view.gui
 ```
 %% Cell type:markdown id: tags:
 We can see that the value of the coordination number is capable of identifying the grain boundary, but is highly sensitive to the chosen cutoff radius.
 %% Cell type:code id: tags:
 ``` python
 r_scale_list = np.linspace(0.7, 1.2, 20)
 avg_neihbour = []
 for r_scale in r_scale_list:
    coord_num = CoordinationNumbers(atoms=atoms, rCut=r_scale * Fe_BCC_lattice_constant)
    avg_neihbour.append(np.average(coord_num))
 fig, ax = plt.subplots(figsize=(12,5))
 ax.plot(r_scale_list, avg_neihbour)
 ax.set_xlabel("r_scale")
 ax.set_ylabel("Average of nearest neighbors")
 ax.set_title("Coordination Numbers")
 plt.show()
 ```
 %% Cell type:markdown id: tags:
 ### Centrosymmetry parameter analysis
 The centrosymmetry property of some lattices (e.g. fcc and bcc) can be used to
 distinguish them from other structures such as crystal defects where the local bond
 symmetry is broken. The CSP of an atom having N nearest neighbors is defined as
 $$CPS = \sum\limits_{i=1}^{N/2} \left| \mathbf{r_i} + \mathbf{r}_{i+N/2}\right|^2$$
 - where $\mathbf{r}_i$ and $\mathbf{r}_{i+N/2}$ are vectors from the central atom to a pair of opposite neighbors.
 %% Cell type:markdown id: tags:
 You can find a practical implementation below. For BCC structure we need to use the 8 nearest neighbor.
 %% Cell type:code id: tags:
 ``` python
 def CentroSymmentryParameter(atoms, n):
    atoms.wrap()
    coordinates = atoms.get_positions()
    box = np.diag(atoms.get_cell())
    # Building the nearest neighbor list
    nblist = cKDTree(coordinates, boxsize=box)
    distances, nblist = nblist.query(coordinates, k=n+1)
    csp=np.zeros(len(atoms))
    for neighbors in nblist:
        atom_index = neighbors[0]
        n_indecies = neighbors[1:]
        N = len(n_indecies)
        r = atoms[n_indecies].get_positions() - atoms[atom_index].position
        # fixing periodic boundary
        r = np.where(abs(r) < abs(r - box), r, r - box)
        r = np.where(abs(r) < abs(r + box), r, r + box)
        pairs = []
        for i, r_i in enumerate(r):
            pairs.append(np.linalg.norm(r_i + r[i+1:,:], axis=1))
        pairs = np.hstack(pairs)
        pairs.sort()
        csp[atom_index] = np.sum(pairs[:N//2])
    return csp
 csp = CentroSymmentryParameter(atoms, n=8)
 ```
 %% Cell type:markdown id: tags:
 The result show how symmetric is the LOA - 0 means perfectly symmetric.
 %% Cell type:code id: tags:
 ``` python
 fig, ax = plt.subplots(figsize=(12,5))
 ax.hist(csp, bins=20)
 ax.set_title('Distribution of Centro Symmentry Parameter')
 ax.set_xlabel("Centro Symmentry Parameter")
 ax.set_ylabel("Number of atoms")
 # ax.set_yscale('symlog')
 plt.show()
 ```
 %% Cell type:code id: tags:
 ``` python
 view = AtomViewer(atoms, csp)
 view.gui
 ```
 %% Cell type:markdown id: tags:
 ### Polyhedral Template Matching
 Polyhedral Template Matching (PTM) is a new alternative to the popular Common Neigbor Analysis, providing raughly the same advantages, but with a greater robustness against thermal vibrations, and does not depend critically on a cutoff.
 The PTM classifies the local crystalline order, and identifies local simple cubic (SC), face-centered cubic (FCC), body-centered cubic (FCC), hexagonal closed-packed (HCP) and icosahedral (ICO) order. In addition, some ordered alloys based on the FCC and BCC structures are also detected, namely L1_0, L1_2 and B2 structures.
 https://wiki.fysik.dtu.dk/asap/Local%20crystalline%20order#polyhedral-template-matching-ptm
 %% Cell type:code id: tags:
 ``` python
 # help(PTM)
 # Imprtant key names of returned data:
 # 'structure': The local crystal structure around atom i, if any.
 #              0 = none; 1 = FCC; 2 = HCP; 3 = BCC; 4 = Icosahedral; 5 = SC.
 # 'rmsd': The RMSD error in the fitting to the template, or INF if no structure was identified.
 # 'scale': The average distance to the nearest neighbors for structures 1-4;
 #          or the average distance to nearest and next-nearest neighbors for structure 5;
 #          or INF if no structure was identified.
 # 'orientation': The orientation of the crystal lattice, expressed as a unit quaternion.
 #                If no structure was found, the illegal value (0, 0, 0, 0) is returned.
 # ?PTM
 ```
 %% Cell type:code id: tags:
 ``` python
 ptm = PTM(atoms=atoms, cutoff=8.)
 ptm.keys()
 ```
 %% Cell type:code id: tags:
 ``` python
 fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16,5))
 ax1.hist(ptm['structure'], range(0, 7))
 # ax1.set_yscale('symlog')
 ax1.set_xticks([x + .5 for x in range(6)])
 ax1.set_xticklabels(['None', 'FCC', 'HCP', 'BCC', 'Ic', 'SC'])
 ax1.set_ylabel('# of atoms')
 ax1.set_title('Structure')
 ax2.hist(ptm['scale'])
 # ax2.set_yscale('symlog')
 ax1.set_xlabel('distance scale')
 ax1.set_ylabel('# of atoms')
 ax2.set_title('Distribution of distance scale')
 plt.show()
 ```
 %% Cell type:code id: tags:
 ``` python
 view = AtomViewer(atoms, ptm['scale'])
 view.view.center()
 view.gui
 ```
 %% Cell type:markdown id: tags:
 ## 3. Machine Learning methods:  Clustering
 Clustering - grouping a set of objects - is an unsupervised machine learning problem. Like for most machine learning algorithms, finding the proper features is one of the most important tasks.
 We can find a summary about the clustering methods below:
 http://scikit-learn.org/stable/auto_examples/cluster/plot_cluster_comparison.html
 <img align="left" width="90%" src="http://scikit-learn.org/stable/_images/sphx_glr_plot_cluster_comparison_001.png">
 %% Cell type:markdown id: tags:
 Let's design the feature space for clustering. Here we can use the following local properties for constructing the feature space:
    - coord_num
    - csp
    - ptm['orientation']
    - ptm['scale']
    - ptm['rmsd']
    - ptm['structure']
 We can also define which clustering method we want to use:
    - pred = KMeans(n_clusters=n_clusters).fit_predict(X)
    - pred = GaussianMixture(n_components=n_clusters, covariance_type='full').fit(X).predict(X)
 %% Cell type:code id: tags:
 ``` python
 # ?KMeans
 ```
 %% Cell type:code id: tags:
 ``` python
 # ?GaussianMixture
 ```
 %% Cell type:code id: tags:
 ``` python
 # Feature space
 # X = np.hstack([ptm['scale'][:, np.newaxis], csp[:, np.newaxis]])
 X = np.hstack([ptm['orientation'], csp[:, np.newaxis]])
 # X = ptm['orientation']
 # Number of clusters
 n_clusters=10
 # Clustering method
 pred = KMeans(n_clusters=n_clusters).fit_predict(X)
 # pred = GaussianMixture(n_components=n_clusters, covariance_type='full').fit(X).predict(X)
 ```
 %% Cell type:markdown id: tags:
 Our goal is to find two groups / the two largest groups of atoms after clustering. We can validate the results by checking the histogram of the prediction. We can see that we must use the descriptor which contains information about the local crystal orientation.
 %% Cell type:code id: tags:
 ``` python
 count = np.bincount(pred)
 fig, ax = plt.subplots(figsize=(12,5))
 ax.bar(range(n_clusters),count)
 ax.set_title('Histogram')
 ax.set_xlabel("classes")
 ax.set_ylabel("# of atoms")
 plt.show()
 ```
 %% Cell type:markdown id: tags:
 By visualising the results, we can see which atoms belong together.
 %% Cell type:code id: tags:
 ``` python
 view = AtomViewer(atoms, pred)
 view.view.center()
 view.gui
 ```
 %% Cell type:markdown id: tags:
 We can use the two largest cluster for calculating the avarage angle difference between the grains.
 %% Cell type:code id: tags:
 ``` python
 # select the 2 largest
 index = np.argsort(count)[-2:]
 orientation0 = np.average(ptm['orientation'][pred==index[0], :], axis=0)
 orientation1 = np.average(ptm['orientation'][pred==index[1], :], axis=0)
 angle_difference = 2 * np.arccos(np.dot(orientation0, np.conj(orientation1)))*180/np.pi
 print('The angle difference between the grains is: {:.3f} degree'.format(angle_difference))
 ```
 %% Cell type:markdown id: tags:
 # Finally:
 We can construct a function and run it on all available data to show how the GB energy changes by changing the angle of the grains:
 %% Cell type:code id: tags:
 ``` python
 def diff_angle(filepath):
    atoms = read(str(filepath))
    cell = atoms.get_cell_lengths_and_angles()
    area = cell[0] * cell[1]
    E_gb = gb_energy(atoms.get_total_energy(), len(atoms), area)
    ptm = PTM(atoms=atoms, cutoff=8.)
    # X = ptm['orientation']
    X = np.hstack([ptm['orientation'], ptm['scale'][:, np.newaxis]])
    n_clusters=20
    pred = KMeans(n_clusters=n_clusters).fit_predict(X)
    count = np.bincount(pred)
    index = np.argsort(count)[-2:]
    orientation0 = np.average(ptm['orientation'][pred==index[0], :], axis=0)
    orientation1 = np.average(ptm['orientation'][pred==index[1], :], axis=0)
    angle_difference = 2 * np.arccos(np.dot(orientation0, np.conj(orientation1)))*180/np.pi
    return angle_difference, E_gb
 ```
 %% Cell type:code id: tags:
 ``` python
 filelist = list(dir_path.glob('**/*.xyz'))
 result = np.zeros((len(filelist), 2))
 for i, file in enumerate(filelist):
    result[i,:] = np.array(diff_angle(file))
    print('{:6.3f} deg {:7.4f} J/m^2 {}'.format(result[i,0], result[i,1], file.name))
 ```
 %% Cell type:markdown id: tags:
 Using the calculated angle difference we can show how the boundary energy change:
 %% Cell type:code id: tags:
 ``` python
 result_sorted = result[result[:,0].argsort(),:]
 fig, ax = plt.subplots(figsize=(10,6))
 ax.plot(result_sorted[:,0], result_sorted[:,1], 'o-')
 plt.show()
 ```
 %% Cell type:markdown id: tags:
 ## Take home massage:
 - machine learning methods are useful tools to analyse datasets without any a priori information.
 - we need to find the proper features for a certain application (choosing the right properties (features) is more important than the machine learning method itself)
 %% Cell type:markdown id: tags:
 ### Future readings:
 - Rosenbrock, Conrad W., et al. "Discovering the Building Blocks of Atomic Systems using Machine Learning." arXiv preprint arXiv:1703.06236 (2017).
 - Stukowski, Alexander. "Structure identification methods for atomistic simulations of crystalline materials." Modelling and Simulation in Materials Science and Engineering 20.4 (2012): 045021.
 - Larsen, Peter Mahler, Søren Schmidt, and Jakob Schiøtz. "Robust structural identification via polyhedral template matching." Modelling and Simulation in Materials Science and Engineering 24.5 (2016): 055007.
 %% Cell type:code id: tags:
 ``` python
 ```