In this tutorial we will be using a machine learning method (clustering) to analyse results of Grain Boundary (GB) calculations of $\alpha$-iron. Along the way we will learn about different methods to describe local atomic environment in order to calculate properties of GBs. We will use these properties to separate the different regions of the GB using clustering methods. Finally we will determine how the energy of the GB is changing according to the angle difference of the regions.
In this tutorial we will be using a machine learning method (clustering) to analyse results of Grain Boundary (GB) calculations of $\alpha$-iron. Along the way we will learn about different methods to describe local atomic environment in order to calculate properties of GBs. We will use these properties to separate the different regions of the GB using clustering methods. Finally we will determine how the energy of the GB is changing according to the angle difference of the regions.
### Tutorial overview:
### Tutorial overview:
1.[The data (Nomad, Imeall)](#The-data)
1.[The data (Nomad, Imeall)](#The-data)
2.[Analysis of the data - Definition of Local Atomic Enviroment](#2.-Analysis-of-the-data---Definition-of-Local-Atomic-Enviroment)
2.[Analysis of the data - Definition of Local Atomic Enviroment](#2.-Analysis-of-the-data---Definition-of-Local-Atomic-Enviroment)
Grain boundaries are 2D defects in the crystal structure, studying them is important because they can change the mechanical, electrical and thermal properties of the material. GBs can also play a significant role in how metals break or become brittle and fracture due to the introduction and subsequent diffusion of hydrogen into the metal.
Grain boundaries are 2D defects in the crystal structure, studying them is important because they can change the mechanical, electrical and thermal properties of the material. GBs can also play a significant role in how metals break or become brittle and fracture due to the introduction and subsequent diffusion of hydrogen into the metal.
Mainly there are two types of GBs; the schematic below represents a tilt (top) and a twist (bottom) boundary between two idealised grains.
Mainly there are two types of GBs; the schematic below represents a tilt (top) and a twist (bottom) boundary between two idealised grains.
In this tutorial we will use a tiny subset of the Imeall database (http://www.imeall.co.uk). All the calculations of this subset are **relaxed structures of tilt GBs**, calculated using **PotHB potential** and stored in **extended xyz** file format of **bcc Fe**.
In this tutorial we will use a tiny subset of the Imeall database (http://www.imeall.co.uk). All the calculations of this subset are **relaxed structures of tilt GBs**, calculated using **PotHB potential** and stored in **extended xyz** file format of **bcc Fe**.
***
***
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
First we need to load the python packages that we use in this notebook.
First we need to load the python packages that we use in this notebook.
Load a selected calculation using ase.io.read function into an ase.Atoms object which contains the properties of the calculation and the list of atoms.
Load a selected calculation using ase.io.read function into an ase.Atoms object which contains the properties of the calculation and the list of atoms.
print('energy of grain boundary: {:.4f} J/m^2\n'.format(E_gb),
print('energy of grain boundary: {:.4f} J/m^2\n'.format(E_gb),
'area: {:.4f} A^2'.format(area))
'area: {:.4f} A^2'.format(area))
```
```
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
Let's visualise the atomic structure. AtomViewer is capable of visualising ase.Atoms object in jupyter notebook environment. We can also represent each atom with a different colour
Let's visualise the atomic structure. AtomViewer is capable of visualising ase.Atoms object in jupyter notebook environment. We can also represent each atom with a different colour
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
atom_index=range(len(atoms))
atom_index=range(len(atoms))
view=AtomViewer(atoms,atom_index)
view=AtomViewer(atoms,atom_index)
view.gui
view.gui
```
```
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
## 2. Analysis of the data - Definition of Local Atomic Enviroment
## 2. Analysis of the data - Definition of Local Atomic Enviroment
In this part we will see methods for describing the local atomic environment (LEA) based on the atomic coordinates only. Later we will use these LAE parameters to construct the feature space for clustering. Most of the methods are invariant under translation and rotation. Usually this is useful, but we will see that in our tutorial we need to use orientation information for proper clustering.
In this part we will see methods for describing the local atomic environment (LEA) based on the atomic coordinates only. Later we will use these LAE parameters to construct the feature space for clustering. Most of the methods are invariant under translation and rotation. Usually this is useful, but we will see that in our tutorial we need to use orientation information for proper clustering.
You can find more details about each method at the end of the tutorial.
You can find more details about each method at the end of the tutorial.
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
### Coordination Numbers
### Coordination Numbers
The coordination number of an atom is the number of its nearest neighbor atoms. In a realistic system, it is not necessarily well defined if two atoms are nearest neighbors, so the coordination number is defined as the number of neighbors within a certain distance.
The coordination number of an atom is the number of its nearest neighbor atoms. In a realistic system, it is not necessarily well defined if two atoms are nearest neighbors, so the coordination number is defined as the number of neighbors within a certain distance.
**Task:**
**Task:**
- Try to use different values for cutoff radius.
- Try to use different values for cutoff radius.
- Find a reasonable value for cutoff radious.<br>
- Find a reasonable value for cutoff radious.<br>
*Hint: optimal value should be between the first and second shell*<br>
*Hint: optimal value should be between the first and second shell*<br>
Let's visualise the result! On the following histogram we can see that most atoms have the same amount of neighbors.
Let's visualise the result! On the following histogram we can see that most atoms have the same amount of neighbors.
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
fig,ax=plt.subplots(figsize=(12,5))
fig,ax=plt.subplots(figsize=(12,5))
bins=np.arange(-0.5,max(coord_num)+1)
bins=np.arange(-0.5,max(coord_num)+1)
ax.hist(coord_num,bins)
ax.hist(coord_num,bins)
ax.set_title('Coordination number')
ax.set_title('Coordination number')
ax.set_xlabel("Number of nearest neighbors")
ax.set_xlabel("Number of nearest neighbors")
ax.set_ylabel("Number of atoms")
ax.set_ylabel("Number of atoms")
plt.show()
plt.show()
```
```
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
We can visualise indiviual values. Here the colour represents the number of neighbors for each atom.
We can visualise indiviual values. Here the colour represents the number of neighbors for each atom.
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
view=AtomViewer(atoms,coord_num)
view=AtomViewer(atoms,coord_num)
view.gui
view.gui
```
```
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
We can see that the value of the coordination number is capable of identifying the grain boundary, but is highly sensitive to the chosen cutoff radius.
We can see that the value of the coordination number is capable of identifying the grain boundary, but is highly sensitive to the chosen cutoff radius.
The result show how symmetric is the LOA - 0 means perfectly symmetric.
The result show how symmetric is the LOA - 0 means perfectly symmetric.
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
fig,ax=plt.subplots(figsize=(12,5))
fig,ax=plt.subplots(figsize=(12,5))
ax.hist(csp,bins=20)
ax.hist(csp,bins=20)
ax.set_title('Distribution of Centro Symmentry Parameter')
ax.set_title('Distribution of Centro Symmentry Parameter')
ax.set_xlabel("Centro Symmentry Parameter")
ax.set_xlabel("Centro Symmentry Parameter")
ax.set_ylabel("Number of atoms")
ax.set_ylabel("Number of atoms")
# ax.set_yscale('symlog')
# ax.set_yscale('symlog')
plt.show()
plt.show()
```
```
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
view=AtomViewer(atoms,csp)
view=AtomViewer(atoms,csp)
view.gui
view.gui
```
```
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
### Polyhedral Template Matching
### Polyhedral Template Matching
Polyhedral Template Matching (PTM) is a new alternative to the popular Common Neigbor Analysis, providing raughly the same advantages, but with a greater robustness against thermal vibrations, and does not depend critically on a cutoff.
Polyhedral Template Matching (PTM) is a new alternative to the popular Common Neigbor Analysis, providing raughly the same advantages, but with a greater robustness against thermal vibrations, and does not depend critically on a cutoff.
The PTM classifies the local crystalline order, and identifies local simple cubic (SC), face-centered cubic (FCC), body-centered cubic (FCC), hexagonal closed-packed (HCP) and icosahedral (ICO) order. In addition, some ordered alloys based on the FCC and BCC structures are also detected, namely L1_0, L1_2 and B2 structures.
The PTM classifies the local crystalline order, and identifies local simple cubic (SC), face-centered cubic (FCC), body-centered cubic (FCC), hexagonal closed-packed (HCP) and icosahedral (ICO) order. In addition, some ordered alloys based on the FCC and BCC structures are also detected, namely L1_0, L1_2 and B2 structures.
Clustering - grouping a set of objects - is an unsupervised machine learning problem. Like for most machine learning algorithms, finding the proper features is one of the most important tasks.
Clustering - grouping a set of objects - is an unsupervised machine learning problem. Like for most machine learning algorithms, finding the proper features is one of the most important tasks.
We can find a summary about the clustering methods below:
We can find a summary about the clustering methods below:
# pred = GaussianMixture(n_components=n_clusters, covariance_type='full').fit(X).predict(X)
# pred = GaussianMixture(n_components=n_clusters, covariance_type='full').fit(X).predict(X)
```
```
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
Our goal is to find two groups / the two largest groups of atoms after clustering. We can validate the results by checking the histogram of the prediction. We can see that we must use the descriptor which contains information about the local crystal orientation.
Our goal is to find two groups / the two largest groups of atoms after clustering. We can validate the results by checking the histogram of the prediction. We can see that we must use the descriptor which contains information about the local crystal orientation.
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
count=np.bincount(pred)
count=np.bincount(pred)
fig,ax=plt.subplots(figsize=(12,5))
fig,ax=plt.subplots(figsize=(12,5))
ax.bar(range(n_clusters),count)
ax.bar(range(n_clusters),count)
ax.set_title('Histogram')
ax.set_title('Histogram')
ax.set_xlabel("classes")
ax.set_xlabel("classes")
ax.set_ylabel("# of atoms")
ax.set_ylabel("# of atoms")
plt.show()
plt.show()
```
```
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
By visualising the results, we can see which atoms belong together.
By visualising the results, we can see which atoms belong together.
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
view=AtomViewer(atoms,pred)
view=AtomViewer(atoms,pred)
view.view.center()
view.view.center()
view.gui
view.gui
```
```
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
We can use the two largest cluster for calculating the avarage angle difference between the grains.
We can use the two largest cluster for calculating the avarage angle difference between the grains.
- machine learning methods are useful tools to analyse datasets without any a priori information.
- machine learning methods are useful tools to analyse datasets without any a priori information.
- we need to find the proper features for a certain application (choosing the right properties (features) is more important than the machine learning method itself)
- we need to find the proper features for a certain application (choosing the right properties (features) is more important than the machine learning method itself)
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
### Future readings:
### Future readings:
- Rosenbrock, Conrad W., et al. "Discovering the Building Blocks of Atomic Systems using Machine Learning." arXiv preprint arXiv:1703.06236 (2017).
- Rosenbrock, Conrad W., et al. "Discovering the Building Blocks of Atomic Systems using Machine Learning." arXiv preprint arXiv:1703.06236 (2017).
- Stukowski, Alexander. "Structure identification methods for atomistic simulations of crystalline materials." Modelling and Simulation in Materials Science and Engineering 20.4 (2012): 045021.
- Stukowski, Alexander. "Structure identification methods for atomistic simulations of crystalline materials." Modelling and Simulation in Materials Science and Engineering 20.4 (2012): 045021.
- Larsen, Peter Mahler, Søren Schmidt, and Jakob Schiøtz. "Robust structural identification via polyhedral template matching." Modelling and Simulation in Materials Science and Engineering 24.5 (2016): 055007.
- Larsen, Peter Mahler, Søren Schmidt, and Jakob Schiøtz. "Robust structural identification via polyhedral template matching." Modelling and Simulation in Materials Science and Engineering 24.5 (2016): 055007.