
Descriptor Embedding and Clustering for Atomisitic-environment Framework (DECAF)
https://gitlab.mpcdf.mpg.de/klai/decaf.git
Tutorials
For tutorials with examples, please visit the GitLab Pages https://klai.pages.mpcdf.de/decaf/
Description
This is a Python package which provide a work flow to obtain clustering of local environments in dataset of structures.
Please refer the methodology paper "A Fuzzy Classification Framework to Identify Equivalent Atoms in Complex Materials and Molecules"[1] for details.
It provides mainly the following functions:
- Computating SOAP descriptor from an input atomic structure as an ASE Atoms object.
- Applying classical multidimensional scaling (MDS) on a dataset of SOAP.
- Differnetiating atomic environments of the embeded dataset using mean shift clustering (MSC).
- Embedding and classifying environments outside of MDS-MSC dataset.
Optional functions are also provided:
- Applying kernel principal component analysis (kPCA) / principal component analysis (PCA) / Sketch-Map[2] for embedding
- Applying HDBSCAN[3] for clustering.
References linking the journal article[1] and the code
Here we provide the locations in the code implementing the corresponding methods in the article.[1]
For details about how to use each function, please refer to decaf/examples/sample_code.ipynb
or comments in decaf/src/decaf.py
.
Methodology involved in the main text[1]:
double-SOAP (Sec.2A[1]):decaf/src/decaf.py
: function get_SOAP
classical MDS (Sec.2B[1]):decaf/src/decaf.py
: function get_cMDS
embedding any SOAP vector with obtained modeldecaf/src/decaf.py
: function embed_cMDS
MSC (Sec.2C[1]):decaf/src/decaf.py
: function get_MeanShift
Demonstrations in the main text[1]:
PAH examples (Sec.3A[1]):decaf/examples/sample_code.ipynb
: block PAH Example
Pd Surfaces examples (Sec.3B[1]):decaf/examples/sample_code.ipynb
: block Pd Surfaces Demonstration
Out-of-sample classification of Pd nanoparticle (Sec.3C[1]):decaf/examples/sample_code.ipynb
: block Classification Demonstration
Demonstrations in the supplementary information (SI)[1]:
kPCA (Sec.S1A[1]):decaf/examples/sample_code.ipynb
: block kPCA Embedding
SketchMap (Sec.S1B[1]):decaf/examples/sample_code.ipynb
: block Sketch Map Embedding
HDBSCAN (Sec.S2A[1]):decaf/examples/sample_code.ipynb
: block HDBSCAN Clustering
Demonstration in Sec.S3-4 are reproducible with change in (hyper)parameters according to the SI with functions in:decaf/examples/sample_code.ipynb
: block Pd Surfaces Demonstration
Demonstration in Sec.S5: MD settings and analysis are given in main text and reproducible, thus omitted in the example here.
Installation
You can install the package simply with the following command
pip install . --user
Then import the package with the following in Python
import decaf
Dependence:
Numpy, ASE, DScribe, Scikit Learn, Scipy
Repository Structure:
decaf
├── examples # Folder containing examples of applying DECAF
│ ├── Compiled_SketchMap # Folder containing compiled SketchMap if needed
│ ├── sample_code.ipynb # Sample code of DECAF applied on the demonstration cases
│ └── Structures # Folder containing atomic structures for the demonstration cases
│ └── **.con
├── pyproject.toml # Setup code for installing DECAF
├── README.md # The readme you are reading now.
└── src # Folder containing Source code of DECAF
└── decaf.py # Source code of DECAF
Reference
- K. C. Lai, S. Matera, C. Scheurer, K. Reuter, "A Fuzzy Classification Framework to Identify Equivalent Atoms in Complex Materials and Molecules" J. Chem. Phys 159.2 (2023). DOI: 10.1063/5.0160369 .
- M. Ceriotti, G. A. Tribello, and M. Parrinello, “Simplifying the representation of complex free-energy landscapes using sketch-map,” Proc. Natl. Acad. Sci. U.S.A. 108, 13023–13028 (2011).
- L. McInnes, J. Healy, and S. Astels, “hdbscan: Hierarchical density based clustering.” J. Open Source Softw. 2, 205 (2017).
Authors and Affiliation
Authors:
King Chun Lai, Sebastian Matera, Christoph Scheurer, Karsten Reuter
Affiliation:
Fritz-Haber-Institut der Max-Planck-Gesellschaft, Faradayweg 4-6, 14195 Berlin, Germany
Support
King Chun Lai : lai@fhi-berlin.mpg.de
License
Descriptor Embedding and Clustering for Atomisitic-environment Framework by King Chun Lai, Sebastian Matera, Christoph Scheurer, Karsten Reuter is licensed under CC BY 4.0