"The materials of interest are represented as vectors in a vector space $X$ according to some chosen representation. The coordinates $x_i$ could represent features of the material e.g. bond distance, lattice parameters, composition of elements etc. The ML moodels try to predict a target property $y$ with the minimal error according to some loss function. In our case, $y$ is the formation energy. For this example, three ML models have been used. Specifically, kernel-ridge-regression models were trained by using three different descriptor of the atomc structures: <a href="https://arxiv.org/abs/1704.06439" target="_blank">MBTR</a>, <a href="https://arxiv.org/abs/1502.01366" target="_blank">SOAP/a>, <a href="https://www.nature.com/articles/s41524-019-0239-3" target="_blank">$n$-gram</a>. Additionally, calculation were performed on a simple representation just containing atomic properties, which is expected to produce much larger errors. All of this data is compiled into ``data.csv``."

"The materials of interest are represented as vectors in a vector space $X$ according to some chosen representation. The coordinates $x_i$ could represent features of the material e.g. bond distance, lattice parameters, composition of elements etc. The ML moodels try to predict a target property $y$ with the minimal error according to some loss function. In our case, $y$ is the formation energy. For this example, three ML models have been used. Specifically, kernel-ridge-regression models were trained by using three different descriptor of the atomc structures: <a href="https://arxiv.org/abs/1704.06439" target="_blank">MBTR</a>, <a href="https://arxiv.org/abs/1502.01366" target="_blank">SOAP</a>, <a href="https://www.nature.com/articles/s41524-019-0239-3" target="_blank">$n$-gram</a>. Additionally, calculation were performed on a simple representation just containing atomic properties, which is expected to produce much larger errors. All of this data is compiled into ``data.csv``."