diff --git a/CO2_SGD.ipynb b/CO2_SGD.ipynb index eff272eec83c8b50c7dcbcad4ecd1a23755eca81..dc3a2592c740bcf1f558d902fbe47f7ebe2fce69 100644 --- a/CO2_SGD.ipynb +++ b/CO2_SGD.ipynb @@ -125,7 +125,7 @@ "\\begin{equation}\n", " F(Z) = \\frac{s(Z)}{s(Y)}\\cdot\\left(\\frac{med(Z)-med(Y)}{mm(Y)-med(Y)}\\right)\\cdot\\left( 1-\\frac{amd(Z)}{amd(Y)}\\right)\n", "\\end{equation}\n", - "where $Y$ is the whole data set, $Z$ - a subgroup, $s$ - sampling size, $med$ - median of the target property, $amd$ - absolute median deviation calculated around the median value of the target property, $mm$ – minimal or maximal value of the target property in the whole sampling. So, the second term supports two options for the quality function: one can do the search of subgroups in the area of smaller values of a target property (as we do for OCO-angles below), or in the area of larger values (for C-O bond distance). The first term in the quality function supports that subgroups are not too small, and the third term supports that the subgroup has a narrow distribution.\n", + "where $Y$ is the whole data set, $Z$ - a subgroup, $s$ - sampling size, $med$ - median of the target property, $amd$ - absolute median deviation calculated around the median value of the target property, $mm$ – minimal or maximal value of the target property in the whole sampling. The first term in the quality function supports that subgroups are not too small. The second term supports that the target property is either maximized (as for l(C-O)) or minimized (as for OCO angle) in the subgroup. The third term supports that the subgroup has a narrow distribution.\n", "\n", "The number of physical features that are offered to the approach includes properties of gas-phase atoms, surface slabs, and adsorption sites." ]