Commit 90857cc5 authored by Thomas Purcell's avatar Thomas Purcell
Browse files

Update python docstrings

Update the most used functions
parent 5f2b7251
......@@ -48,8 +48,9 @@ public:
*
* @param feat_space The feature space to run SISSO on
* @param prop_unit The unit of the property
* @param prop_label The header of the property column in the csv file
* @param prop Vector storing all data to train the SISSO models with
* @param prpo_test Vector storing all data to test the SISSO models with
* @param prop_test Vector storing all data to test the SISSO models with
* @param task_sizes_train Number of training samples per task
* @param task_sizes_test Number of testing samples per task
* @param leave_out_inds List of indexes from the initial data file in the test set
......@@ -144,6 +145,7 @@ public:
* @brief Constructor for the Classifier that takes in python objects (cpp definition in <python/descriptor_identifier/SISSOClassifier.cpp)
*
* @param feat_space The feature space to run SISSO on
* @param prop_label The header of the property column in the csv file
* @param prop_unit The unit of the property
* @param prop Vector storing all data to train the SISSO models with
* @param prpo_test Vector storing all data to test the SISSO models with
......@@ -174,6 +176,7 @@ public:
* @brief Constructor for the Classifier that takes in python objects (cpp definition in <python/descriptor_identifier/SISSOClassifier.cpp)
*
* @param feat_space The feature space to run SISSO on
* @param prop_label The header of the property column in the csv file
* @param prop_unit The unit of the property
* @param prop Vector storing all data to train the SISSO models with
* @param prpo_test Vector storing all data to test the SISSO models with
......
......@@ -33,6 +33,7 @@ public:
* @brief Constructor for the Regressor
*
* @param feat_space The feature space to run SISSO on
* @param prop_label The header of the property column in the csv file
* @param prop_unit The unit of the property
* @param prop Vector storing all data to train the SISSO models with
* @param prpo_test Vector storing all data to test the SISSO models with
......@@ -112,6 +113,7 @@ public:
* @brief Constructor for the Regressor that takes in python objects (cpp definition in <python/descriptor_identifier/SISSOLogRegressor.cpp)
*
* @param feat_space The feature space to run SISSO on
* @param prop_label The header of the property column in the csv file
* @param prop_unit The unit of the property
* @param prop Vector storing all data to train the SISSO models with
* @param prpo_test Vector storing all data to test the SISSO models with
......@@ -143,6 +145,7 @@ public:
* @brief Constructor for the Regressor that takes in python objects (cpp definition in <python/descriptor_identifier/SISSOLogRegressor.cpp)
*
* @param feat_space The feature space to run SISSO on
* @param prop_label The header of the property column in the csv file
* @param prop_unit The unit of the property
* @param prop Vector storing all data to train the SISSO models with
* @param prpo_test Vector storing all data to test the SISSO models with
......
......@@ -38,6 +38,7 @@ public:
* @brief Constructor for the Regressor
*
* @param feat_space The feature space to run SISSO on
* @param prop_label The header of the property column in the csv file
* @param prop_unit The unit of the property
* @param prop Vector storing all data to train the SISSO models with
* @param prop_test Vector storing all data to test the SISSO models with
......@@ -148,6 +149,7 @@ public:
* @brief Constructor for the Regressor that takes in python objects (cpp definition in <python/descriptor_identifier/SISSORegressor.cpp)
*
* @param feat_space The feature space to run SISSO on
* @param prop_label The header of the property column in the csv file
* @param prop_unit The unit of the property
* @param prop Vector storing all data to train the SISSO models with
* @param prpo_test Vector storing all data to test the SISSO models with
......@@ -179,6 +181,7 @@ public:
* @brief Constructor for the Regressor that takes in python objects (cpp definition in <python/descriptor_identifier/SISSORegressor.cpp)
*
* @param feat_space The feature space to run SISSO on
* @param prop_label The header of the property column in the csv file
* @param prop_unit The unit of the property
* @param prop Vector storing all data to train the SISSO models with
* @param prpo_test Vector storing all data to test the SISSO models with
......
......@@ -134,7 +134,6 @@ public:
* @param mpi_comm MPI communicator for the calculations
* @param phi_0 The initial set of features to combine
* @param allowed_ops list of allowed operators
* @param allowed_param_ops dictionary of the parameterizable operators and their associated free parameters
* @param prop The property to be learned (training data)
* @param task_sizes The number of samples per task
* @param project_type The projection operator to use
......@@ -145,7 +144,6 @@ public:
* @param cross_corr_max Maximum cross-correlation used for selecting features
* @param min_abs_feat_val minimum absolute feature value
* @param max_abs_feat_val maximum absolute feature value
* @param max_param_depth the maximum paremterization depths for features
*/
FeatureSpace(
std::shared_ptr<MPI_Interface> mpi_comm,
......@@ -388,7 +386,6 @@ public:
*
* @param phi_0 The initial set of features to combine
* @param allowed_ops list of allowed operators
* @param allowed_param_ops dictionary of the parameterizable operators and their associated free parameters
* @param prop The property to be learned (training data)
* @param task_sizes The number of samples per task
* @param project_type The projection operator to use
......@@ -399,7 +396,6 @@ public:
* @param cross_corr_max Maximum cross-correlation used for selecting features
* @param min_abs_feat_val minimum absolute feature value
* @param max_abs_feat_val maximum absolute feature value
* @param max_param_depth the maximum paremterization depths for features
*/
FeatureSpace(
py::list phi_0,
......@@ -499,7 +495,7 @@ public:
py::list phi_0,
np::ndarray prop,
py::list task_sizes,
std::string project_type="pearson",
std::string project_type="regression",
int n_sis_select=1,
double cross_corr_max=1.0
);
......@@ -509,8 +505,8 @@ public:
* @details constructs the feature space from an initial set of features and a file containing postfix expressions for the features (cpp definition in <python/feature_creation/FeatureSpace.cpp>)
*
* @param feature_file The file with the postfix expressions for the feature space
* @param prop The property to be learned (training data)
* @param phi_0 The initial set of features to combine
* @param prop The property to be learned (training data)
* @param task_sizes The number of samples per task
* @param project_type The projection operator to use
* @param n_sis_select number of features to select during each SIS step
......@@ -521,7 +517,7 @@ public:
py::list phi_0,
py::list prop,
py::list task_sizes,
std::string project_type="pearson",
std::string project_type="regression",
int n_sis_select=1,
double cross_corr_max=1.0
);
......
......@@ -74,16 +74,17 @@ def generate_phi_0_from_csv(
prop_key (str): The key corresponding to which column in the csv file the property is stored in
cols (list or str): The columns to include in the initial feature set
task_key (str): The key corresponding to which column in the csv file the task differentiation is stored in
leave_out_frac (list): List of indices to pull from the training data to act as a test set
leave_out_frac (float): The fraction (as a decimal) of indcies to leave out of the calculations
leave_out_inds (list): List of indices to pull from the training data to act as a test set
max_rung (int): Maximum rung of a feature
Returns:
phi_0 (list of FeatureNodes): The list of primary features
prop_label (str): The label used to describe the property
prop_unit (Unit): The unit of the property
prop_train (np.ndarray): The property values for the training data
prop_test (np.ndarray): The property values for the test data
task_sizes_train (list): The number of samples in the training data for each task
task_sizes_test (list): The number of samples in the test data for each task
leave_out_frac (float): Fraction of samples to leave out
leave_out_inds (list): Indices to use as the test set
"""
if not max_rung:
......@@ -219,6 +220,7 @@ def generate_fs_csv(
n_sis_select,
task_key=None,
leave_out_frac=0.0,
leave_out_inds=None,
):
"""Generate a FeatureSet for the calculation
......@@ -227,11 +229,13 @@ def generate_fs_csv(
prop_key (str): The key corresponding to which column in the csv file the property is stored in
allowed_ops (list): List of operations used to combine the features
allowed_param_ops (dict): A dict describing the desired non-linear parameterization
calc_type (str): The type of projection to use (regression, classification, or log_regression)
cols (list or str): The columns to include in the initial feature set
max_phi (int): Maximum rung for the calculation
n_sis_select (int): number of features to select in each round of SIS
task_key (str): The key corresponding to which column in the csv file the task differentiation is stored in
leave_out_frac (list): List of indices to pull from the training data to act as a test set
leave_out_frac (float): The fraction (as a decimal) of indcies to leave out of the calculations
leave_out_inds (list): List of indices to pull from the training data to act as a test set
Returns:
fs (FeatureSpace): The FeatureSpace for the calculation
......@@ -313,8 +317,8 @@ def generate_fs(
task_sizes_train (list): The number of samples in the training data for each task
allowed_ops (list): List of operations used to combine the features
allowed_param_ops (dict): A dict describing the desired non-linear parameterization
max_phi (int): Maximum rung for the calculation
calc_type (str): type of calculation regression or classification
max_phi (int): Maximum rung for the calculation
n_sis_select (int): number of features to select in each round of SIS
Returns:
......@@ -397,8 +401,9 @@ def generate_fs_sr_from_csv(
max_phi (int): Maximum rung for the calculation
n_sis_select (int): number of features to select in each round of SIS
max_dim (int): Maximum dimension of the models to learn
calc_type (str): type of calculation regression or classification
calc_type (str): type of calculation regression, log_regression, or classification
n_residuals (int): number of residuals to use for the next SIS step when learning higher dimensional models
n_model_store (int): number of models to store as output files
task_key (str): The key corresponding to which column in the csv file the task differentiation is stored in
leave_out_frac (float): Fraction of samples to leave out
leave_out_inds (list): Indices to use as the test set
......@@ -451,6 +456,20 @@ def generate_fs_sr_from_csv(
n_residuals,
n_model_store,
)
elif calc_type.lower() == "log_regression":
sr = SISSOLogRegressor(
fs,
prop_label,
prop_unit,
prop,
prop_test,
task_sizes_train,
task_sizes_test,
leave_out_inds,
max_dim,
n_residuals,
n_model_store,
)
else:
sr = SISSOClassifier(
fs,
......
......@@ -64,15 +64,26 @@ def plot_2d_map(
"""Plot a 2D map of a model (2 selected features)
Args:
model: Model to plot the map of
df: Dataframe from data.csv file
feats: List of 2 features to plot
index: row index to take default values for features not being plotted (from df, default is avg)
data_filename: Filename to store the data in
filename: Filename for the figure
fig_settings: Settings used to augment the plot
model (Model): Model to plot the map of
df (pd.DataFrame): Dataframe from data.csv file
feats (List of str): List of 2 features to plot
index (str): row index to take default values for features not being plotted (from df, default is avg)
data_filename (str): Filename to store the data in
filename (str): Filename for the figure
fig_settings (dict): Settings used to augment the plot
n_points (int): Number of points to plot
levels (int): Number of levels to plot for the contour plot
cmap (Colormap): Colormap to use for the figure
vmin (float): Minimum value of the colorbar axis
vmax (float): Maximum value of the colorbar axis
contour_lc (None or str): Color of the contour lines (if None then don't use them)
fig (mpl.Figure): The matplotlib Figure object
ax (mpl.Axis): The matplotlib axis for the plot
colorbar (bool): If True add a colorbar
Returns:
matplotlib.pyplot.Figure: The machine learning plot of the given model
matplotlib.pyplot.Figure: The pyplot Figure for the plot
matplotlib.pyplot.Axis: The pyplot axis for the plot
matplotlib.pyplot.Colorbar: The pyplot colorbar for the plot
"""
if len(feats) != 2:
raise ValueError("feats must be of length 2")
......@@ -178,9 +189,9 @@ def plot_model_ml_plot(model, filename=None, fig_settings=None):
"""Wrapper to plot_model for a set of training and testing data
Args:
model: The model to plot
model (Model): The model to plot
filename (str): Name of the file to store the plot in
fig_settings (adict): non-default plot settings
fig_settings (dict): non-default plot settings
Returns:
matplotlib.pyplot.Figure: The machine learning plot of the given model
......@@ -325,7 +336,7 @@ def generate_plot(dir_expr, filename=None, fig_settings=None):
Args:
dir_expr (str): Regular expression for the directory list
filename (str): Name of the file to store the plot in
fig_settings (adict): non-default plot settings
fig_settings (dict): non-default plot settings
"""
# Set up the figure
......
......@@ -16,3 +16,6 @@ config = toml.load(str(parent / "config.toml"))
# size
if "height" not in config["size"]:
config["size"]["height"] = config["size"]["width"] / config["size"]["ratio"]
mpl.rcParams['pdf.fonttype'] = 42
mpl.rcParams['ps.fonttype'] = 42
......@@ -2,8 +2,11 @@ from matplotlib.patches import PathPatch
import numpy as np
def adjust_box_widths(ax, fac):
"""
Adjust the widths of a seaborn-generated boxplot.
"""Adjust the widths of a seaborn-generated boxplot.
Args:
ax (pyplot.Axis): Axis to replot
fac(float): Rescaling factor
"""
# iterating through axes artists:
for c in ax.get_children():
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment