auto_prep.modeling package

Submodules

auto_prep.modeling.handler module

class auto_prep.modeling.handler.ModelHandler[source]

Bases: object

Class responsible for loading and handling machine learning models and pipelines.

static generate_shap(X_test: DataFrame, model: BaseEstimator, model_idx: int, task: str)[source]

Generates SHAP plots for a given model.

Parameters:
  • X_test (pd.DataFrame) – Test data for SHAP analysis.

  • model (BaseEstimator) – Trained model for generating SHAP values.

  • model_idx (int) – Identifier for the model.

  • task (str) – regiression / classification

static load_models(task: str) List[BaseEstimator][source]
static load_modules(package: str) List[str][source]

Loads modules from the specified package that contains models (start with model_).

Parameters:

package (str) – The package to load modules from.

Returns:

found module names.

Return type:

List[str]

load_pipelines() List[BaseEstimator] | List[str][source]

Loads pipelines from the directory specified in config.

Returns:

loaded pipelines. List[str]: pipelines file names.

Return type:

List[BaseEstimator]

run(X_train: DataFrame, y_train: Series, X_valid: DataFrame, y_valid: Series, X_test: DataFrame, y_test: Series, task: str)[source]

Performs models fitting and selection.

Parameters:
  • X_train (pd.DataFrame) – Training feature dataset.

  • y_train (pd.Series) – Training target dataset.

  • X_valid (pd.DataFrame) – Validation feature dataset.

  • y_valid (pd.Series) – Validation target dataset.

  • X_test (pd.DataFrame) – Test feature dataset.

  • y_test (pd.Series) – Test target dataset.

  • task (str) – regiression / classification

static tune_model(scoring_func: callable, model_cls: BaseEstimator, best_k: int, X_train: DataFrame, y_train: Series, X_valid: DataFrame | None = None, y_valid: Series | None = None) dict | List[dict] | int[source]

Tunes a model’s hyperparameters using RandomizedSearchCV and returns the best model and related information.

Parameters:
  • scoring_func (Callable) – Scoring function for evaluating models.

  • model_cls (BaseEstimator) – Model class to be trained.

  • best_k (int) – Return up to k best models params.

  • X_train (pd.DataFrame) – Training feature dataset.

  • y_train (pd.Series) – Training target dataset.

  • X_valid (pd.DataFrame, optional) – Validation feature dataset. Defaults to None.

  • y_valid (pd.Series, optional) – Validation target dataset. Defaults to None.

Returns:

training meta info List[dict]: results int: models tested

Return type:

dict

write_to_raport(raport)[source]

Writes overview section to a raport

auto_prep.modeling.handler.custom_sort(key_value)[source]
auto_prep.modeling.handler.format_shape(df)

auto_prep.modeling.model_BayesianRidgeRegressor module

class auto_prep.modeling.model_BayesianRidgeRegressor.ModelBayesianRidgeRegressor(max_iter=300, tol=0.001, alpha_1=1e-06, alpha_2=1e-06, lambda_1=1e-06, lambda_2=1e-06, **kwargs)[source]

Bases: BayesianRidge, Regressor

This class implements a Bayesian Ridge Regressor model, which is a linear regression model with Bayesian regularization.

PARAM_GRID

A dictionary containing the parameter grid for hyperparameter tuning.

Type:

dict

to_tex() dict[source]

Returns a short description in the form of a dictionary.

PARAM_GRID = {'alpha_1': [1e-06, 1e-07, 1e-08], 'alpha_2': [1e-06, 1e-07, 1e-08], 'lambda_1': [1e-06, 1e-07, 1e-08], 'lambda_2': [1e-06, 1e-07, 1e-08], 'max_iter': [300, 400, 500], 'tol': [0.001, 0.0001, 1e-05]}
set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') ModelBayesianRidgeRegressor

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_predict_request(*, return_std: bool | None | str = '$UNCHANGED$') ModelBayesianRidgeRegressor

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

return_std (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for return_std parameter in predict.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') ModelBayesianRidgeRegressor

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

to_tex() dict[source]

Returns a short description in form of dictionary.

Returns:

A dictionary containing the name and description of the model.

Return type:

dict

auto_prep.modeling.model_DecisionTreeClassifier module

class auto_prep.modeling.model_DecisionTreeClassifier.ModelDecisionTreeClassifier(criterion='gini', splitter='best', max_depth=None, min_samples_split=2, min_samples_leaf=1, random_state=42, **kwargs)[source]

Bases: DecisionTreeClassifier, Classifier

This class extends the DecisionTreeClassifier and Classification classes to provide a decision tree classifier model with additional functionality.

PARAM_GRID

A dictionary containing the parameter grid for hyperparameter tuning.

Type:

dict

to_tex() dict[source]

Returns a short description in the form of a dictionary.

PARAM_GRID = {'criterion': ['gini', 'entropy'], 'max_depth': [None, 5, 10, 15, 20], 'min_samples_leaf': [1, 2, 4], 'min_samples_split': [2, 5, 10], 'random_state': [42], 'splitter': ['best', 'random']}
set_fit_request(*, check_input: bool | None | str = '$UNCHANGED$', sample_weight: bool | None | str = '$UNCHANGED$') ModelDecisionTreeClassifier

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • check_input (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for check_input parameter in fit.

  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_predict_proba_request(*, check_input: bool | None | str = '$UNCHANGED$') ModelDecisionTreeClassifier

Request metadata passed to the predict_proba method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict_proba if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict_proba.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

check_input (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for check_input parameter in predict_proba.

Returns:

self – The updated object.

Return type:

object

set_predict_request(*, check_input: bool | None | str = '$UNCHANGED$') ModelDecisionTreeClassifier

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

check_input (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for check_input parameter in predict.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') ModelDecisionTreeClassifier

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

to_tex() dict[source]

Returns a short description in form of dictionary.

Returns:

A dictionary containing the name and description of the model.

Return type:

dict

auto_prep.modeling.model_GaussianNaiveClassifier module

class auto_prep.modeling.model_GaussianNaiveClassifier.ModelGaussianNaiveClassifier(priors=None, var_smoothing=1e-09, **kwargs)[source]

Bases: GaussianNB, Classifier

This class implements a Gaussian Naive Bayes classifier, which is a probabilistic classifier based on applying Bayes’ theorem with strong (naive) independence assumptions between the features. .. attribute:: PARAM_GRID

A dictionary containing the parameter grid for hyperparameter tuning. It includes: - “priors”: List of prior probabilities of the classes. Default is [None]. - “var_smoothing”: List of float values for the portion of the largest

variance of all features that is added to variances for calculation stability.

type:

dict

__init__()[source]
to_tex() dict[source]

Returns a short description in the form of a dictionary.

PARAM_GRID = {'priors': [None], 'var_smoothing': [1e-09, 1e-07, 1e-05]}
set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') ModelGaussianNaiveClassifier

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_partial_fit_request(*, classes: bool | None | str = '$UNCHANGED$', sample_weight: bool | None | str = '$UNCHANGED$') ModelGaussianNaiveClassifier

Request metadata passed to the partial_fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to partial_fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to partial_fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • classes (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for classes parameter in partial_fit.

  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in partial_fit.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') ModelGaussianNaiveClassifier

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

to_tex() dict[source]

Returns a short description in form of dictionary.

Returns:

A dictionary containing the name and description of the model.

Return type:

dict

auto_prep.modeling.model_GradientBoostingRegressor module

class auto_prep.modeling.model_GradientBoostingRegressor.ModelGradientBoostingRegressor(n_estimators=100, learning_rate=0.1, max_depth=3, min_samples_split=2, min_samples_leaf=1, subsample=1.0, random_state=42, **kwargs)[source]

Bases: GradientBoostingRegressor, Regressor

This class implements a Gradient Boosting Regressor model with a predefined parameter grid for hyperparameter tuning. .. attribute:: PARAM_GRID

A dictionary containing the parameter grid for hyperparameter tuning.

type:

dict

__init__()[source]

Initializes the Gradient Boosting Regressor model.

to_tex() dict[source]

Returns a short description in the form of a dictionary.

This method initializes the Gradient Boosting Regressor model and logs the initialization.
PARAM_GRID = {'learning_rate': [0.1, 0.05, 0.02], 'max_depth': [4, 6, 8], 'min_samples_leaf': [1, 2, 4], 'min_samples_split': [2, 5, 10], 'n_estimators': [100, 200, 300], 'random_state': [42], 'subsample': [1.0, 0.5]}
set_fit_request(*, monitor: bool | None | str = '$UNCHANGED$', sample_weight: bool | None | str = '$UNCHANGED$') ModelGradientBoostingRegressor

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • monitor (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for monitor parameter in fit.

  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') ModelGradientBoostingRegressor

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

to_tex() dict[source]

Returns a short description in form of dictionary.

Returns:

A dictionary containing the name and description of the model.

Return type:

dict

auto_prep.modeling.model_KNeighboursClassifier module

class auto_prep.modeling.model_KNeighboursClassifier.ModelKNeighboursClassifier(n_neighbors=5, weights='uniform', algorithm='auto', leaf_size=30, p=2, **kwargs)[source]

Bases: KNeighborsClassifier, Classifier

K Neighbours Classifier model. .. attribute:: PARAM_GRID

Parameter grid for hyperparameter tuning.

type:

dict

__init__()[source]

Initializes the K Neighbours Classifier model.

to_tex() dict[source]

Returns a short description in the form of a dictionary.

PARAM_GRID = {'algorithm': ['auto', 'ball_tree', 'kd_tree', 'brute'], 'leaf_size': [30, 40, 50], 'n_neighbors': [5, 10, 15], 'p': [1, 2], 'weights': ['uniform', 'distance']}
set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') ModelKNeighboursClassifier

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

to_tex() dict[source]

Returns a short description in form of dictionary.

Returns:

A dictionary containing the name and description of the model.

Return type:

dict

auto_prep.modeling.model_KNeighboursRegressor module

class auto_prep.modeling.model_KNeighboursRegressor.ModelKNeighboursRegressor(n_neighbors=5, weights='uniform', algorithm='auto', leaf_size=30, p=2, **kwargs)[source]

Bases: KNeighborsRegressor, Regressor

K Neighbours Regressor model. .. attribute:: PARAM_GRID

Parameter grid for hyperparameter tuning.

type:

dict

__init__()[source]

Initializes the K Neighbours Regressor model.

to_tex() dict[source]

Returns a short description in the form of a dictionary.

PARAM_GRID = {'algorithm': ['auto', 'ball_tree', 'kd_tree', 'brute'], 'leaf_size': [30, 40, 50], 'n_neighbors': [5, 10, 15], 'p': [1, 2], 'weights': ['uniform', 'distance']}
set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') ModelKNeighboursRegressor

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

to_tex() dict[source]

Returns a short description in form of dictionary.

Returns:

A dictionary containing the name and description of the model.

Return type:

dict

auto_prep.modeling.model_LinearRegression module

class auto_prep.modeling.model_LinearRegression.ModelLinearRegression(fit_intercept=True, **kwargs)[source]

Bases: LinearRegression, Regressor

Linear regression model with added description method (to_tex()) and predefined PARAM_GRID that may be used in GridSearch.

PARAM_GRID = {'fit_intercept': [True, False]}
set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') ModelLinearRegression

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') ModelLinearRegression

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

to_tex() dict[source]

Returns a description of the model in a dictionary format.

Returns:

a dictionary containing models name, description and hyperparameters.

Return type:

dict

auto_prep.modeling.model_LinearSVR module

class auto_prep.modeling.model_LinearSVR.ModelLinearSVR(epsilon=0, C=1.0, loss='epsilon_insensitive', fit_intercept=True, **kwargs)[source]

Bases: LinearSVR, Regressor

Linear SVR model with added description method (to_tex()) and predefined PARAM_GRID that may be used in GridSearch.

PARAM_GRID = {'C': [0.1, 1.0, 10.0, 100.0], 'epsilon': [0.0, 0.1, 0.2, 0.5, 1.0], 'fit_intercept': [True, False], 'loss': ['epsilon_insensitive', 'squared_epsilon_insensitive']}
set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') ModelLinearSVR

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') ModelLinearSVR

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

to_tex() dict[source]

Returns a description of the model in a dictionary format.

Returns:

a dictionary containing models name, description and hyperparameters.

Return type:

dict

auto_prep.modeling.model_LogisticRegression module

class auto_prep.modeling.model_LogisticRegression.ModelLogisticRegression(penalty='l2', C=1.0, solver='lbfgs', l1_ratio=None, **kwargs)[source]

Bases: LogisticRegression, Classifier

Logistic regression model with added description method (to_tex()) and predefined PARAM_GRID that may be used in GridSearch.

PARAM_GRID = [{'C': [0.01, 0.1, 1, 10], 'penalty': ['l1'], 'solver': ['liblinear', 'saga']}, {'C': [0.01, 0.1, 1, 10], 'penalty': ['l2'], 'solver': ['lbfgs', 'liblinear', 'saga', 'newton-cg']}, {'C': [0.01, 0.1, 1, 10], 'l1_ratio': [0.5, 0.7], 'penalty': ['elasticnet'], 'solver': ['saga']}]
set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') ModelLogisticRegression

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') ModelLogisticRegression

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

to_tex() dict[source]

Returns a description of the model in dictionary format.

Returns:

a dictionary containing models name, description and hyperparameters.

Return type:

dict

auto_prep.modeling.model_RandomForestRegressor module

class auto_prep.modeling.model_RandomForestRegressor.ModelRandomForestRegressor(n_estimators=100, max_depth=None, min_samples_split=2, min_samples_leaf=1, max_features=1, bootstrap=True, random_state=42, **kwargs)[source]

Bases: RandomForestRegressor, Regressor

Random Forest Regressor model. .. attribute:: PARAM_GRID

Parameter grid for hyperparameter tuning.

type:

dict

__init__()[source]

Initializes the Random Forest Regressor model.

to_tex() dict[source]

Returns a short description in the form of a dictionary.

PARAM_GRID = {'bootstrap': [True, False], 'max_depth': [None, 5, 10, 15, 20], 'max_features': ['sqrt', 'log2', None], 'min_samples_leaf': [1, 2, 4], 'min_samples_split': [2, 5, 10], 'n_estimators': [100, 200, 300], 'random_state': [42]}
set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') ModelRandomForestRegressor

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') ModelRandomForestRegressor

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

to_tex() dict[source]

Returns a short description in form of dictionary.

Returns:

A dictionary containing the name and description of the model.

Return type:

dict

auto_prep.modeling.model_SVC module

class auto_prep.modeling.model_SVC.ModelSVC(C=1.0, kernel='rbf', degree=3, gamma='scale', random_state=42, probability=True, **kwargs)[source]

Bases: SVC, Classifier

Support Vector Classifier model. .. attribute:: PARAM_GRID

Parameter grid for hyperparameter tuning.

type:

dict

__init__()[source]

Initializes the Support Vector Classifier model.

to_tex() dict[source]

Returns a short description in the form of a dictionary.

PARAM_GRID = {'C': [0.1, 1, 10, 100, 1000], 'degree': [3, 4, 5], 'gamma': ['scale', 'auto'], 'kernel': ['linear', 'poly', 'rbf', 'sigmoid'], 'random_state': [42]}
set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') ModelSVC

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') ModelSVC

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

to_tex() dict[source]

Returns a short description in form of dictionary.

Returns:

A dictionary containing the name and description of the model.

Return type:

dict

auto_prep.modeling.model_XGBoost module

Module contents