auto_prep.visualization package

Submodules

auto_prep.visualization.categorical module

class auto_prep.visualization.categorical.CategoricalVisualizer[source]

Bases: object

Contains methods that generate eda charts for categorical data. Will be fed with just categorical columns from original dataset. All methods will be called in order defined in order. Each method that would be called should return a tuple of (path_to_chart, chart title for latex) - if there is no need for chart generation should return (“”, “”). Charts should be saved via save_chart.

static categorical_distribution_chart(X: DataFrame, y: Series) List[Tuple[str, str]][source]

Generates a plot to visualize the distribution of categorical features.

order = ['categorical_distribution_chart']

auto_prep.visualization.eda module

class auto_prep.visualization.eda.EdaVisualizer[source]

Bases: object

Contains methods that generate basic eda charts. Will be fed with entire original dataset. All methods will be called in order defined in order. Each method that would be called should return a tuple of (path_to_chart, chart title for latex) - if there is no need for chart generation should return (“”, “”). Charts should be saved via save_chart.

static missing_values_chart(X: DataFrame, y: Series) Tuple[str, str][source]

Generates a plot to visualize the percentage of missing values for each feature in the given DataFrame.

order = ['target_distribution_chart', 'missing_values_chart']
static target_distribution_chart(X: DataFrame, y: Series, task: str = 'classification') Tuple[str, str][source]

Generates a plot to visualize the distribution of the target variable.

Parameters:
  • X (pd.DataFrame) – Input features (not used directly, included for API consistency).

  • y (pd.Series) – Target variable to visualize.

  • task (str) – Type of task, either “classification” or “regression”.

Returns:

Path to the saved chart and a description of the chart.

Return type:

Tuple[str, str]

auto_prep.visualization.numerical module

class auto_prep.visualization.numerical.NumericalVisualizer[source]

Bases: object

Contains methods that generate eda charts for numerical data. Will be fed with just numerical columns from original dataset. All methods will be called in order defined in order. Each method that would be called should return a tuple of (path_to_chart, chart title for latex) - if there is no need for chart generation should return (“”, “”). Charts should be saved via save_chart.

static correlation_heatmap_chart(X: DataFrame, y: Series) Tuple[str, str][source]

Generates a plot to visualize the correlation between features.

static numerical_distribution_chart(X: DataFrame, y: Series) List[Tuple[str, str]][source]

Generates a plot to visualize the distribution of numerical features.

static numerical_features_boxplot_chart(X: DataFrame, y: Series) List[Tuple[str, str]][source]

Generates boxplots for numerical features, split into multiple pages if necessary.

order = ['numerical_distribution_chart', 'correlation_heatmap_chart', 'numerical_features_boxplot_chart']

Module contents