auto_prep.visualization package
Submodules
auto_prep.visualization.categorical module
- class auto_prep.visualization.categorical.CategoricalVisualizer[source]
Bases:
objectContains methods that generate eda charts for categorical data. Will be fed with just categorical columns from original dataset. All methods will be called in order defined in
order. Each method that would be called should return a tuple of (path_to_chart, chart title for latex) - if there is no need for chart generation should return (“”, “”). Charts should be saved viasave_chart.- static categorical_distribution_chart(X: DataFrame, y: Series) List[Tuple[str, str]][source]
Generates a plot to visualize the distribution of categorical features.
- order = ['categorical_distribution_chart']
auto_prep.visualization.eda module
- class auto_prep.visualization.eda.EdaVisualizer[source]
Bases:
objectContains methods that generate basic eda charts. Will be fed with entire original dataset. All methods will be called in order defined in
order. Each method that would be called should return a tuple of (path_to_chart, chart title for latex) - if there is no need for chart generation should return (“”, “”). Charts should be saved viasave_chart.- static missing_values_chart(X: DataFrame, y: Series) Tuple[str, str][source]
Generates a plot to visualize the percentage of missing values for each feature in the given DataFrame.
- order = ['target_distribution_chart', 'missing_values_chart']
- static target_distribution_chart(X: DataFrame, y: Series, task: str = 'classification') Tuple[str, str][source]
Generates a plot to visualize the distribution of the target variable.
- Parameters:
X (pd.DataFrame) – Input features (not used directly, included for API consistency).
y (pd.Series) – Target variable to visualize.
task (str) – Type of task, either “classification” or “regression”.
- Returns:
Path to the saved chart and a description of the chart.
- Return type:
Tuple[str, str]
auto_prep.visualization.numerical module
- class auto_prep.visualization.numerical.NumericalVisualizer[source]
Bases:
objectContains methods that generate eda charts for numerical data. Will be fed with just numerical columns from original dataset. All methods will be called in order defined in
order. Each method that would be called should return a tuple of (path_to_chart, chart title for latex) - if there is no need for chart generation should return (“”, “”). Charts should be saved viasave_chart.- static correlation_heatmap_chart(X: DataFrame, y: Series) Tuple[str, str][source]
Generates a plot to visualize the correlation between features.
- static numerical_distribution_chart(X: DataFrame, y: Series) List[Tuple[str, str]][source]
Generates a plot to visualize the distribution of numerical features.
- static numerical_features_boxplot_chart(X: DataFrame, y: Series) List[Tuple[str, str]][source]
Generates boxplots for numerical features, split into multiple pages if necessary.
- order = ['numerical_distribution_chart', 'correlation_heatmap_chart', 'numerical_features_boxplot_chart']