auto_prep.visualization package
Submodules
auto_prep.visualization.categorical module
- class auto_prep.visualization.categorical.CategoricalVisualizer[source]
Bases:
object
Contains methods that generate eda charts for categorical data. Will be fed with just categorical columns from original dataset. All methods will be called in order defined in
order
. Each method that would be called should return a tuple of (path_to_chart, chart title for latex) - if there is no need for chart generation should return (“”, “”). Charts should be saved viasave_chart
.- static categorical_distribution_chart(X: DataFrame, y: Series) List[Tuple[str, str]] [source]
Generates a plot to visualize the distribution of categorical features.
- order = ['categorical_distribution_chart']
auto_prep.visualization.eda module
- class auto_prep.visualization.eda.EdaVisualizer[source]
Bases:
object
Contains methods that generate basic eda charts. Will be fed with entire original dataset. All methods will be called in order defined in
order
. Each method that would be called should return a tuple of (path_to_chart, chart title for latex) - if there is no need for chart generation should return (“”, “”). Charts should be saved viasave_chart
.- static missing_values_chart(X: DataFrame, y: Series) Tuple[str, str] [source]
Generates a plot to visualize the percentage of missing values for each feature in the given DataFrame.
- order = ['target_distribution_chart', 'missing_values_chart']
- static target_distribution_chart(X: DataFrame, y: Series, task: str = 'classification') Tuple[str, str] [source]
Generates a plot to visualize the distribution of the target variable.
- Parameters:
X (pd.DataFrame) – Input features (not used directly, included for API consistency).
y (pd.Series) – Target variable to visualize.
task (str) – Type of task, either “classification” or “regression”.
- Returns:
Path to the saved chart and a description of the chart.
- Return type:
Tuple[str, str]
auto_prep.visualization.numerical module
- class auto_prep.visualization.numerical.NumericalVisualizer[source]
Bases:
object
Contains methods that generate eda charts for numerical data. Will be fed with just numerical columns from original dataset. All methods will be called in order defined in
order
. Each method that would be called should return a tuple of (path_to_chart, chart title for latex) - if there is no need for chart generation should return (“”, “”). Charts should be saved viasave_chart
.- static correlation_heatmap_chart(X: DataFrame, y: Series) Tuple[str, str] [source]
Generates a plot to visualize the correlation between features.
- static numerical_distribution_chart(X: DataFrame, y: Series) List[Tuple[str, str]] [source]
Generates a plot to visualize the distribution of numerical features.
- static numerical_features_boxplot_chart(X: DataFrame, y: Series) List[Tuple[str, str]] [source]
Generates boxplots for numerical features, split into multiple pages if necessary.
- order = ['numerical_distribution_chart', 'correlation_heatmap_chart', 'numerical_features_boxplot_chart']