mllm_shap.shap.monte_carlo package#

Submodules#

mllm_shap.shap.monte_carlo.limited module#

Limited Monte Carlo approximation SHAP explainer implementation.

class mllm_shap.shap.monte_carlo.limited.LimitedMcShapExplainer(*args: Any, num_samples: int | None = None, fraction: float = 0.6, **kwargs: Any)[source]#

Bases: BaseMcShapExplainer

Limited Monte Carlo SHAP implementation.

allow_mask_duplicates#: Whether to allow duplicate masks during generation.

embedding_model#: The external embedding model to use. If provided, overrides mode.

embedding_reducer#: The embedding reduction strategy to use.

fraction#: Fraction of total possible masks to generate if num_samples is None.

include_minimal_masks = True#: Whether to include minimal masks (single-feature and empty masks) in the sampling.

last_observability_sink = None#: Sink used for the most recent call (if observability was enabled).

mode#: The SHAP mode, either STATIC or CONTEXTUAL. Used if no embedding_model is provided.

normalizer#: The SHAP value normalizer to use.

num_samples#: Number of random masks to generate. If None, uses fraction. -1 stands for minimal number of samples (only single-feature masks and empty mask).

similarity_measure#: The embedding similarity measure to use.

total_n_calls = 0#: Total number of MLLM calls made for last explanation.

mllm_shap.shap.monte_carlo.standard module#

Standard Monte Carlo approximation SHAP explainer implementation.

class mllm_shap.shap.monte_carlo.standard.StandardMcShapExplainer(*args: Any, num_samples: int | None = None, fraction: float = 0.6, **kwargs: Any)[source]#

Bases: BaseMcShapExplainer

Standard Monte Carlo SHAP Explainer.

allow_mask_duplicates#: Whether to allow duplicate masks during generation.

embedding_model#: The external embedding model to use. If provided, overrides mode.

embedding_reducer#: The embedding reduction strategy to use.

fraction#: Fraction of total possible masks to generate if num_samples is None.

include_minimal_masks: bool = False#: Whether to include minimal masks (single-feature and empty masks) in the sampling.

last_observability_sink = None#: Sink used for the most recent call (if observability was enabled).

mode#: The SHAP mode, either STATIC or CONTEXTUAL. Used if no embedding_model is provided.

normalizer#: The SHAP value normalizer to use.

num_samples#: Number of random masks to generate. If None, uses fraction. -1 stands for minimal number of samples (only single-feature masks and empty mask).

similarity_measure#: The embedding similarity measure to use.

total_n_calls = 0#: Total number of MLLM calls made for last explanation.

mllm_shap.shap.monte_carlo.utils module#

Utility functions for Monte Carlo sampling in SHAP explanations.

mllm_shap.shap.monte_carlo.utils.approximate_budget(error_bound: float, confidence: float) → int[source]#

Calculate the approximate number of samples needed to achieve a desired error bound with a specified confidence level using Hoeffding’s inequality.

Parameters:

error_bound (float) – The maximum allowable error.
confidence (float) – The desired confidence level (between 0 and 1).

Returns:

The calculated number of samples needed.

Return type:

int

Module contents#

Monte Carlo SHAP explainers.

All Monte Carlo SHAP explainers are based on approximating SHAP values using Monte Carlo sampling techniques. They differ from standard Monte Carlo methods by including first-order-omission masks, that is masks omitting exactly one feature (parametrizable).

First-order-omission masks are masks that omit exactly one feature from the set.

LimitedMcShapExplainer implements a limited Monte Carlo sampling
that always includes first-order-omission masks.
StandardMcShapExplainer does not include first-order-omission masks.
approximate_budget() is a utility function to estimate the number
of samples required to achieve a desired error bound with a specified confidence level using Hoeffding’s inequality.

class mllm_shap.shap.monte_carlo.LimitedMcShapExplainer(*args: Any, num_samples: int | None = None, fraction: float = 0.6, **kwargs: Any)[source]#

Bases: BaseMcShapExplainer

Limited Monte Carlo SHAP implementation.

allow_mask_duplicates#: Whether to allow duplicate masks during generation.

embedding_model#: The external embedding model to use. If provided, overrides mode.

embedding_reducer#: The embedding reduction strategy to use.

fraction#: Fraction of total possible masks to generate if num_samples is None.

include_minimal_masks = True#: Whether to include minimal masks (single-feature and empty masks) in the sampling.

last_observability_sink = None#: Sink used for the most recent call (if observability was enabled).

mode#: The SHAP mode, either STATIC or CONTEXTUAL. Used if no embedding_model is provided.

normalizer#: The SHAP value normalizer to use.

num_samples#: Number of random masks to generate. If None, uses fraction. -1 stands for minimal number of samples (only single-feature masks and empty mask).

similarity_measure#: The embedding similarity measure to use.

total_n_calls = 0#: Total number of MLLM calls made for last explanation.

class mllm_shap.shap.monte_carlo.StandardMcShapExplainer(*args: Any, num_samples: int | None = None, fraction: float = 0.6, **kwargs: Any)[source]#

Bases: BaseMcShapExplainer

Standard Monte Carlo SHAP Explainer.

allow_mask_duplicates#: Whether to allow duplicate masks during generation.

embedding_model#: The external embedding model to use. If provided, overrides mode.

embedding_reducer#: The embedding reduction strategy to use.

fraction#: Fraction of total possible masks to generate if num_samples is None.

include_minimal_masks: bool = False#: Whether to include minimal masks (single-feature and empty masks) in the sampling.

last_observability_sink = None#: Sink used for the most recent call (if observability was enabled).

mode#: The SHAP mode, either STATIC or CONTEXTUAL. Used if no embedding_model is provided.

normalizer#: The SHAP value normalizer to use.

num_samples#: Number of random masks to generate. If None, uses fraction. -1 stands for minimal number of samples (only single-feature masks and empty mask).

similarity_measure#: The embedding similarity measure to use.

total_n_calls = 0#: Total number of MLLM calls made for last explanation.

mllm_shap.shap.monte_carlo.approximate_budget(error_bound: float, confidence: float) → int[source]#

Calculate the approximate number of samples needed to achieve a desired error bound with a specified confidence level using Hoeffding’s inequality.

Parameters:

error_bound (float) – The maximum allowable error.
confidence (float) – The desired confidence level (between 0 and 1).

Returns:

The calculated number of samples needed.

Return type:

int