mllm_shap.shap.monte_carlo package#

Submodules#

mllm_shap.shap.monte_carlo.limited module#

Limited Monte Carlo approximation SHAP explainer implementation.

class mllm_shap.shap.monte_carlo.limited.LimitedMcShapExplainer(*args: Any, num_samples: int | None = None, fraction: float = 0.6, **kwargs: Any)[source]#

Bases: BaseMcShapExplainer

Limited Monte Carlo SHAP implementation.

allow_mask_duplicates: bool#

Whether to allow duplicate masks during generation.

embedding_model: BaseExternalEmbedding | None#

The external embedding model to use. If provided, overrides mode.

embedding_reducer: BaseEmbeddingReducer#

The embedding reduction strategy to use.

fraction: float | None#

Fraction of total possible masks to generate if num_samples is None.

include_minimal_masks: bool = True#

Whether to include minimal masks (single-feature and empty masks) in the sampling.

mode: Mode#

The SHAP mode, either STATIC or CONTEXTUAL. Used if no embedding_model is provided.

normalizer: BaseNormalizer#

The SHAP value normalizer to use.

num_samples: int | None#

Number of random masks to generate. If None, uses fraction. -1 stands for minimal number of samples (only single-feature masks and empty mask).

similarity_measure: BaseEmbeddingSimilarity#

The embedding similarity measure to use.

total_n_calls: int = 0#

Total number of MLLM calls made for last explanation.

mllm_shap.shap.monte_carlo.standard module#

Standard Monte Carlo approximation SHAP explainer implementation.

class mllm_shap.shap.monte_carlo.standard.StandardMcShapExplainer(*args: Any, num_samples: int | None = None, fraction: float = 0.6, **kwargs: Any)[source]#

Bases: BaseMcShapExplainer

Standard Monte Carlo SHAP Explainer.

allow_mask_duplicates: bool#

Whether to allow duplicate masks during generation.

embedding_model: BaseExternalEmbedding | None#

The external embedding model to use. If provided, overrides mode.

embedding_reducer: BaseEmbeddingReducer#

The embedding reduction strategy to use.

fraction: float | None#

Fraction of total possible masks to generate if num_samples is None.

include_minimal_masks: bool = False#

Whether to include minimal masks (single-feature and empty masks) in the sampling.

mode: Mode#

The SHAP mode, either STATIC or CONTEXTUAL. Used if no embedding_model is provided.

normalizer: BaseNormalizer#

The SHAP value normalizer to use.

num_samples: int | None#

Number of random masks to generate. If None, uses fraction. -1 stands for minimal number of samples (only single-feature masks and empty mask).

similarity_measure: BaseEmbeddingSimilarity#

The embedding similarity measure to use.

total_n_calls: int = 0#

Total number of MLLM calls made for last explanation.

mllm_shap.shap.monte_carlo.utils module#

Utility functions for Monte Carlo sampling in SHAP explanations.

mllm_shap.shap.monte_carlo.utils.approximate_budget(error_bound: float, confidence: float) int[source]#

Calculate the approximate number of samples needed to achieve a desired error bound with a specified confidence level using Hoeffding’s inequality.

Parameters:
  • error_bound (float) – The maximum allowable error.

  • confidence (float) – The desired confidence level (between 0 and 1).

Returns:

The calculated number of samples needed.

Return type:

int

Module contents#

Monte Carlo SHAP explainers.

All Monte Carlo SHAP explainers are based on approximating SHAP values using Monte Carlo sampling techniques. They differ from standard Monte Carlo methods by including first-order-omission masks, that is masks omitting exactly one feature (parametrizable).

First-order-omission masks are masks that omit exactly one feature from the set.

  • LimitedMcShapExplainer implements a limited Monte Carlo sampling

    that always includes first-order-omission masks.

  • StandardMcShapExplainer does not include first-order-omission masks.

  • approximate_budget() is a utility function to estimate the number

    of samples required to achieve a desired error bound with a specified confidence level using Hoeffding’s inequality.

class mllm_shap.shap.monte_carlo.LimitedMcShapExplainer(*args: Any, num_samples: int | None = None, fraction: float = 0.6, **kwargs: Any)[source]#

Bases: BaseMcShapExplainer

Limited Monte Carlo SHAP implementation.

allow_mask_duplicates: bool#

Whether to allow duplicate masks during generation.

embedding_model: BaseExternalEmbedding | None#

The external embedding model to use. If provided, overrides mode.

embedding_reducer: BaseEmbeddingReducer#

The embedding reduction strategy to use.

fraction: float | None#

Fraction of total possible masks to generate if num_samples is None.

include_minimal_masks: bool = True#

Whether to include minimal masks (single-feature and empty masks) in the sampling.

mode: Mode#

The SHAP mode, either STATIC or CONTEXTUAL. Used if no embedding_model is provided.

normalizer: BaseNormalizer#

The SHAP value normalizer to use.

num_samples: int | None#

Number of random masks to generate. If None, uses fraction. -1 stands for minimal number of samples (only single-feature masks and empty mask).

similarity_measure: BaseEmbeddingSimilarity#

The embedding similarity measure to use.

total_n_calls: int = 0#

Total number of MLLM calls made for last explanation.

class mllm_shap.shap.monte_carlo.StandardMcShapExplainer(*args: Any, num_samples: int | None = None, fraction: float = 0.6, **kwargs: Any)[source]#

Bases: BaseMcShapExplainer

Standard Monte Carlo SHAP Explainer.

allow_mask_duplicates: bool#

Whether to allow duplicate masks during generation.

embedding_model: BaseExternalEmbedding | None#

The external embedding model to use. If provided, overrides mode.

embedding_reducer: BaseEmbeddingReducer#

The embedding reduction strategy to use.

fraction: float | None#

Fraction of total possible masks to generate if num_samples is None.

include_minimal_masks: bool = False#

Whether to include minimal masks (single-feature and empty masks) in the sampling.

mode: Mode#

The SHAP mode, either STATIC or CONTEXTUAL. Used if no embedding_model is provided.

normalizer: BaseNormalizer#

The SHAP value normalizer to use.

num_samples: int | None#

Number of random masks to generate. If None, uses fraction. -1 stands for minimal number of samples (only single-feature masks and empty mask).

similarity_measure: BaseEmbeddingSimilarity#

The embedding similarity measure to use.

total_n_calls: int = 0#

Total number of MLLM calls made for last explanation.

mllm_shap.shap.monte_carlo.approximate_budget(error_bound: float, confidence: float) int[source]#

Calculate the approximate number of samples needed to achieve a desired error bound with a specified confidence level using Hoeffding’s inequality.

Parameters:
  • error_bound (float) – The maximum allowable error.

  • confidence (float) – The desired confidence level (between 0 and 1).

Returns:

The calculated number of samples needed.

Return type:

int