Skip to content

Model Randomization Metric

Model Randomization metric tests whether explanations degrade when model parameters are randomized. This implements a sanity check to verify that explainers are sensitive to model parameters.

Quote

We propose sanity checks for saliency methods. [...] We find that some widely deployed saliency methods are independent of both the data the model was trained on, and the model parameters.

-- Sanity Checks for Saliency Maps (2018)1

For each sample \((x, y)\):

  1. Compute explanation under the original model
  2. Randomize model parameters according to a strategy
  3. Compute explanation under the randomized model
  4. Measure Spearman rank correlation between both explanations

Info

A low Spearman correlation indicates that explanations are sensitive to the model parameters (desirable for a faithful explainer).

Score Interpretation

  • Lower scores are better: A low correlation means the explanations change significantly when model parameters are randomized, indicating the explainer properly depends on the learned weights.
  • Values range from -1 to 1, where 1 means perfectly correlated explanations.
  • High correlation values suggest the explainer may not be faithfully using model information.

Randomization Strategies

The metric supports different randomization strategies via ModelRandomizationStrategy:

ProgressiveLayerRandomization

Randomizes model weights layer-by-layer, starting from the output layers (by default) and progressing toward the input layers.

from xplique.metrics import ProgressiveLayerRandomization

# Randomize top 25% of layers (default)
strategy = ProgressiveLayerRandomization(stop_layer=0.25)

# Randomize up to a specific layer
strategy = ProgressiveLayerRandomization(stop_layer='conv2')

# Randomize from input toward output
strategy = ProgressiveLayerRandomization(stop_layer=3, reverse=False)

Example

from xplique.metrics import ModelRandomizationMetric, ProgressiveLayerRandomization
from xplique.attributions import Saliency

# load images, labels and model
# ...
explainer = Saliency(model)

# Use default strategy (randomize top 25% of layers)
metric = ModelRandomizationMetric(model, inputs, labels)
score = metric.evaluate(explainer)

# Or specify a custom strategy
strategy = ProgressiveLayerRandomization(stop_layer=0.5)
metric = ModelRandomizationMetric(model, inputs, labels, randomization_strategy=strategy)
score = metric.evaluate(explainer)

Warning

This metric clones the model internally for randomization. The original model weights are preserved.

ModelRandomizationMetric

Model Randomization metric.

__init__(self,
         model: Callable,
         inputs: Union[tf.Dataset, tensorflow.python.framework.tensor.Tensor, numpy.ndarray],
         targets: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None],
         randomization_strategy: xplique.metrics.randomization.ModelRandomizationStrategy = None,
         batch_size: Optional[int] = 64,
         activation: Optional[str] = None,
         seed: int = 42)

Parameters

  • model : Callable

    • Model to be randomized.

  • inputs : Union[tf.Dataset, tensorflow.python.framework.tensor.Tensor, numpy.ndarray]

    • Input samples.

  • targets : Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None]

    • One-hot encoded labels, shape (N, C), or integer labels which will be one-hot encoded.

  • randomization_strategy : xplique.metrics.randomization.ModelRandomizationStrategy = None

    • Strategy to randomize the model parameters. If None, a progressive randomization of top 25% layers is used.

  • batch_size : Optional[int] = 64

    • Number of samples to evaluate at once.

  • activation : Optional[str] = None

    • Optional activation, not used directly here.

  • seed : int = 42

    • Random seed for reproducibility.

evaluate(self,
         explainer: Union[xplique.attributions.base.WhiteBoxExplainer, xplique.attributions.base.BlackBoxExplainer]) -> float

Compute mean similarity score over the dataset.