Skip to content

Average Stability

Average Stability is a Stability metric measuring how similar are explanations of similar inputs.

Quote

[...] We want to ensure that, if inputs are near each other and their model outputs are similar, then their explanations should be close to each other.

-- Evaluating and Aggregating Feature-based Model Explanations (2020)1

Formally, given a predictor \(f\), an explanation function \(g\), a point \(x\), a radius \(r\) and a two distance metric: \(\rho\) over the inputs and \(D\) over the explanations, the AverageStability is defined as:

\[ S = \underset{z : \rho(x, z) \leq r}{\int} D(g(f, x), g(f, z))\ dz \]

Info

The better the method, the smaller the score.

Example

from xplique.metrics import AverageStability
from xplique.attributions import Saliency

# load images, labels and model
# ...
explainer = Saliency(model)

metric = AverageStability(model, inputs, labels)
score = metric.evaluate(explainer)

AverageStability

Used to compute the average sensitivity metric (or stability). This metric ensure that close inputs with similar predictions yields similar explanations. For each inputs we randomly sample noise to add to the inputs and compute the explanation for the noisy inputs. We then get the average distance between the original explanations and the noisy explanations.

__init__(self,
         model: Callable,
         inputs: Union[tf.Dataset, tf.Tensor, numpy.ndarray],
         targets: Union[tf.Tensor, numpy.ndarray, None] = None,
         batch_size: Optional[int] = 64,
         radius: float = 0.1,
         distance: Union[str, Callable] = 'l2',
         nb_samples: int = 20)

Parameters

  • model : Callable

    • Model used for computing metric.

  • inputs : Union[tf.Dataset, tf.Tensor, numpy.ndarray]

    • Input samples under study.

  • targets : Union[tf.Tensor, numpy.ndarray, None] = None

    • One-hot encoded labels or regression target (e.g {+1, -1}), one for each sample.

  • batch_size : Optional[int] = 64

    • Number of samples to explain at once, if None compute all at once.

  • radius : float = 0.1

    • Maximum value of the uniform noise added to the inputs before recalculating their explanations.

  • distance : Union[str, Callable] = 'l2'

    • Distance metric between the explanations.

  • nb_samples : int = 20

    • Number of different neighbors points to try on each input to measure the stability.

evaluate(self,
         explainer: Callable,
         base_explanations: Union[tf.Tensor, numpy.ndarray, None] = None) -> float

Evaluate the fidelity score.

Parameters

  • explainer : Callable

    • Explainer or Explanations associated to each inputs.

  • base_explanations : Union[tf.Tensor, numpy.ndarray, None] = None

    • Explanation for the inputs under study. Calculates them automatically if they are not provided.

Return

  • stability_score : float

    • Average distance between the explanations


Warning

AverageStability will compute several time explanations for all the inputs (pertubed more or less severly). Thus, it might be very long to compute (especially if the explainer is already time consumming).