Average Stability¶
Average Stability is a Stability metric measuring how similar are explanations of similar inputs.
Quote
[...] We want to ensure that, if inputs are near each other and their model outputs are similar, then their explanations should be close to each other.
-- Evaluating and Aggregating Feature-based Model Explanations (2020)1
Formally, given a predictor \(f\), an explanation function \(g\), a point \(x\), a radius \(r\) and a two distance metric: \(\rho\) over the inputs and \(D\) over the explanations, the AverageStability is defined as:
Info
The better the method, the smaller the score.
Example¶
from xplique.metrics import AverageStability
from xplique.attributions import Saliency
# load images, labels and model
# ...
explainer = Saliency(model)
metric = AverageStability(model, inputs, labels)
score = metric.evaluate(explainer)
AverageStability
¶
Used to compute the average sensitivity metric (or stability). This metric ensure that close
inputs with similar predictions yields similar explanations. For each inputs we randomly
sample noise to add to the inputs and compute the explanation for the noisy inputs. We then
get the average distance between the original explanations and the noisy explanations.
__init__(self,
model: Callable,
inputs: Union[tf.Dataset, tensorflow.python.framework.tensor.Tensor, numpy.ndarray],
targets: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None] = None,
batch_size: Optional[int] = 64,
radius: float = 0.1,
distance: Union[str, Callable] = 'l2',
nb_samples: int = 20)
¶
model: Callable,
inputs: Union[tf.Dataset, tensorflow.python.framework.tensor.Tensor, numpy.ndarray],
targets: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None] = None,
batch_size: Optional[int] = 64,
radius: float = 0.1,
distance: Union[str, Callable] = 'l2',
nb_samples: int = 20)
Parameters
-
model : Callable
Model used for computing metric.
-
inputs : Union[tf.Dataset, tensorflow.python.framework.tensor.Tensor, numpy.ndarray]
Input samples under study.
-
targets : Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None] = None
One-hot encoded labels or regression target (e.g {+1, -1}), one for each sample.
-
batch_size : Optional[int] = 64
Number of samples to explain at once, if None compute all at once.
-
radius : float = 0.1
Maximum value of the uniform noise added to the inputs before recalculating their explanations.
-
distance : Union[str, Callable] = 'l2'
Distance metric between the explanations.
-
nb_samples : int = 20
Number of different neighbors points to try on each input to measure the stability.
evaluate(self,
explainer: Callable,
base_explanations: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None] = None) -> float
¶
explainer: Callable,
base_explanations: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None] = None) -> float
Evaluate the fidelity score.
Parameters
-
explainer : Callable
Explainer or Explanations associated to each inputs.
-
base_explanations : Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None] = None
Explanation for the inputs under study. Calculates them automatically if they are not provided.
Return
-
stability_score : float
Average distance between the explanations
Warning
AverageStability will compute several time explanations for all the inputs (pertubed more or less severly). Thus, it might be very long to compute (especially if the explainer is already time consumming).