Skip to content

Deletion

The Deletion Fidelity metric measures how well a saliency-map–based explanation localizes the important features.

Quote

The deletion metric measures the drop in the probability of a class as important pixels (given by the saliency map) are gradually removed from the image. A sharp drop, and thus a small area under the probability curve, are indicative of a good explanation.

-- RISE: Randomized Input Sampling for Explanation of Black-box Models (2018)1

Score interpretation

The interpretation of the score depends on your operator, which represents the metrics you use to evaluate your model. For metrics where the score increases with the performance of the model (such as accuracy). If explanations are accurate, the score will quickly fall to the score of a random model. Thus, in this case, a lower score represent a more accurate explanation.

For metrics where the score decreases with the performance of the model (such as losses). If explanations are accurate, the score will quickly rise to the score of a random model. Thus, in this case, a higher score represent a more accurate explanation.

Remarks

This metric only evaluate the order of importance between features.

The parameters metric, steps and max_percentage_perturbed may drastically change the score :

  • For inputs with many features, increasing the number of steps will allow you to capture more efficiently the difference between attributions methods.

  • The order of importance of features with low importance may not matter, hence, decreasing the max_percentage_perturbed, may make the score more relevant.

Sometimes, attributions methods also returns negative attributions, for those methods, do not take the absolute value before computing insertion and deletion metrics. Otherwise, negative attributions may have higher absolute values, and the order of importance between features will change. Therefore, take those previous remarks into account to get a relevant score.

Example

from xplique.metrics import Deletion
from xplique.attributions import Saliency

# load images, targets and model
# ...
explainer = Saliency(model)
explanations = explainer(inputs, targets)

metric = Deletion(model, inputs, targets)
score = metric.evaluate(explanations)

Deletion

The deletion metric measures the drop in the probability of a class as important pixels (given by the saliency map) are gradually removed from the image. A sharp drop, and thus a small area under the probability curve, are indicative of a good explanation.

__init__(self,
         model: keras.engine.training.Model,
         inputs: Union[tf.Dataset, tf.Tensor, numpy.ndarray],
         targets: Union[tf.Tensor, numpy.ndarray, None] = None,
         batch_size: Optional[int] = 64,
         baseline_mode: Union[float, Callable] = 0.0,
         steps: int = 10,
         max_percentage_perturbed: float = 1.0,
         operator: Optional[Callable] = None)

Parameters

  • model : keras.engine.training.Model

    • Model used for computing metric.

  • inputs : Union[tf.Dataset, tf.Tensor, numpy.ndarray]

    • Input samples under study.

  • targets : Union[tf.Tensor, numpy.ndarray, None] = None

    • One-hot encoded labels or regression target (e.g {+1, -1}), one for each sample.

  • batch_size : Optional[int] = 64

    • Number of samples to explain at once, if None compute all at once.

  • baseline_mode : Union[float, Callable] = 0.0

    • Value of the baseline state, will be called with the inputs if it is a function.

  • steps : int = 10

    • Number of steps between the start and the end state.

      Can be set to -1 for all possible steps to be computed.

  • max_percentage_perturbed : float = 1.0

    • Maximum percentage of the input perturbed.

  • operator : Optional[Callable] = None

    • Function g to explain, g take 3 parameters (f, x, y) and should return a scalar, with f the model, x the inputs and y the targets. If None, use the standard operator g(f, x, y) = f(x)[y].

detailed_evaluate(self,
                  explanations: Union[tf.Tensor, numpy.ndarray]) -> Dict[int, float]

Evaluate model performance for successive perturbations of an input. Used to compute causal score.

Parameters

  • explanations : Union[tf.Tensor, numpy.ndarray]

    • Explanation for the inputs, labels to evaluate.

Return

  • causal_score_dict : Dict[int, float]

    • Dictionary of scores obtain for different perturbations Keys are the steps, i.e the number of features perturbed Values are the scores, the score of the model on the inputs with the corresponding number of features perturbed


evaluate(self,
         explanations: Union[tf.Tensor, numpy.ndarray]) -> float

Evaluate the causal score.

Parameters

  • explanations : Union[tf.Tensor, numpy.ndarray]

    • Explanation for the inputs, labels to evaluate.

Return

  • causal_score : float

    • Metric score, area over the deletion (lower is better) or insertion (higher is better) curve.