Average Drop¶

Metrics: Average Drop/Increase/Gain Fidelity

Average Drop (AD) measures the relative decrease in the model's confidence when the input is masked according to the explanation. A good explanation should identify features that, when masked, cause the model's score to drop significantly.

Quote

We define Average Drop % as the percentage of positive drops averaged over all images.

-- Grad-CAM++: Generalized Gradient-based Visual Explanations for Deep Convolutional Networks (2018)¹

Formally, for each sample \(i\):

\[ AD_i = \frac{\text{ReLU}(base_i - after_i)}{base_i + \epsilon} \]

Where: - \(base_i = g(f, x_i, y_i)\) is the model score on the original input - \(after_i = g(f, x_i \odot M_i, y_i)\) is the model score after masking with explanation-derived mask \(M_i\)

Info

The better the explanation, the lower the Average Drop score. A low score indicates that the explanation correctly identifies important features.

Score Interpretation¶

Lower scores are better: A low Average Drop means that important features (as identified by the explanation) are correctly captured. When these features are preserved (via masking), the model's confidence doesn't drop much.
The ReLU ensures we only measure drops, not increases in confidence.
The normalization by \(base_i\) makes the metric scale-invariant.

Example¶

from xplique.metrics import AverageDropMetric
from xplique.attributions import Saliency

# load images, labels and model
# ...
explainer = Saliency(model)
explanations = explainer(inputs, labels)

metric = AverageDropMetric(model, inputs, labels, activation="softmax")
score = metric.evaluate(explanations)

Tip

This metric works best with probabilistic outputs. Set activation="softmax" or "sigmoid" if your model returns logits.

`AverageDropMetric`¶

Average Drop (AD) — measures relative decrease in the model score when the input is masked by the explanation (lower AD is better).

`init(self, model: keras.src.engine.training.Model, inputs: Union[tf.Dataset, tensorflow.python.framework.tensor.Tensor, numpy.ndarray], targets: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None] = None, batch_size: Optional[int] = 64, operator: Union[str, Callable, None] = None, activation: Optional[str] = None)`¶

Parameters

model : keras.src.engine.training.Model
- Model used for computing metric.
inputs : Union[tf.Dataset, tensorflow.python.framework.tensor.Tensor, numpy.ndarray]
- Input samples under study.
targets : Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None] = None
- One-hot encoded labels or regression target (e.g {+1, -1}), one for each sample.
batch_size : Optional[int] = 64
- Number of samples to process at once, if None compute all at once.
operator : Union[str, Callable, None] = None
- Function g to explain. It should take 3 parameters (f, x, y) and return a scalar, with f the model, x the inputs and y the targets. If None, use the standard operator g(f, x, y) = f(x)[y].
activation : Optional[str] = None
- A string that belongs to [None, 'sigmoid', 'softmax']. Specify if we should add an activation layer once the model has been called.

`detailed_evaluate(self, inputs: tensorflow.python.framework.tensor.Tensor, targets: tensorflow.python.framework.tensor.Tensor, explanations: tensorflow.python.framework.tensor.Tensor) -> numpy.ndarray`¶

Compute Average Drop scores for a batch of samples.

Parameters

inputs : tf.Tensor
- Batch of input samples. Shape: (B, H, W, C) for images, (B, T, F) for time series, or (B, ...) for other data types.
targets : tf.Tensor
- Batch of target labels. Shape: (B, num_classes) for one-hot encoded, or (B,) for class indices/regression targets.
explanations : tf.Tensor
- Batch of attribution maps. Shape must be compatible with inputs (same spatial/temporal dimensions, optionally without channel dimension).

Return

scores : np.ndarray
- Per-sample Average Drop scores, shape (B,).
  Values range from 0 (no drop or increase) to ~1 (complete drop).
  Lower values indicate better explanations.

`evaluate(self, explanations: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray]) -> float`¶

Evaluate the metric over the entire dataset by iterating over batches.

Grad-CAM++: Generalized Gradient-based Visual Explanations for Deep Convolutional Networks (2018) ↩