Average Drop¶
Average Drop (AD) measures the relative decrease in the model's confidence when the input is masked according to the explanation. A good explanation should identify features that, when masked, cause the model's score to drop significantly.
Quote
We define Average Drop % as the percentage of positive drops averaged over all images.
-- Grad-CAM++: Generalized Gradient-based Visual Explanations for Deep Convolutional Networks (2018)1
Formally, for each sample \(i\):
Where: - \(base_i = g(f, x_i, y_i)\) is the model score on the original input - \(after_i = g(f, x_i \odot M_i, y_i)\) is the model score after masking with explanation-derived mask \(M_i\)
Info
The better the explanation, the lower the Average Drop score. A low score indicates that the explanation correctly identifies important features.
Score Interpretation¶
- Lower scores are better: A low Average Drop means that important features (as identified by the explanation) are correctly captured. When these features are preserved (via masking), the model's confidence doesn't drop much.
- The ReLU ensures we only measure drops, not increases in confidence.
- The normalization by \(base_i\) makes the metric scale-invariant.
Example¶
from xplique.metrics import AverageDropMetric
from xplique.attributions import Saliency
# load images, labels and model
# ...
explainer = Saliency(model)
explanations = explainer(inputs, labels)
metric = AverageDropMetric(model, inputs, labels, activation="softmax")
score = metric.evaluate(explanations)
Tip
This metric works best with probabilistic outputs. Set activation="softmax" or "sigmoid" if your model returns logits.
AverageDropMetric¶
Average Drop (AD) — measures relative decrease in the model score when the
input is masked by the explanation (lower AD is better).
__init__(self,
model: keras.src.engine.training.Model,
inputs: Union[tf.Dataset, tensorflow.python.framework.tensor.Tensor, numpy.ndarray],
targets: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None] = None,
batch_size: Optional[int] = 64,
operator: Union[str, Callable, None] = None,
activation: Optional[str] = None)¶
model: keras.src.engine.training.Model,
inputs: Union[tf.Dataset, tensorflow.python.framework.tensor.Tensor, numpy.ndarray],
targets: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None] = None,
batch_size: Optional[int] = 64,
operator: Union[str, Callable, None] = None,
activation: Optional[str] = None)
Parameters
-
model : keras.src.engine.training.Model
Model used for computing metric.
-
inputs : Union[tf.Dataset, tensorflow.python.framework.tensor.Tensor, numpy.ndarray]
Input samples under study.
-
targets : Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None] = None
One-hot encoded labels or regression target (e.g {+1, -1}), one for each sample.
-
batch_size : Optional[int] = 64
Number of samples to process at once, if None compute all at once.
-
operator : Union[str, Callable, None] = None
Function g to explain. It should take 3 parameters (f, x, y) and return a scalar, with f the model, x the inputs and y the targets. If None, use the standard operator g(f, x, y) = f(x)[y].
-
activation : Optional[str] = None
A string that belongs to [None, 'sigmoid', 'softmax']. Specify if we should add an activation layer once the model has been called.
detailed_evaluate(self,
inputs: tensorflow.python.framework.tensor.Tensor,
targets: tensorflow.python.framework.tensor.Tensor,
explanations: tensorflow.python.framework.tensor.Tensor) -> numpy.ndarray¶
inputs: tensorflow.python.framework.tensor.Tensor,
targets: tensorflow.python.framework.tensor.Tensor,
explanations: tensorflow.python.framework.tensor.Tensor) -> numpy.ndarray
Compute Average Drop scores for a batch of samples.
Parameters
-
inputs : tf.Tensor
Batch of input samples. Shape: (B, H, W, C) for images, (B, T, F) for time series, or (B, ...) for other data types.
-
targets : tf.Tensor
Batch of target labels. Shape: (B, num_classes) for one-hot encoded, or (B,) for class indices/regression targets.
-
explanations : tf.Tensor
Batch of attribution maps. Shape must be compatible with inputs (same spatial/temporal dimensions, optionally without channel dimension).
Return
-
scores : np.ndarray
Per-sample Average Drop scores, shape (B,).
Values range from 0 (no drop or increase) to ~1 (complete drop).
Lower values indicate better explanations.
evaluate(self,
explanations: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray]) -> float¶
explanations: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray]) -> float
Evaluate the metric over the entire dataset by iterating over batches.