Average Gain¶

Metrics: Average Drop/Increase/Gain Fidelity

Average Gain (AG) measures the relative increase in the model's confidence when the input is masked according to the explanation. This metric is complementary to Average Drop and evaluates whether explanations capture truly discriminative features.

Quote

We define Opti-CAM, a saliency method that obtains the saliency map by directly optimizing for the Average Gain metric.

-- Opti-CAM: Optimizing saliency maps for interpretability (2024)¹

Formally, for each sample \(i\):

\[ AG_i = \frac{\text{ReLU}(after_i - base_i)}{1 - base_i + \epsilon} \]

Where: - \(base_i = g(f, x_i, y_i)\) is the model score on the original input - \(after_i = g(f, x_i \odot M_i, y_i)\) is the model score after masking with explanation-derived mask \(M_i\)

Info

The better the explanation, the higher the Average Gain score. A high score indicates that isolated important features are sufficient to maintain or increase the model's confidence.

Score Interpretation¶

Higher scores are better: A high Average Gain means that the explanation successfully identifies features which, when isolated, are sufficient to maintain or increase the model's confidence.
The ReLU ensures we only measure gains, not decreases in confidence.
The normalization by \((1 - base_i)\) accounts for the remaining headroom to achieve a perfect score.

Example¶

from xplique.metrics import AverageGainMetric
from xplique.attributions import Saliency

# load images, labels and model
# ...
explainer = Saliency(model)
explanations = explainer(inputs, labels)

metric = AverageGainMetric(model, inputs, labels, activation="softmax")
score = metric.evaluate(explanations)

Tip

This metric is intended for scores in [0, 1]. If your model outputs logits, use activation="softmax" or "sigmoid" at construction to operate on probabilities.

`AverageGainMetric`¶

Average Gain (AG) — normalized relative increase when the input is masked by the explanation (higher AG is better, complementary to AD).

`init(self, model: keras.src.engine.training.Model, inputs: Union[tf.Dataset, tensorflow.python.framework.tensor.Tensor, numpy.ndarray], targets: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None] = None, batch_size: Optional[int] = 64, operator: Union[str, Callable, None] = None, activation: Optional[str] = None)`¶

Parameters

model : keras.src.engine.training.Model
- Model used for computing metric.
inputs : Union[tf.Dataset, tensorflow.python.framework.tensor.Tensor, numpy.ndarray]
- Input samples under study.
targets : Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None] = None
- One-hot encoded labels or regression target (e.g {+1, -1}), one for each sample.
batch_size : Optional[int] = 64
- Number of samples to process at once, if None compute all at once.
operator : Union[str, Callable, None] = None
- Function g to explain. It should take 3 parameters (f, x, y) and return a scalar, with f the model, x the inputs and y the targets. If None, use the standard operator g(f, x, y) = f(x)[y].
activation : Optional[str] = None
- A string that belongs to [None, 'sigmoid', 'softmax']. Specify if we should add an activation layer once the model has been called.

`detailed_evaluate(self, inputs: tensorflow.python.framework.tensor.Tensor, targets: tensorflow.python.framework.tensor.Tensor, explanations: tensorflow.python.framework.tensor.Tensor) -> numpy.ndarray`¶

Compute Average Gain scores for a batch of samples.

Parameters

inputs : tf.Tensor
- Batch of input samples. Shape: (B, H, W, C) for images, (B, T, F) for time series, or (B, ...) for other data types.
targets : tf.Tensor
- Batch of target labels. Shape: (B, num_classes) for one-hot encoded, or (B,) for class indices/regression targets.
explanations : tf.Tensor
- Batch of attribution maps. Shape must be compatible with inputs (same spatial/temporal dimensions, optionally without channel dimension).

Return

scores : np.ndarray
- Per-sample Average Gain scores, shape (B,).
  Values range from 0 (no gain or decrease) to potentially > 1.
  Higher values indicate better explanations.

`evaluate(self, explanations: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray]) -> float`¶

Evaluate the metric over the entire dataset by iterating over batches.

Opti-CAM: Optimizing saliency maps for interpretability (2024) ↩