Random Logit Metric¶
Random Logit Invariance metric tests whether explanations change when the target logit is randomized to a different class. This is a sanity check to verify that explainers are sensitive to the target label.
Quote
We propose sanity checks for saliency methods. [...] We find that some widely deployed saliency methods are independent of both the data the model was trained on, and the model parameters.
For each sample \((x, y)\):
- Compute explanation for the true class \(y\)
- Randomly draw an off-class \(y' \neq y\)
- Compute explanation for \(y'\)
- Measure SSIM (Structural Similarity Index) between both explanations
Info
A low SSIM indicates that explanations are sensitive to the target label (desirable if we expect class-specific explanations).
Score Interpretation¶
- Lower scores are better: A low SSIM means the explanations change significantly when the target class changes, indicating the explainer is properly sensitive to the target.
- Values range from -1 to 1, where 1 means identical explanations.
- High SSIM values suggest the explainer may not be faithfully explaining class-specific features.
Example¶
from xplique.metrics import RandomLogitMetric
from xplique.attributions import Saliency
# load images, labels and model
# ...
explainer = Saliency(model)
metric = RandomLogitMetric(model, inputs, labels)
score = metric.evaluate(explainer)
Warning
This metric requires one-hot encoded labels with shape (N, C) where C is the number of classes.
RandomLogitMetric¶
Random Logit Invariance metric.
__init__(self,
model: Callable,
inputs: Union[tf.Dataset, tensorflow.python.framework.tensor.Tensor, numpy.ndarray],
targets: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None],
batch_size: Optional[int] = 64,
activation: Optional[str] = None,
seed: int = 42)¶
model: Callable,
inputs: Union[tf.Dataset, tensorflow.python.framework.tensor.Tensor, numpy.ndarray],
targets: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None],
batch_size: Optional[int] = 64,
activation: Optional[str] = None,
seed: int = 42)
Parameters
-
model : Callable
Model used to compute explanations.
-
inputs : Union[tf.Dataset, tensorflow.python.framework.tensor.Tensor, numpy.ndarray]
Input samples.
-
targets : Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None]
One-hot encoded labels, shape (N, C).
-
batch_size : Optional[int] = 64
Number of samples to evaluate at once.
-
activation : Optional[str] = None
Optional activation applied in the explainer/model, not used directly here.
-
seed : int = 42
Random seed used when sampling off-classes.
evaluate(self,
explainer: Union[xplique.attributions.base.WhiteBoxExplainer, xplique.attributions.base.BlackBoxExplainer]) -> float¶
explainer: Union[xplique.attributions.base.WhiteBoxExplainer, xplique.attributions.base.BlackBoxExplainer]) -> float
Compute mean similarity score over the dataset.