Complexity¶
Complexity measures the entropy of attribution maps to evaluate how concentrated or diffuse the explanations are. Lower complexity (lower entropy) indicates more concentrated, sparse explanations, which are often considered more interpretable.
Quote
We measure the complexity of an explanation as the entropy of the fractional contribution of feature attributions.
-- Evaluating and Aggregating Feature-based Model Explanations (2020)1
Formally, the complexity is defined as the Shannon entropy of the normalized absolute attributions:
Where \(p_i\) is the normalized absolute attribution for feature \(i\):
Info
The better the explanation, the lower the Complexity score. Lower entropy indicates more concentrated, interpretable explanations.
Score Interpretation¶
- Lower scores are better: A low Complexity score indicates that the explanation is concentrated in a few key features, making it easier to interpret.
- Higher entropy values (approaching \(\log(n)\) where \(n\) is the number of features) indicate more uniform/diffuse explanations.
- For image explanations with 4D tensors
(B, H, W, C), channels are averaged before computing complexity.
Example¶
from xplique.metrics import Complexity
from xplique.attributions import Saliency
# load images, labels and model
# ...
explainer = Saliency(model)
explanations = explainer(inputs, labels)
metric = Complexity()
score = metric.evaluate(explanations)
Note
Unlike fidelity metrics, Complexity does not require the model or targets—it only evaluates the explanation itself.
Complexity¶
Entropy-based complexity of attribution maps.
__init__(self,
batch_size: Optional[int] = 32)¶
batch_size: Optional[int] = 32)
detailed_evaluate(self,
explanations: tensorflow.python.framework.tensor.Tensor) -> numpy.ndarray¶
explanations: tensorflow.python.framework.tensor.Tensor) -> numpy.ndarray
Compute the Shannon entropy for each explanation in the batch.
Parameters
-
explanations : tf.Tensor
Attribution maps of shape (B, H, W) or (B, H, W, C) where: - B: batch size - H, W: spatial dimensions - C: channels (optional, will be averaged if present)
Return
-
np.ndarray : numpy.ndarray
Entropy values of shape (B,), one per sample. Values are non-negative, with theoretical maximum of log(H*W) for uniform distributions.