Skip to content

Complexity

Complexity measures the entropy of attribution maps to evaluate how concentrated or diffuse the explanations are. Lower complexity (lower entropy) indicates more concentrated, sparse explanations, which are often considered more interpretable.

Quote

We measure the complexity of an explanation as the entropy of the fractional contribution of feature attributions.

-- Evaluating and Aggregating Feature-based Model Explanations (2020)1

Formally, the complexity is defined as the Shannon entropy of the normalized absolute attributions:

\[ \text{Complexity} = -\sum_{i} p_i \log(p_i) \]

Where \(p_i\) is the normalized absolute attribution for feature \(i\):

\[ p_i = \frac{|a_i|}{\sum_j |a_j|} \]

Info

The better the explanation, the lower the Complexity score. Lower entropy indicates more concentrated, interpretable explanations.

Score Interpretation

  • Lower scores are better: A low Complexity score indicates that the explanation is concentrated in a few key features, making it easier to interpret.
  • Higher entropy values (approaching \(\log(n)\) where \(n\) is the number of features) indicate more uniform/diffuse explanations.
  • For image explanations with 4D tensors (B, H, W, C), channels are averaged before computing complexity.

Example

from xplique.metrics import Complexity
from xplique.attributions import Saliency

# load images, labels and model
# ...
explainer = Saliency(model)
explanations = explainer(inputs, labels)

metric = Complexity()
score = metric.evaluate(explanations)

Note

Unlike fidelity metrics, Complexity does not require the model or targets—it only evaluates the explanation itself.

Complexity

Entropy-based complexity of attribution maps.

__init__(self,
         batch_size: Optional[int] = 32)

detailed_evaluate(self,
                  explanations: tensorflow.python.framework.tensor.Tensor) -> numpy.ndarray

Compute the Shannon entropy for each explanation in the batch.

Parameters

  • explanations : tf.Tensor

    • Attribution maps of shape (B, H, W) or (B, H, W, C) where: - B: batch size - H, W: spatial dimensions - C: channels (optional, will be averaged if present)

Return

  • np.ndarray : numpy.ndarray

    • Entropy values of shape (B,), one per sample. Values are non-negative, with theoretical maximum of log(H*W) for uniform distributions.