Integrated Gradients¶
_{ }View colab tutorial  _{ }View source  ðŸ“° Paper
Integrated Gradients is a visualization technique resulting of a theoretical search for an explanatory method that satisfies two axioms, Sensitivity and Implementation Invariance (Sundararajan et al^{1}).
Quote
We consider the straightline path (in \(R^n\)) from the baseline \(\bar{x}\) to the input \(x\), and compute the gradients at all points along the path. Integrated gradients are obtained by cumulating these gradients.
Rather than calculating only the gradient relative to the image, the method consists of averaging the gradient values along the path from a baseline state to the current value. The baseline state is often set to zero, representing the complete absence of features.
More precisely, with \(x_0\) the baseline state, \(x\) the image and \(f\) our classifier, the Integrated Gradient attribution is defined as
In order to approximate from a finite number of steps, the implementation here use the Trapezoidal rule^{3} and not a leftRiemann summation, which allows for more accurate results and improved performance. (see the paper below for a comparison of the methods^{2}).
Example¶
from xplique.attributions import IntegratedGradients
# load images, labels and model
# ...
method = IntegratedGradients(model, steps=50, baseline_value=0.0)
explanations = method.explain(images, labels)
Notebooks¶
IntegratedGradients
¶
Used to compute the Integrated Gradients, by cumulating the gradients along a path from a
baseline to the desired point.
__init__(self,
model: keras.src.engine.training.Model,
output_layer: Union[str, int, None] = None,
batch_size: Optional[int] = 32,
operator: Union[xplique.commons.operators_operations.Tasks, str,
Callable[[keras.src.engine.training.Model, tf.Tensor, tf.Tensor], float], None] = None,
reducer: Optional[str] = 'mean',
steps: int = 50,
baseline_value: float = 0.0)
¶
model: keras.src.engine.training.Model,
output_layer: Union[str, int, None] = None,
batch_size: Optional[int] = 32,
operator: Union[xplique.commons.operators_operations.Tasks, str,
Callable[[keras.src.engine.training.Model, tf.Tensor, tf.Tensor], float], None] = None,
reducer: Optional[str] = 'mean',
steps: int = 50,
baseline_value: float = 0.0)
Parameters

model : keras.src.engine.training.Model
The model from which we want to obtain explanations

output_layer : Union[str, int, None] = None
Layer to target for the outputs (e.g logits or after softmax).
If an
int
is provided it will be interpreted as a layer index.If a
string
is provided it will look for the layer name.Default to the last layer.
It is recommended to use the layer before Softmax.

batch_size : Optional[int] = 32
Number of inputs to explain at once, if None compute all at once.

operator : Union[xplique.commons.operators_operations.Tasks, str, Callable[[keras.src.engine.training.Model, tf.Tensor, tf.Tensor], float], None] = None
Function g to explain, g take 3 parameters (f, x, y) and should return a scalar, with f the model, x the inputs and y the targets. If None, use the standard operator g(f, x, y) = f(x)[y].

reducer : Optional[str] = 'mean'
String, name of the reducer to use. Either "min", "mean", "max", "sum", or
None
to ignore.Used only for images to obtain explanation with shape (n, h, w, 1).

steps : int = 50
Number of points to interpolate between the baseline and the desired point.

baseline_value : float = 0.0
Scalar used to create the the baseline point.
explain(self,
inputs: Union[tf.Dataset, tf.Tensor, ] ,
targets: Union[tf.Tensor, , None] = None) > tf.Tensor
¶
inputs: Union[tf.Dataset, tf.Tensor,
targets: Union[tf.Tensor,
Compute the explanations of the given inputs.
Accept Tensor, numpy array or tf.data.Dataset (in that case targets is None)
Parameters

inputs : Union[tf.Dataset, tf.Tensor,
] Dataset, Tensor or Array. Input samples to be explained.
If Dataset, targets should not be provided (included in Dataset).
Expected shape among (N, W), (N, T, W), (N, H, W, C).
More information in the documentation.

targets : Union[tf.Tensor,
, None] = None Tensor or Array. Onehot encoding of the model's output from which an explanation is desired. One encoding per input and only one output at a time. Therefore, the expected shape is (N, output_size).
More information in the documentation.
Return

explanations : tf.Tensor
Explanation generated by the method.