Skip to content

Projections

In example-based explainability, one often needs to define a notion of similarity (distance) between samples. However, the original feature space may not be the most suitable space to define this similarity. For instance, in the case of images, two images can be very similar in terms of their pixel values but very different in terms of their semantic content. In addition, computing distances in the original feature space does not take into account the model's whatsoever, questioning the explainability of the method.

To address these issues, one can project the samples into a new space where the distances between samples are more meaningful with respect to the model's decision. Two approaches are commonly used to define this projection space: (1) use a latent space and (2) use a feature weighting scheme.

Consequently, we defined the general Projection class that will be used as a base class for all projection methods. This class allows one to use one or both of the aforementioned approaches. Indeed, one can instantiate a Projection object with a space_projection method, that define a projection from the feature space to a space of interest, and aget_weights method, that defines the feature weighting scheme. The Projection class will then project a sample with the space_projection method and weight the projected sample's features with the get_weights method.

In addition, we provide concrete implementations of the Projection class: LatentSpaceProjection, AttributionProjection, and HadamardProjection.

Projection

Base class used by BaseExampleMethod to project samples to a meaningful space for the model to explain.

__init__(self,
         get_weights: Union[Callable, tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None] = None,
         space_projection: Optional[Callable] = None,
         device: Optional[str] = None,
         mappable: bool = False,
         requires_targets: bool = False)

Parameters

  • get_weights : Union[Callable, tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None] = None

    • Either a Tensor or a Callable.

      - In the case of a Tensor, weights are applied in the projected space.

      - In the case of a callable, a function is expected.

      It should take inputs and targets as parameters and return the weights (Tensor).

      Weights should have the same shape as the input (possible difference on channels).

      The inputs of get_weights() correspond to the projected inputs.

      Example of get_weights() function: def get_weights_example(projected_inputs: Union(tf.Tensor, np.ndarray), targets: Optional[Union[tf.Tensor, np.ndarray]] = None): ''' Example of function to get weights, projected_inputs are the elements for which weights are computed.</p><p> targets are optional additional parameters for weights computation.</p><p> ''' weights = ... # do some magic with inputs and targets, it should use the model.</p><p> return weights

  • space_projection : Optional[Callable] = None

    • Callable that take samples and return a Tensor in the projected space.

      An example of projected space is the latent space of a model. See LatentSpaceProjection

  • device : Optional[str] = None

    • Device to use for the projection, if None, use the default device.

  • mappable : bool = False

    • If True, the projection can be applied to a tf.data.Dataset through Dataset.map.

      Otherwise, the dataset projection will be done through a loop.

      It is not the case for wrapped PyTorch models.

      If you encounter errors in the project_dataset method, you can set it to False.

project(self,
        inputs: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray],
        targets: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None] = None)

Project samples in a space meaningful for the model, either by weights the inputs, projecting in a latent space or both. This function should be called at the init and for each explanation.

Parameters

  • inputs : Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray]

    • Tensor or Array. Input samples to be explained.

      Expected shape among (N, W), (N, T, W), (N, W, H, C).

      More information in the documentation.

  • targets : Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None] = None

    • Additional parameter for self.get_weights function.

Return

  • projected_samples

    • The samples projected in the new space.


project_dataset(self,
                cases_dataset: tf.Dataset,
                targets_dataset: Optional[tf.Dataset] = None) -> Optional[tf.Dataset]

Apply the projection to a dataset through Dataset.map

Parameters

  • cases_dataset : tf.Dataset

    • Dataset of samples to be projected.

  • targets_dataset : Optional[tf.Dataset] = None

    • Dataset of targets for the samples.

Return

  • projected_dataset : Optional[tf.Dataset]

    • The projected dataset.


Info

The __call__ method is an alias for the project method.

Defining a custom projection

To define a custom projection, one needs to implement the space_projection and/or get_weights methods. The space_projection method should return the projected sample, and the get_weights method should return the weights of the features of the projected sample.

Info

The get_weights method should take as input the original sample once it has been projected using the space_projection method.

For the sake of clarity, we provide an example of a custom projection that projects the samples into a latent space (the final convolution block of the ResNet50 model) and weights the features with the gradients of the model's output with respect to the inputs once they have gone through the layers until the final convolutional layer.

import tensorflow as tf
from xplique.attributions import Saliency
from xplique.example_based.projections import Projection

# load the model
model = tf.keras.applications.ResNet50(weights="imagenet", include_top=True)

latent_layer = model.get_layer("conv5_block3_out") # output of the final convolutional block
features_extractor = tf.keras.Model(
    model.input, latent_layer.output, name="features_extractor"
)

# reconstruct the second part of the InceptionV3 model
second_input = tf.keras.Input(shape=latent_layer.output.shape[1:])

x = second_input
layer_found = False
for layer in model.layers:
    if layer_found:
        x = layer(x)
    if layer == latent_layer:
        layer_found = True

predictor = tf.keras.Model(
    inputs=second_input,
    outputs=x,
    name="predictor"
)

# build the custom projection
space_projection = features_extractor
get_weights = Saliency(predictor)

custom_projection = Projection(space_projection=space_projection, get_weights=get_weights, mappable=False)

# build random samples
rdm_imgs = tf.random.normal((5, 224, 224, 3))
rdm_targets = tf.random.uniform(shape=[5], minval=0, maxval=1000, dtype=tf.int32)
rdm_targets = tf.one_hot(rdm_targets, depth=1000)

# project the samples
projected_samples = custom_projection(rdm_imgs, rdm_targets)

LatentSpaceProjection

Projection that project inputs in the model latent space. It does not have weighting.

__init__(self,
         model: Union[keras.src.engine.training.Model, ForwardRef('torch.nn.Module')],
         latent_layer: Union[str, int] = -1,
         device: Union[ForwardRef('torch.device'), str] = None,
         mappable: bool = True)

Parameters

  • model : Union[keras.src.engine.training.Model, ForwardRef('torch.nn.Module')]

    • The model from which we want to obtain explanations.

      It will be splitted if a latent_layer is provided.

      Otherwise, it should be a tf.keras.Model.

      It is recommended to split it manually and provide the first part of the model directly.

  • latent_layer : Union[str, int] = -1

    • Layer used to split the model.

      If an int is provided it will be interpreted as a layer index.

      If a string is provided it will look for the layer name.

      To separate after the last convolution, "last_conv" can be used.

      Otherwise, -1 could be used for the last layer before softmax.

  • device : Union[ForwardRef('torch.device'), str] = None

    • Device to use for the projection, if None, use the default device.

      Only used for PyTorch models. Ignored for TensorFlow models.

  • mappable : bool = True

    • Used only if not latent_layer is provided. Thus if the model is already splitted.

      If the model can be placed in a tf.data.Dataset mapping function.

      It is not the case for wrapped PyTorch models.

      If you encounter errors in the project_dataset method, you can set it to False.

project(self,
        inputs: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray],
        targets: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None] = None)

Project samples in a space meaningful for the model, either by weights the inputs, projecting in a latent space or both. This function should be called at the init and for each explanation.

Parameters

  • inputs : Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray]

    • Tensor or Array. Input samples to be explained.

      Expected shape among (N, W), (N, T, W), (N, W, H, C).

      More information in the documentation.

  • targets : Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None] = None

    • Additional parameter for self.get_weights function.

Return

  • projected_samples

    • The samples projected in the new space.


project_dataset(self,
                cases_dataset: tf.Dataset,
                targets_dataset: Optional[tf.Dataset] = None) -> Optional[tf.Dataset]

Apply the projection to a dataset through Dataset.map

Parameters

  • cases_dataset : tf.Dataset

    • Dataset of samples to be projected.

  • targets_dataset : Optional[tf.Dataset] = None

    • Dataset of targets for the samples.

Return

  • projected_dataset : Optional[tf.Dataset]

    • The projected dataset.


AttributionProjection

Projection build on an attribution function to provide local projections. This class is used as the projection of the Cole similar examples method.

__init__(self,
         model: Union[keras.src.engine.training.Model, ForwardRef('torch.nn.Module')],
         attribution_method: xplique.attributions.base.BlackBoxExplainer = ,
         latent_layer: Union[str, int, None] = None,
         **attribution_kwargs)

Parameters

  • model : Union[keras.src.engine.training.Model, ForwardRef('torch.nn.Module')]

    • The model from which we want to obtain explanations.

  • latent_layer : Union[str, int, None] = None

    • Layer used to split the model, the first part will be used for projection and the second to compute the attributions. By default, the model is not split.

      For such split, the model should be a tf.keras.Model.

      If an int is provided it will be interpreted as a layer index.

      If a string is provided it will look for the layer name.

      The method as described in the paper apply the separation on the last convolutional layer.

      To do so, the "last_conv" parameter will extract it.

      Otherwise, -1 could be used for the last layer before softmax.

  • attribution_method : xplique.attributions.base.BlackBoxExplainer =

    • Class of the attribution method to use for projection.

      It should inherit from xplique.attributions.base.BlackBoxExplainer.

      Ignored if a projection is given.

  • attribution_kwargs : **attribution_kwargs

    • Parameters to be passed at the construction of the attribution_method.

project(self,
        inputs: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray],
        targets: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None] = None)

Project samples in a space meaningful for the model, either by weights the inputs, projecting in a latent space or both. This function should be called at the init and for each explanation.

Parameters

  • inputs : Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray]

    • Tensor or Array. Input samples to be explained.

      Expected shape among (N, W), (N, T, W), (N, W, H, C).

      More information in the documentation.

  • targets : Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None] = None

    • Additional parameter for self.get_weights function.

Return

  • projected_samples

    • The samples projected in the new space.


project_dataset(self,
                cases_dataset: tf.Dataset,
                targets_dataset: Optional[tf.Dataset] = None) -> Optional[tf.Dataset]

Apply the projection to a dataset through Dataset.map

Parameters

  • cases_dataset : tf.Dataset

    • Dataset of samples to be projected.

  • targets_dataset : Optional[tf.Dataset] = None

    • Dataset of targets for the samples.

Return

  • projected_dataset : Optional[tf.Dataset]

    • The projected dataset.


HadamardProjection

Projection build on an the latent space and the gradient. This class is used as the projection of the Cole similar examples method.

__init__(self,
         model: Union[keras.src.engine.training.Model, ForwardRef('torch.nn.Module'), None] = None,
         latent_layer: Union[str, int, None] = None,
         operator: Optional[Callable[[keras.src.engine.training.Model, tensorflow.python.framework.tensor.Tensor, tensorflow.python.framework.tensor.Tensor], float]] = None,
         device: Union[ForwardRef('torch.device'), str] = None,
         features_extractor: Optional[keras.src.engine.training.Model] = None,
         predictor: Optional[keras.src.engine.training.Model] = None,
         mappable: bool = True)

Parameters

  • model : Union[keras.src.engine.training.Model, ForwardRef('torch.nn.Module'), None] = None

    • The model from which we want to obtain explanations.

      It can be splitted manually outside of the projection and provided as two models: the feature_extractor and the predictor. In this case, model should be None.

      It is recommended to split it manually.

  • latent_layer : Union[str, int, None] = None

    • Layer used to split the model, the first part will be used for projection and the second to compute the attributions. By default, the model is not split.

      For such split, the model should be a tf.keras.Model.

      Ignored if model is None, hence if a splitted model is provided through: the feature_extractor and the predictor.

      If an int is provided it will be interpreted as a layer index.

      If a string is provided it will look for the layer name.

      The method as described in the paper apply the separation on the last convolutional layer.

      To do so, the "last_conv" parameter will extract it.

      Otherwise, -1 could be used for the last layer before softmax.

  • operator : Optional[Callable[[keras.src.engine.training.Model, tensorflow.python.framework.tensor.Tensor, tensorflow.python.framework.tensor.Tensor], float]] = None

    • Operator to use to compute the explanation, if None use standard predictions.

      The default operator is the classification operator with online targets computations.

      For more information, refer to the Attribution documentation.

  • device : Union[ForwardRef('torch.device'), str] = None

    • Device to use for the projection, if None, use the default device.

      Only used for PyTorch models. Ignored for TensorFlow models.

  • features_extractor : Optional[keras.src.engine.training.Model] = None

    • The feature extraction part of the model. Mapping inputs to the latent space.

      Used to provided the first part of a splitted model.

      It cannot be provided if a model is provided. It should be provided with a predictor.

  • predictor : Optional[keras.src.engine.training.Model] = None

    • The prediction part of the model. Mapping the latent space to the outputs.

      Used to provided the second part of a splitted model.

      It cannot be provided if a model is provided.

      It should be provided with a features_extractor.

  • mappable : bool = True

    • If the model parts can be placed in a tf.data.Dataset mapping function.

      It is not the case for wrapped PyTorch models.

      If you encounter errors in the project_dataset method, you can set it to False.

      Used only for a splitted model. Thgus if model is None.

project(self,
        inputs: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray],
        targets: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None] = None)

Project samples in a space meaningful for the model, either by weights the inputs, projecting in a latent space or both. This function should be called at the init and for each explanation.

Parameters

  • inputs : Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray]

    • Tensor or Array. Input samples to be explained.

      Expected shape among (N, W), (N, T, W), (N, W, H, C).

      More information in the documentation.

  • targets : Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None] = None

    • Additional parameter for self.get_weights function.

Return

  • projected_samples

    • The samples projected in the new space.


project_dataset(self,
                cases_dataset: tf.Dataset,
                targets_dataset: Optional[tf.Dataset] = None) -> Optional[tf.Dataset]

Apply the projection to a dataset through Dataset.map

Parameters

  • cases_dataset : tf.Dataset

    • Dataset of samples to be projected.

  • targets_dataset : Optional[tf.Dataset] = None

    • Dataset of targets for the samples.

Return

  • projected_dataset : Optional[tf.Dataset]

    • The projected dataset.