Skip to content

COLE: Contributions Oriented Local Explanations

View colab tutorial | View source | 📰 Paper

COLE for Contributions Oriented Local Explanations was introduced by Kenny & Keane in 2019.

Quote

Our method COLE is based on the premise that the contributions of features in a model’s classification represent the most sensible basis to inform case-based explanations.

-- COLE paper1

The core idea of the COLE approach is to use attribution maps to define a relevant search space for the K-Nearest Neighbors (KNN) search.

More specifically, the COLE approach is based on the following steps:

  • (1) Given an input sample \(x\), compute the attribution map \(A(x)\)

  • (2) Consider the projection space defined by: \(p: x \rightarrow A(x) \odot x\) (\(\odot\) denotes the element-wise product)

  • (3) Perform a KNN search in the projection space to find the most similar training samples

Info

In the original paper, the authors focused on Multi-Layer Perceptrons (MLP) and three attribution methods (Hadamard, LPR, Integrated Gradient, and DeepLift). We decided to implement a COLE method that generalizes to a more broader range of Neural Networks and attribution methods (see API Attributions documentation to see the list of methods available).

Tips

The original paper shown that the hadamard product between the latent space and the gradient was the best method. Hence we optimized the code for this method. Setting the attribution_method argument to "gradient" will run much faster.

Example

from xplique.example_based import Cole

# load the training dataset and the model
cases_dataset = ... # load the training dataset
model = ... # load the model

# load the test samples
test_samples = ... # load the test samples to search for

# parameters
k = 3
case_returns = "all"  # elements returned by the explain function
distance = "euclidean"
attribution_method = "gradient",
latent_layer = "last_conv"  # where to split your model for the projection

# instantiate the Cole object
cole = Cole(
    cases_dataset=cases_dataset,
    model=model,
    k=k,
    attribution_method=attribution_method,
    latent_layer=latent_layer,
    case_returns=case_returns,
    distance=distance,
)

# search the most similar samples with the COLE method
similar_samples = cole.explain(
    inputs=test_samples,
    targets=None,  # not necessary with default operator, they are computed internally
)

Notebooks

Cole

Cole is a similar examples method that gives the most similar examples to a query in some specific projection space. Cole uses the model to build a search space so that distances are meaningful for the model. It uses attribution methods to weight inputs. Those attributions may be computed in the latent space for high-dimensional data like images.

__init__(self,
         cases_dataset: ~DatasetOrTensor,
         model: Union[keras.src.engine.training.Model, ForwardRef('torch.nn.Module')],
         labels_dataset: Optional[~DatasetOrTensor] = None,
         targets_dataset: Optional[~DatasetOrTensor] = None,
         k: int = 1,
         distance: Union[str, Callable] = 'euclidean',
         case_returns: Union[List[str], str, None] = 'examples',
         batch_size: Optional[int] = None,
         latent_layer: Union[str, int, None] = None,
         attribution_method: Union[str, Type[xplique.attributions.base.BlackBoxExplainer]] = 'gradient',
         **attribution_kwargs)

Parameters

  • cases_dataset : ~DatasetOrTensor

    • The dataset used to train the model, examples are extracted from this dataset.

      All datasets (cases, labels, and targets) should be of the same type.

      Supported types are: tf.data.Dataset, torch.utils.data.DataLoader, tf.Tensor, np.ndarray, torch.Tensor.

      For datasets with multiple columns, the first column is assumed to be the cases.

      While the second column is assumed to be the labels, and the third the targets.

      Warning: datasets tend to reshuffle at each iteration, ensure the datasets are not reshuffle as we use index in the dataset.

  • labels_dataset : Optional[~DatasetOrTensor] = None

    • Labels associated with the examples in the cases_dataset.

      It should have the same type as cases_dataset.

  • targets_dataset : Optional[~DatasetOrTensor] = None

    • Targets associated with the cases_dataset for dataset projection, oftentimes the one-hot encoding of a model's predictions. See projection for detail.

      It should have the same type as cases_dataset.

      It is not be necessary for all projections.

      Furthermore, projections which requires it compute it internally by default.

  • k : int = 1

    • The number of examples to retrieve per input.

  • distance : Union[str, Callable] = 'euclidean'

    • Distance function for examples search. It can be an integer, a string in {"manhattan", "euclidean", "cosine", "chebyshev", "inf"}, or a Callable, by default "euclidean".

  • case_returns : Union[List[str], str, None] = 'examples'

    • String or list of string with the elements to return in self.explain().

      See the base class returns property for details.

  • batch_size : Optional[int] = None

    • Number of samples treated simultaneously for projection and search.

      Ignored if cases_dataset is a batched tf.data.Dataset or a batched torch.utils.data.DataLoader is provided.

  • latent_layer : Union[str, int, None] = None

    • Layer used to split the model, the first part will be used for projection and the second to compute the attributions. By default, the model is not split.

      For such split, the model should be a tf.keras.Model.

      If an int is provided it will be interpreted as a layer index.

      If a string is provided it will look for the layer name.

      The method as described in the paper apply the separation on the last convolutional layer.

      To do so, the "last_conv" parameter will extract it.

      Otherwise, -1 could be used for the last layer before softmax.

  • attribution_method : Union[str, Type[xplique.attributions.base.BlackBoxExplainer]] = 'gradient'

    • Class of the attribution method to use for projection.

      It should inherit from xplique.attributions.base.BlackBoxExplainer.

      It can also be "gradient" to make the hadamard product between with the gradient.

      It was deemed the best method in the original paper, and we optimized it for speed.

      By default, it is set to "gradient".

  • attribution_kwargs : **attribution_kwargs

    • Parameters to be passed for the construction of the attribution_method.

explain(self,
        inputs: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray],
        targets: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None] = None)

Return the relevant examples to explain the (inputs, targets). It projects inputs with self.projection in the search space and find examples with the self.search_method.

Parameters

  • inputs : Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray]

    • Tensor or Array. Input samples to be explained.

      Expected shape among (N, W), (N, T, W), (N, W, H, C).

      More information in the documentation.

  • targets : Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None] = None

    • Targets associated to the inputs for projection.

      Shape: (n, nb_classes) where n is the number of samples and nb_classes is the number of classes.

      It is used in the projection. But projection can compute it internally.

Return

  • return_dict

    • Dictionary with listed elements in self.returns.

      The elements that can be returned are defined with the _returns_possibilities static attribute of the class.