COLE: Contributions Oriented Local Explanations¶
View colab tutorial | View source | 📰 Paper
COLE for Contributions Oriented Local Explanations was introduced by Kenny & Keane in 2019.
Quote
Our method COLE is based on the premise that the contributions of features in a model’s classification represent the most sensible basis to inform case-based explanations.
-- COLE paper1
The core idea of the COLE approach is to use attribution maps to define a relevant search space for the K-Nearest Neighbors (KNN) search.
More specifically, the COLE approach is based on the following steps:
-
(1) Given an input sample \(x\), compute the attribution map \(A(x)\)
-
(2) Consider the projection space defined by: \(p: x \rightarrow A(x) \odot x\) (\(\odot\) denotes the element-wise product)
-
(3) Perform a KNN search in the projection space to find the most similar training samples
Info
In the original paper, the authors focused on Multi-Layer Perceptrons (MLP) and three attribution methods (Hadamard, LPR, Integrated Gradient, and DeepLift). We decided to implement a COLE method that generalizes to a more broader range of Neural Networks and attribution methods (see API Attributions documentation to see the list of methods available).
Tips
The original paper shown that the hadamard product between the latent space and the gradient was the best method. Hence we optimized the code for this method. Setting the attribution_method
argument to "gradient"
will run much faster.
Example¶
from xplique.example_based import Cole
# load the training dataset and the model
cases_dataset = ... # load the training dataset
model = ... # load the model
# load the test samples
test_samples = ... # load the test samples to search for
# parameters
k = 3
case_returns = "all" # elements returned by the explain function
distance = "euclidean"
attribution_method = "gradient",
latent_layer = "last_conv" # where to split your model for the projection
# instantiate the Cole object
cole = Cole(
cases_dataset=cases_dataset,
model=model,
k=k,
attribution_method=attribution_method,
latent_layer=latent_layer,
case_returns=case_returns,
distance=distance,
)
# search the most similar samples with the COLE method
similar_samples = cole.explain(
inputs=test_samples,
targets=None, # not necessary with default operator, they are computed internally
)
Notebooks¶
Cole
¶
Cole is a similar examples method that gives the most similar examples
to a query in some specific projection space.
Cole uses the model to build a search space so that distances are meaningful for the model.
It uses attribution methods to weight inputs.
Those attributions may be computed in the latent space for high-dimensional data like images.
__init__(self,
cases_dataset: ~DatasetOrTensor,
model: Union[keras.src.engine.training.Model, ForwardRef('torch.nn.Module')],
labels_dataset: Optional[~DatasetOrTensor] = None,
targets_dataset: Optional[~DatasetOrTensor] = None,
k: int = 1,
distance: Union[str, Callable] = 'euclidean',
case_returns: Union[List[str], str, None] = 'examples',
batch_size: Optional[int] = None,
latent_layer: Union[str, int, None] = None,
attribution_method: Union[str, Type[xplique.attributions.base.BlackBoxExplainer]] = 'gradient',
**attribution_kwargs)
¶
cases_dataset: ~DatasetOrTensor,
model: Union[keras.src.engine.training.Model, ForwardRef('torch.nn.Module')],
labels_dataset: Optional[~DatasetOrTensor] = None,
targets_dataset: Optional[~DatasetOrTensor] = None,
k: int = 1,
distance: Union[str, Callable] = 'euclidean',
case_returns: Union[List[str], str, None] = 'examples',
batch_size: Optional[int] = None,
latent_layer: Union[str, int, None] = None,
attribution_method: Union[str, Type[xplique.attributions.base.BlackBoxExplainer]] = 'gradient',
**attribution_kwargs)
Parameters
-
cases_dataset : ~DatasetOrTensor
The dataset used to train the model, examples are extracted from this dataset.
All datasets (cases, labels, and targets) should be of the same type.
Supported types are:
tf.data.Dataset
,torch.utils.data.DataLoader
,tf.Tensor
,np.ndarray
,torch.Tensor
.For datasets with multiple columns, the first column is assumed to be the cases.
While the second column is assumed to be the labels, and the third the targets.
Warning: datasets tend to reshuffle at each iteration, ensure the datasets are not reshuffle as we use index in the dataset.
-
labels_dataset : Optional[~DatasetOrTensor] = None
Labels associated with the examples in the
cases_dataset
.It should have the same type as
cases_dataset
.
-
targets_dataset : Optional[~DatasetOrTensor] = None
Targets associated with the
cases_dataset
for dataset projection, oftentimes the one-hot encoding of a model's predictions. Seeprojection
for detail.It should have the same type as
cases_dataset
.It is not be necessary for all projections.
Furthermore, projections which requires it compute it internally by default.
-
k : int = 1
The number of examples to retrieve per input.
-
distance : Union[str, Callable] = 'euclidean'
Distance function for examples search. It can be an integer, a string in {"manhattan", "euclidean", "cosine", "chebyshev", "inf"}, or a Callable, by default "euclidean".
-
case_returns : Union[List[str], str, None] = 'examples'
String or list of string with the elements to return in
self.explain()
.See the base class returns property for details.
-
batch_size : Optional[int] = None
Number of samples treated simultaneously for projection and search.
Ignored if
cases_dataset
is a batchedtf.data.Dataset
or a batchedtorch.utils.data.DataLoader
is provided.
-
latent_layer : Union[str, int, None] = None
Layer used to split the model, the first part will be used for projection and the second to compute the attributions. By default, the model is not split.
For such split, the
model
should be atf.keras.Model
.If an
int
is provided it will be interpreted as a layer index.If a
string
is provided it will look for the layer name.The method as described in the paper apply the separation on the last convolutional layer.
To do so, the
"last_conv"
parameter will extract it.Otherwise,
-1
could be used for the last layer before softmax.
-
attribution_method : Union[str, Type[xplique.attributions.base.BlackBoxExplainer]] = 'gradient'
Class of the attribution method to use for projection.
It should inherit from
xplique.attributions.base.BlackBoxExplainer
.It can also be
"gradient"
to make the hadamard product between with the gradient.It was deemed the best method in the original paper, and we optimized it for speed.
By default, it is set to
"gradient"
.
-
attribution_kwargs : **attribution_kwargs
Parameters to be passed for the construction of the
attribution_method
.
explain(self,
inputs: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray],
targets: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None] = None)
¶
inputs: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray],
targets: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None] = None)
Return the relevant examples to explain the (inputs, targets).
It projects inputs with self.projection
in the search space
and find examples with the self.search_method
.
Parameters
-
inputs : Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray]
Tensor or Array. Input samples to be explained.
Expected shape among (N, W), (N, T, W), (N, W, H, C).
More information in the documentation.
-
targets : Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None] = None
Targets associated to the
inputs
for projection.Shape: (n, nb_classes) where n is the number of samples and nb_classes is the number of classes.
It is used in the
projection
. Butprojection
can compute it internally.
Return
-
return_dict
Dictionary with listed elements in
self.returns
.The elements that can be returned are defined with the
_returns_possibilities
static attribute of the class.