Skip to content

LIME

View colab tutorial | View source

The Lime method use an interpretable model to provide an explanation. More specifically, you map inputs (\(x \in R^d\)) to an interpretable space (e.g super-pixels) of size num_interpetable_features. From there you generate perturbed interpretable samples (\(z' \in \{0,1\}^{num\_interpretable\_samples}\) where \(1\) means we keep this specific interpretable feature).

Once you have your interpretable samples you can map them back to their original space (the perturbed samples \(z \in R^d\)) and obtain the label prediction of your model for each perturbed samples.

In the Lime method you define a similarity kernel which compute the similarity between an input and its perturbed representations (either in the original input space or in the interpretable space): \(\pi_x(z',z)\).

Finally, you train an interpretable model per input, using interpretable samples along the corresponding perturbed labels and it will draw interpretable samples weighted by the similarity kernel. Thus, you will have an interpretable explanation (i.e in the interpretable space) which can be broadcasted afterwards to the original space considering the mapping you used.

Quote

The overall goal of LIME is to identify an interpretable model over the interpretable representation that is locally faithful to the classifier.

-- "Why Should I Trust You?": Explaining the Predictions of Any Classifier.1

Example

from xplique.attributions import Lime

# load images, labels and model
# define a custom map_to_interpret_space function
# ...

method = Lime(model, map_to_interpret_space=custom_map)
explanations = method.explain(images, labels)

The choice of the interpretable model and the map function will have a great deal toward the quality of explanation. By default, the map function use the quickshift segmentation of scikit-images

Notebooks

Lime

Used to compute the LIME method.

__init__(self,
         model: Callable,
         batch_size: Optional[int] = None,
         operator: Optional[Callable[[keras.engine.training.Model, tf.Tensor, tf.Tensor], float]] = None,
         interpretable_model: Any = Ridge(alpha=2),
         similarity_kernel: Optional[Callable[[tf.Tensor, tf.Tensor, tf.Tensor], tf.Tensor]] = None,
         pertub_func: Optional[Callable[[Union[int, tf.Tensor], int], tf.Tensor]] = None,
         map_to_interpret_space: Optional[Callable[[tf.Tensor], tf.Tensor]] = None,
         ref_value: Optional[numpy.ndarray] = None,
         nb_samples: int = 150,
         distance_mode: str = 'euclidean',
         kernel_width: float = 45.0,
         prob: float = 0.5)

Parameters

  • model : Callable

    • The model from which we want to obtain explanations

  • batch_size : Optional[int] = None

    • Number of perturbed samples to process at once, mandatory when nb_samples is huge.

      Notice, it is different compare to WhiteBox explainers which batch the inputs.

      Here inputs are process one by one.

  • operator : Optional[Callable[[keras.engine.training.Model, tf.Tensor, tf.Tensor], float]] = None

    • Function g to explain, g take 3 parameters (f, x, y) and should return a scalar, with f the model, x the inputs and y the targets. If None, use the standard operator g(f, x, y) = f(x)[y].

  • interpretable_model : Any = Ridge(alpha=2)

    • Model object to train interpretable model.

      See the documentation for more information.

  • similarity_kernel : Optional[Callable[[tf.Tensor, tf.Tensor, tf.Tensor], tf.Tensor]] = None

    • Function which considering an input, perturbed instances of these input and the interpretable version of those perturbed samples compute the similarities between the input and the perturbed samples.

      See the documentation for more information.

  • pertub_func : Optional[Callable[[Union[int, tf.Tensor], int], tf.Tensor]] = None

    • Function which generate perturbed interpretable samples in the interpretation space from the number of interpretable features (e.g nb of super pixel) and the number of perturbed samples you want per original input.

      See the documentation for more information.

  • ref_value : Optional[numpy.ndarray] = None

    • It defines reference value which replaces each feature when the corresponding interpretable feature is set to 0.

      It should be provided as: a ndarray of shape (1) if there is no channels in your input and (C,) otherwise The default ref value is set to (0.5,0.5,0.5) for inputs with 3 channels (corresponding to a grey pixel when inputs are normalized by 255) and to 0 otherwise.

  • map_to_interpret_space : Optional[Callable[[tf.Tensor], tf.Tensor]] = None

    • Function which group features of an input corresponding to the same interpretable feature (e.g super-pixel).

      It allows to transpose from (resp. to) the original input space to (resp. from) the interpretable space.

      See the documentation for more information.

  • nb_samples : int = 150

    • The number of perturbed samples you want to generate for each input sample.

      Default to 150.

  • prob : float = 0.5

    • The probability argument for the default pertub function.

  • distance_mode : str = 'euclidean'

    • The distance mode used in the default similarity kernel, you can choose either "euclidean" or "cosine" (will compute cosine similarity).

      Default value set to "euclidean".

  • kernel_width : float = 45.0

    • Width of your kernel. It is important to make it evolving depending on your inputs size otherwise you will get all similarity close to 0 leading to poor performance or NaN values.

      Default to 45 (i.e adapted for RGB images).

explain(self,
        inputs: Union[tf.Dataset, tf.Tensor, numpy.ndarray],
        targets: Union[tf.Tensor, numpy.ndarray, None] = None) -> tf.Tensor

This method attributes the output of the model with given targets to the inputs of the model using the approach described above, training an interpretable model and returning a representation of the interpretable model.

Parameters

  • inputs : Union[tf.Dataset, tf.Tensor, numpy.ndarray]

    • Dataset, Tensor or Array. Input samples to be explained.

      If Dataset, targets should not be provided (included in Dataset).

      Expected shape among (N, W), (N, T, W), (N, W, H, C).

      More information in the documentation.

  • targets : Union[tf.Tensor, numpy.ndarray, None] = None

    • Tensor or Array. One-hot encoding of the model's output from which an explanation is desired. One encoding per input and only one output at a time. Therefore, the expected shape is (N, output_size).

      More information in the documentation.

Return

  • explanations : tf.Tensor

    • Interpretable coefficients, same shape as the inputs, except for the channels.

      Coefficients of the interpretable model. Those coefficients having the size of the interpretable space will be given the same value to coefficient which were grouped together (e.g belonging to the same super-pixel).


Warning

The computation time might be very long depending on the hyperparameters settings. A huge number of perturbed samples and a fine-grained mapping may lead to better results but it is long to compute.

Parameters in-depth

interpretable_model:

A Model object providing a fit method that train the model with the following inputs:

  • - interpretable_inputs: 2D ndarray of shape (\(nb\_samples\) x \(num\_interp\_features\)),
  • - expected_outputs: 1D ndarray of shape (\(nb\_samples\)),
  • - weights: 1D ndarray of shape (\(nb\_samples\))

The model object should also provide a predict and fit method.

It should also have a coef_ attributes (the interpretable explanations) at least once fit is called.

As interpretable model you can use linear models from scikit-learn.

Warning

Note that here nb_samples doesn't indicates the length of inputs but the number of perturbed samples we want to generate for each input.

similarity_kernel:

Function which considering an input, perturbed instances of these input and the interpretable version of those perturbed samples compute the similarities between the input and the perturbed samples.

Info

The similarities can be computed in the original input space or in the interpretable space.

You can provide a custom function. Note that to use a custom function, you have to follow the following scheme:

def custom_similarity(
    original_input, interpret_samples , perturbed_samples
) -> tf.tensor (shape=(nb_samples,), dtype = tf.float32):
    ** some tf actions **
    return similarities

where:

  • - original_input has shape among \((W)\), \((W, H)\), \((W, H, C)\)
  • - interpret_samples is a tf.tensor of shape \((nb\_samples, num\_interp\_features)\)
  • - perturbed_samples is a tf.tensor of shape \((nb\_samples, *original\_input.shape)\)

If it is possible you can add the @tf.function decorator.

Warning

Note that here nb_samples doesn't indicates the length of inputs but the number of perturbed samples we want to generate for each input.

Info

The default similarity kernel use the euclidean distance between the original input and the perturbed samples in the input space.

pertub_func:

Function which generate perturbed interpretable samples in the interpretation space from the number of interpretable features (e.g nb of super pixel) and the number of perturbed samples you want per original input.

The generated interp_samples belong to \(\{0,1\}^{num\_features}\). Where \(1\) indicates that we keep the corresponding feature (e.g super pixel) in the mapping.

To use your own custom pertub function you should use the following scheme:

@tf.function
def custom_pertub_function(num_features, nb_samples) ->
tf.tensor (shape=(nb_samples, num_interp_features), dtype=tf.int32):
    ** some tf actions**
    return perturbed_sample

Info

The default pertub function provided keep a feature (e.g super pixel) with a probability 0.5. If you want to change it, define the prob value when initiating the explainer or define your own function.

map_to_interpret_space:

Function which group features of an input corresponding to the same interpretable feature (e.g super-pixel).

It allows to transpose from (resp. to) the original input space to (resp. from) the interpretable space.

The default mappings are:

  • - the quickshift segmentation algorithm for inputs with \((N, W, H, C)\) shape, we assume here such shape is used to represent \((W, H, C)\) images.
  • - the felzenszwalb segmentation algorithm for inputs with \((N, W, H)\) shape, we assume here such shape is used to represent \((W, H)\) images.
  • - an identity mapping if inputs has shape \((N, W)\), we assume here your inputs are tabular data.

To use your own custom map function you should use the following scheme:

def custom_map_to_interpret_space(single_inp: tf.tensor) ->
tf.tensor:
    **some grouping techniques**
    return mapping

mapping should have the same dimension as single input except for channels.

For instance you can use the scikit-image (as we did for the quickshift algorithm) library to defines super pixels on your images.

Info

The quality of your explanation relies strongly on this mapping.

Warning

Depending on the mapping you might have a huge number of interpretable_features (e.g you map pixels 2 by 2 on a 299x299 image). Thus, the compuation time might be very long!

Danger

As you may have noticed, by default Time Series are not handled. Consequently, a custom mapping should be implented. Either to assign each feature to a different group or to group consecutive features together, by group of 4 timesteps for example. In the second example, we try to cover patterns. An example is provided below.

def map_time_series(single_input: tf.tensor) -> tf.Tensor:
    time_dim = single_input.shape[0]
    feat_dim = single_input.shape[1]
    mapping = tf.range(time_dim*feat_dim)
    mapping = tf.reshape(mapping, (time_dim, feat_dim))
    return mapping