Skip to content

Similar-Examples

View colab tutorial | View source

We designate here as Similar Examples all methods that given an input sample, search for the most similar training samples given a distance function distance. Furthermore, one can define the search space using a projection function (see Projections). This function should map an input sample to the search space where the distance function is defined and meaningful (e.g. the latent space of a Convolutional Neural Network). Then, a K-Nearest Neighbors (KNN) search is performed to find the most similar samples in the search space.

Example

from xplique.example_based import SimilarExamples

cases_dataset = ... # load the training dataset
targets = ... # load the one-hot encoding of predicted labels of the training dataset

# parameters
k = 5
distance = "euclidean"
case_returns = ["examples", "nuns"]

# define the projection function
def custom_projection(inputs: tf.Tensor, np.ndarray, targets: tf.Tensor, np.ndarray = None):
    '''
    Example of projection,
    inputs are the elements to project.
    targets are optional parameters to orientate the projection.
    '''
    projected_inputs = # do some magic on inputs, it should use the model.
    return projected_inputs

# instantiate the SimilarExamples object
sim_ex = SimilarExamples(
    cases_dataset=cases_dataset,
    targets_dataset=targets,
    k=k,
    projection=custom_projection,
    distance=distance,
)

# load the test samples and targets
test_samples = ... # load the test samples to search for
test_targets = ... # load the one-hot encoding of the test samples' predictions

# search the most similar samples with the SimilarExamples method
similar_samples = sim_ex.explain(test_samples, test_targets)

Notebooks

SimilarExamples

Class for similar example-based method. This class allows to search the k Nearest Neighbor of an input in the projected space (defined by the projection method) using the distance defined by the distance method provided.

__init__(self,
         cases_dataset: ~DatasetOrTensor,
         labels_dataset: Optional[~DatasetOrTensor] = None,
         targets_dataset: Optional[~DatasetOrTensor] = None,
         k: int = 1,
         projection: Union[xplique.example_based.projections.base.Projection, Callable] = None,
         case_returns: Union[List[str], str] = 'examples',
         batch_size: Optional[int] = None,
         distance: Union[int, str, Callable] = 'euclidean')

Parameters

  • cases_dataset : ~DatasetOrTensor

    • The dataset used to train the model, examples are extracted from this dataset.

      All datasets (cases, labels, and targets) should be of the same type.

      Supported types are: tf.data.Dataset, torch.utils.data.DataLoader, tf.Tensor, np.ndarray, torch.Tensor.

      For datasets with multiple columns, the first column is assumed to be the cases.

      While the second column is assumed to be the labels, and the third the targets.

      Warning: datasets tend to reshuffle at each iteration, ensure the datasets are not reshuffle as we use index in the dataset.

  • labels_dataset : Optional[~DatasetOrTensor] = None

    • Labels associated with the examples in the cases_dataset.

      It should have the same type as cases_dataset.

  • targets_dataset : Optional[~DatasetOrTensor] = None

    • Targets associated with the cases_dataset for dataset projection, oftentimes the one-hot encoding of a model's predictions. See projection for detail.

      It should have the same type as cases_dataset.

      It is not be necessary for all projections.

      Furthermore, projections which requires it compute it internally by default.

  • k : int = 1

    • The number of examples to retrieve per input.

  • projection : Union[xplique.example_based.projections.base.Projection, Callable] = None

    • Projection or Callable that project samples from the input space to the search space.

      The search space should be a space where distances are relevant for the model.

      It should not be None, otherwise, the model is not involved thus not explained.

      Example of Callable: def custom_projection(inputs: tf.Tensor, np.ndarray, targets: tf.Tensor, np.ndarray = None): ''' Example of projection, inputs are the elements to project.</p><p> targets are optional parameters to orientated the projection.</p><p> ''' projected_inputs = # do some magic on inputs, it should use the model.</p><p> return projected_inputs

  • case_returns : Union[List[str], str] = 'examples'

    • String or list of string with the elements to return in self.explain().

      See the base class returns property for more details.

  • batch_size : Optional[int] = None

    • Number of samples treated simultaneously for projection and search.

      Ignored if cases_dataset is a batched tf.data.Dataset or a batched torch.utils.data.DataLoader is provided.

  • distance : Union[int, str, Callable] = 'euclidean'

    • Distance for the knn search method. It can be an integer, a string in {"manhattan", "euclidean", "cosine", "chebyshev", "inf"}, or a Callable, by default "euclidean".

explain(self,
        inputs: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray],
        targets: Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None] = None)

Return the relevant examples to explain the (inputs, targets). It projects inputs with self.projection in the search space and find examples with the self.search_method.

Parameters

  • inputs : Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray]

    • Tensor or Array. Input samples to be explained.

      Expected shape among (N, W), (N, T, W), (N, W, H, C).

      More information in the documentation.

  • targets : Union[tensorflow.python.framework.tensor.Tensor, numpy.ndarray, None] = None

    • Targets associated to the inputs for projection.

      Shape: (n, nb_classes) where n is the number of samples and nb_classes is the number of classes.

      It is used in the projection. But projection can compute it internally.

Return

  • return_dict

    • Dictionary with listed elements in self.returns.

      The elements that can be returned are defined with the _returns_possibilities static attribute of the class.