Getting started: Maximum Logit Score on MNIST¶
This notebook aims at introducing the core features of oodeel
, including :
- Instantiation of
OODDataset
to load a dataset fromtensorflow_datasets
ortorchvision.datasets
catalog and to organize in-distribution and out-of-distribution data. - Preparation of a
tf.data.Dataset
or atorch.DataLoader
ready for scoring and/or training. - A simple utils to train neural nets (adapted when in-distribution is not a standard dataset, such as a subset of class from a dataset)
- Instantiation of
OODBaseDetector
based on an already trained model, that is used for scoring data. - Some evaluation metrics to assess the quality of OOD detection.
First, some required imports.
import os
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "2"
import tensorflow as tf
tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.ERROR)
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import torch
from IPython.display import clear_output
from oodeel.methods import MLS
from oodeel.eval.metrics import bench_metrics
from oodeel.eval.plots import plot_ood_scores
from oodeel.datasets import OODDataset
from oodeel.utils.tf_training_tools import train_tf_model
from oodeel.utils.torch_training_tools import train_torch_model
from sklearn.metrics import accuracy_score, roc_auc_score
import warnings
warnings.filterwarnings("ignore")
cuda_idx = 0 if torch.cuda.is_available() else None
clear_output()
Models are saved at ~/.oodeel/saved_models and data is supposed to be found at ~/.oodeel/datasets by default. Change the following cell for a custom path.
model_path = os.path.expanduser("~/") + ".oodeel/saved_models"
data_path = os.path.expanduser("~/") + ".oodeel/datasets"
os.makedirs(model_path, exist_ok=True)
os.makedirs(data_path, exist_ok=True)
MNIST vs Fashion MNIST¶
- In-distribution data: MNIST
- Out-of-distribution data: Fashion MNIST
Load and prepare the datasets¶
This is performed using the class OODDataset
. First, load the datasets.
Before loading, choose your backend (tensorflow or torch). You only have to give it as input to OODDataset
and that's it. The rest of the code will be exactly the same regardless of the library (except for model loading).
When a str
is given as argument, OODDataset
will automatically search in the tensorflow_datasets
or torchision.datasets
catalogs. Be careful with the str
you give as input, it has to exactly match the name of the dataset in tensorflow_datasets
or torchision.datasets
(and it is case-sensitive).
Loading from existing data objects
In this notebook, the datasets are loaded from platforms, but you can also instanciate OODDataset with your own tf.data.Dataset, tf.Tensor, torch.Dataset or np.arrays
Warning
- We denote In-Distribution (ID) data with
_in
and Out-Of-Distribution (OOD) data with_out
to avoid confusion with OOD detection which is the name of the task, and is therefore used to denote core classes such asOODDataset
andOODBaseDetector
.- When loading from
tensorflow_datasets
ortorchision.datasets
, do not forget to check whichload_kwargs
to add. It can change for different datasets, even withing a same platform.
In pytorch from torchision.datasets
catalog:
backend = "torch"
oods_in = OODDataset(
'MNIST',
load_kwargs = {"train": False, "root": data_path, "download": True},
backend=backend,
)
oods_out = OODDataset(
'FashionMNIST',
load_kwargs = {"train": False, "root": data_path, "download": True},
backend=backend,
)
oods_train = OODDataset(
'MNIST',
load_kwargs = {"train": True, "root": data_path, "download": True},
backend=backend,
)
In tensorflow from tensorflow_datasets
catalog:
backend = "tensorflow"
oods_in = OODDataset(
'mnist',
load_kwargs = {"split": "test"},
backend=backend,
)
oods_out = OODDataset(
'fashion_mnist',
load_kwargs = {"split": "test"},
backend=backend,
)
oods_train = OODDataset(
'mnist',
load_kwargs = {"split": "train"},
backend=backend,
)
clear_output()
Note
When backend is "torch", the datasets can still be loaded from
tensorflow_datasets
, and are converted on the fly and loaded on VRAM as torch tensors. This feature can be useful because tensorflow_datasets's catalog is way larger than torchvision dataset's. In that case, however, the loaded dataset cannot be used for training withtrain_torch_model
.
Then, prepare the dataset for scoring and/or training using .prepare
method.
def preprocess_fn(*inputs):
x = inputs[0] / 255
return tuple([x] + list(inputs[1:]))
batch_size = 128
ds_in = oods_in.prepare(batch_size=batch_size, preprocess_fn=preprocess_fn)
ds_out = oods_out.prepare(batch_size=batch_size, preprocess_fn=preprocess_fn)
ds_train = oods_train.prepare(batch_size=batch_size, preprocess_fn=preprocess_fn, shuffle=True)
ds_in
, ds_out
and ds_train
are regular tf.data.Dataset
or torch.DataLoader
that you can use like usual!
Train or load a model on in-distribution data (MNIST).
... a keras model if backend="tensorflow"
... or a pytorch one, if backend="torch"
.
model_path_mnist = os.path.join(model_path, "mnist_model_tensorflow.h5")
try:
model_tf = tf.keras.models.load_model(model_path_mnist)
except OSError:
train_config = {
"model": "toy_convnet",
"input_shape": (28, 28, 1),
"num_classes": 10,
"epochs": 5,
"save_dir": model_path_mnist,
"validation_data": ds_in #ds_in is actually the test set of MNIST
}
model_tf = train_tf_model(ds_train, **train_config)
... or a pytorch one, if backend="torch"
model_path_mnist = os.path.join(model_path, "mnist_model_torch")
try:
model_torch = torch.load(os.path.join(model_path_mnist, "best.pt"))
except OSError:
train_config = {
"model": "toy_convnet",
"num_classes": 10,
"epochs": 5,
"save_dir": model_path_mnist,
"validation_data": ds_in, #ds_in is actually the test set of MNIST
"cuda_idx": cuda_idx
}
model_torch = train_torch_model(ds_train, **train_config)
Maximum Logit Score (MLS)¶
Return an OOD score based on the maximum value of the output logits. Introduced in Open-Set Recognition: a Good Closed-Set Classifier is All You Need?, ICLR 2022.
In this section, we use model_tf
because backend="tensorflow"
if backend == "tensorflow":
model = model_tf
elif backend == "torch":
model = model_torch
Simply fit the OODBaseDetector
to the model and then score the dataset.
mls = MLS()
mls.fit(model)
scores_in, _ = mls.score(ds_in)
scores_out, _ = mls.score(ds_out)
Tip
You can also obtain the prediction of the underlying model using the
.score
method. Just name a variable instead of_
. For instance, inscores_in, info_in = mls.score(ds_in)
, info_in is a dict containing the predictions of the model from ds_in, and the labels if they are found in ds_in. Here, we use_
because we do not need the predictions.
Compute the evaluation metrics based on the scores of the test data, and visualize the scores histogram.
metrics = bench_metrics(
(scores_in, scores_out),
metrics = ["auroc", "fpr95tpr", accuracy_score],
threshold = -7.5 # visually chosen based on the plot
)
plt.figure(figsize=(13, 5))
plot_ood_scores(scores_in, scores_out)
plt.show()
metrics = pd.Series(metrics)
print(metrics)
auroc 0.992269 fpr95tpr 0.027700 accuracy_score 0.873150 dtype: float64
Maximum Softmax Probability (MSP)¶
It is possible to do the same after the softmax activation. Introduced in A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks, ICLR 2017.
msp = MLS(output_activation="softmax")
msp.fit(model)
scores_in, _ = msp.score(ds_in)
scores_out, _ = msp.score(ds_out)
metrics = bench_metrics(
(scores_in, scores_out),
metrics = ["auroc", "fpr95tpr", accuracy_score, roc_auc_score],
threshold = -0.95 # visually chosen based on the plot
)
# Normalize scores for a better hist visualization
minim = np.min([np.min(scores_in), np.min(scores_out)])
scores_in_ = scores_in - 2 * minim + np.min(scores_in[np.where(scores_in != minim)])
scores_out_ = scores_out - 2 * minim + np.min(scores_in[np.where(scores_in != minim)])
plt.figure(figsize=(13, 5))
plot_ood_scores(scores_in_, scores_out_, log_scale=True)
plt.show()
metrics = pd.Series(metrics)
print(metrics)
auroc 0.985023 fpr95tpr 0.057000 accuracy_score 0.946000 roc_auc_score 0.985023 dtype: float64
MNIST (0-4) vs MNIST (5-9)¶
- In-distribution data: MNIST (0-4)
- Out-of-distribution data: MNIST (5-9)
We can repeat the procedure in an open-set-recognition or semantic OOD setting by considering a subset of MNIST classes as in-distribution and another subset as out-of-distribution. The datasets are constructed using the OODDataset
method .split_by_class
. First load and split the dataset.
In Pytorch:
backend = "torch"
oods_test = OODDataset(
'MNIST',
load_kwargs = {"train": False, "root": data_path, "download": True},
backend=backend,
)
oods_train = OODDataset(
'MNIST',
load_kwargs = {"train": True, "root": data_path, "download": True},
backend=backend,
)
or in Tensorflow
backend = "tensorflow"
oods_test = OODDataset(
'mnist',
load_kwargs = {"split": "test"},
backend=backend,
)
oods_train = OODDataset(
'mnist',
load_kwargs = {"split": "train"},
backend=backend,
)
clear_output()
Then prepare the datasets for scoring and/or training.
batch_size = 128
inc_labels = [0, 1, 2, 3, 4]
oods_train, _ = oods_train.split_by_class(in_labels=inc_labels)
oods_in, oods_out = oods_test.split_by_class(in_labels=inc_labels)
def preprocess_fn(*inputs):
x = inputs[0] / 255
return tuple([x] + list(inputs[1:]))
ds_train = oods_train.prepare(batch_size=batch_size, preprocess_fn=preprocess_fn, shuffle=True)
ds_in = oods_in.prepare(batch_size=batch_size, with_ood_labels=False, preprocess_fn=preprocess_fn)
ds_out = oods_out.prepare(batch_size=batch_size, with_ood_labels=False, preprocess_fn=preprocess_fn)
Train or load the model.
In Tensorflow:
model_path_mnist_04 = os.path.join(model_path, "mnist_model_0-4.h5")
try:
model_tf = tf.keras.models.load_model(model_path_mnist_04)
except OSError:
train_config = {
"model": "toy_convnet",
"input_shape": (28, 28, 1),
"num_classes": 5,
"epochs": 5,
"save_dir": model_path_mnist_04,
"validation_data": ds_in #ds_in is actually the test set of MNIST
}
model_tf = train_tf_model(ds_train, **train_config)
In Pytorch:
model_path_mnist = os.path.join(model_path, "mnist_model_torch_0-4")
try:
model_torch = torch.load(os.path.join(model_path_mnist, "best.pt"))
except OSError:
train_config = {
"model": "toy_convnet",
"num_classes": 5,
"epochs": 5,
"save_dir": model_path_mnist,
"validation_data": ds_in, #ds_in is actually the test set of MNIST
"cuda_idx": cuda_idx
}
model_torch = train_torch_model(ds_train, **train_config)
Maximum Logit Score (MLS)¶
if backend == "tensorflow":
model = model_tf
elif backend == "torch":
model = model_torch
mls = MLS()
mls.fit(model)
scores_in, _ = mls.score(ds_in)
scores_out, _ = mls.score(ds_out)
metrics = bench_metrics(
(scores_in, scores_out),
metrics = ["auroc", "fpr95tpr", accuracy_score, roc_auc_score],
threshold = -7.5 # visually chosen based on the plot
)
plt.figure(figsize=(13, 5))
plot_ood_scores(scores_in, scores_out)
plt.show()
metrics = pd.Series(metrics)
print(metrics)
auroc 0.923506 fpr95tpr 0.395797 accuracy_score 0.855000 roc_auc_score 0.923506 dtype: float64
Maximum Softmax Probability (MSP)¶
msp = MLS(output_activation="softmax")
msp.fit(model)
scores_in, _ = msp.score(ds_in)
scores_out, _ = msp.score(ds_out)
metrics = bench_metrics(
(scores_in, scores_out),
metrics = ["auroc", "fpr95tpr", accuracy_score, roc_auc_score],
threshold = -0.95 # visually chosen based on the plot
)
# Normalize scores for a better hist visualization
minim = np.min([np.min(scores_in), np.min(scores_out)])
scores_in_ = scores_in - 2 * minim + np.min(scores_in[np.where(scores_in != minim)])
scores_out_ = scores_out - 2 * minim + np.min(scores_in[np.where(scores_in != minim)])
plt.figure(figsize=(13, 5))
plot_ood_scores(scores_in_, scores_out_, log_scale=True)
plt.show()
metrics = pd.Series(metrics)
print(metrics)
auroc 0.935694 fpr95tpr 0.374003 accuracy_score 0.835600 roc_auc_score 0.935693 dtype: float64