Demo 3: HKR Classifier on MNIST dataset
Demo 3: HKR classifier on MNIST dataset¶
This notebook will demonstrate learning a binary task on the MNIST0-8 dataset.
# pip install deel-lip -qqq
import tensorflow as tf
from tensorflow.keras import backend as K
from tensorflow.python.keras.layers import Input, Flatten
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.metrics import binary_accuracy
from tensorflow.keras.models import Sequential
from deel.lip.layers import (
SpectralConv2D,
SpectralDense,
FrobeniusDense,
ScaledL2NormPooling2D,
)
from deel.lip.activations import MaxMin, GroupSort, GroupSort2, FullSort
from deel.lip.losses import HKR, KR, HingeMargin
data preparation¶
For this task we will select two classes: 0 and 8. Labels are changed to {-1,1}, wich is compatible with the Hinge term used in the loss.
from tensorflow.keras.datasets import mnist
# first we select the two classes
selected_classes = [0, 8] # must be two classes as we perform binary classification
def prepare_data(x, y, class_a=0, class_b=8):
"""
This function convert the MNIST data to make it suitable for our binary classification
setup.
"""
# select items from the two selected classes
mask = (y == class_a) + (
y == class_b
) # mask to select only items from class_a or class_b
x = x[mask]
y = y[mask]
x = x.astype("float32")
y = y.astype("float32")
# convert from range int[0,255] to float32[-1,1]
x /= 255
x = x.reshape((-1, 28, 28, 1))
# change label to binary classification {-1,1}
y[y == class_a] = 1.0
y[y == class_b] = -1.0
return x, y
# now we load the dataset
(x_train, y_train_ord), (x_test, y_test_ord) = mnist.load_data()
# prepare the data
x_train, y_train = prepare_data(
x_train, y_train_ord, selected_classes[0], selected_classes[1]
)
x_test, y_test = prepare_data(
x_test, y_test_ord, selected_classes[0], selected_classes[1]
)
# display infos about dataset
print(
"train set size: %i samples, classes proportions: %.3f percent"
% (y_train.shape[0], 100 * y_train[y_train == 1].sum() / y_train.shape[0])
)
print(
"test set size: %i samples, classes proportions: %.3f percent"
% (y_test.shape[0], 100 * y_test[y_test == 1].sum() / y_test.shape[0])
)
Build lipschitz Model¶
Let's first explicit the paremeters of this experiment
# training parameters
epochs = 10
batch_size = 128
# network parameters
activation = GroupSort # ReLU, MaxMin, GroupSort2
# loss parameters
min_margin = 1.0
alpha = 10.0
Now we can build the network.
Here the experiment is done with a MLP. But Deel-lip
also provide state of the art 1-Lipschitz convolutions.
K.clear_session()
# helper function to build the 1-lipschitz MLP
wass = Sequential(
layers=[
Input((28, 28, 1)),
Flatten(),
SpectralDense(32, GroupSort2(), use_bias=True),
SpectralDense(16, GroupSort2(), use_bias=True),
FrobeniusDense(1, activation=None, use_bias=False),
],
name="lipModel",
)
wass.summary()
optimizer = Adam(lr=0.001)
# as the output of our classifier is in the real range [-1, 1], binary accuracy must be redefined
def HKR_binary_accuracy(y_true, y_pred):
S_true = tf.dtypes.cast(tf.greater_equal(y_true[:, 0], 0), dtype=tf.float32)
S_pred = tf.dtypes.cast(tf.greater_equal(y_pred[:, 0], 0), dtype=tf.float32)
return binary_accuracy(S_true, S_pred)
wass.compile(
loss=HKR(
alpha=alpha, min_margin=min_margin
), # HKR stands for the hinge regularized KR loss
metrics=[
KR, # shows the KR term of the loss
HingeMargin(min_margin=min_margin), # shows the hinge term of the loss
HKR_binary_accuracy, # shows the classification accuracy
],
optimizer=optimizer,
)
Learn classification on MNIST¶
Now the model is build, we can learn the task.
wass.fit(
x=x_train,
y=y_train,
validation_data=(x_test, y_test),
batch_size=batch_size,
shuffle=True,
epochs=epochs,
verbose=1,
)
As we can see the model reach a very decent accuracy on this task.