Skip to content

deel.lip.layers

The submodule deel.lip.layers contains all custom Keras layers to build Lipschitz-constrained neural networks. They all inherit from keras.layers.Layer from Keras API.

activations

This module contains extra activation functions which respect the Lipschitz constant. It can be added as a layer, or it can be used in the "activation" params for other layers.

FullSort

FullSort(**kwargs)

Bases: GroupSort

FullSort activation. Special case of GroupSort where the entire input is sorted.

Input shape

Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

Output shape

Same size as input.

Source code in deel/lip/layers/activations.py
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
def __init__(self, **kwargs):
    """
    FullSort activation. Special case of GroupSort where the entire input is sorted.

    Input shape:
        Arbitrary. Use the keyword argument `input_shape` (tuple of integers, does
        not include the samples axis) when using this layer as the first layer in a
        model.

    Output shape:
        Same size as input.

    """
    kwargs["n"] = None
    super().__init__(**kwargs)

GroupSort

GroupSort(
    n=None,
    data_format="channels_last",
    k_coef_lip=1.0,
    **kwargs
)

Bases: Layer, LipschitzLayer

GroupSort activation

PARAMETER DESCRIPTION
n

group size used when sorting. When None group size is set to input size (fullSort behavior)

TYPE: int DEFAULT: None

data_format

either channels_first or channels_last

TYPE: str DEFAULT: 'channels_last'

k_coef_lip

the lipschitz coefficient to be enforced

TYPE: float DEFAULT: 1.0

**kwargs

params passed to layers (named fashion)

DEFAULT: {}

Input shape

Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

Output shape

Same size as input.

Source code in deel/lip/layers/activations.py
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
def __init__(self, n=None, data_format="channels_last", k_coef_lip=1.0, **kwargs):
    """
    GroupSort activation

    Args:
        n (int): group size used when sorting. When None group size is set to input
            size (fullSort behavior)
        data_format (str): either channels_first or channels_last
        k_coef_lip (float): the lipschitz coefficient to be enforced
        **kwargs: params passed to layers (named fashion)

    Input shape:
        Arbitrary. Use the keyword argument `input_shape` (tuple of integers, does
        not include the samples axis) when using this layer as the first layer in a
        model.

    Output shape:
        Same size as input.

    """
    self.set_klip_factor(k_coef_lip)
    super(GroupSort, self).__init__(**kwargs)
    if data_format == "channels_last":
        self.channel_axis = -1
    elif data_format == "channels_first":
        raise RuntimeError(
            "channels_first not implemented for GroupSort activation"
        )
    else:
        raise RuntimeError("data format not understood")
    self.n = n
    self.data_format = data_format

GroupSort2

GroupSort2(**kwargs)

Bases: GroupSort

GroupSort2 activation. Special case of GroupSort with group of size 2.

Input shape

Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

Output shape

Same size as input.

Source code in deel/lip/layers/activations.py
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
def __init__(self, **kwargs):
    """
    GroupSort2 activation. Special case of GroupSort with group of size 2.

    Input shape:
        Arbitrary. Use the keyword argument `input_shape` (tuple of integers, does
        not include the samples axis) when using this layer as the first layer in a
        model.

    Output shape:
        Same size as input.

    """
    kwargs["n"] = 2
    super().__init__(**kwargs)

Householder

Householder(
    data_format="channels_last",
    k_coef_lip=1.0,
    theta_initializer=None,
    **kwargs
)

Bases: Layer, LipschitzLayer

Householder activation: this review From this repository

PARAMETER DESCRIPTION
data_format

either channels_first or channels_last. Only channels_last is supported.

TYPE: str DEFAULT: 'channels_last'

k_coef_lip

The lipschitz coefficient to be enforced.

TYPE: str DEFAULT: 1.0

theta_initializer

initializer for the angle theta of reflection. Defaults to pi/2, which corresponds to GroupSort2.

DEFAULT: None

**kwargs

parameters passed to the tf.keras.layers.Layer.

DEFAULT: {}

Input shape

Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

Output shape

Same size as input.

Source code in deel/lip/layers/activations.py
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
def __init__(
    self,
    data_format="channels_last",
    k_coef_lip=1.0,
    theta_initializer=None,
    **kwargs,
):
    """
    Householder activation:
    [this review](https://openreview.net/pdf?id=tD7eCtaSkR)
    From [this repository](https://github.com/singlasahil14/SOC)

    Args:
        data_format (str): either channels_first or channels_last. Only
            channels_last is supported.
        k_coef_lip (str): The lipschitz coefficient to be enforced.
        theta_initializer: initializer for the angle theta of reflection. Defaults
            to pi/2, which corresponds to GroupSort2.
        **kwargs: parameters passed to the `tf.keras.layers.Layer`.

    Input shape:
        Arbitrary. Use the keyword argument `input_shape` (tuple of integers, does
        not include the samples axis) when using this layer as the first layer in a
        model.

    Output shape:
        Same size as input.

    """
    if data_format != "channels_last":
        raise RuntimeError("Only 'channels_last' data format is supported")

    self.data_format = data_format
    self.set_klip_factor(k_coef_lip)
    self.theta_initializer = theta_initializer
    super().__init__(**kwargs)

MaxMin

MaxMin(
    data_format="channels_last", k_coef_lip=1.0, **kwargs
)

Bases: Layer, LipschitzLayer

MaxMin activation [Relu(x),reLU(-x)]

PARAMETER DESCRIPTION
data_format

either channels_first or channels_last

TYPE: str DEFAULT: 'channels_last'

k_coef_lip

the lipschitz coefficient to be enforced

TYPE: float DEFAULT: 1.0

**kwargs

params passed to layers (named fashion)

DEFAULT: {}

Input shape

Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

Output shape

Double channel size as input.

References

([M. Blot, M. Cord, et N. Thome, « Max-min convolutional neural networks for image classification », in 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 2016, p. 3678‑3682.)

Source code in deel/lip/layers/activations.py
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
def __init__(self, data_format="channels_last", k_coef_lip=1.0, **kwargs):
    """
    MaxMin activation [Relu(x),reLU(-x)]

    Args:
        data_format (str): either channels_first or channels_last
        k_coef_lip (float): the lipschitz coefficient to be enforced
        **kwargs: params passed to layers (named fashion)

    Input shape:
        Arbitrary. Use the keyword argument `input_shape` (tuple of integers, does
        not include the samples axis) when using this layer as the first layer in a
        model.

    Output shape:
        Double channel size as input.

    References:
        ([M. Blot, M. Cord, et N. Thome, « Max-min convolutional neural networks
        for image classification », in 2016 IEEE International Conference on Image
        Processing (ICIP), Phoenix, AZ, USA, 2016, p. 3678‑3682.)

    """
    self.set_klip_factor(k_coef_lip)
    super(MaxMin, self).__init__(**kwargs)
    if data_format == "channels_last":
        self.channel_axis = -1
    elif data_format == "channels_first":
        self.channel_axis = 1
    else:
        raise RuntimeError("data format not understood")
    self.data_format = data_format

PReLUlip

PReLUlip(k_coef_lip=1.0)

PreLu activation, with Lipschitz constraint.

PARAMETER DESCRIPTION
k_coef_lip

lipschitz coefficient to be enforced

TYPE: float DEFAULT: 1.0

Source code in deel/lip/layers/activations.py
199
200
201
202
203
204
205
206
207
208
209
@register_keras_serializable("deel-lip", "PReLUlip")
def PReLUlip(k_coef_lip=1.0):
    """
    PreLu activation, with Lipschitz constraint.

    Args:
        k_coef_lip (float): lipschitz coefficient to be enforced
    """
    return PReLU(
        alpha_constraint=MinMaxNorm(min_value=-k_coef_lip, max_value=k_coef_lip)
    )

base_layer

This module extends original keras layers, in order to add k lipschitz constraint via reparametrization. Currently, are implemented: * Dense layer: as SpectralDense (and as FrobeniusDense when the layer has a single output) * Conv2D layer: as SpectralConv2D (and as FrobeniusConv2D when the layer has a single output) * AveragePooling: as ScaledAveragePooling * GlobalAveragePooling2D: as ScaledGlobalAveragePooling2D By default the layers are 1 Lipschitz almost everywhere, which is efficient for wasserstein distance estimation. However for other problems (such as adversarial robustness) the user may want to use layers that are at most 1 lipschitz, this can be done by setting the param eps_bjorck=None.

Condensable

Bases: ABC

Some Layers don't optimize directly the kernel, this means that the kernel stored in the layer is not the kernel used to make predictions (called W_bar), To address this, these layers can implement the condense() function that make self.kernel equal to W_bar. This operation also allows to turn the Lipschitz layer to its keras equivalent e.g. The Dense layer that have the same predictions as the trained SpectralDense.

condense abstractmethod

condense()

The condense operation allows to overwrite the kernel and ensure that other variables are still consistent. Returns: None

Source code in deel/lip/layers/base_layer.py
 99
100
101
102
103
104
105
106
107
@abc.abstractmethod
def condense(self):
    """
    The condense operation allows to overwrite the kernel and ensure that other
    variables are still consistent.
    Returns:
        None
    """
    pass

vanilla_export abstractmethod

vanilla_export()

This operation allows to turn this Layer to its super type, easing storage and serving. Returns: self as super type

Source code in deel/lip/layers/base_layer.py
109
110
111
112
113
114
115
116
117
@abc.abstractmethod
def vanilla_export(self):
    """
    This operation allows to turn this Layer to its super type, easing storage and
    serving.
    Returns:
         self as super type
    """
    pass

LipschitzLayer

Bases: ABC

This class allows to set Lipschitz factor of a layer. Lipschitz layer must inherit this class to allow user to set the Lipschitz factor. Warning: This class only regroups useful functions when developing new Lipschitz layers. But it does not ensure any property about the layer. This means that inheriting from this class won't ensure anything about the Lipschitz constant.

coef_lip class-attribute instance-attribute

coef_lip = None

define correction coefficient (ie. Lipschitz bound ) of the layer ( multiply the output of the layer by this constant )

k_coef_lip class-attribute instance-attribute

k_coef_lip = 1.0

variable used to store the lipschitz factor

set_klip_factor

set_klip_factor(klip_factor)

Allow to set the Lipschitz factor of a layer. Args: klip_factor (float): the Lipschitz factor the user want to ensure. Returns: None

Source code in deel/lip/layers/base_layer.py
45
46
47
48
49
50
51
52
53
def set_klip_factor(self, klip_factor):
    """
    Allow to set the Lipschitz factor of a layer.
    Args:
        klip_factor (float): the Lipschitz factor the user want to ensure.
    Returns:
        None
    """
    self.k_coef_lip = klip_factor

convolutional

This module extends original keras layers, in order to add k lipschitz constraint via reparametrization. Currently, are implemented: * Dense layer: as SpectralDense (and as FrobeniusDense when the layer has a single output) * Conv2D layer: as SpectralConv2D (and as FrobeniusConv2D when the layer has a single output) * AveragePooling: as ScaledAveragePooling * GlobalAveragePooling2D: as ScaledGlobalAveragePooling2D By default the layers are 1 Lipschitz almost everywhere, which is efficient for wasserstein distance estimation. However for other problems (such as adversarial robustness) the user may want to use layers that are at most 1 lipschitz, this can be done by setting the param eps_bjorck=None.

FrobeniusConv2D

FrobeniusConv2D(
    filters,
    kernel_size,
    strides=(1, 1),
    padding="same",
    data_format=None,
    dilation_rate=(1, 1),
    activation=None,
    use_bias=True,
    kernel_initializer=SpectralInitializer(),
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    k_coef_lip=1.0,
    **kwargs
)

Bases: Conv2D, LipschitzLayer, Condensable

Same as SpectralConv2D but in the case of a single output.

Source code in deel/lip/layers/convolutional.py
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
def __init__(
    self,
    filters,
    kernel_size,
    strides=(1, 1),
    padding="same",
    data_format=None,
    dilation_rate=(1, 1),
    activation=None,
    use_bias=True,
    kernel_initializer=SpectralInitializer(),
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    k_coef_lip=1.0,
    **kwargs,
):
    if strides not in ((1, 1), [1, 1], 1):
        raise RuntimeError("FrobeniusConv2D does not support strides")
    if dilation_rate not in ((1, 1), [1, 1], 1):
        raise RuntimeError("FrobeniusConv2D does not support dilation rate")
    if padding != "same":
        raise RuntimeError("FrobeniusConv2D only supports padding='same'")
    if not (
        (kernel_constraint is None)
        or isinstance(kernel_constraint, SpectralConstraint)
    ):
        raise RuntimeError(
            "only deellip constraints are allowed as other constraints could break"
            " 1 lipschitz condition"
        )
    super(FrobeniusConv2D, self).__init__(
        filters=filters,
        kernel_size=kernel_size,
        strides=strides,
        padding=padding,
        data_format=data_format,
        dilation_rate=dilation_rate,
        activation=activation,
        use_bias=use_bias,
        kernel_initializer=kernel_initializer,
        bias_initializer=bias_initializer,
        kernel_regularizer=kernel_regularizer,
        bias_regularizer=bias_regularizer,
        activity_regularizer=activity_regularizer,
        kernel_constraint=kernel_constraint,
        bias_constraint=bias_constraint,
        **kwargs,
    )
    self.set_klip_factor(k_coef_lip)
    self.wbar = None
    self._kwargs = kwargs

SpectralConv2D

SpectralConv2D(
    filters,
    kernel_size,
    strides=(1, 1),
    padding="same",
    data_format=None,
    dilation_rate=(1, 1),
    activation=None,
    use_bias=True,
    kernel_initializer=SpectralInitializer(),
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    k_coef_lip=1.0,
    eps_spectral=DEFAULT_EPS_SPECTRAL,
    eps_bjorck=DEFAULT_EPS_BJORCK,
    beta_bjorck=DEFAULT_BETA_BJORCK,
    maxiter_spectral=DEFAULT_MAXITER_SPECTRAL,
    maxiter_bjorck=DEFAULT_MAXITER_BJORCK,
    **kwargs
)

Bases: Conv2D, LipschitzLayer, Condensable

This class is a Conv2D Layer constrained such that all singular of it's kernel are 1. The computation based on Bjorck algorithm. As this is not enough to ensure 1 Lipschitzity a coertive coefficient is applied on the output. The computation is done in three steps:

  1. reduce the largest singular value to 1, using iterated power method.
  2. increase other singular values to 1, using Bjorck algorithm.
  3. divide the output by the Lipschitz bound to ensure k Lipschitzity.
PARAMETER DESCRIPTION
filters

Integer, the dimensionality of the output space (i.e. the number of output filters in the convolution).

kernel_size

An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions.

strides

An integer or tuple/list of 2 integers, specifying the strides of the convolution along the height and width. Can be a single integer to specify the same value for all spatial dimensions. Specifying any stride value != 1 is incompatible with specifying any dilation_rate value != 1.

DEFAULT: (1, 1)

padding

one of "valid" or "same" (case-insensitive).

DEFAULT: 'same'

data_format

A string, one of channels_last (default) or channels_first. The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, height, width, channels) while channels_first corresponds to inputs with shape (batch, channels, height, width). It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json. If you never set it, then it will be "channels_last".

DEFAULT: None

dilation_rate

an integer or tuple/list of 2 integers, specifying the dilation rate to use for dilated convolution. Can be a single integer to specify the same value for all spatial dimensions. Currently, specifying any dilation_rate value != 1 is incompatible with specifying any stride value != 1.

DEFAULT: (1, 1)

activation

Activation function to use. If you don't specify anything, no activation is applied (ie. "linear" activation: a(x) = x).

DEFAULT: None

use_bias

Boolean, whether the layer uses a bias vector.

DEFAULT: True

kernel_initializer

Initializer for the kernel weights matrix.

DEFAULT: SpectralInitializer()

bias_initializer

Initializer for the bias vector.

DEFAULT: 'zeros'

kernel_regularizer

Regularizer function applied to the kernel weights matrix.

DEFAULT: None

bias_regularizer

Regularizer function applied to the bias vector.

DEFAULT: None

activity_regularizer

Regularizer function applied to the output of the layer (its "activation")..

DEFAULT: None

kernel_constraint

Constraint function applied to the kernel matrix.

DEFAULT: None

bias_constraint

Constraint function applied to the bias vector.

DEFAULT: None

k_coef_lip

lipschitz constant to ensure

DEFAULT: 1.0

eps_spectral

stopping criterion for the iterative power algorithm.

DEFAULT: DEFAULT_EPS_SPECTRAL

eps_bjorck

stopping criterion Bjorck algorithm.

DEFAULT: DEFAULT_EPS_BJORCK

beta_bjorck

beta parameter in bjorck algorithm.

DEFAULT: DEFAULT_BETA_BJORCK

maxiter_spectral

maximum number of iterations for the power iteration.

DEFAULT: DEFAULT_MAXITER_SPECTRAL

maxiter_bjorck

maximum number of iterations for bjorck algorithm.

DEFAULT: DEFAULT_MAXITER_BJORCK

This documentation reuse the body of the original keras.layers.Conv2D doc.

Source code in deel/lip/layers/convolutional.py
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
def __init__(
    self,
    filters,
    kernel_size,
    strides=(1, 1),
    padding="same",
    data_format=None,
    dilation_rate=(1, 1),
    activation=None,
    use_bias=True,
    kernel_initializer=SpectralInitializer(),
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    k_coef_lip=1.0,
    eps_spectral=DEFAULT_EPS_SPECTRAL,
    eps_bjorck=DEFAULT_EPS_BJORCK,
    beta_bjorck=DEFAULT_BETA_BJORCK,
    maxiter_spectral=DEFAULT_MAXITER_SPECTRAL,
    maxiter_bjorck=DEFAULT_MAXITER_BJORCK,
    **kwargs,
):
    """
    This class is a Conv2D Layer constrained such that all singular of it's kernel
    are 1. The computation based on Bjorck algorithm. As this is not
    enough to ensure 1 Lipschitzity a coertive coefficient is applied on the
    output.
    The computation is done in three steps:

    1. reduce the largest singular value to 1, using iterated power method.
    2. increase other singular values to 1, using Bjorck algorithm.
    3. divide the output by the Lipschitz bound to ensure k Lipschitzity.

    Args:
        filters: Integer, the dimensionality of the output space
            (i.e. the number of output filters in the convolution).
        kernel_size: An integer or tuple/list of 2 integers, specifying the
            height and width of the 2D convolution window.
            Can be a single integer to specify the same value for
            all spatial dimensions.
        strides: An integer or tuple/list of 2 integers,
            specifying the strides of the convolution along the height and width.
            Can be a single integer to specify the same value for
            all spatial dimensions.
            Specifying any stride value != 1 is incompatible with specifying
            any `dilation_rate` value != 1.
        padding: one of `"valid"` or `"same"` (case-insensitive).
        data_format: A string,
            one of `channels_last` (default) or `channels_first`.
            The ordering of the dimensions in the inputs.
            `channels_last` corresponds to inputs with shape
            `(batch, height, width, channels)` while `channels_first`
            corresponds to inputs with shape
            `(batch, channels, height, width)`.
            It defaults to the `image_data_format` value found in your
            Keras config file at `~/.keras/keras.json`.
            If you never set it, then it will be "channels_last".
        dilation_rate: an integer or tuple/list of 2 integers, specifying
            the dilation rate to use for dilated convolution.
            Can be a single integer to specify the same value for
            all spatial dimensions.
            Currently, specifying any `dilation_rate` value != 1 is
            incompatible with specifying any stride value != 1.
        activation: Activation function to use.
            If you don't specify anything, no activation is applied
            (ie. "linear" activation: `a(x) = x`).
        use_bias: Boolean, whether the layer uses a bias vector.
        kernel_initializer: Initializer for the `kernel` weights matrix.
        bias_initializer: Initializer for the bias vector.
        kernel_regularizer: Regularizer function applied to
            the `kernel` weights matrix.
        bias_regularizer: Regularizer function applied to the bias vector.
        activity_regularizer: Regularizer function applied to
            the output of the layer (its "activation")..
        kernel_constraint: Constraint function applied to the kernel matrix.
        bias_constraint: Constraint function applied to the bias vector.
        k_coef_lip: lipschitz constant to ensure
        eps_spectral: stopping criterion for the iterative power algorithm.
        eps_bjorck: stopping criterion Bjorck algorithm.
        beta_bjorck: beta parameter in bjorck algorithm.
        maxiter_spectral: maximum number of iterations for the power iteration.
        maxiter_bjorck: maximum number of iterations for bjorck algorithm.

    This documentation reuse the body of the original keras.layers.Conv2D doc.
    """
    if dilation_rate not in ((1, 1), [1, 1], 1):
        raise RuntimeError("SpectralConv2D does not support dilation rate")
    if padding != "same":
        raise RuntimeError("SpectralConv2D only supports padding='same'")
    super(SpectralConv2D, self).__init__(
        filters=filters,
        kernel_size=kernel_size,
        strides=strides,
        padding=padding,
        data_format=data_format,
        dilation_rate=dilation_rate,
        activation=activation,
        use_bias=use_bias,
        kernel_initializer=kernel_initializer,
        bias_initializer=bias_initializer,
        kernel_regularizer=kernel_regularizer,
        bias_regularizer=bias_regularizer,
        activity_regularizer=activity_regularizer,
        kernel_constraint=kernel_constraint,
        bias_constraint=bias_constraint,
        **kwargs,
    )
    self._kwargs = kwargs
    self.set_klip_factor(k_coef_lip)
    self.u = None
    self.sig = None
    self.wbar = None
    _check_RKO_params(eps_spectral, eps_bjorck, beta_bjorck)
    self.eps_spectral = eps_spectral
    self.eps_bjorck = eps_bjorck
    self.beta_bjorck = beta_bjorck
    self.maxiter_bjorck = maxiter_bjorck
    self.maxiter_spectral = maxiter_spectral

SpectralConv2DTranspose

SpectralConv2DTranspose(
    filters,
    kernel_size,
    strides=(1, 1),
    padding="same",
    output_padding=None,
    data_format=None,
    dilation_rate=(1, 1),
    activation=None,
    use_bias=True,
    kernel_initializer=SpectralInitializer(),
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    k_coef_lip=1.0,
    eps_spectral=DEFAULT_EPS_SPECTRAL,
    eps_bjorck=DEFAULT_EPS_BJORCK,
    beta_bjorck=DEFAULT_BETA_BJORCK,
    maxiter_spectral=DEFAULT_MAXITER_SPECTRAL,
    maxiter_bjorck=DEFAULT_MAXITER_BJORCK,
    **kwargs
)

Bases: Conv2DTranspose, LipschitzLayer, Condensable

This class is a Conv2DTranspose layer constrained such that all singular values of its kernel are 1. The computation is based on Björck orthogonalization algorithm.

The computation is done in three steps: 1. reduce the largest singular value to 1, using iterated power method. 2. increase other singular values to 1, using Björck algorithm. 3. divide the output by the Lipschitz target K to ensure K-Lipschitzity.

This documentation reuses the body of the original tf.keras.layers.Conv2DTranspose doc.

PARAMETER DESCRIPTION
filters

Integer, the dimensionality of the output space (i.e. the number of output filters in the convolution).

kernel_size

An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions.

strides

An integer or tuple/list of 2 integers, specifying the strides of the convolution along the height and width. Can be a single integer to specify the same value for all spatial dimensions.

DEFAULT: (1, 1)

padding

only "same" padding is supported in this Lipschitz layer (case-insensitive).

DEFAULT: 'same'

output_padding

if set to None (default), the output shape is inferred. Only None value is supported in this Lipschitz layer.

DEFAULT: None

data_format

A string, one of channels_last (default) or channels_first. The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, height, width, channels) while channels_first corresponds to inputs with shape (batch, channels, height, width). It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json. If you never set it, then it will be "channels_last".

DEFAULT: None

dilation_rate

an integer, specifying the dilation rate for all spatial dimensions for dilated convolution. This Lipschitz layer does not support dilation rate != 1.

DEFAULT: (1, 1)

activation

Activation function to use. If you don't specify anything, no activation is applied (see keras.activations).

DEFAULT: None

use_bias

Boolean, whether the layer uses a bias vector.

DEFAULT: True

kernel_initializer

Initializer for the kernel weights matrix (see keras.initializers). Defaults to SpectralInitializer.

DEFAULT: SpectralInitializer()

bias_initializer

Initializer for the bias vector (see keras.initializers). Defaults to 'zeros'.

DEFAULT: 'zeros'

kernel_regularizer

Regularizer function applied to the kernel weights matrix (see keras.regularizers).

DEFAULT: None

bias_regularizer

Regularizer function applied to the bias vector (see keras.regularizers).

DEFAULT: None

activity_regularizer

Regularizer function applied to the output of the layer (its "activation") (see keras.regularizers).

DEFAULT: None

kernel_constraint

Constraint function applied to the kernel matrix (see keras.constraints).

DEFAULT: None

bias_constraint

Constraint function applied to the bias vector (see keras.constraints).

DEFAULT: None

k_coef_lip

Lipschitz constant to ensure

DEFAULT: 1.0

eps_spectral

stopping criterion for the iterative power algorithm.

DEFAULT: DEFAULT_EPS_SPECTRAL

eps_bjorck

stopping criterion Björck algorithm.

DEFAULT: DEFAULT_EPS_BJORCK

beta_bjorck

beta parameter in Björck algorithm.

DEFAULT: DEFAULT_BETA_BJORCK

maxiter_spectral

maximum number of iterations for the power iteration.

DEFAULT: DEFAULT_MAXITER_SPECTRAL

maxiter_bjorck

maximum number of iterations for bjorck algorithm.

DEFAULT: DEFAULT_MAXITER_BJORCK

Source code in deel/lip/layers/convolutional.py
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
def __init__(
    self,
    filters,
    kernel_size,
    strides=(1, 1),
    padding="same",
    output_padding=None,
    data_format=None,
    dilation_rate=(1, 1),
    activation=None,
    use_bias=True,
    kernel_initializer=SpectralInitializer(),
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    k_coef_lip=1.0,
    eps_spectral=DEFAULT_EPS_SPECTRAL,
    eps_bjorck=DEFAULT_EPS_BJORCK,
    beta_bjorck=DEFAULT_BETA_BJORCK,
    maxiter_spectral=DEFAULT_MAXITER_SPECTRAL,
    maxiter_bjorck=DEFAULT_MAXITER_BJORCK,
    **kwargs,
):
    """
    This class is a Conv2DTranspose layer constrained such that all singular values
    of its kernel are 1. The computation is based on Björck orthogonalization
    algorithm.

    The computation is done in three steps:
    1. reduce the largest singular value to 1, using iterated power method.
    2. increase other singular values to 1, using Björck algorithm.
    3. divide the output by the Lipschitz target K to ensure K-Lipschitzity.

    This documentation reuses the body of the original
    `tf.keras.layers.Conv2DTranspose` doc.

    Args:
        filters: Integer, the dimensionality of the output space
            (i.e. the number of output filters in the convolution).
        kernel_size: An integer or tuple/list of 2 integers, specifying the
            height and width of the 2D convolution window.
            Can be a single integer to specify the same value for
            all spatial dimensions.
        strides: An integer or tuple/list of 2 integers,
            specifying the strides of the convolution along the height and width.
            Can be a single integer to specify the same value for
            all spatial dimensions.
        padding: only `"same"` padding is supported in this Lipschitz layer
            (case-insensitive).
        output_padding: if set to `None` (default), the output shape is inferred.
            Only `None` value is supported in this Lipschitz layer.
        data_format: A string,
            one of `channels_last` (default) or `channels_first`.
            The ordering of the dimensions in the inputs.
            `channels_last` corresponds to inputs with shape
            `(batch, height, width, channels)` while `channels_first`
            corresponds to inputs with shape
            `(batch, channels, height, width)`.
            It defaults to the `image_data_format` value found in your
            Keras config file at `~/.keras/keras.json`.
            If you never set it, then it will be "channels_last".
        dilation_rate: an integer, specifying the dilation rate for all spatial
            dimensions for dilated convolution. This Lipschitz layer does not
            support dilation rate != 1.
        activation: Activation function to use.
            If you don't specify anything, no activation is applied
            (see `keras.activations`).
        use_bias: Boolean, whether the layer uses a bias vector.
        kernel_initializer: Initializer for the `kernel` weights matrix
            (see `keras.initializers`). Defaults to `SpectralInitializer`.
        bias_initializer: Initializer for the bias vector
            (see `keras.initializers`). Defaults to 'zeros'.
        kernel_regularizer: Regularizer function applied to
            the `kernel` weights matrix (see `keras.regularizers`).
        bias_regularizer: Regularizer function applied to the bias vector
            (see `keras.regularizers`).
        activity_regularizer: Regularizer function applied to
            the output of the layer (its "activation") (see `keras.regularizers`).
        kernel_constraint: Constraint function applied to the kernel matrix
            (see `keras.constraints`).
        bias_constraint: Constraint function applied to the bias vector
            (see `keras.constraints`).
        k_coef_lip: Lipschitz constant to ensure
        eps_spectral: stopping criterion for the iterative power algorithm.
        eps_bjorck: stopping criterion Björck algorithm.
        beta_bjorck: beta parameter in Björck algorithm.
        maxiter_spectral: maximum number of iterations for the power iteration.
        maxiter_bjorck: maximum number of iterations for bjorck algorithm.
    """
    super().__init__(
        filters,
        kernel_size,
        strides,
        padding,
        output_padding,
        data_format,
        dilation_rate,
        activation,
        use_bias,
        kernel_initializer,
        bias_initializer,
        kernel_regularizer,
        bias_regularizer,
        activity_regularizer,
        kernel_constraint,
        bias_constraint,
        **kwargs,
    )

    if self.dilation_rate != (1, 1):
        raise ValueError("SpectralConv2DTranspose does not support dilation rate")
    if self.padding != "same":
        raise ValueError("SpectralConv2DTranspose only supports padding='same'")
    if self.output_padding is not None:
        raise ValueError(
            "SpectralConv2DTranspose only supports output_padding=None"
        )
    self.set_klip_factor(k_coef_lip)
    self.u = None
    self.sig = None
    self.wbar = None
    _check_RKO_params(eps_spectral, eps_bjorck, beta_bjorck)
    self.eps_spectral = eps_spectral
    self.eps_bjorck = eps_bjorck
    self.beta_bjorck = beta_bjorck
    self.maxiter_bjorck = maxiter_bjorck
    self.maxiter_spectral = maxiter_spectral
    self._kwargs = kwargs

dense

This module extends original keras layers, in order to add k lipschitz constraint via reparametrization. Currently, are implemented: * Dense layer: as SpectralDense (and as FrobeniusDense when the layer has a single output) * Conv2D layer: as SpectralConv2D (and as FrobeniusConv2D when the layer has a single output) * AveragePooling: as ScaledAveragePooling * GlobalAveragePooling2D: as ScaledGlobalAveragePooling2D By default the layers are 1 Lipschitz almost everywhere, which is efficient for wasserstein distance estimation. However for other problems (such as adversarial robustness) the user may want to use layers that are at most 1 lipschitz, this can be done by setting the param eps_bjorck=None.

FrobeniusDense

FrobeniusDense(
    units,
    activation=None,
    use_bias=True,
    kernel_initializer=SpectralInitializer(),
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    disjoint_neurons=True,
    k_coef_lip=1.0,
    **kwargs
)

Bases: Dense, LipschitzLayer, Condensable

Identical and faster than a SpectralDense in the case of a single output. In the multi-neurons setting, this layer can be used: - as a classical Frobenius Dense normalization (disjoint_neurons=False) - as a stacking of 1 lipschitz independent neurons (each output is 1-lipschitz, but the no orthogonality is enforced between outputs ) (disjoint_neurons=True).

Warning

default is disjoint_neurons = True

Source code in deel/lip/layers/dense.py
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
def __init__(
    self,
    units,
    activation=None,
    use_bias=True,
    kernel_initializer=SpectralInitializer(),
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    disjoint_neurons=True,
    k_coef_lip=1.0,
    **kwargs
):
    super().__init__(
        units=units,
        activation=activation,
        use_bias=use_bias,
        kernel_initializer=kernel_initializer,
        bias_initializer=bias_initializer,
        kernel_regularizer=kernel_regularizer,
        bias_regularizer=bias_regularizer,
        activity_regularizer=activity_regularizer,
        kernel_constraint=kernel_constraint,
        bias_constraint=bias_constraint,
        **kwargs
    )
    self.set_klip_factor(k_coef_lip)
    self.disjoint_neurons = disjoint_neurons
    self.axis_norm = None
    self.wbar = None
    if self.disjoint_neurons:
        self.axis_norm = 0
    self._kwargs = kwargs

SpectralDense

SpectralDense(
    units,
    activation=None,
    use_bias=True,
    kernel_initializer=SpectralInitializer(),
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    k_coef_lip=1.0,
    eps_spectral=DEFAULT_EPS_SPECTRAL,
    eps_bjorck=DEFAULT_EPS_BJORCK,
    beta_bjorck=DEFAULT_BETA_BJORCK,
    maxiter_spectral=DEFAULT_MAXITER_SPECTRAL,
    maxiter_bjorck=DEFAULT_MAXITER_BJORCK,
    **kwargs
)

Bases: Dense, LipschitzLayer, Condensable

This class is a Dense Layer constrained such that all singular of it's kernel are 1. The computation based on Bjorck algorithm. The computation is done in two steps:

  1. reduce the larget singular value to 1, using iterated power method.
  2. increase other singular values to 1, using Bjorck algorithm.
PARAMETER DESCRIPTION
units

Positive integer, dimensionality of the output space.

activation

Activation function to use. If you don't specify anything, no activation is applied (ie. "linear" activation: a(x) = x).

DEFAULT: None

use_bias

Boolean, whether the layer uses a bias vector.

DEFAULT: True

kernel_initializer

Initializer for the kernel weights matrix.

DEFAULT: SpectralInitializer()

bias_initializer

Initializer for the bias vector.

DEFAULT: 'zeros'

kernel_regularizer

Regularizer function applied to the kernel weights matrix.

DEFAULT: None

bias_regularizer

Regularizer function applied to the bias vector.

DEFAULT: None

activity_regularizer

Regularizer function applied to the output of the layer (its "activation")..

DEFAULT: None

kernel_constraint

Constraint function applied to the kernel weights matrix.

DEFAULT: None

bias_constraint

Constraint function applied to the bias vector.

DEFAULT: None

k_coef_lip

lipschitz constant to ensure

DEFAULT: 1.0

eps_spectral

stopping criterion for the iterative power algorithm.

DEFAULT: DEFAULT_EPS_SPECTRAL

eps_bjorck

stopping criterion Bjorck algorithm.

DEFAULT: DEFAULT_EPS_BJORCK

beta_bjorck

beta parameter in bjorck algorithm.

DEFAULT: DEFAULT_BETA_BJORCK

maxiter_spectral

maximum number of iterations for the power iteration.

DEFAULT: DEFAULT_MAXITER_SPECTRAL

maxiter_bjorck

maximum number of iterations for bjorck algorithm.

DEFAULT: DEFAULT_MAXITER_BJORCK

Input shape

N-D tensor with shape: (batch_size, ..., input_dim). The most common situation would be a 2D input with shape (batch_size, input_dim).

Output shape

N-D tensor with shape: (batch_size, ..., units). For instance, for a 2D input with shape (batch_size, input_dim), the output would have shape (batch_size, units).

This documentation reuse the body of the original keras.layers.Dense doc.

Source code in deel/lip/layers/dense.py
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
def __init__(
    self,
    units,
    activation=None,
    use_bias=True,
    kernel_initializer=SpectralInitializer(),
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    k_coef_lip=1.0,
    eps_spectral=DEFAULT_EPS_SPECTRAL,
    eps_bjorck=DEFAULT_EPS_BJORCK,
    beta_bjorck=DEFAULT_BETA_BJORCK,
    maxiter_spectral=DEFAULT_MAXITER_SPECTRAL,
    maxiter_bjorck=DEFAULT_MAXITER_BJORCK,
    **kwargs
):
    """
    This class is a Dense Layer constrained such that all singular of it's kernel
    are 1. The computation based on Bjorck algorithm.
    The computation is done in two steps:

    1. reduce the larget singular value to 1, using iterated power method.
    2. increase other singular values to 1, using Bjorck algorithm.

    Args:
        units: Positive integer, dimensionality of the output space.
        activation: Activation function to use.
            If you don't specify anything, no activation is applied
            (ie. "linear" activation: `a(x) = x`).
        use_bias: Boolean, whether the layer uses a bias vector.
        kernel_initializer: Initializer for the `kernel` weights matrix.
        bias_initializer: Initializer for the bias vector.
        kernel_regularizer: Regularizer function applied to
            the `kernel` weights matrix.
        bias_regularizer: Regularizer function applied to the bias vector.
        activity_regularizer: Regularizer function applied to
            the output of the layer (its "activation")..
        kernel_constraint: Constraint function applied to
            the `kernel` weights matrix.
        bias_constraint: Constraint function applied to the bias vector.
        k_coef_lip: lipschitz constant to ensure
        eps_spectral: stopping criterion for the iterative power algorithm.
        eps_bjorck: stopping criterion Bjorck algorithm.
        beta_bjorck: beta parameter in bjorck algorithm.
        maxiter_spectral: maximum number of iterations for the power iteration.
        maxiter_bjorck: maximum number of iterations for bjorck algorithm.

    Input shape:
        N-D tensor with shape: `(batch_size, ..., input_dim)`.
        The most common situation would be
        a 2D input with shape `(batch_size, input_dim)`.

    Output shape:
        N-D tensor with shape: `(batch_size, ..., units)`.
        For instance, for a 2D input with shape `(batch_size, input_dim)`,
        the output would have shape `(batch_size, units)`.

    This documentation reuse the body of the original keras.layers.Dense doc.
    """
    super(SpectralDense, self).__init__(
        units=units,
        activation=activation,
        use_bias=use_bias,
        kernel_initializer=kernel_initializer,
        bias_initializer=bias_initializer,
        kernel_regularizer=kernel_regularizer,
        bias_regularizer=bias_regularizer,
        activity_regularizer=activity_regularizer,
        kernel_constraint=kernel_constraint,
        bias_constraint=bias_constraint,
        **kwargs
    )
    self._kwargs = kwargs
    self.set_klip_factor(k_coef_lip)
    _check_RKO_params(eps_spectral, eps_bjorck, beta_bjorck)
    self.eps_spectral = eps_spectral
    self.eps_bjorck = eps_bjorck
    self.beta_bjorck = beta_bjorck
    self.maxiter_bjorck = maxiter_bjorck
    self.maxiter_spectral = maxiter_spectral
    self.u = None
    self.sig = None
    self.wbar = None
    self.built = False

pooling

This module extends original keras layers, in order to add k lipschitz constraint via reparametrization. Currently, are implemented: * Dense layer: as SpectralDense (and as FrobeniusDense when the layer has a single output) * Conv2D layer: as SpectralConv2D (and as FrobeniusConv2D when the layer has a single output) * AveragePooling: as ScaledAveragePooling * GlobalAveragePooling2D: as ScaledGlobalAveragePooling2D By default the layers are 1 Lipschitz almost everywhere, which is efficient for wasserstein distance estimation. However for other problems (such as adversarial robustness) the user may want to use layers that are at most 1 lipschitz, this can be done by setting the param eps_bjorck=None.

InvertibleDownSampling

InvertibleDownSampling(
    pool_size,
    data_format="channels_last",
    name=None,
    dtype=None,
    **kwargs
)

Bases: Layer

This pooling layer perform a reshape on the spacial dimensions: it take a (bs, h, w, c) ( if channels_last ) and reshape it to a (bs, h/p_h, w/p_w, cp_wp_h ), where p_w and p_h are the shape of the pool. By doing this the image size is reduced while the number of channels is increased.

References

Anil et al. paper

Note

The image shape must be divisible by the pool shape.

PARAMETER DESCRIPTION
pool_size

tuple describing the pool shape

data_format

can either be channels_last or channels_first

DEFAULT: 'channels_last'

name

name of the layer

DEFAULT: None

dtype

dtype of the layer

DEFAULT: None

**kwargs

params passed to the Layers constructor

DEFAULT: {}

Source code in deel/lip/layers/pooling.py
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
def __init__(
    self, pool_size, data_format="channels_last", name=None, dtype=None, **kwargs
):
    """

    This pooling layer perform a reshape on the spacial dimensions: it take a
    (bs, h, w, c) ( if channels_last ) and reshape it to a
    (bs, h/p_h, w/p_w, c*p_w*p_h ), where p_w and p_h are the shape of the pool.
    By doing this the image size is reduced while the number of channels is
    increased.

    References:
        Anil et al. [paper](https://arxiv.org/abs/1911.00937)

    Note:
        The image shape must be divisible by the pool shape.

    Args:
        pool_size: tuple describing the pool shape
        data_format: can either be `channels_last` or `channels_first`
        name: name of the layer
        dtype: dtype of the layer
        **kwargs: params passed to the Layers constructor
    """
    super(InvertibleDownSampling, self).__init__(name=name, dtype=dtype, **kwargs)
    self.pool_size = pool_size
    self.data_format = data_format

InvertibleUpSampling

InvertibleUpSampling(
    pool_size,
    data_format="channels_last",
    name=None,
    dtype=None,
    **kwargs
)

Bases: Layer

This Layer is the inverse of the InvertibleDownSampling layer. It take a (bs, h, w, c) ( if channels_last ) and reshape it to a (bs, h/p_h, w/p_w, cp_wp_h ), where p_w and p_h are the shape of the pool. By doing this the image size is reduced while the number of channels is increased.

References

Anil et al. paper

Note

The input number of channels must be divisible by the p_w*p_h.

PARAMETER DESCRIPTION
pool_size

tuple describing the pool shape (p_h, p_w)

data_format

can either be channels_last or channels_first

DEFAULT: 'channels_last'

name

name of the layer

DEFAULT: None

dtype

dtype of the layer

DEFAULT: None

**kwargs

params passed to the Layers constructor

DEFAULT: {}

Source code in deel/lip/layers/pooling.py
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
def __init__(
    self, pool_size, data_format="channels_last", name=None, dtype=None, **kwargs
):
    """

    This Layer is the inverse of the InvertibleDownSampling layer. It take a
    (bs, h, w, c) ( if channels_last ) and reshape it to a
    (bs, h/p_h, w/p_w, c*p_w*p_h ), where p_w and p_h are the shape of the
    pool. By doing this the image size is reduced while the number of
    channels is increased.

    References:
        Anil et al. [paper](https://arxiv.org/abs/1911.00937)

    Note:
        The input number of channels must be divisible by the `p_w*p_h`.


    Args:
        pool_size: tuple describing the pool shape (p_h, p_w)
        data_format: can either be `channels_last` or `channels_first`
        name: name of the layer
        dtype: dtype of the layer
        **kwargs: params passed to the Layers constructor
    """
    super(InvertibleUpSampling, self).__init__(name=name, dtype=dtype, **kwargs)
    self.pool_size = pool_size
    self.data_format = data_format

ScaledAveragePooling2D

ScaledAveragePooling2D(
    pool_size=(2, 2),
    strides=None,
    padding="valid",
    data_format=None,
    k_coef_lip=1.0,
    **kwargs
)

Bases: AveragePooling2D, LipschitzLayer

Average pooling operation for spatial data, but with a lipschitz bound.

PARAMETER DESCRIPTION
pool_size

integer or tuple of 2 integers, factors by which to downscale (vertical, horizontal). (2, 2) will halve the input in both spatial dimension. If only one integer is specified, the same window length will be used for both dimensions.

DEFAULT: (2, 2)

strides

Integer, tuple of 2 integers, or None. Strides values. If None, it will default to pool_size.

DEFAULT: None

padding

One of "valid" or "same" (case-insensitive).

DEFAULT: 'valid'

data_format

A string, one of channels_last (default) or channels_first. The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, height, width, channels) while channels_first corresponds to inputs with shape (batch, channels, height, width). It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json. If you never set it, then it will be "channels_last".

DEFAULT: None

k_coef_lip

the lipschitz factor to ensure

DEFAULT: 1.0

Input shape
  • If data_format='channels_last': 4D tensor with shape (batch_size, rows, cols, channels).
  • If data_format='channels_first': 4D tensor with shape (batch_size, channels, rows, cols).
Output shape
  • If data_format='channels_last': 4D tensor with shape (batch_size, pooled_rows, pooled_cols, channels).
  • If data_format='channels_first': 4D tensor with shape (batch_size, channels, pooled_rows, pooled_cols).

This documentation reuse the body of the original keras.layers.AveragePooling2D doc.

Source code in deel/lip/layers/pooling.py
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
def __init__(
    self,
    pool_size=(2, 2),
    strides=None,
    padding="valid",
    data_format=None,
    k_coef_lip=1.0,
    **kwargs,
):
    """
    Average pooling operation for spatial data, but with a lipschitz bound.

    Arguments:
        pool_size: integer or tuple of 2 integers,
            factors by which to downscale (vertical, horizontal).
            `(2, 2)` will halve the input in both spatial dimension.
            If only one integer is specified, the same window length
            will be used for both dimensions.
        strides: Integer, tuple of 2 integers, or None.
            Strides values.
            If None, it will default to `pool_size`.
        padding: One of `"valid"` or `"same"` (case-insensitive).
        data_format: A string,
            one of `channels_last` (default) or `channels_first`.
            The ordering of the dimensions in the inputs.
            `channels_last` corresponds to inputs with shape
            `(batch, height, width, channels)` while `channels_first`
            corresponds to inputs with shape
            `(batch, channels, height, width)`.
            It defaults to the `image_data_format` value found in your
            Keras config file at `~/.keras/keras.json`.
            If you never set it, then it will be "channels_last".
        k_coef_lip: the lipschitz factor to ensure

    Input shape:
        - If `data_format='channels_last'`:
            4D tensor with shape `(batch_size, rows, cols, channels)`.
        - If `data_format='channels_first'`:
            4D tensor with shape `(batch_size, channels, rows, cols)`.

    Output shape:
        - If `data_format='channels_last'`:
            4D tensor with shape `(batch_size, pooled_rows, pooled_cols, channels)`.
        - If `data_format='channels_first'`:
            4D tensor with shape `(batch_size, channels, pooled_rows, pooled_cols)`.

    This documentation reuse the body of the original keras.layers.AveragePooling2D
    doc.
    """
    if not ((strides == pool_size) or (strides is None)):
        raise RuntimeError("stride must be equal to pool_size")
    if padding != "valid":
        raise RuntimeError("ScaledAveragePooling2D only supports padding='valid'")
    super(ScaledAveragePooling2D, self).__init__(
        pool_size=pool_size,
        strides=pool_size,
        padding=padding,
        data_format=data_format,
        **kwargs,
    )
    self.set_klip_factor(k_coef_lip)
    self._kwargs = kwargs

ScaledGlobalAveragePooling2D

ScaledGlobalAveragePooling2D(
    data_format=None, k_coef_lip=1.0, **kwargs
)

Bases: GlobalAveragePooling2D, LipschitzLayer

Global average pooling operation for spatial data with Lipschitz bound.

PARAMETER DESCRIPTION
data_format

A string, one of channels_last (default) or channels_first. The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, height, width, channels) while channels_first corresponds to inputs with shape (batch, channels, height, width). It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json. If you never set it, then it will be "channels_last".

DEFAULT: None

Input shape
  • If data_format='channels_last': 4D tensor with shape (batch_size, rows, cols, channels).
  • If data_format='channels_first': 4D tensor with shape (batch_size, channels, rows, cols).

Output shape: 2D tensor with shape (batch_size, channels).

This documentation reuse the body of the original keras.layers.GlobalAveragePooling doc.

Source code in deel/lip/layers/pooling.py
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
def __init__(self, data_format=None, k_coef_lip=1.0, **kwargs):
    """Global average pooling operation for spatial data with Lipschitz bound.

    Arguments:
        data_format: A string,
            one of `channels_last` (default) or `channels_first`.
            The ordering of the dimensions in the inputs.
            `channels_last` corresponds to inputs with shape
            `(batch, height, width, channels)` while `channels_first`
            corresponds to inputs with shape
            `(batch, channels, height, width)`.
            It defaults to the `image_data_format` value found in your
            Keras config file at `~/.keras/keras.json`.
            If you never set it, then it will be "channels_last".

    Input shape:
        - If `data_format='channels_last'`:
            4D tensor with shape `(batch_size, rows, cols, channels)`.
        - If `data_format='channels_first'`:
            4D tensor with shape `(batch_size, channels, rows, cols)`.

    Output shape:
    2D tensor with shape `(batch_size, channels)`.

    This documentation reuse the body of the original
    keras.layers.GlobalAveragePooling doc.
    """
    super(ScaledGlobalAveragePooling2D, self).__init__(
        data_format=data_format, **kwargs
    )
    self.set_klip_factor(k_coef_lip)
    self._kwargs = kwargs

ScaledGlobalL2NormPooling2D

ScaledGlobalL2NormPooling2D(
    data_format=None,
    k_coef_lip=1.0,
    eps_grad_sqrt=1e-06,
    **kwargs
)

Bases: GlobalAveragePooling2D, LipschitzLayer

Average pooling operation for spatial data, with a lipschitz bound. This pooling operation is norm preserving (aka gradient=1 almost everywhere).

[1]Y.-L.Boureau, J.Ponce, et Y.LeCun, « A Theoretical Analysis of Feature Pooling in Visual Recognition »,p.8.

PARAMETER DESCRIPTION
data_format

A string, one of channels_last (default) or channels_first. The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, height, width, channels) while channels_first corresponds to inputs with shape (batch, channels, height, width). It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json. If you never set it, then it will be "channels_last".

DEFAULT: None

k_coef_lip

the lipschitz factor to ensure

DEFAULT: 1.0

eps_grad_sqrt

Epsilon value to avoid numerical instability due to non-defined gradient at 0 in the sqrt function

DEFAULT: 1e-06

Input shape
  • If data_format='channels_last': 4D tensor with shape (batch_size, rows, cols, channels).
  • If data_format='channels_first': 4D tensor with shape (batch_size, channels, rows, cols).
Output shape
  • If data_format='channels_last': 4D tensor with shape (batch_size, channels).
  • If data_format='channels_first': 4D tensor with shape (batch_size, pooled_cols).
Source code in deel/lip/layers/pooling.py
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
def __init__(self, data_format=None, k_coef_lip=1.0, eps_grad_sqrt=1e-6, **kwargs):
    """
    Average pooling operation for spatial data, with a lipschitz bound. This
    pooling operation is norm preserving (aka gradient=1 almost everywhere).

    [1]Y.-L.Boureau, J.Ponce, et Y.LeCun, « A Theoretical Analysis of Feature
    Pooling in Visual Recognition »,p.8.

    Arguments:
        data_format: A string,
            one of `channels_last` (default) or `channels_first`.
            The ordering of the dimensions in the inputs.
            `channels_last` corresponds to inputs with shape
            `(batch, height, width, channels)` while `channels_first`
            corresponds to inputs with shape
            `(batch, channels, height, width)`.
            It defaults to the `image_data_format` value found in your
            Keras config file at `~/.keras/keras.json`.
            If you never set it, then it will be "channels_last".
        k_coef_lip: the lipschitz factor to ensure
        eps_grad_sqrt: Epsilon value to avoid numerical instability
            due to non-defined gradient at 0 in the sqrt function

    Input shape:
        - If `data_format='channels_last'`:
            4D tensor with shape `(batch_size, rows, cols, channels)`.
        - If `data_format='channels_first'`:
            4D tensor with shape `(batch_size, channels, rows, cols)`.

    Output shape:
        - If `data_format='channels_last'`:
            4D tensor with shape `(batch_size, channels)`.
        - If `data_format='channels_first'`:
            4D tensor with shape `(batch_size, pooled_cols)`.
    """
    if eps_grad_sqrt < 0.0:
        raise RuntimeError("eps_grad_sqrt must be positive")
    super(ScaledGlobalL2NormPooling2D, self).__init__(
        data_format=data_format, **kwargs
    )
    self.set_klip_factor(k_coef_lip)
    self.eps_grad_sqrt = eps_grad_sqrt
    self._kwargs = kwargs
    if self.data_format == "channels_last":
        self.axes = [1, 2]
    else:
        self.axes = [2, 3]

ScaledL2NormPooling2D

ScaledL2NormPooling2D(
    pool_size=(2, 2),
    strides=None,
    padding="valid",
    data_format=None,
    k_coef_lip=1.0,
    eps_grad_sqrt=1e-06,
    **kwargs
)

Bases: AveragePooling2D, LipschitzLayer

Average pooling operation for spatial data, with a lipschitz bound. This pooling operation is norm preserving (aka gradient=1 almost everywhere).

[1]Y.-L.Boureau, J.Ponce, et Y.LeCun, « A Theoretical Analysis of Feature Pooling in Visual Recognition »,p.8.

PARAMETER DESCRIPTION
pool_size

integer or tuple of 2 integers, factors by which to downscale (vertical, horizontal). (2, 2) will halve the input in both spatial dimension. If only one integer is specified, the same window length will be used for both dimensions.

DEFAULT: (2, 2)

strides

Integer, tuple of 2 integers, or None. Strides values. If None, it will default to pool_size.

DEFAULT: None

padding

One of "valid" or "same" (case-insensitive).

DEFAULT: 'valid'

data_format

A string, one of channels_last (default) or channels_first. The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, height, width, channels) while channels_first corresponds to inputs with shape (batch, channels, height, width). It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json. If you never set it, then it will be "channels_last".

DEFAULT: None

k_coef_lip

the lipschitz factor to ensure

DEFAULT: 1.0

eps_grad_sqrt

Epsilon value to avoid numerical instability due to non-defined gradient at 0 in the sqrt function

DEFAULT: 1e-06

Input shape
  • If data_format='channels_last': 4D tensor with shape (batch_size, rows, cols, channels).
  • If data_format='channels_first': 4D tensor with shape (batch_size, channels, rows, cols).
Output shape
  • If data_format='channels_last': 4D tensor with shape (batch_size, pooled_rows, pooled_cols, channels).
  • If data_format='channels_first': 4D tensor with shape (batch_size, channels, pooled_rows, pooled_cols).
Source code in deel/lip/layers/pooling.py
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
def __init__(
    self,
    pool_size=(2, 2),
    strides=None,
    padding="valid",
    data_format=None,
    k_coef_lip=1.0,
    eps_grad_sqrt=1e-6,
    **kwargs,
):
    """
    Average pooling operation for spatial data, with a lipschitz bound. This
    pooling operation is norm preserving (aka gradient=1 almost everywhere).

    [1]Y.-L.Boureau, J.Ponce, et Y.LeCun, « A Theoretical Analysis of Feature
    Pooling in Visual Recognition »,p.8.

    Arguments:
        pool_size: integer or tuple of 2 integers,
            factors by which to downscale (vertical, horizontal).
            `(2, 2)` will halve the input in both spatial dimension.
            If only one integer is specified, the same window length
            will be used for both dimensions.
        strides: Integer, tuple of 2 integers, or None.
            Strides values.
            If None, it will default to `pool_size`.
        padding: One of `"valid"` or `"same"` (case-insensitive).
        data_format: A string,
            one of `channels_last` (default) or `channels_first`.
            The ordering of the dimensions in the inputs.
            `channels_last` corresponds to inputs with shape
            `(batch, height, width, channels)` while `channels_first`
            corresponds to inputs with shape
            `(batch, channels, height, width)`.
            It defaults to the `image_data_format` value found in your
            Keras config file at `~/.keras/keras.json`.
            If you never set it, then it will be "channels_last".
        k_coef_lip: the lipschitz factor to ensure
        eps_grad_sqrt: Epsilon value to avoid numerical instability
            due to non-defined gradient at 0 in the sqrt function

    Input shape:
        - If `data_format='channels_last'`:
            4D tensor with shape `(batch_size, rows, cols, channels)`.
        - If `data_format='channels_first'`:
            4D tensor with shape `(batch_size, channels, rows, cols)`.

    Output shape:
        - If `data_format='channels_last'`:
            4D tensor with shape `(batch_size, pooled_rows, pooled_cols, channels)`.
        - If `data_format='channels_first'`:
            4D tensor with shape `(batch_size, channels, pooled_rows, pooled_cols)`.
    """
    if not ((strides == pool_size) or (strides is None)):
        raise RuntimeError("stride must be equal to pool_size")
    if padding != "valid":
        raise RuntimeError("ScaledL2NormPooling2D only supports padding='valid'")
    if eps_grad_sqrt < 0.0:
        raise RuntimeError("eps_grad_sqrt must be positive")
    super(ScaledL2NormPooling2D, self).__init__(
        pool_size=pool_size,
        strides=pool_size,
        padding=padding,
        data_format=data_format,
        **kwargs,
    )
    self.set_klip_factor(k_coef_lip)
    self.eps_grad_sqrt = eps_grad_sqrt
    self._kwargs = kwargs

unconstrained

This module contains custom Keras unconstrained layers.

Compared to other files in layers folder, the layers defined here are not Lipschitz-constrained. They are base classes for more advanced layers. Do not use these layers as is, since they are not Lipschitz constrained.

PadConv2D

PadConv2D(
    filters,
    kernel_size,
    strides=(1, 1),
    padding="same",
    data_format=None,
    dilation_rate=(1, 1),
    activation=None,
    use_bias=True,
    kernel_initializer="glorot_uniform",
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    **kwargs
)

Bases: Conv2D, Condensable

This class is a Conv2D Layer with parameterized padding. Since Conv2D layer only supports "same" and "valid" padding, this layer will enable other type of padding, such as "constant", "symmetric", "reflect" or "circular".

Warning

The PadConv2D is not a Lipschitz layer and must not be directly used. This must be used as a base class to create a Lipschitz layer with padding.

All arguments are the same as the original Conv2D except the padding which is defined as following:

PARAMETER DESCRIPTION
padding

one of "same", "valid" "constant", "symmetric", "reflect" or "circular" (case-insensitive).

DEFAULT: 'same'

Source code in deel/lip/layers/unconstrained.py
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
def __init__(
    self,
    filters,
    kernel_size,
    strides=(1, 1),
    padding="same",
    data_format=None,
    dilation_rate=(1, 1),
    activation=None,
    use_bias=True,
    kernel_initializer="glorot_uniform",
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    **kwargs
):
    """
    This class is a Conv2D Layer with parameterized padding.
    Since Conv2D layer only supports `"same"` and `"valid"` padding, this layer will
    enable other type of padding, such as `"constant"`, `"symmetric"`, `"reflect"`
    or `"circular"`.

    Warning:
        The PadConv2D is not a Lipschitz layer and must not be directly used. This
        must be used as a base class to create a Lipschitz layer with padding.

    All arguments are the same as the original `Conv2D` except the `padding`
    which is defined as following:

    Args:
        padding: one of `"same"`, `"valid"` `"constant"`, `"symmetric"`,
            `"reflect"` or `"circular"` (case-insensitive).
    """
    self.pad = lambda x: x
    self.old_padding = padding
    self.internal_input_shape = None
    if padding.lower() != "same":  # same is directly processed in Conv2D
        padding = "valid"
    super(PadConv2D, self).__init__(
        filters=filters,
        kernel_size=kernel_size,
        strides=strides,
        padding=padding,
        data_format=data_format,
        dilation_rate=dilation_rate,
        activation=activation,
        use_bias=use_bias,
        kernel_initializer=kernel_initializer,
        bias_initializer=bias_initializer,
        kernel_regularizer=kernel_regularizer,
        bias_regularizer=bias_regularizer,
        activity_regularizer=activity_regularizer,
        kernel_constraint=kernel_constraint,
        bias_constraint=bias_constraint,
        **kwargs
    )
    self._kwargs = kwargs
    if self.old_padding.lower() in ["same", "valid"]:
        self.pad = lambda x: x
        self.padding_size = [0, 0]
    if self.old_padding.lower() in ["constant", "reflect", "symmetric"]:
        self.padding_size = [self.kernel_size[0] // 2, self.kernel_size[1] // 2]
        paddings = [
            [0, 0],
            [self.padding_size[0], self.padding_size[0]],
            [self.padding_size[1], self.padding_size[1]],
            [0, 0],
        ]
        self.pad = lambda t: tf.pad(t, paddings, self.old_padding)
    if self.old_padding.lower() == "circular":
        self.padding_size = [self.kernel_size[0] // 2, self.kernel_size[1] // 2]
        self.pad = lambda t: _padding_circular(t, self.padding_size)