deel.lip.layers

The submodule deel.lip.layers contains all custom Keras layers to build Lipschitz-constrained neural networks. They all inherit from keras.layers.Layer from Keras API.

activations ¶

This module contains extra activation functions which respect the Lipschitz constant. It can be added as a layer, or it can be used in the "activation" params for other layers.

FullSort ¶

FullSort(**kwargs)

Bases: GroupSort

FullSort activation. Special case of GroupSort where the entire input is sorted.

Input shape

Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

Output shape

Same size as input.

Source code in deel/lip/layers/activations.py

def __init__(self, **kwargs):
    """
    FullSort activation. Special case of GroupSort where the entire input is sorted.

    Input shape:
        Arbitrary. Use the keyword argument `input_shape` (tuple of integers, does
        not include the samples axis) when using this layer as the first layer in a
        model.

    Output shape:
        Same size as input.

    """
    kwargs["n"] = None
    super().__init__(**kwargs)

GroupSort ¶

GroupSort(
    n=None,
    data_format="channels_last",
    k_coef_lip=1.0,
    **kwargs
)

Bases: Layer, LipschitzLayer

GroupSort activation

PARAMETER	DESCRIPTION
`n`	group size used when sorting. When None group size is set to input size (fullSort behavior) TYPE: `int` DEFAULT: `None`
`data_format`	either channels_first or channels_last TYPE: `str` DEFAULT: `'channels_last'`
`k_coef_lip`	the lipschitz coefficient to be enforced TYPE: `float` DEFAULT: `1.0`
`**kwargs`	params passed to layers (named fashion) DEFAULT: `{}`

Input shape

Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

Output shape

Same size as input.

Source code in deel/lip/layers/activations.py

def __init__(self, n=None, data_format="channels_last", k_coef_lip=1.0, **kwargs):
    """
    GroupSort activation

    Args:
        n (int): group size used when sorting. When None group size is set to input
            size (fullSort behavior)
        data_format (str): either channels_first or channels_last
        k_coef_lip (float): the lipschitz coefficient to be enforced
        **kwargs: params passed to layers (named fashion)

    Input shape:
        Arbitrary. Use the keyword argument `input_shape` (tuple of integers, does
        not include the samples axis) when using this layer as the first layer in a
        model.

    Output shape:
        Same size as input.

    """
    self.set_klip_factor(k_coef_lip)
    super(GroupSort, self).__init__(**kwargs)
    if data_format == "channels_last":
        self.channel_axis = -1
    elif data_format == "channels_first":
        raise RuntimeError(
            "channels_first not implemented for GroupSort activation"
        )
    else:
        raise RuntimeError("data format not understood")
    self.n = n
    self.data_format = data_format

GroupSort2 ¶

GroupSort2(**kwargs)

Bases: GroupSort

GroupSort2 activation. Special case of GroupSort with group of size 2.

Input shape

Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

Output shape

Same size as input.

Source code in deel/lip/layers/activations.py

def __init__(self, **kwargs):
    """
    GroupSort2 activation. Special case of GroupSort with group of size 2.

    Input shape:
        Arbitrary. Use the keyword argument `input_shape` (tuple of integers, does
        not include the samples axis) when using this layer as the first layer in a
        model.

    Output shape:
        Same size as input.

    """
    kwargs["n"] = 2
    super().__init__(**kwargs)

Householder ¶

Householder(
    data_format="channels_last",
    k_coef_lip=1.0,
    theta_initializer=None,
    **kwargs
)

Bases: Layer, LipschitzLayer

Householder activation: this review From this repository

PARAMETER	DESCRIPTION
`data_format`	either channels_first or channels_last. Only channels_last is supported. TYPE: `str` DEFAULT: `'channels_last'`
`k_coef_lip`	The lipschitz coefficient to be enforced. TYPE: `str` DEFAULT: `1.0`
`theta_initializer`	initializer for the angle theta of reflection. Defaults to pi/2, which corresponds to GroupSort2. DEFAULT: `None`
`**kwargs`	parameters passed to the `tf.keras.layers.Layer`. DEFAULT: `{}`

Input shape

Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

Output shape

Same size as input.

Source code in deel/lip/layers/activations.py

def __init__(
    self,
    data_format="channels_last",
    k_coef_lip=1.0,
    theta_initializer=None,
    **kwargs,
):
    """
    Householder activation:
    [this review](https://openreview.net/pdf?id=tD7eCtaSkR)
    From [this repository](https://github.com/singlasahil14/SOC)

    Args:
        data_format (str): either channels_first or channels_last. Only
            channels_last is supported.
        k_coef_lip (str): The lipschitz coefficient to be enforced.
        theta_initializer: initializer for the angle theta of reflection. Defaults
            to pi/2, which corresponds to GroupSort2.
        **kwargs: parameters passed to the `tf.keras.layers.Layer`.

    Input shape:
        Arbitrary. Use the keyword argument `input_shape` (tuple of integers, does
        not include the samples axis) when using this layer as the first layer in a
        model.

    Output shape:
        Same size as input.

    """
    if data_format != "channels_last":
        raise RuntimeError("Only 'channels_last' data format is supported")

    self.data_format = data_format
    self.set_klip_factor(k_coef_lip)
    self.theta_initializer = theta_initializer
    super().__init__(**kwargs)

MaxMin ¶

MaxMin(
    data_format="channels_last", k_coef_lip=1.0, **kwargs
)

Bases: Layer, LipschitzLayer

MaxMin activation [Relu(x),reLU(-x)]

PARAMETER	DESCRIPTION
`data_format`	either channels_first or channels_last TYPE: `str` DEFAULT: `'channels_last'`
`k_coef_lip`	the lipschitz coefficient to be enforced TYPE: `float` DEFAULT: `1.0`
`**kwargs`	params passed to layers (named fashion) DEFAULT: `{}`

Input shape

Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

Output shape

Double channel size as input.

References

([M. Blot, M. Cord, et N. Thome, « Max-min convolutional neural networks for image classification », in 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 2016, p. 3678‑3682.)

Source code in deel/lip/layers/activations.py

def __init__(self, data_format="channels_last", k_coef_lip=1.0, **kwargs):
    """
    MaxMin activation [Relu(x),reLU(-x)]

    Args:
        data_format (str): either channels_first or channels_last
        k_coef_lip (float): the lipschitz coefficient to be enforced
        **kwargs: params passed to layers (named fashion)

    Input shape:
        Arbitrary. Use the keyword argument `input_shape` (tuple of integers, does
        not include the samples axis) when using this layer as the first layer in a
        model.

    Output shape:
        Double channel size as input.

    References:
        ([M. Blot, M. Cord, et N. Thome, « Max-min convolutional neural networks
        for image classification », in 2016 IEEE International Conference on Image
        Processing (ICIP), Phoenix, AZ, USA, 2016, p. 3678‑3682.)

    """
    self.set_klip_factor(k_coef_lip)
    super(MaxMin, self).__init__(**kwargs)
    if data_format == "channels_last":
        self.channel_axis = -1
    elif data_format == "channels_first":
        self.channel_axis = 1
    else:
        raise RuntimeError("data format not understood")
    self.data_format = data_format

PReLUlip ¶

PReLUlip(k_coef_lip=1.0)

PreLu activation, with Lipschitz constraint.

PARAMETER	DESCRIPTION
`k_coef_lip`	lipschitz coefficient to be enforced TYPE: `float` DEFAULT: `1.0`

Source code in deel/lip/layers/activations.py

@register_keras_serializable("deel-lip", "PReLUlip")
def PReLUlip(k_coef_lip=1.0):
    """
    PreLu activation, with Lipschitz constraint.

    Args:
        k_coef_lip (float): lipschitz coefficient to be enforced
    """
    return PReLU(
        alpha_constraint=MinMaxNorm(min_value=-k_coef_lip, max_value=k_coef_lip)
    )

base_layer ¶

This module extends original keras layers, in order to add k lipschitz constraint via reparametrization. Currently, are implemented: * Dense layer: as SpectralDense (and as FrobeniusDense when the layer has a single output) * Conv2D layer: as SpectralConv2D (and as FrobeniusConv2D when the layer has a single output) * AveragePooling: as ScaledAveragePooling * GlobalAveragePooling2D: as ScaledGlobalAveragePooling2D By default the layers are 1 Lipschitz almost everywhere, which is efficient for wasserstein distance estimation. However for other problems (such as adversarial robustness) the user may want to use layers that are at most 1 lipschitz, this can be done by setting the param eps_bjorck=None.

Condensable ¶

Bases: ABC

Some Layers don't optimize directly the kernel, this means that the kernel stored in the layer is not the kernel used to make predictions (called W_bar), To address this, these layers can implement the condense() function that make self.kernel equal to W_bar. This operation also allows to turn the Lipschitz layer to its keras equivalent e.g. The Dense layer that have the same predictions as the trained SpectralDense.

condense `abstractmethod` ¶

condense()

The condense operation allows to overwrite the kernel and ensure that other variables are still consistent. Returns: None

Source code in deel/lip/layers/base_layer.py

@abc.abstractmethod
def condense(self):
    """
    The condense operation allows to overwrite the kernel and ensure that other
    variables are still consistent.
    Returns:
        None
    """
    pass

vanilla_export `abstractmethod` ¶

vanilla_export()

This operation allows to turn this Layer to its super type, easing storage and serving. Returns: self as super type

Source code in deel/lip/layers/base_layer.py

@abc.abstractmethod
def vanilla_export(self):
    """
    This operation allows to turn this Layer to its super type, easing storage and
    serving.
    Returns:
         self as super type
    """
    pass

LipschitzLayer ¶

Bases: ABC

This class allows to set Lipschitz factor of a layer. Lipschitz layer must inherit this class to allow user to set the Lipschitz factor. Warning: This class only regroups useful functions when developing new Lipschitz layers. But it does not ensure any property about the layer. This means that inheriting from this class won't ensure anything about the Lipschitz constant.

coef_lip `class-attribute` `instance-attribute` ¶

coef_lip = None

define correction coefficient (ie. Lipschitz bound ) of the layer ( multiply the output of the layer by this constant )

k_coef_lip `class-attribute` `instance-attribute` ¶

k_coef_lip = 1.0

variable used to store the lipschitz factor

set_klip_factor ¶

set_klip_factor(klip_factor)

Allow to set the Lipschitz factor of a layer. Args: klip_factor (float): the Lipschitz factor the user want to ensure. Returns: None

Source code in deel/lip/layers/base_layer.py

def set_klip_factor(self, klip_factor):
    """
    Allow to set the Lipschitz factor of a layer.
    Args:
        klip_factor (float): the Lipschitz factor the user want to ensure.
    Returns:
        None
    """
    self.k_coef_lip = klip_factor

convolutional ¶

This module extends original keras layers, in order to add k lipschitz constraint via reparametrization. Currently, are implemented: * Dense layer: as SpectralDense (and as FrobeniusDense when the layer has a single output) * Conv2D layer: as SpectralConv2D (and as FrobeniusConv2D when the layer has a single output) * AveragePooling: as ScaledAveragePooling * GlobalAveragePooling2D: as ScaledGlobalAveragePooling2D By default the layers are 1 Lipschitz almost everywhere, which is efficient for wasserstein distance estimation. However for other problems (such as adversarial robustness) the user may want to use layers that are at most 1 lipschitz, this can be done by setting the param eps_bjorck=None.

FrobeniusConv2D ¶

FrobeniusConv2D(
    filters,
    kernel_size,
    strides=(1, 1),
    padding="same",
    data_format=None,
    dilation_rate=(1, 1),
    activation=None,
    use_bias=True,
    kernel_initializer=SpectralInitializer(),
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    k_coef_lip=1.0,
    **kwargs
)

Bases: Conv2D, LipschitzLayer, Condensable

Same as SpectralConv2D but in the case of a single output.

Source code in deel/lip/layers/convolutional.py

def __init__(
    self,
    filters,
    kernel_size,
    strides=(1, 1),
    padding="same",
    data_format=None,
    dilation_rate=(1, 1),
    activation=None,
    use_bias=True,
    kernel_initializer=SpectralInitializer(),
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    k_coef_lip=1.0,
    **kwargs,
):
    if strides not in ((1, 1), [1, 1], 1):
        raise RuntimeError("FrobeniusConv2D does not support strides")
    if dilation_rate not in ((1, 1), [1, 1], 1):
        raise RuntimeError("FrobeniusConv2D does not support dilation rate")
    if padding != "same":
        raise RuntimeError("FrobeniusConv2D only supports padding='same'")
    if not (
        (kernel_constraint is None)
        or isinstance(kernel_constraint, SpectralConstraint)
    ):
        raise RuntimeError(
            "only deellip constraints are allowed as other constraints could break"
            " 1 lipschitz condition"
        )
    super(FrobeniusConv2D, self).__init__(
        filters=filters,
        kernel_size=kernel_size,
        strides=strides,
        padding=padding,
        data_format=data_format,
        dilation_rate=dilation_rate,
        activation=activation,
        use_bias=use_bias,
        kernel_initializer=kernel_initializer,
        bias_initializer=bias_initializer,
        kernel_regularizer=kernel_regularizer,
        bias_regularizer=bias_regularizer,
        activity_regularizer=activity_regularizer,
        kernel_constraint=kernel_constraint,
        bias_constraint=bias_constraint,
        **kwargs,
    )
    self.set_klip_factor(k_coef_lip)
    self.wbar = None
    self._kwargs = kwargs

SpectralConv2D ¶

SpectralConv2D(
    filters,
    kernel_size,
    strides=(1, 1),
    padding="same",
    data_format=None,
    dilation_rate=(1, 1),
    activation=None,
    use_bias=True,
    kernel_initializer=SpectralInitializer(),
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    k_coef_lip=1.0,
    eps_spectral=DEFAULT_EPS_SPECTRAL,
    eps_bjorck=DEFAULT_EPS_BJORCK,
    beta_bjorck=DEFAULT_BETA_BJORCK,
    maxiter_spectral=DEFAULT_MAXITER_SPECTRAL,
    maxiter_bjorck=DEFAULT_MAXITER_BJORCK,
    **kwargs
)

Bases: Conv2D, LipschitzLayer, Condensable

This class is a Conv2D Layer constrained such that all singular of it's kernel are 1. The computation based on Bjorck algorithm. As this is not enough to ensure 1 Lipschitzity a coertive coefficient is applied on the output. The computation is done in three steps:

reduce the largest singular value to 1, using iterated power method.
increase other singular values to 1, using Bjorck algorithm.
divide the output by the Lipschitz bound to ensure k Lipschitzity.

PARAMETER	DESCRIPTION
`filters`	Integer, the dimensionality of the output space (i.e. the number of output filters in the convolution).
`kernel_size`	An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions.
`strides`	An integer or tuple/list of 2 integers, specifying the strides of the convolution along the height and width. Can be a single integer to specify the same value for all spatial dimensions. Specifying any stride value != 1 is incompatible with specifying any `dilation_rate` value != 1. DEFAULT: `(1, 1)`
`padding`	one of `"valid"` or `"same"` (case-insensitive). DEFAULT: `'same'`
`data_format`	A string, one of `channels_last` (default) or `channels_first`. The ordering of the dimensions in the inputs. `channels_last` corresponds to inputs with shape `(batch, height, width, channels)` while `channels_first` corresponds to inputs with shape `(batch, channels, height, width)`. It defaults to the `image_data_format` value found in your Keras config file at `~/.keras/keras.json`. If you never set it, then it will be "channels_last". DEFAULT: `None`
`dilation_rate`	an integer or tuple/list of 2 integers, specifying the dilation rate to use for dilated convolution. Can be a single integer to specify the same value for all spatial dimensions. Currently, specifying any `dilation_rate` value != 1 is incompatible with specifying any stride value != 1. DEFAULT: `(1, 1)`
`activation`	Activation function to use. If you don't specify anything, no activation is applied (ie. "linear" activation: `a(x) = x`). DEFAULT: `None`
`use_bias`	Boolean, whether the layer uses a bias vector. DEFAULT: `True`
`kernel_initializer`	Initializer for the `kernel` weights matrix. DEFAULT: `SpectralInitializer()`
`bias_initializer`	Initializer for the bias vector. DEFAULT: `'zeros'`
`kernel_regularizer`	Regularizer function applied to the `kernel` weights matrix. DEFAULT: `None`
`bias_regularizer`	Regularizer function applied to the bias vector. DEFAULT: `None`
`activity_regularizer`	Regularizer function applied to the output of the layer (its "activation").. DEFAULT: `None`
`kernel_constraint`	Constraint function applied to the kernel matrix. DEFAULT: `None`
`bias_constraint`	Constraint function applied to the bias vector. DEFAULT: `None`
`k_coef_lip`	lipschitz constant to ensure DEFAULT: `1.0`
`eps_spectral`	stopping criterion for the iterative power algorithm. DEFAULT: `DEFAULT_EPS_SPECTRAL`
`eps_bjorck`	stopping criterion Bjorck algorithm. DEFAULT: `DEFAULT_EPS_BJORCK`
`beta_bjorck`	beta parameter in bjorck algorithm. DEFAULT: `DEFAULT_BETA_BJORCK`
`maxiter_spectral`	maximum number of iterations for the power iteration. DEFAULT: `DEFAULT_MAXITER_SPECTRAL`
`maxiter_bjorck`	maximum number of iterations for bjorck algorithm. DEFAULT: `DEFAULT_MAXITER_BJORCK`

This documentation reuse the body of the original keras.layers.Conv2D doc.

Source code in deel/lip/layers/convolutional.py

def __init__(
    self,
    filters,
    kernel_size,
    strides=(1, 1),
    padding="same",
    data_format=None,
    dilation_rate=(1, 1),
    activation=None,
    use_bias=True,
    kernel_initializer=SpectralInitializer(),
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    k_coef_lip=1.0,
    eps_spectral=DEFAULT_EPS_SPECTRAL,
    eps_bjorck=DEFAULT_EPS_BJORCK,
    beta_bjorck=DEFAULT_BETA_BJORCK,
    maxiter_spectral=DEFAULT_MAXITER_SPECTRAL,
    maxiter_bjorck=DEFAULT_MAXITER_BJORCK,
    **kwargs,
):
    """
    This class is a Conv2D Layer constrained such that all singular of it's kernel
    are 1. The computation based on Bjorck algorithm. As this is not
    enough to ensure 1 Lipschitzity a coertive coefficient is applied on the
    output.
    The computation is done in three steps:

    1. reduce the largest singular value to 1, using iterated power method.
    2. increase other singular values to 1, using Bjorck algorithm.
    3. divide the output by the Lipschitz bound to ensure k Lipschitzity.

    Args:
        filters: Integer, the dimensionality of the output space
            (i.e. the number of output filters in the convolution).
        kernel_size: An integer or tuple/list of 2 integers, specifying the
            height and width of the 2D convolution window.
            Can be a single integer to specify the same value for
            all spatial dimensions.
        strides: An integer or tuple/list of 2 integers,
            specifying the strides of the convolution along the height and width.
            Can be a single integer to specify the same value for
            all spatial dimensions.
            Specifying any stride value != 1 is incompatible with specifying
            any `dilation_rate` value != 1.
        padding: one of `"valid"` or `"same"` (case-insensitive).
        data_format: A string,
            one of `channels_last` (default) or `channels_first`.
            The ordering of the dimensions in the inputs.
            `channels_last` corresponds to inputs with shape
            `(batch, height, width, channels)` while `channels_first`
            corresponds to inputs with shape
            `(batch, channels, height, width)`.
            It defaults to the `image_data_format` value found in your
            Keras config file at `~/.keras/keras.json`.
            If you never set it, then it will be "channels_last".
        dilation_rate: an integer or tuple/list of 2 integers, specifying
            the dilation rate to use for dilated convolution.
            Can be a single integer to specify the same value for
            all spatial dimensions.
            Currently, specifying any `dilation_rate` value != 1 is
            incompatible with specifying any stride value != 1.
        activation: Activation function to use.
            If you don't specify anything, no activation is applied
            (ie. "linear" activation: `a(x) = x`).
        use_bias: Boolean, whether the layer uses a bias vector.
        kernel_initializer: Initializer for the `kernel` weights matrix.
        bias_initializer: Initializer for the bias vector.
        kernel_regularizer: Regularizer function applied to
            the `kernel` weights matrix.
        bias_regularizer: Regularizer function applied to the bias vector.
        activity_regularizer: Regularizer function applied to
            the output of the layer (its "activation")..
        kernel_constraint: Constraint function applied to the kernel matrix.
        bias_constraint: Constraint function applied to the bias vector.
        k_coef_lip: lipschitz constant to ensure
        eps_spectral: stopping criterion for the iterative power algorithm.
        eps_bjorck: stopping criterion Bjorck algorithm.
        beta_bjorck: beta parameter in bjorck algorithm.
        maxiter_spectral: maximum number of iterations for the power iteration.
        maxiter_bjorck: maximum number of iterations for bjorck algorithm.

    This documentation reuse the body of the original keras.layers.Conv2D doc.
    """
    if dilation_rate not in ((1, 1), [1, 1], 1):
        raise RuntimeError("SpectralConv2D does not support dilation rate")
    if padding != "same":
        raise RuntimeError("SpectralConv2D only supports padding='same'")
    super(SpectralConv2D, self).__init__(
        filters=filters,
        kernel_size=kernel_size,
        strides=strides,
        padding=padding,
        data_format=data_format,
        dilation_rate=dilation_rate,
        activation=activation,
        use_bias=use_bias,
        kernel_initializer=kernel_initializer,
        bias_initializer=bias_initializer,
        kernel_regularizer=kernel_regularizer,
        bias_regularizer=bias_regularizer,
        activity_regularizer=activity_regularizer,
        kernel_constraint=kernel_constraint,
        bias_constraint=bias_constraint,
        **kwargs,
    )
    self._kwargs = kwargs
    self.set_klip_factor(k_coef_lip)
    self.u = None
    self.sig = None
    self.wbar = None
    _check_RKO_params(eps_spectral, eps_bjorck, beta_bjorck)
    self.eps_spectral = eps_spectral
    self.eps_bjorck = eps_bjorck
    self.beta_bjorck = beta_bjorck
    self.maxiter_bjorck = maxiter_bjorck
    self.maxiter_spectral = maxiter_spectral

SpectralConv2DTranspose ¶

SpectralConv2DTranspose(
    filters,
    kernel_size,
    strides=(1, 1),
    padding="same",
    output_padding=None,
    data_format=None,
    dilation_rate=(1, 1),
    activation=None,
    use_bias=True,
    kernel_initializer=SpectralInitializer(),
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    k_coef_lip=1.0,
    eps_spectral=DEFAULT_EPS_SPECTRAL,
    eps_bjorck=DEFAULT_EPS_BJORCK,
    beta_bjorck=DEFAULT_BETA_BJORCK,
    maxiter_spectral=DEFAULT_MAXITER_SPECTRAL,
    maxiter_bjorck=DEFAULT_MAXITER_BJORCK,
    **kwargs
)

Bases: Conv2DTranspose, LipschitzLayer, Condensable

This class is a Conv2DTranspose layer constrained such that all singular values of its kernel are 1. The computation is based on Björck orthogonalization algorithm.

The computation is done in three steps: 1. reduce the largest singular value to 1, using iterated power method. 2. increase other singular values to 1, using Björck algorithm. 3. divide the output by the Lipschitz target K to ensure K-Lipschitzity.

This documentation reuses the body of the original tf.keras.layers.Conv2DTranspose doc.

PARAMETER	DESCRIPTION
`filters`	Integer, the dimensionality of the output space (i.e. the number of output filters in the convolution).
`kernel_size`	An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions.
`strides`	An integer or tuple/list of 2 integers, specifying the strides of the convolution along the height and width. Can be a single integer to specify the same value for all spatial dimensions. DEFAULT: `(1, 1)`
`padding`	only `"same"` padding is supported in this Lipschitz layer (case-insensitive). DEFAULT: `'same'`
`output_padding`	if set to `None` (default), the output shape is inferred. Only `None` value is supported in this Lipschitz layer. DEFAULT: `None`
`data_format`	A string, one of `channels_last` (default) or `channels_first`. The ordering of the dimensions in the inputs. `channels_last` corresponds to inputs with shape `(batch, height, width, channels)` while `channels_first` corresponds to inputs with shape `(batch, channels, height, width)`. It defaults to the `image_data_format` value found in your Keras config file at `~/.keras/keras.json`. If you never set it, then it will be "channels_last". DEFAULT: `None`
`dilation_rate`	an integer, specifying the dilation rate for all spatial dimensions for dilated convolution. This Lipschitz layer does not support dilation rate != 1. DEFAULT: `(1, 1)`
`activation`	Activation function to use. If you don't specify anything, no activation is applied (see `keras.activations`). DEFAULT: `None`
`use_bias`	Boolean, whether the layer uses a bias vector. DEFAULT: `True`
`kernel_initializer`	Initializer for the `kernel` weights matrix (see `keras.initializers`). Defaults to `SpectralInitializer`. DEFAULT: `SpectralInitializer()`
`bias_initializer`	Initializer for the bias vector (see `keras.initializers`). Defaults to 'zeros'. DEFAULT: `'zeros'`
`kernel_regularizer`	Regularizer function applied to the `kernel` weights matrix (see `keras.regularizers`). DEFAULT: `None`
`bias_regularizer`	Regularizer function applied to the bias vector (see `keras.regularizers`). DEFAULT: `None`
`activity_regularizer`	Regularizer function applied to the output of the layer (its "activation") (see `keras.regularizers`). DEFAULT: `None`
`kernel_constraint`	Constraint function applied to the kernel matrix (see `keras.constraints`). DEFAULT: `None`
`bias_constraint`	Constraint function applied to the bias vector (see `keras.constraints`). DEFAULT: `None`
`k_coef_lip`	Lipschitz constant to ensure DEFAULT: `1.0`
`eps_spectral`	stopping criterion for the iterative power algorithm. DEFAULT: `DEFAULT_EPS_SPECTRAL`
`eps_bjorck`	stopping criterion Björck algorithm. DEFAULT: `DEFAULT_EPS_BJORCK`
`beta_bjorck`	beta parameter in Björck algorithm. DEFAULT: `DEFAULT_BETA_BJORCK`
`maxiter_spectral`	maximum number of iterations for the power iteration. DEFAULT: `DEFAULT_MAXITER_SPECTRAL`
`maxiter_bjorck`	maximum number of iterations for bjorck algorithm. DEFAULT: `DEFAULT_MAXITER_BJORCK`

Source code in deel/lip/layers/convolutional.py

def __init__(
    self,
    filters,
    kernel_size,
    strides=(1, 1),
    padding="same",
    output_padding=None,
    data_format=None,
    dilation_rate=(1, 1),
    activation=None,
    use_bias=True,
    kernel_initializer=SpectralInitializer(),
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    k_coef_lip=1.0,
    eps_spectral=DEFAULT_EPS_SPECTRAL,
    eps_bjorck=DEFAULT_EPS_BJORCK,
    beta_bjorck=DEFAULT_BETA_BJORCK,
    maxiter_spectral=DEFAULT_MAXITER_SPECTRAL,
    maxiter_bjorck=DEFAULT_MAXITER_BJORCK,
    **kwargs,
):
    """
    This class is a Conv2DTranspose layer constrained such that all singular values
    of its kernel are 1. The computation is based on Björck orthogonalization
    algorithm.

    The computation is done in three steps:
    1. reduce the largest singular value to 1, using iterated power method.
    2. increase other singular values to 1, using Björck algorithm.
    3. divide the output by the Lipschitz target K to ensure K-Lipschitzity.

    This documentation reuses the body of the original
    `tf.keras.layers.Conv2DTranspose` doc.

    Args:
        filters: Integer, the dimensionality of the output space
            (i.e. the number of output filters in the convolution).
        kernel_size: An integer or tuple/list of 2 integers, specifying the
            height and width of the 2D convolution window.
            Can be a single integer to specify the same value for
            all spatial dimensions.
        strides: An integer or tuple/list of 2 integers,
            specifying the strides of the convolution along the height and width.
            Can be a single integer to specify the same value for
            all spatial dimensions.
        padding: only `"same"` padding is supported in this Lipschitz layer
            (case-insensitive).
        output_padding: if set to `None` (default), the output shape is inferred.
            Only `None` value is supported in this Lipschitz layer.
        data_format: A string,
            one of `channels_last` (default) or `channels_first`.
            The ordering of the dimensions in the inputs.
            `channels_last` corresponds to inputs with shape
            `(batch, height, width, channels)` while `channels_first`
            corresponds to inputs with shape
            `(batch, channels, height, width)`.
            It defaults to the `image_data_format` value found in your
            Keras config file at `~/.keras/keras.json`.
            If you never set it, then it will be "channels_last".
        dilation_rate: an integer, specifying the dilation rate for all spatial
            dimensions for dilated convolution. This Lipschitz layer does not
            support dilation rate != 1.
        activation: Activation function to use.
            If you don't specify anything, no activation is applied
            (see `keras.activations`).
        use_bias: Boolean, whether the layer uses a bias vector.
        kernel_initializer: Initializer for the `kernel` weights matrix
            (see `keras.initializers`). Defaults to `SpectralInitializer`.
        bias_initializer: Initializer for the bias vector
            (see `keras.initializers`). Defaults to 'zeros'.
        kernel_regularizer: Regularizer function applied to
            the `kernel` weights matrix (see `keras.regularizers`).
        bias_regularizer: Regularizer function applied to the bias vector
            (see `keras.regularizers`).
        activity_regularizer: Regularizer function applied to
            the output of the layer (its "activation") (see `keras.regularizers`).
        kernel_constraint: Constraint function applied to the kernel matrix
            (see `keras.constraints`).
        bias_constraint: Constraint function applied to the bias vector
            (see `keras.constraints`).
        k_coef_lip: Lipschitz constant to ensure
        eps_spectral: stopping criterion for the iterative power algorithm.
        eps_bjorck: stopping criterion Björck algorithm.
        beta_bjorck: beta parameter in Björck algorithm.
        maxiter_spectral: maximum number of iterations for the power iteration.
        maxiter_bjorck: maximum number of iterations for bjorck algorithm.
    """
    super().__init__(
        filters,
        kernel_size,
        strides,
        padding,
        output_padding,
        data_format,
        dilation_rate,
        activation,
        use_bias,
        kernel_initializer,
        bias_initializer,
        kernel_regularizer,
        bias_regularizer,
        activity_regularizer,
        kernel_constraint,
        bias_constraint,
        **kwargs,
    )

    if self.dilation_rate != (1, 1):
        raise ValueError("SpectralConv2DTranspose does not support dilation rate")
    if self.padding != "same":
        raise ValueError("SpectralConv2DTranspose only supports padding='same'")
    if self.output_padding is not None:
        raise ValueError(
            "SpectralConv2DTranspose only supports output_padding=None"
        )
    self.set_klip_factor(k_coef_lip)
    self.u = None
    self.sig = None
    self.wbar = None
    _check_RKO_params(eps_spectral, eps_bjorck, beta_bjorck)
    self.eps_spectral = eps_spectral
    self.eps_bjorck = eps_bjorck
    self.beta_bjorck = beta_bjorck
    self.maxiter_bjorck = maxiter_bjorck
    self.maxiter_spectral = maxiter_spectral
    self._kwargs = kwargs

dense ¶

This module extends original keras layers, in order to add k lipschitz constraint via reparametrization. Currently, are implemented: * Dense layer: as SpectralDense (and as FrobeniusDense when the layer has a single output) * Conv2D layer: as SpectralConv2D (and as FrobeniusConv2D when the layer has a single output) * AveragePooling: as ScaledAveragePooling * GlobalAveragePooling2D: as ScaledGlobalAveragePooling2D By default the layers are 1 Lipschitz almost everywhere, which is efficient for wasserstein distance estimation. However for other problems (such as adversarial robustness) the user may want to use layers that are at most 1 lipschitz, this can be done by setting the param eps_bjorck=None.

FrobeniusDense ¶

FrobeniusDense(
    units,
    activation=None,
    use_bias=True,
    kernel_initializer=SpectralInitializer(),
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    disjoint_neurons=True,
    k_coef_lip=1.0,
    **kwargs
)

Bases: Dense, LipschitzLayer, Condensable

Identical and faster than a SpectralDense in the case of a single output. In the multi-neurons setting, this layer can be used: - as a classical Frobenius Dense normalization (disjoint_neurons=False) - as a stacking of 1 lipschitz independent neurons (each output is 1-lipschitz, but the no orthogonality is enforced between outputs ) (disjoint_neurons=True).

Warning

default is disjoint_neurons = True

Source code in deel/lip/layers/dense.py

def __init__(
    self,
    units,
    activation=None,
    use_bias=True,
    kernel_initializer=SpectralInitializer(),
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    disjoint_neurons=True,
    k_coef_lip=1.0,
    **kwargs
):
    super().__init__(
        units=units,
        activation=activation,
        use_bias=use_bias,
        kernel_initializer=kernel_initializer,
        bias_initializer=bias_initializer,
        kernel_regularizer=kernel_regularizer,
        bias_regularizer=bias_regularizer,
        activity_regularizer=activity_regularizer,
        kernel_constraint=kernel_constraint,
        bias_constraint=bias_constraint,
        **kwargs
    )
    self.set_klip_factor(k_coef_lip)
    self.disjoint_neurons = disjoint_neurons
    self.axis_norm = None
    self.wbar = None
    if self.disjoint_neurons:
        self.axis_norm = 0
    self._kwargs = kwargs

SpectralDense ¶

SpectralDense(
    units,
    activation=None,
    use_bias=True,
    kernel_initializer=SpectralInitializer(),
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    k_coef_lip=1.0,
    eps_spectral=DEFAULT_EPS_SPECTRAL,
    eps_bjorck=DEFAULT_EPS_BJORCK,
    beta_bjorck=DEFAULT_BETA_BJORCK,
    maxiter_spectral=DEFAULT_MAXITER_SPECTRAL,
    maxiter_bjorck=DEFAULT_MAXITER_BJORCK,
    **kwargs
)

Bases: Dense, LipschitzLayer, Condensable

This class is a Dense Layer constrained such that all singular of it's kernel are 1. The computation based on Bjorck algorithm. The computation is done in two steps:

reduce the larget singular value to 1, using iterated power method.
increase other singular values to 1, using Bjorck algorithm.

PARAMETER	DESCRIPTION
`units`	Positive integer, dimensionality of the output space.
`activation`	Activation function to use. If you don't specify anything, no activation is applied (ie. "linear" activation: `a(x) = x`). DEFAULT: `None`
`use_bias`	Boolean, whether the layer uses a bias vector. DEFAULT: `True`
`kernel_initializer`	Initializer for the `kernel` weights matrix. DEFAULT: `SpectralInitializer()`
`bias_initializer`	Initializer for the bias vector. DEFAULT: `'zeros'`
`kernel_regularizer`	Regularizer function applied to the `kernel` weights matrix. DEFAULT: `None`
`bias_regularizer`	Regularizer function applied to the bias vector. DEFAULT: `None`
`activity_regularizer`	Regularizer function applied to the output of the layer (its "activation").. DEFAULT: `None`
`kernel_constraint`	Constraint function applied to the `kernel` weights matrix. DEFAULT: `None`
`bias_constraint`	Constraint function applied to the bias vector. DEFAULT: `None`
`k_coef_lip`	lipschitz constant to ensure DEFAULT: `1.0`
`eps_spectral`	stopping criterion for the iterative power algorithm. DEFAULT: `DEFAULT_EPS_SPECTRAL`
`eps_bjorck`	stopping criterion Bjorck algorithm. DEFAULT: `DEFAULT_EPS_BJORCK`
`beta_bjorck`	beta parameter in bjorck algorithm. DEFAULT: `DEFAULT_BETA_BJORCK`
`maxiter_spectral`	maximum number of iterations for the power iteration. DEFAULT: `DEFAULT_MAXITER_SPECTRAL`
`maxiter_bjorck`	maximum number of iterations for bjorck algorithm. DEFAULT: `DEFAULT_MAXITER_BJORCK`

Input shape

N-D tensor with shape: (batch_size, ..., input_dim). The most common situation would be a 2D input with shape (batch_size, input_dim).

Output shape

N-D tensor with shape: (batch_size, ..., units). For instance, for a 2D input with shape (batch_size, input_dim), the output would have shape (batch_size, units).

This documentation reuse the body of the original keras.layers.Dense doc.

Source code in deel/lip/layers/dense.py

def __init__(
    self,
    units,
    activation=None,
    use_bias=True,
    kernel_initializer=SpectralInitializer(),
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    k_coef_lip=1.0,
    eps_spectral=DEFAULT_EPS_SPECTRAL,
    eps_bjorck=DEFAULT_EPS_BJORCK,
    beta_bjorck=DEFAULT_BETA_BJORCK,
    maxiter_spectral=DEFAULT_MAXITER_SPECTRAL,
    maxiter_bjorck=DEFAULT_MAXITER_BJORCK,
    **kwargs
):
    """
    This class is a Dense Layer constrained such that all singular of it's kernel
    are 1. The computation based on Bjorck algorithm.
    The computation is done in two steps:

    1. reduce the larget singular value to 1, using iterated power method.
    2. increase other singular values to 1, using Bjorck algorithm.

    Args:
        units: Positive integer, dimensionality of the output space.
        activation: Activation function to use.
            If you don't specify anything, no activation is applied
            (ie. "linear" activation: `a(x) = x`).
        use_bias: Boolean, whether the layer uses a bias vector.
        kernel_initializer: Initializer for the `kernel` weights matrix.
        bias_initializer: Initializer for the bias vector.
        kernel_regularizer: Regularizer function applied to
            the `kernel` weights matrix.
        bias_regularizer: Regularizer function applied to the bias vector.
        activity_regularizer: Regularizer function applied to
            the output of the layer (its "activation")..
        kernel_constraint: Constraint function applied to
            the `kernel` weights matrix.
        bias_constraint: Constraint function applied to the bias vector.
        k_coef_lip: lipschitz constant to ensure
        eps_spectral: stopping criterion for the iterative power algorithm.
        eps_bjorck: stopping criterion Bjorck algorithm.
        beta_bjorck: beta parameter in bjorck algorithm.
        maxiter_spectral: maximum number of iterations for the power iteration.
        maxiter_bjorck: maximum number of iterations for bjorck algorithm.

    Input shape:
        N-D tensor with shape: `(batch_size, ..., input_dim)`.
        The most common situation would be
        a 2D input with shape `(batch_size, input_dim)`.

    Output shape:
        N-D tensor with shape: `(batch_size, ..., units)`.
        For instance, for a 2D input with shape `(batch_size, input_dim)`,
        the output would have shape `(batch_size, units)`.

    This documentation reuse the body of the original keras.layers.Dense doc.
    """
    super(SpectralDense, self).__init__(
        units=units,
        activation=activation,
        use_bias=use_bias,
        kernel_initializer=kernel_initializer,
        bias_initializer=bias_initializer,
        kernel_regularizer=kernel_regularizer,
        bias_regularizer=bias_regularizer,
        activity_regularizer=activity_regularizer,
        kernel_constraint=kernel_constraint,
        bias_constraint=bias_constraint,
        **kwargs
    )
    self._kwargs = kwargs
    self.set_klip_factor(k_coef_lip)
    _check_RKO_params(eps_spectral, eps_bjorck, beta_bjorck)
    self.eps_spectral = eps_spectral
    self.eps_bjorck = eps_bjorck
    self.beta_bjorck = beta_bjorck
    self.maxiter_bjorck = maxiter_bjorck
    self.maxiter_spectral = maxiter_spectral
    self.u = None
    self.sig = None
    self.wbar = None
    self.built = False

pooling ¶

This module extends original keras layers, in order to add k lipschitz constraint via reparametrization. Currently, are implemented: * Dense layer: as SpectralDense (and as FrobeniusDense when the layer has a single output) * Conv2D layer: as SpectralConv2D (and as FrobeniusConv2D when the layer has a single output) * AveragePooling: as ScaledAveragePooling * GlobalAveragePooling2D: as ScaledGlobalAveragePooling2D By default the layers are 1 Lipschitz almost everywhere, which is efficient for wasserstein distance estimation. However for other problems (such as adversarial robustness) the user may want to use layers that are at most 1 lipschitz, this can be done by setting the param eps_bjorck=None.

InvertibleDownSampling ¶

InvertibleDownSampling(
    pool_size,
    data_format="channels_last",
    name=None,
    dtype=None,
    **kwargs
)

Bases: Layer

This pooling layer perform a reshape on the spacial dimensions: it take a (bs, h, w, c) ( if channels_last ) and reshape it to a (bs, h/p_h, w/p_w, cp_wp_h ), where p_w and p_h are the shape of the pool. By doing this the image size is reduced while the number of channels is increased.

References

Anil et al. paper

Note

The image shape must be divisible by the pool shape.

PARAMETER	DESCRIPTION
`pool_size`	tuple describing the pool shape
`data_format`	can either be `channels_last` or `channels_first` DEFAULT: `'channels_last'`
`name`	name of the layer DEFAULT: `None`
`dtype`	dtype of the layer DEFAULT: `None`
`**kwargs`	params passed to the Layers constructor DEFAULT: `{}`

Source code in deel/lip/layers/pooling.py

def __init__(
    self, pool_size, data_format="channels_last", name=None, dtype=None, **kwargs
):
    """

    This pooling layer perform a reshape on the spacial dimensions: it take a
    (bs, h, w, c) ( if channels_last ) and reshape it to a
    (bs, h/p_h, w/p_w, c*p_w*p_h ), where p_w and p_h are the shape of the pool.
    By doing this the image size is reduced while the number of channels is
    increased.

    References:
        Anil et al. [paper](https://arxiv.org/abs/1911.00937)

    Note:
        The image shape must be divisible by the pool shape.

    Args:
        pool_size: tuple describing the pool shape
        data_format: can either be `channels_last` or `channels_first`
        name: name of the layer
        dtype: dtype of the layer
        **kwargs: params passed to the Layers constructor
    """
    super(InvertibleDownSampling, self).__init__(name=name, dtype=dtype, **kwargs)
    self.pool_size = pool_size
    self.data_format = data_format

InvertibleUpSampling ¶

InvertibleUpSampling(
    pool_size,
    data_format="channels_last",
    name=None,
    dtype=None,
    **kwargs
)

Bases: Layer

This Layer is the inverse of the InvertibleDownSampling layer. It take a (bs, h, w, c) ( if channels_last ) and reshape it to a (bs, h/p_h, w/p_w, cp_wp_h ), where p_w and p_h are the shape of the pool. By doing this the image size is reduced while the number of channels is increased.

References

Anil et al. paper

Note

The input number of channels must be divisible by the p_w*p_h.

PARAMETER	DESCRIPTION
`pool_size`	tuple describing the pool shape (p_h, p_w)
`data_format`	can either be `channels_last` or `channels_first` DEFAULT: `'channels_last'`
`name`	name of the layer DEFAULT: `None`
`dtype`	dtype of the layer DEFAULT: `None`
`**kwargs`	params passed to the Layers constructor DEFAULT: `{}`

Source code in deel/lip/layers/pooling.py

def __init__(
    self, pool_size, data_format="channels_last", name=None, dtype=None, **kwargs
):
    """

    This Layer is the inverse of the InvertibleDownSampling layer. It take a
    (bs, h, w, c) ( if channels_last ) and reshape it to a
    (bs, h/p_h, w/p_w, c*p_w*p_h ), where p_w and p_h are the shape of the
    pool. By doing this the image size is reduced while the number of
    channels is increased.

    References:
        Anil et al. [paper](https://arxiv.org/abs/1911.00937)

    Note:
        The input number of channels must be divisible by the `p_w*p_h`.


    Args:
        pool_size: tuple describing the pool shape (p_h, p_w)
        data_format: can either be `channels_last` or `channels_first`
        name: name of the layer
        dtype: dtype of the layer
        **kwargs: params passed to the Layers constructor
    """
    super(InvertibleUpSampling, self).__init__(name=name, dtype=dtype, **kwargs)
    self.pool_size = pool_size
    self.data_format = data_format

ScaledAveragePooling2D ¶

ScaledAveragePooling2D(
    pool_size=(2, 2),
    strides=None,
    padding="valid",
    data_format=None,
    k_coef_lip=1.0,
    **kwargs
)

Bases: AveragePooling2D, LipschitzLayer

Average pooling operation for spatial data, but with a lipschitz bound.

PARAMETER	DESCRIPTION
`pool_size`	integer or tuple of 2 integers, factors by which to downscale (vertical, horizontal). `(2, 2)` will halve the input in both spatial dimension. If only one integer is specified, the same window length will be used for both dimensions. DEFAULT: `(2, 2)`
`strides`	Integer, tuple of 2 integers, or None. Strides values. If None, it will default to `pool_size`. DEFAULT: `None`
`padding`	One of `"valid"` or `"same"` (case-insensitive). DEFAULT: `'valid'`
`data_format`	A string, one of `channels_last` (default) or `channels_first`. The ordering of the dimensions in the inputs. `channels_last` corresponds to inputs with shape `(batch, height, width, channels)` while `channels_first` corresponds to inputs with shape `(batch, channels, height, width)`. It defaults to the `image_data_format` value found in your Keras config file at `~/.keras/keras.json`. If you never set it, then it will be "channels_last". DEFAULT: `None`
`k_coef_lip`	the lipschitz factor to ensure DEFAULT: `1.0`

Input shape

If data_format='channels_last': 4D tensor with shape (batch_size, rows, cols, channels).
If data_format='channels_first': 4D tensor with shape (batch_size, channels, rows, cols).

Output shape

If data_format='channels_last': 4D tensor with shape (batch_size, pooled_rows, pooled_cols, channels).
If data_format='channels_first': 4D tensor with shape (batch_size, channels, pooled_rows, pooled_cols).

This documentation reuse the body of the original keras.layers.AveragePooling2D doc.

Source code in deel/lip/layers/pooling.py

def __init__(
    self,
    pool_size=(2, 2),
    strides=None,
    padding="valid",
    data_format=None,
    k_coef_lip=1.0,
    **kwargs,
):
    """
    Average pooling operation for spatial data, but with a lipschitz bound.

    Arguments:
        pool_size: integer or tuple of 2 integers,
            factors by which to downscale (vertical, horizontal).
            `(2, 2)` will halve the input in both spatial dimension.
            If only one integer is specified, the same window length
            will be used for both dimensions.
        strides: Integer, tuple of 2 integers, or None.
            Strides values.
            If None, it will default to `pool_size`.
        padding: One of `"valid"` or `"same"` (case-insensitive).
        data_format: A string,
            one of `channels_last` (default) or `channels_first`.
            The ordering of the dimensions in the inputs.
            `channels_last` corresponds to inputs with shape
            `(batch, height, width, channels)` while `channels_first`
            corresponds to inputs with shape
            `(batch, channels, height, width)`.
            It defaults to the `image_data_format` value found in your
            Keras config file at `~/.keras/keras.json`.
            If you never set it, then it will be "channels_last".
        k_coef_lip: the lipschitz factor to ensure

    Input shape:
        - If `data_format='channels_last'`:
            4D tensor with shape `(batch_size, rows, cols, channels)`.
        - If `data_format='channels_first'`:
            4D tensor with shape `(batch_size, channels, rows, cols)`.

    Output shape:
        - If `data_format='channels_last'`:
            4D tensor with shape `(batch_size, pooled_rows, pooled_cols, channels)`.
        - If `data_format='channels_first'`:
            4D tensor with shape `(batch_size, channels, pooled_rows, pooled_cols)`.

    This documentation reuse the body of the original keras.layers.AveragePooling2D
    doc.
    """
    if not ((strides == pool_size) or (strides is None)):
        raise RuntimeError("stride must be equal to pool_size")
    if padding != "valid":
        raise RuntimeError("ScaledAveragePooling2D only supports padding='valid'")
    super(ScaledAveragePooling2D, self).__init__(
        pool_size=pool_size,
        strides=pool_size,
        padding=padding,
        data_format=data_format,
        **kwargs,
    )
    self.set_klip_factor(k_coef_lip)
    self._kwargs = kwargs

ScaledGlobalAveragePooling2D ¶

ScaledGlobalAveragePooling2D(
    data_format=None, k_coef_lip=1.0, **kwargs
)

Bases: GlobalAveragePooling2D, LipschitzLayer

Global average pooling operation for spatial data with Lipschitz bound.

PARAMETER DESCRIPTION

data_format

A string, one of channels_last (default) or channels_first. The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, height, width, channels) while channels_first corresponds to inputs with shape (batch, channels, height, width). It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json. If you never set it, then it will be "channels_last".

DEFAULT: None

Input shape

If data_format='channels_last': 4D tensor with shape (batch_size, rows, cols, channels).
If data_format='channels_first': 4D tensor with shape (batch_size, channels, rows, cols).

Output shape: 2D tensor with shape (batch_size, channels).

This documentation reuse the body of the original keras.layers.GlobalAveragePooling doc.

Source code in deel/lip/layers/pooling.py

def __init__(self, data_format=None, k_coef_lip=1.0, **kwargs):
    """Global average pooling operation for spatial data with Lipschitz bound.

    Arguments:
        data_format: A string,
            one of `channels_last` (default) or `channels_first`.
            The ordering of the dimensions in the inputs.
            `channels_last` corresponds to inputs with shape
            `(batch, height, width, channels)` while `channels_first`
            corresponds to inputs with shape
            `(batch, channels, height, width)`.
            It defaults to the `image_data_format` value found in your
            Keras config file at `~/.keras/keras.json`.
            If you never set it, then it will be "channels_last".

    Input shape:
        - If `data_format='channels_last'`:
            4D tensor with shape `(batch_size, rows, cols, channels)`.
        - If `data_format='channels_first'`:
            4D tensor with shape `(batch_size, channels, rows, cols)`.

    Output shape:
    2D tensor with shape `(batch_size, channels)`.

    This documentation reuse the body of the original
    keras.layers.GlobalAveragePooling doc.
    """
    super(ScaledGlobalAveragePooling2D, self).__init__(
        data_format=data_format, **kwargs
    )
    self.set_klip_factor(k_coef_lip)
    self._kwargs = kwargs

ScaledGlobalL2NormPooling2D ¶

ScaledGlobalL2NormPooling2D(
    data_format=None,
    k_coef_lip=1.0,
    eps_grad_sqrt=1e-06,
    **kwargs
)

Bases: GlobalAveragePooling2D, LipschitzLayer

Average pooling operation for spatial data, with a lipschitz bound. This pooling operation is norm preserving (aka gradient=1 almost everywhere).

[1]Y.-L.Boureau, J.Ponce, et Y.LeCun, « A Theoretical Analysis of Feature Pooling in Visual Recognition »,p.8.

PARAMETER	DESCRIPTION
`data_format`	A string, one of `channels_last` (default) or `channels_first`. The ordering of the dimensions in the inputs. `channels_last` corresponds to inputs with shape `(batch, height, width, channels)` while `channels_first` corresponds to inputs with shape `(batch, channels, height, width)`. It defaults to the `image_data_format` value found in your Keras config file at `~/.keras/keras.json`. If you never set it, then it will be "channels_last". DEFAULT: `None`
`k_coef_lip`	the lipschitz factor to ensure DEFAULT: `1.0`
`eps_grad_sqrt`	Epsilon value to avoid numerical instability due to non-defined gradient at 0 in the sqrt function DEFAULT: `1e-06`

Input shape

If data_format='channels_last': 4D tensor with shape (batch_size, rows, cols, channels).
If data_format='channels_first': 4D tensor with shape (batch_size, channels, rows, cols).

Output shape

If data_format='channels_last': 4D tensor with shape (batch_size, channels).
If data_format='channels_first': 4D tensor with shape (batch_size, pooled_cols).

Source code in deel/lip/layers/pooling.py

def __init__(self, data_format=None, k_coef_lip=1.0, eps_grad_sqrt=1e-6, **kwargs):
    """
    Average pooling operation for spatial data, with a lipschitz bound. This
    pooling operation is norm preserving (aka gradient=1 almost everywhere).

    [1]Y.-L.Boureau, J.Ponce, et Y.LeCun, « A Theoretical Analysis of Feature
    Pooling in Visual Recognition »,p.8.

    Arguments:
        data_format: A string,
            one of `channels_last` (default) or `channels_first`.
            The ordering of the dimensions in the inputs.
            `channels_last` corresponds to inputs with shape
            `(batch, height, width, channels)` while `channels_first`
            corresponds to inputs with shape
            `(batch, channels, height, width)`.
            It defaults to the `image_data_format` value found in your
            Keras config file at `~/.keras/keras.json`.
            If you never set it, then it will be "channels_last".
        k_coef_lip: the lipschitz factor to ensure
        eps_grad_sqrt: Epsilon value to avoid numerical instability
            due to non-defined gradient at 0 in the sqrt function

    Input shape:
        - If `data_format='channels_last'`:
            4D tensor with shape `(batch_size, rows, cols, channels)`.
        - If `data_format='channels_first'`:
            4D tensor with shape `(batch_size, channels, rows, cols)`.

    Output shape:
        - If `data_format='channels_last'`:
            4D tensor with shape `(batch_size, channels)`.
        - If `data_format='channels_first'`:
            4D tensor with shape `(batch_size, pooled_cols)`.
    """
    if eps_grad_sqrt < 0.0:
        raise RuntimeError("eps_grad_sqrt must be positive")
    super(ScaledGlobalL2NormPooling2D, self).__init__(
        data_format=data_format, **kwargs
    )
    self.set_klip_factor(k_coef_lip)
    self.eps_grad_sqrt = eps_grad_sqrt
    self._kwargs = kwargs
    if self.data_format == "channels_last":
        self.axes = [1, 2]
    else:
        self.axes = [2, 3]

ScaledL2NormPooling2D ¶

ScaledL2NormPooling2D(
    pool_size=(2, 2),
    strides=None,
    padding="valid",
    data_format=None,
    k_coef_lip=1.0,
    eps_grad_sqrt=1e-06,
    **kwargs
)

Bases: AveragePooling2D, LipschitzLayer

Average pooling operation for spatial data, with a lipschitz bound. This pooling operation is norm preserving (aka gradient=1 almost everywhere).

[1]Y.-L.Boureau, J.Ponce, et Y.LeCun, « A Theoretical Analysis of Feature Pooling in Visual Recognition »,p.8.

PARAMETER	DESCRIPTION
`pool_size`	integer or tuple of 2 integers, factors by which to downscale (vertical, horizontal). `(2, 2)` will halve the input in both spatial dimension. If only one integer is specified, the same window length will be used for both dimensions. DEFAULT: `(2, 2)`
`strides`	Integer, tuple of 2 integers, or None. Strides values. If None, it will default to `pool_size`. DEFAULT: `None`
`padding`	One of `"valid"` or `"same"` (case-insensitive). DEFAULT: `'valid'`
`data_format`	A string, one of `channels_last` (default) or `channels_first`. The ordering of the dimensions in the inputs. `channels_last` corresponds to inputs with shape `(batch, height, width, channels)` while `channels_first` corresponds to inputs with shape `(batch, channels, height, width)`. It defaults to the `image_data_format` value found in your Keras config file at `~/.keras/keras.json`. If you never set it, then it will be "channels_last". DEFAULT: `None`
`k_coef_lip`	the lipschitz factor to ensure DEFAULT: `1.0`
`eps_grad_sqrt`	Epsilon value to avoid numerical instability due to non-defined gradient at 0 in the sqrt function DEFAULT: `1e-06`

Input shape

If data_format='channels_last': 4D tensor with shape (batch_size, rows, cols, channels).
If data_format='channels_first': 4D tensor with shape (batch_size, channels, rows, cols).

Output shape

If data_format='channels_last': 4D tensor with shape (batch_size, pooled_rows, pooled_cols, channels).
If data_format='channels_first': 4D tensor with shape (batch_size, channels, pooled_rows, pooled_cols).

Source code in deel/lip/layers/pooling.py

def __init__(
    self,
    pool_size=(2, 2),
    strides=None,
    padding="valid",
    data_format=None,
    k_coef_lip=1.0,
    eps_grad_sqrt=1e-6,
    **kwargs,
):
    """
    Average pooling operation for spatial data, with a lipschitz bound. This
    pooling operation is norm preserving (aka gradient=1 almost everywhere).

    [1]Y.-L.Boureau, J.Ponce, et Y.LeCun, « A Theoretical Analysis of Feature
    Pooling in Visual Recognition »,p.8.

    Arguments:
        pool_size: integer or tuple of 2 integers,
            factors by which to downscale (vertical, horizontal).
            `(2, 2)` will halve the input in both spatial dimension.
            If only one integer is specified, the same window length
            will be used for both dimensions.
        strides: Integer, tuple of 2 integers, or None.
            Strides values.
            If None, it will default to `pool_size`.
        padding: One of `"valid"` or `"same"` (case-insensitive).
        data_format: A string,
            one of `channels_last` (default) or `channels_first`.
            The ordering of the dimensions in the inputs.
            `channels_last` corresponds to inputs with shape
            `(batch, height, width, channels)` while `channels_first`
            corresponds to inputs with shape
            `(batch, channels, height, width)`.
            It defaults to the `image_data_format` value found in your
            Keras config file at `~/.keras/keras.json`.
            If you never set it, then it will be "channels_last".
        k_coef_lip: the lipschitz factor to ensure
        eps_grad_sqrt: Epsilon value to avoid numerical instability
            due to non-defined gradient at 0 in the sqrt function

    Input shape:
        - If `data_format='channels_last'`:
            4D tensor with shape `(batch_size, rows, cols, channels)`.
        - If `data_format='channels_first'`:
            4D tensor with shape `(batch_size, channels, rows, cols)`.

    Output shape:
        - If `data_format='channels_last'`:
            4D tensor with shape `(batch_size, pooled_rows, pooled_cols, channels)`.
        - If `data_format='channels_first'`:
            4D tensor with shape `(batch_size, channels, pooled_rows, pooled_cols)`.
    """
    if not ((strides == pool_size) or (strides is None)):
        raise RuntimeError("stride must be equal to pool_size")
    if padding != "valid":
        raise RuntimeError("ScaledL2NormPooling2D only supports padding='valid'")
    if eps_grad_sqrt < 0.0:
        raise RuntimeError("eps_grad_sqrt must be positive")
    super(ScaledL2NormPooling2D, self).__init__(
        pool_size=pool_size,
        strides=pool_size,
        padding=padding,
        data_format=data_format,
        **kwargs,
    )
    self.set_klip_factor(k_coef_lip)
    self.eps_grad_sqrt = eps_grad_sqrt
    self._kwargs = kwargs

unconstrained ¶

This module contains custom Keras unconstrained layers.

Compared to other files in layers folder, the layers defined here are not Lipschitz-constrained. They are base classes for more advanced layers. Do not use these layers as is, since they are not Lipschitz constrained.

PadConv2D ¶

PadConv2D(
    filters,
    kernel_size,
    strides=(1, 1),
    padding="same",
    data_format=None,
    dilation_rate=(1, 1),
    activation=None,
    use_bias=True,
    kernel_initializer="glorot_uniform",
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    **kwargs
)

Bases: Conv2D, Condensable

This class is a Conv2D Layer with parameterized padding. Since Conv2D layer only supports "same" and "valid" padding, this layer will enable other type of padding, such as "constant", "symmetric", "reflect" or "circular".

Warning

The PadConv2D is not a Lipschitz layer and must not be directly used. This must be used as a base class to create a Lipschitz layer with padding.

All arguments are the same as the original Conv2D except the padding which is defined as following:

PARAMETER	DESCRIPTION
`padding`	one of `"same"`, `"valid"` `"constant"`, `"symmetric"`, `"reflect"` or `"circular"` (case-insensitive). DEFAULT: `'same'`

Source code in deel/lip/layers/unconstrained.py

def __init__(
    self,
    filters,
    kernel_size,
    strides=(1, 1),
    padding="same",
    data_format=None,
    dilation_rate=(1, 1),
    activation=None,
    use_bias=True,
    kernel_initializer="glorot_uniform",
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    **kwargs
):
    """
    This class is a Conv2D Layer with parameterized padding.
    Since Conv2D layer only supports `"same"` and `"valid"` padding, this layer will
    enable other type of padding, such as `"constant"`, `"symmetric"`, `"reflect"`
    or `"circular"`.

    Warning:
        The PadConv2D is not a Lipschitz layer and must not be directly used. This
        must be used as a base class to create a Lipschitz layer with padding.

    All arguments are the same as the original `Conv2D` except the `padding`
    which is defined as following:

    Args:
        padding: one of `"same"`, `"valid"` `"constant"`, `"symmetric"`,
            `"reflect"` or `"circular"` (case-insensitive).
    """
    self.pad = lambda x: x
    self.old_padding = padding
    self.internal_input_shape = None
    if padding.lower() != "same":  # same is directly processed in Conv2D
        padding = "valid"
    super(PadConv2D, self).__init__(
        filters=filters,
        kernel_size=kernel_size,
        strides=strides,
        padding=padding,
        data_format=data_format,
        dilation_rate=dilation_rate,
        activation=activation,
        use_bias=use_bias,
        kernel_initializer=kernel_initializer,
        bias_initializer=bias_initializer,
        kernel_regularizer=kernel_regularizer,
        bias_regularizer=bias_regularizer,
        activity_regularizer=activity_regularizer,
        kernel_constraint=kernel_constraint,
        bias_constraint=bias_constraint,
        **kwargs
    )
    self._kwargs = kwargs
    if self.old_padding.lower() in ["same", "valid"]:
        self.pad = lambda x: x
        self.padding_size = [0, 0]
    if self.old_padding.lower() in ["constant", "reflect", "symmetric"]:
        self.padding_size = [self.kernel_size[0] // 2, self.kernel_size[1] // 2]
        paddings = [
            [0, 0],
            [self.padding_size[0], self.padding_size[0]],
            [self.padding_size[1], self.padding_size[1]],
            [0, 0],
        ]
        self.pad = lambda t: tf.pad(t, paddings, self.old_padding)
    if self.old_padding.lower() == "circular":
        self.padding_size = [self.kernel_size[0] // 2, self.kernel_size[1] // 2]
        self.pad = lambda t: _padding_circular(t, self.padding_size)

deel.lip.layers

activations ¶

FullSort ¶

GroupSort ¶

GroupSort2 ¶

Householder ¶

MaxMin ¶

PReLUlip ¶

base_layer ¶

Condensable ¶

condense abstractmethod ¶

vanilla_export abstractmethod ¶

LipschitzLayer ¶

coef_lip class-attribute instance-attribute ¶

k_coef_lip class-attribute instance-attribute ¶

set_klip_factor ¶

convolutional ¶

FrobeniusConv2D ¶

SpectralConv2D ¶

SpectralConv2DTranspose ¶

dense ¶

FrobeniusDense ¶

SpectralDense ¶

pooling ¶

InvertibleDownSampling ¶

InvertibleUpSampling ¶

ScaledAveragePooling2D ¶

ScaledGlobalAveragePooling2D ¶

ScaledGlobalL2NormPooling2D ¶

ScaledL2NormPooling2D ¶

unconstrained ¶

PadConv2D ¶

condense `abstractmethod` ¶

vanilla_export `abstractmethod` ¶

coef_lip `class-attribute` `instance-attribute` ¶

k_coef_lip `class-attribute` `instance-attribute` ¶