convolutions
AdaptiveOrthoConv2d(in_channels, out_channels, kernel_size, stride=1, padding='same', dilation=1, groups=1, bias=True, padding_mode='circular', ortho_params=OrthoParams())
¶
Factory function to create an orthogonal convolutional layer, selecting the appropriate class based on kernel
size and stride. This is the implementation for the Adaptive Orthogonal Convolution
scheme [1]. It aims to be
scalable to large networks and large image sizes, while enforcing orthogonality in the convolutional layers.
This layer also intend to be compatible with all the feature of the nn.Conv2d
class (e.g., striding, dilation,
grouping, etc.). This method has an explicit kernel, which means that the forward operation is equivalent to a
standard convolutional layer, but the weight are constrained to be orthogonal.
Key Features:¶
- Enforces orthogonality, preserving gradient norms.
- Supports native striding, dilation, grouped convolutions, and flexible padding.
Behavior:¶
- When kernel_size == stride, the layer is an `RKOConv2d`.
- When stride == 1, the layer is a `FastBlockConv2d`.
- Otherwise, the layer is a `BcopRkoConv2d`.
Note
- This implementation also work under zero padding, it lipschitz constant is still tight, but it looses orthogonality.orthogonality on the image border.
- the unit tesing validated for a tolerance of 1e-4 under various orthogonalization schemes (see reparametrizers). Only Cholesky based methods were validated for a lower tolerance of 5e-2.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
in_channels
|
int
|
Number of input channels. |
required |
out_channels
|
int
|
Number of output channels. |
required |
kernel_size
|
_size_2_t
|
Size of the convolution kernel. |
required |
stride
|
_size_2_t
|
Stride of the convolution. Default is 1. |
1
|
padding
|
str or _size_2_t
|
Padding mode or size. Default is "same". |
'same'
|
dilation
|
_size_2_t
|
Dilation rate. Default is 1. |
1
|
groups
|
int
|
Number of blocked connections from input to output channels. Default is 1. |
1
|
bias
|
bool
|
Whether to include a learnable bias. Default is True. |
True
|
padding_mode
|
str
|
Padding mode. Default is "circular". |
'circular'
|
ortho_params
|
OrthoParams
|
Parameters to control orthogonality. Default is |
OrthoParams()
|
Returns:
Type | Description |
---|---|
Conv2d
|
A configured instance of |
Raises:
Type | Description |
---|---|
`ValueError`
|
If kernel_size < stride, as orthogonality cannot be enforced. |
References
- [1] Boissin, T., Mamalet, F., Fel, T., Picard, A. M., Massena, T., & Serrurier, M. (2025). An Adaptive Orthogonal Convolution Scheme for Efficient and Flexible CNN Architectures. https://arxiv.org/abs/2501.07930
Source code in orthogonium\layers\conv\AOC\ortho_conv.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 |
|
AdaptiveOrthoConvTranspose2d(in_channels, out_channels, kernel_size, stride=1, padding=0, output_padding=0, groups=1, bias=True, dilation=1, padding_mode='zeros', ortho_params=OrthoParams())
¶
Factory function to create an orthogonal transposed convolutional layer, selecting the appropriate class based on kernel
size and stride. This is the implementation for the Adaptive Orthogonal Convolution
scheme [1]. It aims to be
scalable to large networks and large image sizes, while enforcing orthogonality in the convolutional layers.
This layer also intend to be compatible with all the feature of the nn.Conv2d
class (e.g., striding, dilation,
grouping, etc.). This method has an explicit kernel, which means that the forward operation is equivalent to a
standard convolutional layer, but the weight are constrained to be orthogonal.
Key Features:¶
- Ensures orthogonality in transpose convolutions for stable gradient propagation.
- Supports dilation, grouped operations, and efficient kernel construction.
Behavior:¶
- When kernel_size == stride, the layer is an `RkoConvTranspose2d`.
- When stride == 1, the layer is a `FastBlockConvTranspose2D`.
- Otherwise, the layer is a `BcopRkoConvTranspose2d`.
Note
- This implementation also work under zero padding, it lipschitz constant is still tight, but it looses orthogonality.orthogonality on the image border.
- The current implementation of the torch.nn.ConvTranspose2d does not support circular padding. One can implement padding manually by add a padding layer before and setting padding = (0,0).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
in_channels
|
int
|
Number of input channels. |
required |
out_channels
|
int
|
Number of output channels. |
required |
kernel_size
|
_size_2_t
|
Size of the convolution kernel. |
required |
stride
|
_size_2_t
|
Stride of the transpose convolution. Default is 1. |
1
|
padding
|
_size_2_t
|
Padding size. Default is 0. |
0
|
output_padding
|
_size_2_t
|
Additional size for output. Default is 0. |
0
|
groups
|
int
|
Number of groups. Default is 1. |
1
|
bias
|
bool
|
Whether to include a learnable bias. Default is True. |
True
|
dilation
|
_size_2_t
|
Dilation rate. Default is 1. |
1
|
padding_mode
|
str
|
Padding mode. Default is "zeros". |
'zeros'
|
ortho_params
|
OrthoParams
|
Parameters to control orthogonality. Default is |
OrthoParams()
|
Returns:
Type | Description |
---|---|
ConvTranspose2d
|
A configured instance of |
Raises:
- ValueError
: If kernel_size < stride, as orthogonality cannot be enforced.
References
- [1] Boissin, T., Mamalet, F., Fel, T., Picard, A. M., Massena, T., & Serrurier, M. (2025). An Adaptive Orthogonal Convolution Scheme for Efficient and Flexible CNN Architectures. https://arxiv.org/abs/2501.07930
Source code in orthogonium\layers\conv\AOC\ortho_conv.py
101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 |
|
SSL derived 1-Lipschitz Layers¶
This module implements several 1-Lipschitz residual blocks, inspired by and extending the SDP-based Lipschitz Layers (SLL) from [1]. Specifically:
-
SDPBasedLipschitzResBlock
The original version of the 1-Lipschitz convolutional residual block. It enforces Lipschitz constraints by rescaling activation outputs according to an estimate of the operator norm. -
SLLxAOCLipschitzResBlock
An extended version of the SLL approach described in [1], combined with additional orthogonal convolutions to handle stride, kernel-size, or channel-dimension changes. It fuses multiple convolutions via the block convolution, thereby preserving the 1-Lipschitz property while enabling strided downsampling or modifying input/output channels. -
AOCLipschitzResBlock
A variant of the original Lipschitz block where the core convolution is replaced by anAdaptiveOrthoConv2d
. It maintains the 1-Lipschitz property with orthogonal weight parameterization while providing efficient convolution implementations.
References¶
[1] Alexandre Araujo, Aaron J Havens, Blaise Delattre, Alexandre Allauzen, and Bin Hu. A unified alge- braic perspective on lipschitz neural networks. In The Eleventh International Conference on Learning Representations, 2023 [2] Thibaut Boissin, Franck Mamalet, Thomas Fel, Agustin Martin Picard, Thomas Massena, Mathieu Serrurier, An Adaptive Orthogonal Convolution Scheme for Efficient and Flexible CNN Architectures
Notes on the SLL approach¶
In [1], the SLL layer for convolutions is a 1-Lipschitz residual operation defined approximately as:
where \(\mathbf{K}\) represents a toeplitz (convolution) matrix that represent a 1-Lipschitz operator. This is done in practice by computing a normalization vector \(\mathbf{t}\) and rescaling the activation outputs by \(\mathbf{t}\).
By default, the SLL formulation does not allow strides or changes in the number of channels.
To address these issues, SLLxAOCLipschitzResBlock
adds extra orthogonal convolutions before and/or
after the main SLL operation. These additional convolutions can be merged via block convolution
(Proposition 1 in [2]) to maintain 1-Lipschitz behavior while enabling stride and/or channel changes.
When \(\mathbf{K}\), \(\mathbf{K}_{pre}\), and \(\mathbf{K}_{post}\) each correspond to 2×2 convolutions, the resulting block effectively contains two 3×3 convolutions in one branch and a single 4×4 stride-2 convolution in the skip branch—quite similar to typical ResNet blocks.
AOCLipschitzResBlock
¶
Bases: Module
Source code in orthogonium\layers\conv\SLL\sll_layer.py
313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 |
|
__init__(in_channels, inner_dim_factor, kernel_size, dilation=1, groups=1, bias=True, padding_mode='circular', ortho_params=OrthoParams())
¶
A Lipschitz residual block in which the main convolution is replaced by
AdaptiveOrthoConv2d
(AOC). This preserves 1-Lipschitz (or lower) behavior through
an orthogonal parameterization, without explicitly computing a scaling factor t
.
Args:
- in_channels
(int): Number of input channels.
- inner_dim_factor
(int): Multiplier for internal representation size.
- kernel_size
(_size_2_t): Convolution kernel size.
- dilation
(_size_2_t, optional): Default is 1.
- groups
(int, optional): Default is 1.
- bias
(bool, optional): If True, adds a learnable bias. Default is True.
- padding_mode
(str, optional): 'circular'
or 'zeros'
. Default is 'circular'
.
- ortho_params
(OrthoParams, optional): Orthogonal parameterization settings. Default is OrthoParams()
.
References
- [1] Araujo, A., Havens, A. J., Delattre, B., Allauzen, A., & Hu, B. A Unified Algebraic Perspective on Lipschitz Neural Networks. In The Eleventh International Conference on Learning Representations. https://arxiv.org/abs/2303.03169
- [2] Boissin, T., Mamalet, F., Fel, T., Picard, A. M., Massena, T., & Serrurier, M. (2025). An Adaptive Orthogonal Convolution Scheme for Efficient and Flexible CNN Architectures. https://arxiv.org/abs/2501.07930
Source code in orthogonium\layers\conv\SLL\sll_layer.py
314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 |
|
SDPBasedLipschitzDense
¶
Bases: Module
Source code in orthogonium\layers\conv\SLL\sll_layer.py
258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 |
|
__init__(in_features, out_features, inner_dim, **kwargs)
¶
A 1-Lipschitz fully-connected layer (dense version). Similar to the convolutional SLL approach, but operates on vectors:
Args:
- in_features
(int): Input size.
- out_features
(int): Output size (must match in_features
to remain 1-Lipschitz).
- inner_dim
(int): The internal dimension used for the transform.
References
- Araujo, A., Havens, A. J., Delattre, B., Allauzen, A., & Hu, B. A Unified Algebraic Perspective on Lipschitz Neural Networks. In The Eleventh International Conference on Learning Representations. https://arxiv.org/abs/2303.03169
Source code in orthogonium\layers\conv\SLL\sll_layer.py
259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 |
|
SDPBasedLipschitzResBlock
¶
Bases: Module
Source code in orthogonium\layers\conv\SLL\sll_layer.py
189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 |
|
__init__(cin, inner_dim_factor, kernel_size=3, groups=1, **kwargs)
¶
Original 1-Lipschitz convolutional residual block, based on the SDP-based Lipschitz layer (SLL) approach [1]. It has a structure akin to:
out = x - 2 * ConvTranspose( t * ReLU(Conv(x) + bias) )
where t
is a channel-wise scaling factor ensuring a Lipschitz constant ≤ 1.
Note
By default, SDPBasedLipschitzResBlock
assumes cin == cout
and does not handle
stride changes outside the skip connection (i.e., typically used when stride=1 or 2
for downsampling in a standard residual architecture).
Args:
- cin
(int): Number of input channels.
- cout
(int): Number of output channels.
- inner_dim_factor
(float): Multiplier for the intermediate dimensionality.
- kernel_size
(int, optional): Size of the convolution kernel. Default is 3.
- groups
(int, optional): Number of groups for the convolution. Default is 1.
- **kwargs
: Additional keyword arguments (unused).
References
- Araujo, A., Havens, A. J., Delattre, B., Allauzen, A., & Hu, B. A Unified Algebraic Perspective on Lipschitz Neural Networks. In The Eleventh International Conference on Learning Representations. https://arxiv.org/abs/2303.03169
Source code in orthogonium\layers\conv\SLL\sll_layer.py
190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 |
|
SLLxAOCLipschitzResBlock
¶
Bases: Module
Source code in orthogonium\layers\conv\SLL\sll_layer.py
66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 |
|
__init__(cin, cout, inner_dim_factor, kernel_size=3, stride=2, groups=1, **kwargs)
¶
Extended SLL-based convolutional residual block. Supports arbitrary kernel sizes,
strides, and changes in the number of channels by integrating additional
orthogonal convolutions and fusing them via \mathbconv
[1].
The forward pass follows:
where \(\mathbf{K}_{pre}\) and \(\mathbf{K}_{post}\) are obtained with AOC.
where the kernel \kernel{K}
may effectively be expanded by pre/post AOC layers to
handle stride and channel changes. This approach is described in "Improving
SDP-based Lipschitz Layers" of [1].
Args:
- cin
(int): Number of input channels.
- inner_dim_factor
(float): Multiplier for the internal channel dimension.
- kernel_size
(int, optional): Base kernel size for the SLL portion. Default is 3.
- stride
(int, optional): Stride for the skip connection. Default is 2.
- groups
(int, optional): Number of groups for the convolution. Default is 1.
- **kwargs
: Additional options (unused).
References
- Boissin, T., Mamalet, F., Fel, T., Picard, A. M., Massena, T., & Serrurier, M. (2025). An Adaptive Orthogonal Convolution Scheme for Efficient and Flexible CNN Architectures. https://arxiv.org/abs/2501.07930
Source code in orthogonium\layers\conv\SLL\sll_layer.py
67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 |
|
AOLConv2D
¶
Bases: Conv2d
Source code in orthogonium\layers\conv\AOL\aol.py
87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 |
|
__init__(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros', device=None, dtype=None, niter=1)
¶
Almost-Orthogonal Convolution layer. This layer implements the method proposed in [1] to enforce almost-orthogonality. While orthogonality is not enforced, the lipschitz constant of the layer is guaranteed to be less than 1.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
in_channels
|
int
|
Number of input channels. |
required |
out_channels
|
int
|
Number of output channels. |
required |
kernel_size
|
int or tuple
|
Size of the convolution kernel. |
required |
stride
|
int or tuple
|
Stride of the convolution. Default is 1. |
1
|
padding
|
int or tuple
|
Padding size. Default is 0. |
0
|
dilation
|
int or tuple
|
Dilation rate. Default is 1. |
1
|
groups
|
int
|
Number of groups. Default is 1. |
1
|
bias
|
bool
|
Whether to include a learnable bias. Default is True. |
True
|
padding_mode
|
str
|
Padding mode. Default is "zeros". |
'zeros'
|
device
|
device
|
Device to store the layer parameters. Default is None. |
None
|
dtype
|
dtype
|
Data type to store the layer parameters. Default is None. |
None
|
References
[1] Prach, B., & Lampert, C. H. (2022).
"Almost-orthogonal layers for efficient general-purpose lipschitz networks."
ECCV.
https://arxiv.org/abs/2208.03160`_
Source code in orthogonium\layers\conv\AOL\aol.py
89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 |
|
reset_parameters()
¶
Resets parameters of the module. This includes the weight and bias parameters, if they are used.
Source code in orthogonium\layers\conv\AOL\aol.py
153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 |
|
AOLConvTranspose2D
¶
Bases: ConvTranspose2d
Source code in orthogonium\layers\conv\AOL\aol.py
171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 |
|
__init__(in_channels, out_channels, kernel_size, stride=1, padding=0, output_padding=0, groups=1, bias=True, dilation=1, padding_mode='zeros', device=None, dtype=None, niter=1)
¶
Almost-Orthogonal Convolution layer. This layer implements the method proposed in [1] to enforce almost-orthogonality. While orthogonality is not enforced, the lipschitz constant of the layer is guaranteed to be less than 1.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
in_channels
|
int
|
Number of input channels. |
required |
out_channels
|
int
|
Number of output channels. |
required |
kernel_size
|
int or tuple
|
Size of the convolution kernel. |
required |
stride
|
int or tuple
|
Stride of the convolution. Default is 1. |
1
|
padding
|
int or tuple
|
Padding size. Default is 0. |
0
|
output_padding
|
int or tuple
|
Additional size added to the output shape. Default is 0. |
0
|
groups
|
int
|
Number of groups. Default is 1. |
1
|
bias
|
bool
|
Whether to include a learnable bias. Default is True. |
True
|
dilation
|
int or tuple
|
Dilation rate. Default is 1. |
1
|
padding_mode
|
str
|
Padding mode. Default is "zeros". |
'zeros'
|
device
|
device
|
Device to store the layer parameters. Default is None. |
None
|
dtype
|
dtype
|
Data type to store the layer parameters. Default is None. |
None
|
References
[1] Prach, B., & Lampert, C. H. (2022).
"Almost-orthogonal layers for efficient general-purpose lipschitz networks."
ECCV.
https://arxiv.org/abs/2208.03160`_
Source code in orthogonium\layers\conv\AOL\aol.py
173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 |
|
MultiStepAOLReparametrizer
¶
Bases: Module
Source code in orthogonium\layers\conv\AOL\aol.py
40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 |
|
reset_parameters()
¶
Resets the parameters of the reparametrizer.
Source code in orthogonium\layers\conv\AOL\aol.py
79 80 81 82 83 84 |
|
AdaptiveSOCConv2d(in_channels, out_channels, kernel_size, stride=1, padding='same', dilation=1, groups=1, bias=True, padding_mode='circular', ortho_params=OrthoParams())
¶
Factory function to create an orthogonal convolutional layer, selecting the appropriate class based on kernel
size and stride. This is a modified implementation of the Skew orthogonal convolution
[1], with significant
modification from the original paper:
- This implementation provide an explicit kernel (which is larger the original kernel size) so the forward is done in a single iteration. As described in [2].
- This implementation avoid the use of channels padding to handle case where cin != cout. Similarly, stride is handled natively using the ad adaptive scheme.
- the fantastic four method is replaced by AOL which allows to reduce the number of iterations required to converge.
It aims to be more scalable to large networks and large image sizes, while enforcing orthogonality in the
convolutional layers. This layer also intend to be compatible with all the feature of the nn.Conv2d
class
(e.g., striding, dilation, grouping, etc.). This method has an explicit kernel, which means that the forward
operation is equivalent to a standard convolutional layer, but the weight are constrained to be orthogonal.
Note
- this implementation changes the size of the kernel, which also change the padding semantics. Please adjust the padding according to the kernel size and the number of iterations.
- current unit testing use a tolerance of 8e-2 sor this layer can be expected to be 1.08 lipschitz continuous. Similarly, the stable rank is evaluated loosely (must be greater than 0.5).
Key Features:¶
- Enforces orthogonality, preserving gradient norms.
- Supports native striding, dilation, grouped convolutions, and flexible padding.
Behavior:¶
- When kernel_size == stride, the layer is an `RKOConv2d`.
- When stride == 1, the layer is a `FastBlockConv2d`.
- Otherwise, the layer is a `BcopRkoConv2d`.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
in_channels
|
int
|
Number of input channels. |
required |
out_channels
|
int
|
Number of output channels. |
required |
kernel_size
|
_size_2_t
|
Size of the convolution kernel. |
required |
stride
|
_size_2_t
|
Stride of the convolution. Default is 1. |
1
|
padding
|
str or _size_2_t
|
Padding mode or size. Default is "same". |
'same'
|
dilation
|
_size_2_t
|
Dilation rate. Default is 1. |
1
|
groups
|
int
|
Number of blocked connections from input to output channels. Default is 1. |
1
|
bias
|
bool
|
Whether to include a learnable bias. Default is True. |
True
|
padding_mode
|
str
|
Padding mode. Default is "circular". |
'circular'
|
ortho_params
|
OrthoParams
|
Parameters to control orthogonality. Default is |
OrthoParams()
|
Returns:
Type | Description |
---|---|
Conv2d
|
A configured instance of |
Raises:
Type | Description |
---|---|
`ValueError`
|
If kernel_size < stride, as orthogonality cannot be enforced. |
References
- [1] Singla, S., & Feizi, S. (2021, July). Skew orthogonal convolutions. In International Conference on Machine Learning (pp. 9756-9766). PMLR.https://arxiv.org/abs/2105.11417
- [2] Boissin, T., Mamalet, F., Fel, T., Picard, A. M., Massena, T., & Serrurier, M. (2025). An Adaptive Orthogonal Convolution Scheme for Efficient and Flexible CNN Architectures. https://arxiv.org/abs/2501.07930
Source code in orthogonium\layers\conv\adaptiveSOC\ortho_conv.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 |
|
AdaptiveSOCConvTranspose2d(in_channels, out_channels, kernel_size, stride=1, padding=0, output_padding=0, groups=1, bias=True, dilation=1, padding_mode='zeros', ortho_params=OrthoParams())
¶
Factory function to create an orthogonal transposed convolutional layer, selecting the appropriate class based on
kernel size and stride. This is a modified implementation of the Skew orthogonal convolution
[1], with significant
modification from the original paper:
- This implementation provide an explicit kernel (which is larger the original kernel size) so the forward is done in a single iteration. As described in [2].
- This implementation avoid the use of channels padding to handle case where cin != cout. Similarly, stride is handled natively using the ad adaptive scheme.
- the fantastic four method is replaced by AOL which allows to reduce the number of iterations required to converge.
It aims to be more scalable to large networks and large image sizes, while enforcing orthogonality in the
convolutional layers. This layer also intend to be compatible with all the feature of the nn.Conv2d
class
(e.g., striding, dilation, grouping, etc.). This method has an explicit kernel, which means that the forward
operation is equivalent to a standard convolutional layer, but the weight are constrained to be orthogonal.
Note
- this implementation changes the size of the kernel, which also change the padding semantics. Please adjust the padding according to the kernel size and the number of iterations.
- current unit testing use a tolerance of 8e-2 sor this layer can be expected to be 1.08 lipschitz continuous. Similarly, the stable rank is evaluated loosely (must be greater than 0.5).
Key Features:¶
- Enforces orthogonality, preserving gradient norms.
- Supports native striding, dilation, grouped convolutions, and flexible padding.
Behavior:¶
- When kernel_size == stride, the layer is an `RKOConv2d`.
- When stride == 1, the layer is a `FastBlockConv2d`.
- Otherwise, the layer is a `BcopRkoConv2d`.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
in_channels
|
int
|
Number of input channels. |
required |
out_channels
|
int
|
Number of output channels. |
required |
kernel_size
|
_size_2_t
|
Size of the convolution kernel. |
required |
stride
|
_size_2_t
|
Stride of the convolution. Default is 1. |
1
|
padding
|
str or _size_2_t
|
Padding mode or size. Default is "same". |
0
|
dilation
|
_size_2_t
|
Dilation rate. Default is 1. |
1
|
groups
|
int
|
Number of blocked connections from input to output channels. Default is 1. |
1
|
bias
|
bool
|
Whether to include a learnable bias. Default is True. |
True
|
padding_mode
|
str
|
Padding mode. Default is "circular". |
'zeros'
|
ortho_params
|
OrthoParams
|
Parameters to control orthogonality. Default is |
OrthoParams()
|
Returns:
Type | Description |
---|---|
ConvTranspose2d
|
A configured instance of |
Raises:
Type | Description |
---|---|
`ValueError`
|
If kernel_size < stride, as orthogonality cannot be enforced. |
References
- [1] Singla, S., & Feizi, S. (2021, July). Skew orthogonal convolutions. In International Conference on Machine Learning (pp. 9756-9766). PMLR.https://arxiv.org/abs/2105.11417
- [2] Boissin, T., Mamalet, F., Fel, T., Picard, A. M., Massena, T., & Serrurier, M. (2025). An Adaptive Orthogonal Convolution Scheme for Efficient and Flexible CNN Architectures. https://arxiv.org/abs/2501.07930
Source code in orthogonium\layers\conv\adaptiveSOC\ortho_conv.py
114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 |
|