Loading [MathJax]/extensions/TeX/AMSsymbols.js
Showing posts with label Convolutional neural network. Show all posts
Showing posts with label Convolutional neural network. Show all posts

Friday, August 30, 2024

Fractal Dimension of Objects in Python

Target audience: Beginner
Estimated reading time: 5'

Are you finding it challenging to configure a convolutional neural network (CNN) for modeling 3D objects?

The complexity of a 3D object can pose a considerable challenge when tuning the parameters of a 3D convolutional neural network. Fractal analysis [ref 1] offers a way to measure the complexity of key features, volumes, and boundaries within the object, providing valuable insights that can help data scientists fine-tune their models for better performance.



What you will learn: How to evaluate the complexity of a 3-dimension object using fractal dimension index.

Notes

  • This article is a follow up on Fractal Dimension of Images in Python
  • Environments: Python  3.11,  Matplotlib 3.9.
  • Source code is available at  Github.com/patnicolas/Data_Exploration/fractal
  • To enhance the readability of the algorithm implementations, we have omitted non-essential code elements like error checking, comments, exceptions, validation of class and method arguments, scoping qualifiers, and import statements.

Introduction

As described in a previous article, Fractal Dimension of Images in Python - Overview , a fractal dimension is a measure used to describe the complexity of fractal patterns or sets by quantifying the ratio of change in detail relative to the change in scale [ref 2].
Among the various approaches to estimate the fractal dimension, from variation, structure function methods, root mean square and R/S analysis, we selected the box counting method because of its simplicity and visualization capability.

Point cloud
To evaluate our method for calculating the fractal dimension index of an object, it's necessary to simulate or represent a 3D object. This is achieved by generating a cluster of random data points across the x, y, and z axes, commonly referred to as a point cloud.

Box counting method
The box counting method [ref 3] is described in a previous article, Box-counting method

If N is the number of measurements units for our counting boxes or cubesand eps the related scaling factor, the fractal dimension index is computed as\[ D=- \displaystyle \lim_{\epsilon  \to 0} \frac{log(N)}{log(\epsilon)} \simeq - \frac{log(N)}{log(eps)} \ \ with \ N=r^3  \]We will use the height of the square box r as our measurement unit for images.
The fractal dimension index varies from 2 for very simple object to 3 for objects with complex pattern.

Implementation

Let's define a class, FractalDimObject, that encapsulates the calculation of fractal dimension and the creation of wrapping boxes. 

This class provides two constructors:
  1. The default constructor, __init__, which accepts a 3D array xyz and a threshold value near 1.
  2. An alternative constructor, build, which generates a 3D array of shape (size, size, size) to simulate a 3-dimension object.
class FractalDimObject(object):
    def __init__(self, xyz: np.array, threshold: float) -> None:
        self.xyz = xyz
        self.threshold = threshold


    @classmethod
    def build(cls, size: int, threshold: float) -> Self:
        _xyz = np.zeros((size, size, size))

        # Create a 3D fractal-like structure such as cube
        for x in range(size):            # Width
            for y in range(size):        # Depth
                for z in range(size):    # Height
                    if (x // 2 + y // 2) % 2 == 0:  # Condition for non-zero values
                        _xyz[x, y, z] = random.gauss(size//2, size)

        return cls(_xyz, threshold)

The alternative constructor generates a test array for evaluation purposes. It starts by initializing the array with values of 0.0, and then assigns random Gaussian-distributed values to a specific subset of the array

The values used in define the 3D object is visualized below.

Fig. 1 Visualization of the point cloud representing 3D object


The __call__ method performs the fractal dimension calculation and tracks the relationship between box counts and sizes in three stages:
  1. It determines the box sizes for non-zero elements in the array.
  2. It counts the number of boxes for each size.
  3. It applies a linear regression to the logarithms of the box sizes and their corresponding counts.
def __call__(self) -> (np.array, List[int], List[int]):
     # Step 1 Extract the sizes of array
     sizes = self.__extract_sizes()
     sizes_list = list(sizes)
     sizes_list.reverse()

     # Step 2 Count the number of boxes of each size
     counts = [self.__count_boxes(int(size)) for size in sizes_list]

     # Step 3 Fit the points to a line log(counts) = a.log(sizes) + b
     coefficients = np.polyfit(np.log(sizes), np.log(counts), 1)
     return -coefficients[0], sizes, counts

The __extract_sizes method is responsible for generating the box sizes, as detailed in the Appendix. The implementation of __count_boxes, which counts the wrapping boxes for a given size, follows a similar approach to the method used in calculating the fractal dimension of images.

def __count_boxes(self, box_size: int) -> int:
     sx, sy, sz = self.xyz.shape
     count = 0
        
      for i in range(0, sz, box_size):
          for j in range(0, sy, box_size):
             for k in range(0, sz, box_size):
                  # Wraps the non-zero values (object) with boxes
                 data = self.xyz[i:i+box_size, j:j+box_size, k:k+box_size]

                 if np.any(data):    # For non-zero values (inside object)
                    count += 1
     return count


Evaluation

Let's compute the fractal dimension of the array representing a 3D object with an initial 3D sampling grid 1024 x 1024 x 1024 

import math

grid_size = 1024      # Grid size 
threshold = 0.92
        
fractal_dim_object = FractalDimObject.build(grid_size, threshold)
coefficient, counts, sizes = fractal_dim_object()
print(coefficient)
      

Output: 2.7456

Finally let's plot the profile of box sizes vs. box counts.


Fig. 2 Plot of box sizes vs box counts size = exp(dim*counts)

The plot reflects the linear regression of log (size) and log (counts).




Wednesday, July 6, 2022

Fractal Dimension of Images in Python

Target audience: Expert
Estimated reading time: 8'
Configuring the parameters of a 2D convolutional neural network, such as kernel size and padding, can be challenging because it largely depends on the complexity of an image or its specific sections. Fractals help quantify the complexity of important features and boundaries within an image and ultimately guide the data scientist in optimizing his/her model.



       Original image
        Image section
\

What you will learn: How to evaluate the complexity of an image using fractal dimension index.

Notes

  • Environments: Python  3.11,  Matplotlib 3.9.
  • Source code is available at  Github.com/patnicolas/Data_Exploration/fractal
  • To enhance the readability of the algorithm implementations, we have omitted non-essential code elements like error checking, comments, exceptions, validation of class and method arguments, scoping qualifiers, and import statements.

Overview

Fractal dimension

A fractal dimension is a measure used to describe the complexity of fractal patterns or sets by quantifying the ratio of change in detail relative to the change in scale [ref 1].

Initially, fractal dimensions were used to characterize intricate geometric forms where detailed patterns were more significant than the overall shape. For ordinary geometric shapes, the fractal dimension theoretically matches the familiar Euclidean or topological dimension.

However, the fractal dimension can take non-integer values. If a set's fractal dimension exceeds its topological dimension, it is considered to exhibit fractal geometry [ref 2].

There are many approaches to compute the fractal dimension [ref 1]  of an image, among them:
  • Variation method
  • Structure function method
  • Root mean square method
  • R/S analysis method
  • Box counting method
This article describes the concept and implementation of the box counting method in Python.

Box counting method

The box counting method is similar to the perimeter measuring technique we applied to coastlines. However, instead of measuring length, we overlay the image with a grid and count how many squares in the grid cover any part of the image. We then repeat this process with progressively finer grids, each with smaller squares [ref 3]. By continually reducing the grid size, we capture the pattern's structure with greater precision.
fig. 1 Illustration of the box counting method for Kock shape

If N is the number of measurements units (yardstick in 1D, square in 2D, cube in 3D,..) and eps the related scaling factor, the fractal dimension index is computed as\[ D=- \displaystyle \lim_{\epsilon  \to 0} \frac{log(N)}{log(\epsilon)} \simeq - \frac{log(N)}{log(eps)} \ \ with \ N=r^2  \ (1) \] We will use the height of the square box r as our measurement unit for images.

Implementation

First, let's define the box parameter (square)
  • eps Scaling factor for resizing the boxes
  • r Height or width of the squared boxes
@dataclass
class BoxParameter:
    eps: float
    r: int

    # Denominator of the Fractal Dimension
    def log_inv_eps(self) -> float:
        return -np.log(self.eps)

    # Numerator of the Fractal Dimension
    def log_num_r(self) -> float:
        return np.log(self.r)

The two methods log_inv_eps and log_num_r implement the numerator and denominator of the formula (1)


The class FractalDimImage encapsulates the computation of fractal dimension of a given image.
The two class (static) members are
  • num_grey_levels: Default number of grey scales
  • max_plateau_count: Number of attempts to exit a saddle point.
class FractalDimImage(object):
      # Default number of grey values
    num_grey_levels: int = 256
      # Convergence criteria
    max_plateau_count = 3

    def __init__(self, image_path: AnyStr) -> None:
        raw_image: np.array = self.__load_image(image_path)
        
        # If the image is actually a RGB (color) image, then converted to grey scale image
        self.image = FractalDimImage.rgb_to_grey( raw_image)  if raw_image.shape[2] == 3 
        else raw_image


We cannot assume that the image is not defined with the 3 RGB channels. Therefore if the 3rd value of the shape is 3, then the image is converted into a grey scale array.

The following private method, __load_image load the image from a given path and converted into a numpy array

@staticmethod
def __load_image(image_path: AnyStr) -> np.array
     from PIL import Image
     from numpy import asarray

    this_image = Image.open(mode="r", fp=image_path)
    return asarray(this_image)



The computation of fractal dimension is implemented by the method __call__. The method returns a tuple:
  • fractal dimension index
  • trace/history of the box parameters collected during execution.
The symmetrical nature of fractal allows to iterate over half the size of the image [1]. The number of boxes N created at each iteration i, take into account the grey scale. N= (256/ num_pixels) *i [2].

The method populates each box with pixels/grey scale (method __create_boxes) [3] , then compute the sum of least squares (__sum_least_squares) [4]. The last statement [5] implement the formula (1). The source code for the private methods __create_boxes and __sum_least_squares are included in the appendix for reference.

def __call__(self) -> (float, List[BoxParameter]):
   image_pixels = self.image.shape[0]  
   plateau_count = 0
   prev_num_r = -1
      
   trace = []
   max_iters = (image_pixels // 2) + 1   # [1]

   for iter in range(2, max_iters):
       num_boxes = grey_levels // (image_pixels // iter)  # [2]
       n_boxes = max(1, num_boxes)
       num_r = 0     # Number of squares
            
       eps = iter / image_pixels
       for i in range(0, image_pixels, iter):
           boxes = self.__create_boxes(i, iter, n_boxes)    # [3]
           num_r += FractalDimImage.__sum_least_squares(boxes, n_boxes)  # [4]

        # Detect if the number of measurements r has not changed..
       if num_r == prev_num_r:
           plateau_count += 1
           prev_num_r = num_r
       trace.append(BoxParameter(eps, num_r))

        # Break from the iteration if the computation is stuck 
        # in the same number of measurements
        if plateau_count > FractalDimImage.max_plateau_count:
             break

    # Implement the fractal dimension given the trace [5]
   return FractalDimImage.__compute_fractal_dim(trace), trace



The implementation of the formula for fractal dimension extracts a polynomial fitting the numerator and denominator and return the first order value.

@staticmethod
def __compute_fractal_dim(trace: List[BoxParameter]) -> float:
   from numpy.polynomial.polynomial import polyfit

   _x = np.array([box_param.log_inv_eps() for box_param in trace])
   _y = np.array([box_param.log_num_r() for box_param in trace])
   fitted = polyfit(x=_x, y=_y, deg=1, full=False)

   return float(fitted[1])


Evaluation


We compute the fractal dimension for an image then a region that contains the key features (meaning) of the image.

image_path_name = '../images/fractal_test_image.jpg'
fractal_dim_image = FractalDimImage(image_path_name)
fractal_dim, trace = fractal_dim_image()


Original image

The original RGB image has 542 x 880 pixels and converted into grey scale image.

Fig. 2 Original grey scale image 


Output: fractal_dim = 2.54

Fig. 3 Trace for the squared box measurement during iteration

The size of the box converges very quickly after 8 iterations.

Image region

We select the following region of 395 x 378 pixels

Fig. 4 Key region of the original grey scale image 


Output: fractal_dim = 2.63
The region has similar fractal dimension as the original image. This outcome should not be surprising: the pixels not contained in the selected region consists of background without features of any significance.

Fig. 5 Trace for the squared box measurement during iteration

The convergence pattern for calculating the fractal dimension of the region is comparable to that of the original image, reaching convergence after 6 iterations.

References


------------------
Patrick Nicolas has over 25 years of experience in software and data engineering, architecture design and end-to-end deployment and support with extensive knowledge in machine learning. 
He has been director of data engineering at Aideo Technologies since 2017 and he is the author of "Scala for Machine Learning", Packt Publishing ISBN 978-1-78712-238-3 and Geometric Learning in Python Newsletter on LinkedIn.

Appendix

Source code for initializing the square boxes

def __create_boxes(self, i: int, iter: int, n_boxes: int) -> List[List[np.array]]:
   boxes = [[]] * ((FractalDimImage.num_grey_levels + n_boxes - 1) // n_boxes)
   i_lim = i + iter

     # Shrink the boxes that are larger than i_lim
   for img_row in self.image[i: i_lim]:  
      for pixel in img_row[i: i_lim]:
          height = int(pixel // n_boxes)
          boxes[height].append(pixel)

   return boxes

Computation of the same of leas squares for boxes extracted from an image.

@staticmethod
def __sum_least_squares(boxes: List[List[float]], n_boxes: int) -> float:
   # Standard deviation of boxes
   stddev_box = np.sqrt(np.var(boxes, axis=1))
   # Filter out NAN values
   stddev = stddev_box[~np.isnan(stddev_box)]

   nBox_r = 2 * (stddev // n_boxes) + 1
   return sum(nBox_r)


Saturday, September 11, 2021

Automate GAN Configuration in PyTorch

Target audience: Advanced
Estimated reading time: 5'

Working with the architecture and deployment of Generative Adversarial Networks (GANs) often involves complex details that can be difficult to understand and resolve. Consider the advantage of identifying neural components that are reusable and can be utilized in both the generator and discriminator parts of the network.





Notes:
  • This post steers clear of the intricate technicalities of generative adversarial networks and convolutional neural networks. Instead, it focuses on automating the setup process for certain neural components.
  • Readers are expected to have a foundational knowledge of neural networks and familiarity with the PyTorch library.
  • Environments: Python 3.9, PyTorch 1.9.1

The challenge

This article is focused on streamlining the development of Deep Convolutional Generative Adversarial Networks (DCGANs) [ref 1]. We achieve this by configuring the generator in relation to the setup of the discriminator. The main area of our study is the well-known application of using GANs to differentiate between real and fake images.

For those unfamiliar with GANs..... 

Generative Adversarial Networks (GANs) [ref 2] are a type of unsupervised learning model that identify patterns within data and utilize these patterns for data augmentation, creating new samples that closely resemble the original dataset. GANs belong to the family of generative models, which also includes variational auto-encoders and maximum likelihood estimation (MLE) models. The unique aspect of GANs is that they convert the problem into a form of supervised learning by employing two competing networks:
  • The Generator model, which is trained to produce new data samples.
  • The Discriminator model, which aims to differentiate between real samples (from the original dataset) and fake ones (created by the Generator).
Crafting and setting up components like the generator and discriminator in a Generative Adversarial Network (GAN), or the encoder and decoder layers in a Variational Convolutional Auto-Encoder (VAE), can often be a repetitive and laborious process.

In fact, some aspects of this process can be entirely automated. For instance, the generative network in a convolutional GAN can be designed as the inverse of the discriminator using a de-convolutional network. Similarly, the decoder in a VAE can be automatically configured based on the structure of its encoder.

Functional representation of a simple deep convolutional GAN


Neural component reusability is key to generate a de-convolutional network from a convolutional network. To this purpose we break down a neural network into computational blocks.

Convolutional networks

In its most basic form, a Generative Adversarial Network (GAN) consists of two distinct neural networks: a generator and a discriminator.

Neural blocks

Each of these networks is further subdivided into neural blocks or groups of PyTorch modules, which include elements like hidden layers, batch normalization, regularization, pooling modes, and activation functions. Take, for example, a discriminator that is structured using a convolutional neural network [ref 3] followed by a fully connected (restricted Boltzmann machine) network. The PyTorch modules corresponding to each layer are organized into what we call a neural block class.

A PyTorch modules of the convolutional neural block [ref 4] are:
  • Conv2d: Convolutional layer with input, output channels, kernel, stride and padding
  • Dropout: Drop-out regularization layer
  • BatchNorm2d: Batch normalization module
  • MaxPool2d Pooling layer
  • ReLu, Sigmoid, ... Activation functions
Representation of a convolutional neural block with PyTorch modules

The constructor of the neural block is designed to initialize all its parameters and modules in the correct sequence. For simplicity, we are not including regularization elements like drop-out (which essentially involves bagging of sub-networks) in this setup.

Important note: Each step of the algorithm makes reference to comments in the code (i.e.  The first step [1] is to initialize the number of input and output channels refers to  # [1] - initialize the input and output channels).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
3 4
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
class ConvNeuralBlock(nn.Module):

  def __init__(self,
      in_channels: int,
      out_channels: int,
      kernel_size: int,
      stride: int,
      padding: int,
      batch_norm: bool,
      max_pooling_kernel: int,
      activation: nn.Module,
      bias: bool,
      is_spectral: bool = False):
    
   super(ConvNeuralBlock, self).__init__()
        
   # Assertions are omitted
   # [1] - initialize the input and output channels
   self.in_channels = in_channels
   self.out_channels = out_channels
   self.is_spectral = is_spectral
   modules = []
   
   # [2] - create a 2 dimension convolution layer
   conv_module = nn.Conv2d(   
       self.in_channels,
       self.out_channels,
       kernel_size=kernel_size,
       stride=stride,
       padding=padding,
       bias=bias)

   # [6] - if this is a spectral norm block
   if self.is_spectral:        
      conv_module = nn.utils.spectral_norm(conv_module)
      modules.append(conv_module)
        
   # [3] - Batch normalization
   if batch_norm:               
      modules.append(nn.BatchNorm2d(self.out_channels))
      
   # [4] - Activation function
   if activation is not None: 
      modules.append(activation)
         
   # [5] - Pooling module
   if max_pooling_kernel > 0:   
      modules.append(nn.MaxPool2d(max_pooling_kernel))
   
   self.modules = tuple(modules)

We considering the case of a generative model for images. The first step [#1] is to initialize the number of input and output channels, then create the 2-dimension convolution [#2], a batch normalization module [#3] an activation function [#4] and finally a max pooling module [#5]. The spectral norm regularization term [#6is optional.
The convolutional neural network is assembled from convolutional and feedback forward neural blocks, in the following build method.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
class ConvModel(NeuralModel):

  def __init__(self,                    
       model_id: str,
       # [1] - Number of input and output unites
       input_size: int,
       output_size: int,
       # [2] - PyTorch convolutional modules
       conv_model: nn.Sequential,
       dff_model_input_size: int = -1,
       # [3] - PyTorch fully connected
       dff_model: nn.Sequential = None):
        
   super(ConvModel, self).__init__(model_id)
   self.input_size = input_size
   self.output_size = output_size
   self.conv_model = conv_model
   self.dff_model_input_size = dff_model_input_size
   self.dff_model = dff_model
   
  @classmethod
  def build(cls,
      model_id: str,
      conv_neural_blocks: list,  
      dff_neural_blocks: list) -> NeuralModel:
            
   # [4] - Initialize the input and output size 
   #        for the convolutional layer
   input_size = conv_neural_blocks[0].in_channels
   output_size = conv_neural_blocks[len(conv_neural_blocks) - 1].out_channels

   # [5] - Generate the model from the sequence 
   #        of conv. neural blocks
   conv_modules = [conv_module for conv_block in conv_neural_blocks
         for conv_module in conv_block.modules]
   conv_model = nn.Sequential(*conv_modules)

   # [6] - If a fully connected RBM is included in the model ..
   if dff_neural_blocks is not None and not is_vae:
      dff_modules = [dff_module for dff_block in dff_neural_blocks
          for dff_module in dff_block.modules]
         
      dff_model_input_size = dff_neural_blocks[0].output_size
      dff_model = nn.Sequential(*tuple(dff_modules))
   else:
      dff_model_input_size = -1
      dff_model = None
      
  return cls(
     model_id, 
     conv_dimension, 
     input_size, 
     output_size, 
     conv_model,
     dff_model_input_size, 
     dff_model)

The standard constructor [#1] sets up the count of input/output channels, along with the PyTorch modules for the convolutional layers [#2] and the fully connected layers [#3].
The class method, build, creates the convolutional model using convolutional neural blocks and feed-forward neural blocks. It determines the dimensions of the input and output layers based on the first and last neural blocks [#4], and then produces the PyTorch convolutional modules [#5] and modules for fully-connected layers [#6] from these neural blocks.

Following this, we proceed to construct the de-convolutional neural network utilizing the convolutional blocks.

Inverting a convolutional block

To build a GAN, one must follow these steps:
  1. Select and specify the PyTorch modules that will constitute each convolutional layer.
  2. Assemble these chosen modules into a single convolutional neural block.
  3. Construct the generator and discriminator of the GAN by integrating these neural blocks.
  4. Link the generator and discriminator to create a fully functional GAN.
The aim here is to create a builder capable of producing the de-convolutional network. This network will act as the GAN's generator, formulated on the basis of the convolutional network described in the preceding section.

The process begins with the extraction of the de-convolutional block from an already established convolutional block.
Conceptual automated generation of de-convolutional block

The standard constructor for the neural block in a de-convolutional network sets up all the essential parameters required for the network, with the exception of the pooling module (which is not necessary). The code example provided demonstrates how to create a De-convolutional neural block. This process involves using convolution parameters like the number of input and output channels, kernel size, stride, padding, along with batch normalization and the activation function.


class DeConvNeuralBlock(nn.Module):

  def __init__(self,
       in_channels: int,
       out_channels: int,
       kernel_size: int,
       stride: int,
       padding: int,
       batch_norm: bool,
       activation: nn.Module,
       bias: bool) -> object:
    super(DeConvNeuralBlock, self).__init__()
    self.in_channels = in_channels
    self.out_channels = out_channels
    modules = []
             
    # Two dimension de-convolution layer
    de_conv = nn.ConvTranspose2d(
       self.in_channels,
       self.out_channels,
       kernel_size=kernel_size,
       stride=stride, 
       padding=padding,
       bias=bias)
   # Add the deconvolution block
   modules.append(de_conv)

   # Add the batch normalization, if defined
   if batch_norm:         
      modules.append(nn.BatchNorm2d(self.out_channels))
   # Add activation
   modules.append(activation)
   self.modules = modules

Be aware that the de-convolution block lacks pooling capabilities. The class method named auto_build accepts a convolutional neural block, the number of input and output channels, and an optional activation function to create a de-convolutional neural block of the DeConvNeuralBlock type. The calculation of the number of input and output channels for the resulting deconvolution layer is handled by the private method __resize.


@classmethod
def auto_build(cls,
    conv_block: ConvNeuralBlock,
    in_channels: int,
    out_channels: int = None,
    activation: nn.Module = None) -> nn.Module:
    
  # Extract the parameters of the source convolutional block
  kernel_size, stride, padding, batch_norm, activation = \
      DeConvNeuralBlock.__resize(conv_block, activation)

  # Override the number of input_tensor channels 
  # for this block if defined
  next_block_in_channels = in_channels 
     if in_channels is not None \
     else conv_block.out_channels

  # Override the number of output-channels for 
  # this block if specified
  next_block_out_channels = out_channels 
     if out_channels is not None \
     else conv_block.in_channels
    
  return cls(
        conv_block.conv_dimension,
        next_block_in_channels,
        next_block_out_channels,
        kernel_size,
        stride,
        padding,
        batch_norm,
        activation,
        False)

Sizing de-convolutional layers

The next task consists of computing the size of the component of the de-convolutional block from the original convolutional block. 

@staticmethod
def __resize(
  conv_block: ConvNeuralBlock,
  updated_activation: nn.Module) -> (int, int, int, bool, nn.Module):
  conv_modules = list(conv_block.modules)
    
  # [1] - Extract the various components of the 
  #        convolutional neural block
  _, batch_norm, activation = DeConvNeuralBlock.__de_conv_modules(conv_modules)
  
  # [2] - override the activation function for the 
  #        output layer, if necessary
  if updated_activation is not None:
     activation = updated_activation
    
    # [3]- Compute the parameters for the de-convolutional 
    #       layer, from the conv. block
     kernel_size, _ = conv_modules[0].kernel_size
     stride, _ = conv_modules[0].stride
     padding = conv_modules[0].padding

 return kernel_size, stride, padding, batch_norm, activation


The __resize method performs several functions: it retrieves the PyTorch modules for the de-convolutional layers from the initial convolutional block [#1], incorporates the activation function into the block [#2], and ultimately sets up the parameters for the de-convolutional layer [#3].

Additionally, there's a utility method named __de_conf_modules. This method is responsible for extracting the PyTorch modules associated with the convolutional layer, the batch normalization module, and the activation function for the de-convolution, all from the convolution's PyTorch modules.

@staticmethod
def __de_conv_modules(conv_modules: list) -> \
        (torch.nn.Module, torch.nn.Module, torch.nn.Module):

  activation_function = None
  deconv_layer = None
  batch_norm_module = None

  # 4- Extract the PyTorch de-convolutional modules 
  #     from the convolutional ones
  for conv_module in conv_modules:
     if DeConvNeuralBlock.__is_conv(conv_module):
         deconv_layer = conv_module
     elif DeConvNeuralBlock.__is_batch_norm(conv_module):
         batch_norm_moduled = conv_module
     elif DeConvNeuralBlock.__is_activation(conv_module):
        activation_function = conv_module

  return deconv_layer, batch_norm_module, activation_function



Convolutional layers

and the height of the two dimension output data is



De-convolutional layers
As expected, the formula to compute the size of the output of a de-convolutional layer is the mirror image of the formula for the output size of the convolutional layer.

and


Assembling de-convolutional network

Finally, a de-convolutional model, categorized as DeConvModel, is constructed using a sequence of PyTorch modules, referred to as de_conv_model. The default constructor [#1] is used once more to establish the dimensions of the input layer [#2] and the output layer [#3]. It also loads the PyTorch modules, named de_conv_modules, for all the de-convolutional layers.

class DeConvModel(NeuralModel, ConvSizeParams):

  def __init__(self,            # [1] - Default constructor
           model_id: str,
           input_size: int,      # [2] - Size first layer
           output_size: int,    # [3] - Size output layer
           de_conv_modules: torch.nn.Sequential):
    super(DeConvModel, self).__init__(model_id)
    self.input_size = input_size
    self.output_size = output_size
    self.de_conv_modules = de_conv_modules


  @classmethod
  def build(cls,
      model_id: str,
      conv_neural_blocks: list,  # [4] - Input to the builder
      in_channels: int,
      out_channels: int = None,
      last_block_activation: torch.nn.Module = None) -> NeuralModel:
    
    de_conv_neural_blocks = []

    # [5] - Need to reverse the order of convolutional neural blocks
    list.reverse(conv_neural_blocks)

    # [6] - Traverse the list of convolutional neural blocks
    for idx in range(len(conv_neural_blocks)):
       conv_neural_block = conv_neural_blocks[idx]
       new_in_channels = None
       activation = None
       last_out_channels = None

        # [7] - Update num. input channels for the first 
        # de-convolutional layer
       if idx == 0:
            new_in_channels = in_channels
        
        # [8] - Defined, if necessary the activation 
        # function for the last layer
       elif idx == len(conv_neural_blocks) - 1:
          if last_block_activation is not None:
             activation = last_block_activation
          if out_channels is not None:
             last_out_channels = out_channels

        # [9] - Apply transposition to the convolutional block
      de_conv_neural_block = DeConvNeuralBlock.auto_build(
           conv_neural_block,
           new_in_channels,
           last_out_channels,
            activation)
      de_conv_neural_blocks.append(de_conv_neural_block)
        
       # [10]- Instantiate the Deconvolutional network 
       # from its neural blocks
   de_conv_model = DeConvModel.assemble(
       model_id, 
       de_conv_neural_blocks)
     
   del de_conv_neural_blocks
   return de_conv_model


The alternative constructor named build is designed to generate and set up the de-convolutional model using the convolutional blocks, referred to as conv_neural_blocks [#4].

To align the order of de-convolutional layers correctly, it's necessary to reverse the sequence of convolutional blocks [#5]. For every block within the convolutional network [#6], this method adjusts the number of input channels to match the number of input channels in the first layer [#7].

It then updates the activation function for the final output layer [#8] and systematically integrates the de-convolutional blocks [#9]. Ultimately, the de-convolutional neural network is composed using these blocks [#10]..

@classmethod
def assemble(cls, model_id: str, de_conv_neural_blocks: list):
   input_size = de_conv_neural_blocks[0].in_channels
   output_size = de_conv_neural_blocks[len(de_conv_neural_blocks)-1].out_channels 
 
   # [11]- Generate the PyTorch convolutional modules used by the default constructor
  conv_modules = tuple([conv_module for conv_block in de_conv_neural_blocks
                        for conv_module in conv_block.modules 
                        if conv_module is not None])
  de_conv_model = torch.nn.Sequential(*conv_modules)

  return cls(model_id, input_size, output_size, de_conv_model)

The assemble method is responsible for building the complete de-convolutional neural network. It does this by compiling the PyTorch modules from each of the blocks in de_conv_neural_blocks into a cohesive unit [#11].

Thank you for reading this article. For more information ...

References

[2] A Gentle Introduction to Generative Adversarial Networks
[3] Deep learning Chap 9 Convolutional networks. 
I. Goodfellow, Y. Bengio, A. Courville - 2017 MIT Press Cambridge MA


---------------------------
Patrick Nicolas has over 25 years of experience in software and data engineering, architecture design and end-to-end deployment and support with extensive knowledge in machine learning. 
He has been director of data engineering at Aideo Technologies since 2017 and he is the author of "Scala for Machine Learning" Packt Publishing ISBN 978-1-78712-238-3