Autoencoders

5 min readApr 20, 2021

A Beginner’s Guide to Autoencoders

Artificial Intelligence encircles a wide range of technologies and techniques that enable computer systems to solve problems like Data Compression which is used in computer vision, computer networks, computer architecture, and many other fields. Autoencoders are unsupervised neural networks that use machine learning to do this compression for us. This Autoencoders Tutorial will provide you with a complete insight into autoencoders in the following sequence:

What are Autoencoders?
The need for Autoencoders
Applications of Autoencoders
Architecture of Autoencoders
Types of Autoencoders

What are Autoencoders?

An autoencoder neural network is an Unsupervised Machine learning algorithm that applies backpropagation, setting the target values to be equal to the inputs. Autoencoders are used to reduce the size of our inputs into a smaller representation. If anyone needs the original data, they can reconstruct it from the compressed data.

We have a similar machine learning algorithm ie. PCA does the same task. So you might be thinking why do we need Autoencoders then?

The need for Autoencoders:

Autoencoders are preferred over PCA because:

An autoencoder can learn non-linear transformations with a non-linear activation function and multiple layers.
It doesn’t have to learn dense layers. It can use convolutional layers to learn which is better for video, image, and series data.
It is more efficient to learn several layers with an autoencoder rather than learn one huge transformation with PCA.
An autoencoder provides a representation of each layer as the output.
It can make use of pre-trained layers from another model to apply transfer learning to enhance the encoder/decoder.

Applications of Autoencoders

Image Coloring

Autoencoders are used for converting any black and white picture into a colored image. Depending on what is in the picture, it is possible to tell what the color should be.

Feature variation

It extracts only the required features of an image and generates the output by removing any noise or unnecessary interruption.

Dimensionality Reduction

The reconstructed image is the same as our input but with reduced dimensions. It helps in providing a similar image with a reduced pixel value.

Denoising Image

The input seen by the autoencoder is not the raw input but a stochastically corrupted version. A denoising autoencoder is thus trained to reconstruct the original input from the noisy version.

Watermark Removal

It is also used for removing watermarks from images or to remove any object while filming a video or a movie.

Now that you have an idea of the different industrial applications of Autoencoders, let’s continue our Autoencoders Tutorial Blog and understand the complex architecture of Autoencoders.

Architecture of Autoencoders

An Autoencoder consist of three layers:

Encoder
Code
Decoder

Encoder: This part of the network compresses the input into a latent space representation. The encoder layer encodes the input image as a compressed representation in a reduced dimension. The compressed image is the distorted version of the original image.
Code: This part of the network represents the compressed input that is fed to the decoder.
Decoder: This layer decodes the encoded image back to the original dimension. The decoded image is a lossy reconstruction of the original image and it is reconstructed from the latent space representation.

The layer between the encoder and decoder, ie. the code is also known as Bottleneck. This is a well-designed approach to decide which aspects of observed data are relevant information and which aspects can be discarded. It does this by balancing two criteria :

The compactness of representation, measured as the compressibility.
It retains some behaviorally relevant variables from the input.

Types of Autoencoders

Convolution Autoencoders

Autoencoders in their traditional formulation does not take into account the fact that a signal can be seen as a sum of other signals. Convolutional Autoencoders use the convolution operator to exploit this observation. They learn to encode the input in a set of simple signals and then try to reconstruct the input from them, modify the geometry or the reflectance of the image.

Use cases of CAE:

Image Reconstruction
Image Colorization
latent space clustering
generating higher resolution images

Sparse Autoencoders

Sparse autoencoders offer us an alternative method for introducing an information bottleneck without requiring a reduction in the number of nodes at our hidden layers. Instead, we’ll construct our loss function such that we penalize activations within a layer.

Deep Autoencoders

The extension of the simple Autoencoder is the Deep Autoencoder. The first layer of the Deep Autoencoder is used for first-order features in the raw input. The second layer is used for second-order features corresponding to patterns in the appearance of first-order features. Deeper layers of the Deep Autoencoder tend to learn even higher-order features.

A deep autoencoder is composed of two, symmetrical deep-belief networks-

First four or five shallow layers representing the encoding half of the net.
The second set of four or five layers make up the decoding half.

Use cases of Deep Autoencoders

Image Search
Data Compression
Topic Modeling & Information Retrieval (IR)

Contractive Autoencoders

A contractive autoencoder is an unsupervised deep learning technique that helps a neural network encode unlabeled training data. This is accomplished by constructing a loss term that penalizes large derivatives of our hidden layer activations with respect to the input training examples, essentially penalizing instances where a small change in the input leads to a large change in the encoding space.

Now that you have an idea of what Autoencoders is, its different types, and its properties. Let’s move ahead with our Autoencoders Tutorial and understand a simple implementation of it using TensorFlow in Python.