Using convolutional autoencoders to reduce noise in images

Using convolutional autoencoders to reduce noise in images

Preface

I was too busy at work this week. I wanted to write the Attention translation, but I couldn't find the time. I will write it later. This week, let's look at a simple autoencoder practical code. I won't introduce the theory of autoencoders in detail. You can find a lot of them online. The simplest autoencoder is to reproduce the input through an encoder and decoder. For example, we input an image into a network, and the encoder of the autoencoder compresses the image to obtain the compressed information. Then the decoder decodes the information to reproduce the original image.

The autoencoder is actually optimized by minimizing the difference between the target and the input, that is, making the output layer reproduce the original information as much as possible. Since the basic form of the autoencoder is relatively simple, there are many variants of it, including DAE, SDAE, VAE, etc. If you are interested, you can search for other related information on the Internet.

This article will implement two demos. The first part is to implement a simple input-hidden-output autoencoder. The second part will implement a convolutional autoencoder based on the first part to reduce the noise of the image.

Tool Description

  • TensorFlow 1.0

  • jupyter notebook

  • Data: MNIST handwriting dataset

  • Complete code address: NELSONZHAO/zhihu

***part

First, we will implement the simplest AutoEncoder with the structure shown above.

Loading data

Here, we use the MNIST handwriting dataset for experiments. First, we need to import the data. TensorFlow has encapsulated this experimental dataset, so it is also very simple for us to use.

If you want to display the data in grayscale, use the code plt.imshow(img.reshape((28,28)), cmap='Greys_r').

We can load our dataset through input_data. If you already have the MNIST dataset (four compressed packages) locally, you can put these four compressed packages in the directory MNIST_data, so that TensorFlow will directly extract the data without downloading it again. We can use imshow to view any image. Since the data we loaded has been processed into a 784-dimensional vector, it needs to be reshaped when it is displayed again.

Build the model

After we load the data, we can start the simplest modeling. Before that, let's first get the size of the input data. The image we loaded is a 28x28 pixel block, and TensorFlow has already processed it into a 784-dimensional vector for us. At the same time, we also need to specify the size of the hidden layer.

Here I specify 64. The smaller the hidden_units, the more information is lost. You can also try other sizes to see the results.

AutoEncoder contains three layers: input, hidden and output.

In the hidden layer, we use ReLU as the activation function.

At this point, a simple AutoEncoder has been constructed. Next, we can start the TensorFlow graph for training.

Visualization of training results

After the above steps, we have constructed a simple AutoEncoder. Now we will visualize the results to see its performance.

Here, I selected 5 samples from the test dataset for visualization. Similarly, if you want to observe grayscale images, specify the cmap parameter as 'Greys_r'. The first row above is the original image in the test dataset, and the second row is the image after being reproduced by AutoEncoder. You can clearly see the loss of pixel information.

Similarly, we can also visualize the hidden layer compressed data, the results are as follows:

These five images correspond to the hidden layer compressed images of the five images in the test.

Through the simple example above, we understand the basic working principle of AutoEncoder. Next, we will further improve our model and convert the hidden layer into a convolutional layer to perform image denoising.

Some code is omitted in the above process. Please go to my GitHub to view the complete code.

Part 2

Based on the understanding of the working principle of AutoEncoder above, in this section we will add multiple convolutional layers to AutoEncoder to perform image denoising.

We still use the MNIST dataset for the experiment. The steps of data import are not described here. Please download the code to view it. Before we start, let's take a look at our entire model structure through a picture:

We input a noisy image into the model and give the model a noise-free image at the output, allowing the model to learn the denoising process through the convolutional autoencoder.

Input Layer

The input layer here is different from the input layer in the previous part because we are going to use convolution operation here. Therefore, the input layer should be an image of height x width x depth. The depth of a general image is three layers in RGB format. Here, the depth of our MNIST dataset is only 1.

Encoder Convolutional Layer

The encoder convolution layer sets up three convolution layers plus a pooling layer to process the image.

In the first convolution layer, we used 64 filters of size 3 x 3. The strides defaulted to 1. After the padding was set to same, our height and width would not be changed. Therefore, after the first convolution layer, the data we obtained changed from the original 28 x 28 x 1 to 28 x 28 x 64.

Next, the convolution result is subjected to max pooling. Here, I set the size and stride to 2 x 2. The pooling operation does not change the depth of the convolution result, so the size after pooling is 14 x 14 x 64.

I will not go into detail about other convolutional layers. The activation function of all convolutional layers is ReLU.

After three layers of convolution and pooling operations, the conv3 we get is actually equivalent to the hidden layer of AutoEncoder in the previous part. The data of this layer has been compressed to a size of 4 x 4 x 32.

At this point, we have completed the convolution operation on the Encoder side, and the data dimension has changed from the initial 28 x 28 x 1 to 4 x 4 x 32.

Decoder Convolutional Layer

Next, we will start the convolution on the decoder side. Before that, some friends may ask, since the encoder has already rolled the image into 4 x 4 x 32, if we continue to perform convolution in the decoder, won’t the size of the data get smaller and smaller? Therefore, on the decoder side, we do not simply perform convolution operations, but use a combination of Upsample (which can be translated into upsampling in Chinese) + convolution.

We know that the convolution operation is to scan each patch in the image through a filter, and then perform nonlinear processing after weighted summation of the pixel blocks in the patch. For example, if the size of our patch in the original image is 3 x 3 (in layman's terms, we take a 3 x 3 pixel block from a picture), and then we use a 3 x 3 filter to process this patch, then this patch becomes a 1 pixel block after convolution. In Deconvolution (or transposed convolution), this process is reversed, and a 1 pixel block will be expanded to a 3 x 3 pixel block.

However, Deconvolution has some drawbacks. It will cause checkerboard patterns to appear in the image. This is because there will be a lot of overlap in the filters during the Deconvolution process. To solve this problem, someone proposed using Upsample plus convolutional layers to solve it.

There are two common ways of upsampling, one is nearest neighbor interpolation and the other is bilinear interpolation.

This article will also use the Upsample plus convolution method to perform processing on the Decoder side.

TensorFlow also encapsulates the Upsample operation. We use resize_nearest_neighbor to resize the result of the Encoder convolution and then perform convolution again. After three Upsample operations, we get a data size of 28 x 28 x 64. Finally, we need to perform another convolution on this result to process it into the size of our original image.

***Define loss and optimizer in one step.

We used cross entropy to calculate the loss function, and the learning rate of the optimization function was 0.001.

Constructing Noise Data

Through the above steps, we have constructed the entire convolutional autoencoder model. Since we want to use this model to reduce the noise of the image, we also need to construct our noise data based on the original data before training.

Let's take a look at how to add noise through a simple example above. We get a picture data img (size 784), add the noise factor and multiply it by the random number, and the pixels on the picture will be changed. Then, since each pixel data of MNIST data is processed into a number between 0 and 1, we use numpy.clip to clip the picture with noise added to ensure that each pixel data is still between 0 and 1.

The operation of np.random.randn(*img.shape) is equivalent to np.random.randn(img.shape[0], img.shape[1])

Let's take a look at the image comparison before and after adding noise.

Training the model

After introducing model construction and noise processing, we can then train our model.

When training the model, our input has become the data with noise added, and the output is our original data without noise. The main thing is to reshape the original data into the same format as inputs_. Due to the depth of the convolution operation, the model training is a bit slow, so it is recommended to use GPU to run.

Remember to ***close sess.

Results Visualization

After the long training above, our model is finally trained. Next, let's take a look at the effect of the model through visualization.

It can be seen that through the convolutional autoencoder, our noise reduction effect is still very good. The final generated image looks very smooth and the noise is almost invisible.

Some of you may think that we can also use the basic input-hidden-output structure of AutoEncoder to achieve noise reduction. Therefore, I also implemented a model that uses the simplest input-hidden-output structure for noise reduction training (the code is on my GitHub). Let's take a look at its results:

It can be seen that compared with the convolutional autoencoder, its noise reduction effect is worse, and some noise shadows can still be seen in the reshaped image.

Conclusion

So far, we have completed the basic version of the AutoEncoder model, and added a convolutional layer to perform image denoising. I believe you have a preliminary understanding of AntoEncoder.

The complete code is available on my GitHub (NELSONZHAO/zhihu), which contains six files:

  • BasicAE, the basic version of AutoEncoder (including jupyter notebook and html files)

  • EasyDAE, a basic version of the denoising AutoEncoder (including jupyter notebook and html files)

  • ConvDAE, Convolutional Denoising AutoEncoder (including jupyter notebook and html files)

If you think it’s good, you can give my GitHub a star!

<<:  Training deep residual neural networks based on boosting principle

>>:  ARKit & OpenGL ES - ARKit principle and implementation

Recommend

Gartner: Opt-out rates for mobile app tracking will drop from 85% to 60% by 2023

According to data released by market research fir...

She is a true "light chaser", making the magic of the sun shine brightly in life

There has never been a lack of great women on the...

How to quickly increase the popularity of your live streaming room?

Live streaming has become a standard sales method...

Mazda recalls 230,000 vehicles in North America due to parking brake issues

According to foreign media reports, Mazda North A...

Who will be Bezos' successor? The three executives have obvious advantages

Recently, in an interview with BI editor-in-chief...

Xinyou Internet CEO: Creating an ideal connector for multi-screen entertainment

Recently, a piece of news in the gaming industry ...

The cold wind is howling. Each plant has its own unique way to save its life!

In order to compete for limited sunlight and wate...

4 user growth models: from simple to sophisticated!

The simplest growth model AARRR model: Advantages...

The online marketing plan for Women’s Day is here!

along with With the rise of “her economy”, Women’...

Why has Youdao Cloud Notes become the most popular office app?

Recently, the famous domestic Internet business m...