Time for action – using dilation and erosion to refine ROIs

Since photographs from holidays are a usual target for image enhancement applications, we'll use one of these for our example, showing three large rocks in the sea. The goal is to come up with a mask that includes just them. Let's start with our usual steps:

  1. As always, we'll need to load our image into MATLAB, only now we will also have to convert it to grayscale:
    >> img = imread('3Rocks.jpg');
    >> img = rgb2gray(img);
  2. Now that our image is loaded and transformed to grayscale, let's show it to get a better idea of our goal:
    >> imshow(img);
  3. Let's now have a go at thresholding the image. Let's set our threshold to 30, since the rocks are dark. This time, the threshold denotes the maximum value kept, meaning we will ask MATLAB to make a mask containing only the pixels with values below 30, that is, set the pixels of the image with values below 30 equal to 1 (white), and the rest to 0 (black):
    >> img_bin = img < 30;
    >> figure,imshow(img_bin)
  4. We can see that we have two problems; one is the inclusion of other dark objects in the scene (such as people's heads) and the other is the suboptimal selection of the rocks. First, let's take advantage of the fact that most of the unwanted areas lie at the bottom of the image. Using the data cursor, Time for action – using dilation and erosion to refine ROIs, we can see that row 705 can be used as a lower limit for the mask. So, we can set all pixels under that row to 0:
    >> img_bin(706:end,:) = 0;
    >> imshow(img_bin);
  5. Now, we must do something to eliminate some sparse white dots that shouldn't be included in the mask. A possible solution is to perform binary erosion, using a small rectangular element. Let's use the second option, applying a 2x2 structuring element with all pixels set to 1:
    >> img_bin_clean = imerode(img_bin,ones(2));
  6. Finally, we will perform dilation with a 70x70 structuring element, with all pixels set to 1 and show the final mask:
    >> mask = imdilate(img_bin_clean, ones(70));
    >> figure,
    >> subplot(1,2,1)
    >> imshow(img_bin_clean);title('Image after erosion');
    >> subplot(1,2,2),imshow(mask);title('Image after dilation');
  7. Now, let's try to erase the rocks. The result will not be optimal, but it will be interesting for comprehending what masking is. We will be using the color of the sky, so we should use the data cursor on the sky to get some sample values of the brightness. A better idea is to use our imtool, to observe entire neighborhoods. Let's do that:
    >> imtool(img);
  8. We observe that a good choice could be 147, since it is a value repeated a lot near the left rock.
  9. Having decided the value we want under our mask, let's try our disappearing act:
    >> img_proc = img;
    >> img_proc(mask) = 147;
    >> subplot(1,2,1),imshow(img),title('Original image')
    >> subplot(1,2,2),imshow(img_proc),title('Processed image')

What just happened?

This example covered both dilation and erosion, combining them with techniques learned earlier. We used a user-defined threshold to acquire a first mask for our image (after we converted it from color to grayscale). Then, we cleaned the mask from unwanted spots taking advantage of their distinct location and wrapped up the cleaning process using an erosion step to eliminate small white spots. To complete the ROIs covering the three rocks, we then performed image dilation with a rectangular structuring element sized 70x70 pixels, all equal to 1. The structuring elements were created using MATLAB's ones function, which returns a matrix with all elements equal to 1. When the function is called with only one input, N, the output is a square matrix with size NxN pixels. To better understand this, let's see the result of a 3x3 matrix generated this way:

>> ones(3)

The output of the previous command is as follows:

ans =
     1     1     1
     1     1     1
     1     1     1

After creating our mask, we applied a patching-up process like the one described in the previous section. This time, our goal was to erase the rocks from the picture, replacing their pixels' values with one that is descriptive of the sky. Of course, using just one brightness value for such big areas, ends up with a flat result, which is less subtle than we would like. However, the main goal of erasing the rocks was achieved to a good extent.

Tip

The use of imerode to eliminate small objects from our mask is not always a good idea, since it affects all binary objects in the image. For this example, we used it in conjunction with imdilate. A better choice for such tasks would be to use the bwareaopen function, which eliminates small objects of a predefined size from the image. In the preceding example, to eliminate objects smaller than 6 pixels, we would replace the step img_bin_clean = imerode(img_bin, ones(2)); with img_bin_clean = bwareaopen(img_bin, 6);.

Choosing a structuring element

We mentioned earlier the usage of structuring elements and the two rules we must follow when choosing them. However, in our example of dilation and erosion, we used a rather simple rectangular structuring element, consisting of instances of 1. Is there a better choice? The answer is yes. The objects we want to mask are not rectangular, so the best choice is definitely not a rectangular structuring element. However, we can observe that the three rocks are not similar. The two rocks at the sides could be thought of as similar, but they have opposite orientations (that is, they look like they are mirrored). The shape of the small rock in the middle does not resemble the others. All these facts lead us to the conclusion that more than one structuring element should be used. However, we fall right into the next problem; how will we use different structuring elements for different areas? For this, we will recollect a technique we used in the previous chapter.

But first thing first; we should start with choosing the ideal structuring element for each rock. As you may already have understood from the results of the previous example, the sides of the rocks that are attached to the left and right image borders remain almost untouched. Their only alteration after imdilate. is being expanded at the top and bottom. The middle rock has expanded in all directions after dilation.

To make this more obvious, let's use a basic technique in binary image processing, which is image subtraction. If we subtract two binary images and observe the result, we will see which pixels have a different value in the two images. In our example, we will see which pixels were set to 1 after the dilation process, if we subtract (using function imsubtract) the mask before the dilation from the final mask and show the pixels that are positive:

>> Z = mask - img_bin; >> figure,imshow(Z)
>> subplot(1,3,1),imshow(img_bin),title('Mask before dilation')
>> subplot(1,3,2),imshow(mask),title('Mask after dilation')
>> subplot(1,3,3),imshow(Z),title('Pixels set to 1 after dilation')

To eliminate unwanted dilation in a specific direction, we should be more careful about the structuring element we will use. The goal is to produce a structuring element that only expands our ROI in the desired directions. To achieve this, the structuring element should have instances of 1 in the pixels facing in the desired directions and instances of 0 in the rest of the pixels. One way to achieve this is by manually initializing the pixels of a matrix to fit our needs. Another way to achieve it is using a structuring element provided by MATLAB's strel function as a starting point and alter it to fit our needs.

Using strel to generate structuring elements

The ready-made strel function, provided by the Image Toolbox of MATLAB, offers various types of structuring elements. The supported shapes that can be used in the problem we examine, are square, rectangle, disk, octagon, diamond, line and arbitrary. More information can be obtained by typing help strel in the command line. For the time being, we shall just see some of them, by typing in the following lines:

>> se1 = strel('square',10); % 10x10 square
>> se2 = strel('rectangle',[12,8]); % 12x8 rectangle
>> se3 = strel('line',10,45); % line, length 10, angle 45 degrees
>> se4 = strel('disk',10); % disk, radius 10
>> se5 = strel('octagon',12);  % octagon, size 12 (must be multiple
  of 3)
>> se6 = strel('diamond',10); % diamond, size 10
>> subplot(2,3,1),imshow(getnhood(se1)),title('Square')
>> subplot(2,3,2),imshow(getnhood(se2)),title('Rectangle')
>> subplot(2,3,3),imshow(getnhood(se3)),title('Line')
>> subplot(2,3,4),imshow(getnhood(se4)),title('Disk')
>> subplot(2,3,5),imshow(getnhood(se5)),title('Octagon')
>> subplot(2,3,6),imshow(getnhood(se6)),title('Diamond')

As you may have noticed in the Workspace window, the structuring elements are not saved as matrices, but as a special type, called strel. This is why in order for us to transform them into matrices, we use the getnhood function, which allows for them to be processed and displayed in the ways presented so far.