Efficiently Implementing Dilate and Erode Image Functions

2024-01-04 19:59

本文主要是介绍Efficiently Implementing Dilate and Erode Image Functions,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

http://blog.ostermiller.org/dilate-and-erode

Efficiently Implementing Dilate and Erode Image Functions


Dilate by One

Motivation

Dilate

Dilate is a function that accepts a black and white image. It is also known by the names “grow”, “bolden”, and “expand”. It turns on pixels which were near pixels that were on originally, thereby thickening the items in the image.

Erode

Erode is the sister function to dilate. It is also known by the name “shrink”. It turns off pixels which were near pixels that were off originally, thereby eating away at the edges of the items in the image.

Detecting Image “Snow”

Dilate and erode can be used in conjunction do detect and correct a common problem with digital and scanned images: snow. Snow is caused by bad pixels on the CCD of a digital camera or dust that gets onto a scanned image or negative. If we dilate and image, and then erode the image, we can see that holes in the image get filled in:

Original After Dilate After Dilate and Erode
0000000
0000000
0011100
0010100
0011100
0000000
0000000
0000000
0011100
0111110
0111110
0111110
0011100
0000000
0000000
0000000
0011100
0011100
0011100
0000000
0000000

A typical algorithm to correct snow would usually be something like this:

Image correctSnow(Image image, int radiusOfSnow){// Copy the image and convert the copy to black and whiteImage blackAndWhiteImage = threshold(copy(image), THRESHOLD_VALUE);// Copy the threshold and dilate and erode the copyImage dilatedAndEroded = erode(dilate(copy(blackAndWhiteImage), radiusOfSnow), radiusOfSnow);// Figure out which pixels were snow based on the difference// between the thresholded image and the dilated and eroded imageImage snowPixels = diff(blackAndWhiteImage, dilatedAndEroded);// Fix up any pixels in the original that were marked as snow by // filling them with color of surrounding pixelsreturn blendSnowPixels(image, snowPixels);
}

As we will see, dilate and erode can be efficient, no matter how big the snow that you want to remove is.

The Challenge

Write a function that dilates an image. Assume that the image can be of arbitrary size (n by n). An image is represented by a two dimensional array of integers. The image data is binary (zeros and ones), black and white, no color or gray scale. Dilate should turn on any pixel that is touching a pixel in the north, east, south, or west direction (no diagonals) that is already turned on in the input.

This is how dilation is supposed to behave (Flip a couple bits and the press dilate):

First Try

When first presented with this problem most people make a few mistake related to bounds checking (off by one errors and such). Those problems are easily corrected. However the most common incorrect algorithm is that it is easy to step on ones toes.

// Incorrect solution - steps on its own toes
int[][] dilate(int[][] image){for (int i=0; i<image.length; i++){for (int j=0; j<image[i].length; j++){if (image[i][j] == 1){if (i>0) image[i-1][j] = 1;if (j>0) image[i][j-1] = 1;if (i+1<image.length) image[i+1][j] = 1;if (j+1<image[i].length) image[i][j+1] = 1;}}}return image;
}

Second Try

The easiest way to correct this is to create a copy of the image to work with a copy of the image. This is perfectly correct solution, however it uses O(n^2) extra space. Notice that the copy of the image has to be initialized to all zeros and that the current pixel has to be copied over as well as the surrounding pixels.

// Correct, but creates a copy of the image which is inefficient
int[][] dilate(int[][] image){int[][] imagecopy = new int[image.length][image[0].length];for (int i=0; i<image.length; i++){for (int j=0; j<image[i].length; j++){if (image[i][j] == 1){imagecopy[i][j] = 1;if (i>0) imagecopy[i-1][j] = 1;if (j>0) imagecopy[i][j-1] = 1;if (i+1<image.length) imagecopy[i+1][j] = 1;if (j+1<image[i].length) imagecopy[i][j+1] = 1;}}}return imagecopy;
}

Best Solution

The best solution is to store information about what is newly turned on (vs was on originally) in the pixels themselves. This solution uses the value of 2 to indicate something was newly turned on. Notice that before a pixel is flipped to a two, it must be checked to ensure that it wasn’t a one first. Also notice that a second pass is made at the end to turn the twos back to ones. This does not effect the magnitude of the running time: The algorithm still runs in O(n^2) time.

// Best dilate by one solution
int[][] dilate(int[][] image){for (int i=0; i<image.length; i++){for (int j=0; j<image[i].length; j++){if (image[i][j] == 1){if (i>0 && image[i-1][j]==0) image[i-1][j] = 2;if (j>0 && image[i][j-1]==0) image[i][j-1] = 2;if (i+1<image.length && image[i+1][j]==0) image[i+1][j] = 2;if (j+1<image[i].length && image[i][j+1]==0) image[i][j+1] = 2;}}}for (int i=0; i<image.length; i++){for (int j=0; j<image[i].length; j++){if (image[i][j] == 2){image[i][j] = 1;}}}return image;
}

Dilate by K

Motivation

What we have written is dilate by one. This can be generalized into dilate by k. k will be defined in terms of Manhattan distance. In Manhattan, the distance between any two places is the number of blocks you have to walk to get there. You must travel in a stair step fashion as you cannot cut diagonally through buildings. The dilation by k will turn on all pixels that are within k Manhattan distance of a pixel that was on in the input.

Here is how dilation by k is supposed to work:

Dilate by: 

Nasty Solution

The temptation is to follow a similar algorithm to the one presented by dilate by one. In such a solution we would walk each point in the image. When a point that is on is found, travel k in each direction turning on pixels. While possible to write, such a solution would be fairly messy and run in O(n^2*k^2) time. Since k is bounded by n, that is really a O(n^4) solution. There are both much cleaner ways and much more efficient ways to write the algorithm.

// n^4 (very inefficient) solution for dilate by k
int[][] dilate(int[][] image, int k){for (int i=0; i<image.length; i++){for (int j=0; j<image[i].length; j++){if (image[i][j] == 1){for (int l=i-k; l<=i+k; l++){int remainingk = k-Math.abs(i-l);for (int m=j-remainingk; m<=j+remainingk; m++){if (l>=0 && m>=0 && l<image.length && m<image.length && image[l][m]==0){image[l][m] = 2;}}}}}}for (int i=0; i<image.length; i++){for (int j=0; j<image[i].length; j++){if (image[i][j] == 2){image[i][j] = 1;}}}return image;
}
Dilate by: 

Easy solution

If you pressed the dilate button more than once in the dilate by one examples, you may have noticed that dilating by one k times, is equivalent to dilating by k. This suggests the super easy algorithm of calling what was already written k times. Notice that this runs in O(n^2*k) time. Since k is bounded by n, that is really O(n^3) time.

// easy to write (but n^3 inefficient) dilate by k solution
int[][] dilate(int[][] image, int k){for (int i=0; i<k; i++){image = dilate(image);}return image;
}
Dilate by: 

Efficient Solution

To find an O(n^2) solution to this problem, the problem needs to be framed differently. What if there were an oracle that told us (for every pixel) the Manhattan distance to the nearest pixel. Pixels that are on would get a zero from the oracle. Pixels next to them a one, and so on.

The Manhattan distance oracle:

If we had that oracle we could just threshold by the Manhattan distance.

// n^2 solution with Manhattan oracle
int[][] dilate(int[][] image, int k){image = manhattan(image);for (int i=0; i<image.length; i++){for (int j=0; j<image[i].length; j++){image[i][j] = ((image[i][j]<=k)?1:0);}}return image;
}
Dilate by: 

Manhattan Oracle

In One Dimension

So the question just becomes how can we write the Manhattan oracle to run in O(n^2) time. The best way to approach this problem is to take it down a dimension. In a single dimensional array, for each point find the distance to the nearest “on” point.

There are two facts that help:

  1. The maximum distance to a pixel that is on is the length of the array
  2. A pixel is at most one further from a pixel that is on than the pixel before it

Those with eagle eyes will soon discover that if you walk the array, and find a pixel that is on, you can count up until you get halfway to the next pixel that is on. That suggests that a two pass solution: One pass to find the nearest pixel to the left and one pass to find the nearest pixel to the right. The distance to the nearest pixel will be the minimum of the pixel to the left and the pixel to the right. In cases in which the distance is unknown, it is safe to assume the maximum because that value will be corrected later.

// O(n) solution to find the distance to "on" pixels in a single dimension array
int[] distance(int[] arr){// traverse forwardsfor (int i=0; i<arr.length; i++){if (arr[i] == 1){// first pass and pixel was on, it gets a zeroarr[i] = 0;} else {// pixel was off// It is at most the length of array// away from a pixel that is onarr[i] = arr.length;// One more than the pixel to the leftif (i>0) arr[i] = Math.min(arr[i], arr[i-1]+1);}}// traverse backwardsfor (int i=arr.length-1; i>=0; i--){// what we had on the first pass// or one more than the pixel to the rightif (i+1<arr.length) arr[i] = Math.min(arr[i], arr[i+1]+1);}return arr;
}

In Two Dimensions

Generalizing from one dimension, we have a similar solution, but now have pixels on four sides of us instead of two sides to which to pay attention. Again, we can make two passes. On the first pass, look north and west, adding one to each (use the minimum). On the second pass, look south and east, adding one to each (use the minimum). The maximum Manhattan distance that a pixel can be away from a pixel that is “on” is the sum of the width and the height of the image.

// O(n^2) solution to find the Manhattan distance to "on" pixels in a two dimension array
int[][] manhattan(int[][] image){// traverse from top left to bottom rightfor (int i=0; i<image.length; i++){for (int j=0; j<image[i].length; j++){if (image[i][j] == 1){// first pass and pixel was on, it gets a zeroimage[i][j] = 0;} else {// pixel was off// It is at most the sum of the lengths of the array// away from a pixel that is onimage[i][j] = image.length + image[i].length;// or one more than the pixel to the northif (i>0) image[i][j] = Math.min(image[i][j], image[i-1][j]+1);// or one more than the pixel to the westif (j>0) image[i][j] = Math.min(image[i][j], image[i][j-1]+1);}}}// traverse from bottom right to top leftfor (int i=image.length-1; i>=0; i--){for (int j=image[i].length-1; j>=0; j--){// either what we had on the first pass// or one more than the pixel to the southif (i+1<image.length) image[i][j] = Math.min(image[i][j], image[i+1][j]+1);// or one more than the pixel to the eastif (j+1<image[i].length) image[i][j] = Math.min(image[i][j], image[i][j+1]+1);}}return image;
}


Leave a comment

这篇关于Efficiently Implementing Dilate and Erode Image Functions的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/570441

相关文章

lvgl8.3.6 控件垂直布局 label控件在image控件的下方显示

在使用 LVGL 8.3.6 创建一个垂直布局,其中 label 控件位于 image 控件下方,你可以使用 lv_obj_set_flex_flow 来设置布局为垂直,并确保 label 控件在 image 控件后添加。这里是如何步骤性地实现它的一个基本示例: 创建父容器:首先创建一个容器对象,该对象将作为布局的基础。设置容器为垂直布局:使用 lv_obj_set_flex_flow 设置容器

Flink实战案例(二十三):自定义时间和窗口的操作符(四)window functions之增量聚合函数(一)ReduceFunction

实例一 例子: 计算每个传感器15s窗口中的温度最小值 val minTempPerWindow = sensorData.map(r => (r.id, r.temperature)).keyBy(_._1).timeWindow(Time.seconds(15)).reduce((r1, r2) => (r1._1, r1._2.min(r2._2))) 实例二 ReduceFun

IMAGE LIST

   CImageList就是一个容器,用来存储图片资源,方便这些资源被CListBox,CComboBox,CComboBoxEx,CTabCtrl以及CTreeCtrl,CListCtrl等使用。      要使用CImgeList首先要使用它的create函数:      一般用的比较多的是这一个函数,当然,它还有很多重载,自己可以去翻阅msdn.       BOOL

【vscode】vscode paste image插件设置

本文首发于 ❄️慕雪的寒舍 vscode编辑md文件的时候,如果想插入图片,自带的粘贴只会粘贴到当前目录下,也没有文件重命名,很不友好。 在扩展商店里面有mushan的Paste Image插件,相比自带的,更加友好一点。但是它的配置把我弄糊涂了,简单测试了一下才明白处理的逻辑。 注意,本文编写的是对mushan的Paste Image插件的教程。 首先是安装这个插件,这个不多说

pytorch时空数据处理4——图像转文本/字幕Image-Captionning(二)

pytorch时空数据处理4——图像转文本/字幕Image-Captionning(二) pytorch时空数据处理4——图像转文本/字幕Image-Captionning(二)DatasetInputs to modelCaption LengthsData pipelineEncoderAttentionDecoder代码数据集初始化 create_input_files.py训练 tr

Show,Attend and Tell: Neural Image Caption Generation with Visual Attention

简单的翻译阅读了一下 Abstract 受机器翻译和对象检测领域最新工作的启发,我们引入了一种基于注意力的模型,该模型可以自动学习描述图像的内容。我们描述了如何使用标准的反向传播技术,以确定性的方式训练模型,并通过最大化变分下界随机地训练模型。我们还通过可视化展示了模型如何能够自动学习将注视固定在显着对象上,同时在输出序列中生成相应的单词。我们通过三个基准数据集(Flickr9k,Flickr

Docker Image 命令

文章目录 目录 文章目录 1 . Docker镜像是什么? 2 . 镜像命令详解 docker images docker tag docker pull docker rmi  docker save 总结 1 . Docker镜像是什么? Docker image 本质上是一个 read-only 只读文件, 这个文件包含了文件系统、 源码、库文件、依赖、工具等一些

flutter Image

Flutter中,Image是一个用于显示图片的控件,可以显示网络图片、本地图片以及Asset中的图片。Image控件支持多种常见的图片格式,例如PNG、JPEG、GIF等。 const Image({super.key,required this.image,this.frameBuilder,this.loadingBuilder,this.errorBuilder,this.seman

C#Bitmap和Image之间的关系

Image 类 Image 是一个抽象基类,它定义了所有图像类型的共同属性和方法。它提供了图像处理的通用接口,比如获取图像的尺寸、像素格式、帧数等。Image 类本身不能被实例化,它只是提供了一个通用的框架,具体的图像类型(如位图、图标、元文件等)需要通过继承 Image 类来实现。Image 类提供了一些通用的方法,如 Save(保存图像到文件)、GetThumbnailImage(获取图像的

Image Transformation can make Neural Networks more robust against Adversarial Examples

Image Transformation can make Neural Networks more robust against Adversarial Examples 创新点 1.旋转解决误分类 总结 可以说简单粗暴有效