如何检测圣诞树？

编程入门行业动态更新时间:2024-10-08 00:35:23

本文介绍了如何检测圣诞树？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

哪些图像处理技术可用于实现检测以下图片中显示的圣诞树的应用程序？

我正在寻找将在所有这些图像上工作的解决方案。因此，需要训练 haar级联分类器或模板匹配的方法不是很有趣。

对于可以以任何编程语言编写的东西，只使用开源技术。解决方案必须使用在此问题上共享的图像进行测试。有 6个输入图像，答案应显示处理每个图像的结果。最后，对于每个输出图像，必须以红线绘制以包围检测到的树。

去编程检测这些图像中的树？

解决方案

我有一种方法，我认为是有趣的，休息。我的方法的主要区别，相比其他一些，是在如何执行图像分割步骤 - 我使用。对于DBSCAN通常相对于其他聚类算法的行为，有一个非常好的视觉比较，可供， scipy ， matplotlib 和 scikit-learn 。我把它分成两部分。第一部分负责实际的图像处理：

来自PIL import Image import numpy as np import scipy as sp import matplotlib.colors as colors 从sklearn.cluster导入DBSCAN 从数学import ceil，sqrt 输入： rgbimg：[M，N，3]包含（uint，0-255）彩色图像的numpy数组 hueleftthr：选择最大允许色调黄绿色区域 huerightthr：在蓝色紫色区域中选择最小允许色调的标量常数 satthr：选择最小值的标量常量允许饱和 valthr：选择最小允许值的标量常数 monothr：选择最小允许单色的标量常数亮度 maxpoints ：标量常数最大像素数量转发到 DBSCAN聚类算法 proxthresh：用于DBSCAN的接近阈值，作为的一部分图像的对角线大小输出： borderseg：[K，2,2]包含K对x-和y-像素的嵌套列表绘制树形边框的值 X：[P，2]通过阈值步骤的像素列表标签：[Q，2] Xslice中点的集群标签列表下面） Xslice：[Q，2]要传递给DBSCAN的减少的像素列表 def findtree（rgbimg， hueleftthr = 0.2，huerightthr = 0.95，satthr = 0.7， valthr = 0.7，monothr = 220，maxpoints = 5000，proxthresh = 0.04）：＃将$ rgb图像转换为单色$ b gryimg = np.asarray（Image.fromarray（rgbimg）.convert（'L'））＃将rgb图像（uint，0-255）转换为hsv（float，0.0-1.0） hsvimg = colors.rgb_to_hsv（rgbimg.astype（float）/ 255）＃初始化二进制阈值映像 binimg = np.zeros（（rgbimg.shape [0]，rgbimg.shape [ 1]））＃查找hue 0.95（红色或黄色）和饱和度/值＃大于0.7（饱和和亮）的像素 - 趋向于与＃在某些图像中的树上观赏灯 boolidx = np.logical_and（ np.logical_and（ np.logical_or（（hsvimg [：，：，0] （hsvimg [：，：1]> satthr），（hsvimg [：，：，0]> huerightthr）：，2]> valthr））＃查找满足hsv标准的像素 binimg [np.where（boolidx）] = 255 ＃添加满足灰度亮度标准的像素 binimg [np.where（gryimg> monothr）] = 255 ＃为DBSCAN聚类算法准备阈值点 X = np.transpose（np.where（binimg == 255）） Xslice = X nsample = len（Xslice）如果nsample> maxpoints：＃确保点数不超过DBSCAN最大容量 Xslice = X [range（0，nsample，int（ceil（float（nsample）/ maxpoints））] ＃将DBSCAN接近阈值转换为像素单位并运行DBSCAN pixproxthr = proxthresh * sqrt（binimg.shape [0] ** 2 + binimg.shape [1] ** 2） db = DBSCAN（eps = pixproxthr，min_samples = 10）.fit（Xslice） labels = db.labels_.astype（int）＃查找最大的簇）和获得凸包 unique_labels = set（labels） maxclustpt = 0 for unique_labels： class_members = [index [0] for index in np.argwhere == k）] if len（class_members）> maxclustpt： points = Xslice [class_members] hull = sp.spatial.ConvexHull（points） maxclustpt = len（class_members） borderseg = [[points [simplex，0] ，points [simplex，1]] for simplex in hull.simplices] return borderseg，X，labels，Xslice

，第二部分是用户级脚本，它调用第一个文件并生成上面的所有图：

＃！/ usr / bin / env python 从PIL import Image import numpy as np import matplotlib.pyplot as plt import matplotlib.cm as cm from findtree import findtree ＃要处理的映像文件 fname = ['nmzwj.png'，'aVZhC.png'， '2K9EF.png'，'YowlH.png'，'2y4o5.png'，'FWhSP.png'] ＃初始化数字 fgsz =（16,7） figthresh = plt.figure（figsize = fgsz，facecolor ='w'） figclust = plt.figure（figsize = fgsz，facecolor ='w'） figcltwo = plt.figure figsize = fgsz，facecolor ='w'） figborder = plt.figure（figsize = fgsz，facecolor ='w'） figthresh.canvas.set_window_title（'阈值HSV和单色亮度'） figclust.canvas.set_window_title（'DBSCAN Clusters（Raw Pixel Output）'） figcltwo.canvas.set_window_title（'DBSCAN Clusters（Slyly Dilated for Display）'） figborder.canvas.set_window_title树与边界'）为ii，名称在zip（范围（len（fname）），fname）：＃打开文件并转换为rgb image rgbimg = np.asarray（Image.open（name））＃获取树边框以及一堆其他中间值＃将用于说明算法的工作原理 borderseg，X，labels，Xslice = findtree（rgbimg）＃显示阈值图像 axthresh = figthresh.add_subplot（2,3，ii + 1） axthresh。 set_xticks（[]） axthresh.set_yticks（[]） binimg = np.zeros（（rgbimg.shape [0]，rgbimg.shape [1]）） for v，h in X： binimg [v，h] = 255 axthresh.imshow（binimg，interpolation ='nearest'，cmap ='Grays'）＃显示颜色编码集群 axclust = figclust.add_subplot（2,3，ii + 1）＃原始版本 axclust.set_xticks（[]） axclust.set_yticks（[]） axcltwo = figbtwo.add_subplot（2,3，ii + 1）＃仅显示 axcltwo.set_xticks（[]） axcltwo.set_yticks（[]） axcltwo.imshow（binimg， clustimg = np.ones（rgbimg.shape） unique_labels = set（labels）＃为每个集群生成一个唯一的颜色 plcol = cm.rainbow_r（np.linspace（0,1，len（unique_labels）））为lbl，pix在zip（标签，Xslice）：为col，unqlbl在zip ，unique_labels）： if lbl == unqlbl：＃集群标签-1表示没有集群成员资格; ＃覆盖默认颜色为黑色如果lbl == -1： col = [0.0，0.0，0.0，1.0] ＃ij的原始版本范围（3）： clustimg [pix [0]，pix [1]，ij] = col [ij] ＃只显示 axcltwo.plot pix [0]，'o'，markerfacecolor = col， markersize = 1，markeredgecolor = col） axclust.imshow（clustimg） axcltwo.set_xlim（0，binimg.shape [1 ] -1 axcltwo.set_ylim（binimg.shape [0]，-1）＃在树上绘制带有读取边框的原始图片 axborder = figborder.add_subplot对于vseg，hseg在bordereg： axborder中，为，，axborder.set_axis_off（） axborder.imshow（rgbimg，interpolation ='nearest' plot（hseg，vseg，'r-'，lw = 3） axborder.set_xlim（0，binimg.shape [1] -1） axborder.set_ylim（binimg.shape [0] 1） plt.show（）

Which image processing techniques could be used to implement an application that detects the christmas trees displayed in the following images?

I'm searching for solutions that are going to work on all these images. Therefore, approaches that require training haar cascade classifiers or template matching are not very interesting.

I'm looking for something that can be written in any programming language, as long as it uses only Open Source technologies. The solution must be tested with the images that are shared on this question. There are 6 input images and the answer should display the results of processing each of them. Finally, for each output image there must be red lines draw to surround the detected tree.

How would you go about programmatically detecting the trees in these images?

解决方案

I have an approach which I think is interesting and a bit different from the rest. The main difference in my approach, compared to some of the others, is in how the image segmentation step is performed--I used the DBSCAN clustering algorithm from Python's scikit-learn; it's optimized for finding somewhat amorphous shapes that may not necessarily have a single clear centroid.

At the top level, my approach is fairly simple and can be broken down into about 3 steps. First I apply a threshold (or actually, the logical "or" of two separate and distinct thresholds). As with many of the other answers, I assumed that the Christmas tree would be one of the brighter objects in the scene, so the first threshold is just a simple monochrome brightness test; any pixels with values above 220 on a 0-255 scale (where black is 0 and white is 255) are saved to a binary black-and-white image. The second threshold tries to look for red and yellow lights, which are particularly prominent in the trees in the upper left and lower right of the six images, and stand out well against the blue-green background which is prevalent in most of the photos. I convert the rgb image to hsv space, and require that the hue is either less than 0.2 on a 0.0-1.0 scale (corresponding roughly to the border between yellow and green) or greater than 0.95 (corresponding to the border between purple and red) and additionally I require bright, saturated colors: saturation and value must both be above 0.7. The results of the two threshold procedures are logically "or"-ed together, and the resulting matrix of black-and-white binary images is shown below:

You can clearly see that each image has one large cluster of pixels roughly corresponding to the location of each tree, plus a few of the images also have some other small clusters corresponding either to lights in the windows of some of the buildings, or to a background scene on the horizon. The next step is to get the computer to recognize that these are separate clusters, and label each pixel correctly with a cluster membership ID number.

For this task I chose DBSCAN. There is a pretty good visual comparison of how DBSCAN typically behaves, relative to other clustering algorithms, available here. As I said earlier, it does well with amorphous shapes. The output of DBSCAN, with each cluster plotted in a different color, is shown here:

There are a few things to be aware of when looking at this result. First is that DBSCAN requires the user to set a "proximity" parameter in order to regulate its behavior, which effectively controls how separated a pair of points must be in order for the algorithm to declare a new separate cluster rather than agglomerating a test point onto an already pre-existing cluster. I set this value to be 0.04 times the size along the diagonal of each image. Since the images vary in size from roughly VGA up to about HD 1080, this type of scale-relative definition is critical.

Another point worth noting is that the DBSCAN algorithm as it is implemented in scikit-learn has memory limits which are fairly challenging for some of the larger images in this sample. Therefore, for a few of the larger images, I actually had to "decimate" (i.e., retain only every 3rd or 4th pixel and drop the others) each cluster in order to stay within this limit. As a result of this culling process, the remaining individual sparse pixels are difficult to see on some of the larger images. Therefore, for display purposes only, the color-coded pixels in the above images have been effectively "dilated" just slightly so that they stand out better. It's purely a cosmetic operation for the sake of the narrative; although there are comments mentioning this dilation in my code, rest assured that it has nothing to do with any calculations that actually matter.

Once the clusters are identified and labeled, the third and final step is easy: I simply take the largest cluster in each image (in this case, I chose to measure "size" in terms of the total number of member pixels, although one could have just as easily instead used some type of metric that gauges physical extent) and compute the convex hull for that cluster. The convex hull then becomes the tree border. The six convex hulls computed via this method are shown below in red:

The source code is written for Python 2.7.6 and it depends on numpy, scipy, matplotlib and scikit-learn. I've divided it into two parts. The first part is responsible for the actual image processing:

from PIL import Image import numpy as np import scipy as sp import matplotlib.colors as colors from sklearn.cluster import DBSCAN from math import ceil, sqrt """ Inputs: rgbimg: [M,N,3] numpy array containing (uint, 0-255) color image hueleftthr: Scalar constant to select maximum allowed hue in the yellow-green region huerightthr: Scalar constant to select minimum allowed hue in the blue-purple region satthr: Scalar constant to select minimum allowed saturation valthr: Scalar constant to select minimum allowed value monothr: Scalar constant to select minimum allowed monochrome brightness maxpoints: Scalar constant maximum number of pixels to forward to the DBSCAN clustering algorithm proxthresh: Proximity threshold to use for DBSCAN, as a fraction of the diagonal size of the image Outputs: borderseg: [K,2,2] Nested list containing K pairs of x- and y- pixel values for drawing the tree border X: [P,2] List of pixels that passed the threshold step labels: [Q,2] List of cluster labels for points in Xslice (see below) Xslice: [Q,2] Reduced list of pixels to be passed to DBSCAN """ def findtree(rgbimg, hueleftthr=0.2, huerightthr=0.95, satthr=0.7, valthr=0.7, monothr=220, maxpoints=5000, proxthresh=0.04): # Convert rgb image to monochrome for gryimg = np.asarray(Image.fromarray(rgbimg).convert('L')) # Convert rgb image (uint, 0-255) to hsv (float, 0.0-1.0) hsvimg = colors.rgb_to_hsv(rgbimg.astype(float)/255) # Initialize binary thresholded image binimg = np.zeros((rgbimg.shape[0], rgbimg.shape[1])) # Find pixels with hue<0.2 or hue>0.95 (red or yellow) and saturation/value # both greater than 0.7 (saturated and bright)--tends to coincide with # ornamental lights on trees in some of the images boolidx = np.logical_and( np.logical_and( np.logical_or((hsvimg[:,:,0] < hueleftthr), (hsvimg[:,:,0] > huerightthr)), (hsvimg[:,:,1] > satthr)), (hsvimg[:,:,2] > valthr)) # Find pixels that meet hsv criterion binimg[np.where(boolidx)] = 255 # Add pixels that meet grayscale brightness criterion binimg[np.where(gryimg > monothr)] = 255 # Prepare thresholded points for DBSCAN clustering algorithm X = np.transpose(np.where(binimg == 255)) Xslice = X nsample = len(Xslice) if nsample > maxpoints: # Make sure number of points does not exceed DBSCAN maximum capacity Xslice = X[range(0,nsample,int(ceil(float(nsample)/maxpoints)))] # Translate DBSCAN proximity threshold to units of pixels and run DBSCAN pixproxthr = proxthresh * sqrt(binimg.shape[0]**2 + binimg.shape[1]**2) db = DBSCAN(eps=pixproxthr, min_samples=10).fit(Xslice) labels = db.labels_.astype(int) # Find the largest cluster (i.e., with most points) and obtain convex hull unique_labels = set(labels) maxclustpt = 0 for k in unique_labels: class_members = [index[0] for index in np.argwhere(labels == k)] if len(class_members) > maxclustpt: points = Xslice[class_members] hull = sp.spatial.ConvexHull(points) maxclustpt = len(class_members) borderseg = [[points[simplex,0], points[simplex,1]] for simplex in hull.simplices] return borderseg, X, labels, Xslice

and the second part is a user-level script which calls the first file and generates all of the plots above:

#!/usr/bin/env python from PIL import Image import numpy as np import matplotlib.pyplot as plt import matplotlib.cm as cm from findtree import findtree # Image files to process fname = ['nmzwj.png', 'aVZhC.png', '2K9EF.png', 'YowlH.png', '2y4o5.png', 'FWhSP.png'] # Initialize figures fgsz = (16,7) figthresh = plt.figure(figsize=fgsz, facecolor='w') figclust = plt.figure(figsize=fgsz, facecolor='w') figcltwo = plt.figure(figsize=fgsz, facecolor='w') figborder = plt.figure(figsize=fgsz, facecolor='w') figthresh.canvas.set_window_title('Thresholded HSV and Monochrome Brightness') figclust.canvas.set_window_title('DBSCAN Clusters (Raw Pixel Output)') figcltwo.canvas.set_window_title('DBSCAN Clusters (Slightly Dilated for Display)') figborder.canvas.set_window_title('Trees with Borders') for ii, name in zip(range(len(fname)), fname): # Open the file and convert to rgb image rgbimg = np.asarray(Image.open(name)) # Get the tree borders as well as a bunch of other intermediate values # that will be used to illustrate how the algorithm works borderseg, X, labels, Xslice = findtree(rgbimg) # Display thresholded images axthresh = figthresh.add_subplot(2,3,ii+1) axthresh.set_xticks([]) axthresh.set_yticks([]) binimg = np.zeros((rgbimg.shape[0], rgbimg.shape[1])) for v, h in X: binimg[v,h] = 255 axthresh.imshow(binimg, interpolation='nearest', cmap='Greys') # Display color-coded clusters axclust = figclust.add_subplot(2,3,ii+1) # Raw version axclust.set_xticks([]) axclust.set_yticks([]) axcltwo = figcltwo.add_subplot(2,3,ii+1) # Dilated slightly for display only axcltwo.set_xticks([]) axcltwo.set_yticks([]) axcltwo.imshow(binimg, interpolation='nearest', cmap='Greys') clustimg = np.ones(rgbimg.shape) unique_labels = set(labels) # Generate a unique color for each cluster plcol = cm.rainbow_r(np.linspace(0, 1, len(unique_labels))) for lbl, pix in zip(labels, Xslice): for col, unqlbl in zip(plcol, unique_labels): if lbl == unqlbl: # Cluster label of -1 indicates no cluster membership; # override default color with black if lbl == -1: col = [0.0, 0.0, 0.0, 1.0] # Raw version for ij in range(3): clustimg[pix[0],pix[1],ij] = col[ij] # Dilated just for display axcltwo.plot(pix[1], pix[0], 'o', markerfacecolor=col, markersize=1, markeredgecolor=col) axclust.imshow(clustimg) axcltwo.set_xlim(0, binimg.shape[1]-1) axcltwo.set_ylim(binimg.shape[0], -1) # Plot original images with read borders around the trees axborder = figborder.add_subplot(2,3,ii+1) axborder.set_axis_off() axborder.imshow(rgbimg, interpolation='nearest') for vseg, hseg in borderseg: axborder.plot(hseg, vseg, 'r-', lw=3) axborder.set_xlim(0, binimg.shape[1]-1) axborder.set_ylim(binimg.shape[0], -1) plt.show()