我刚刚完成实施 kd树做快速近邻搜索。我感兴趣的是比href="en.wikipedia/wiki/Euclidean_distance" rel="nofollow">欧氏距离的
我有两个问题:
您挂绝对可以推广到其他距离度量维基百科的页面上所描述的最近邻搜索过程,为您提供等效的几何取代超球面对象为给定的指标,并测试每个超平面的交叉与此对象。
例如:如果您使用的是曼哈顿距离,而不是(即矢量组件的所有差值的绝对值之和),你的超球会成为(多维)的钻石。 (这是最简单的二维可视化 - 如果您当前的最近的邻居是距离的 X 的从查询点的 P 的,那么后面一个不同的超平面更近的邻居必须相交钻石形状,其具有高度和宽度2倍,并为中心的 P 的)。这可能使超平面交叉测试更难code或运行缓慢,但总的原则仍然适用。
I just finished implementing a kd-tree for doing fast nearest neighbor searches. I'm interested in playing around with different distance metrics other than the Euclidean distance. My understanding of the kd-tree is that the speedy kd-tree search is not guaranteed to give exact searches if the metric is non-Euclidean, which means that I might need to implement a new data structure and search algorithm if I want to try out new metrics for my search.
I have two questions:
解决方案
The nearest-neighbour search procedure described on the Wikipedia page you linked to can certainly be generalised to other distance metrics, provided you replace "hypersphere" with the equivalent geometrical object for the given metric, and test each hyperplane for crossings with this object.
Example: if you are using the Manhattan distance instead (i.e. the sum of the absolute values of all differences in vector components), your hypersphere would become a (multidimensional) diamond. (This is easiest to visualise in 2D -- if your current nearest neighbour is at distance x from the query point p, then any closer neighbour behind a different hyperplane must intersect a diamond shape that has width and height 2x and is centred on p). This might make the hyperplane-crossing test more difficult to code or slower to run, however the general principle still applies.
更多推荐
我可以使用任意的度量标准搜索KD树?
发布评论