二进制搜索效率与fortran中的线性搜索效率

编程入门 行业动态 更新时间:2024-10-15 02:24:47
本文介绍了二进制搜索效率与fortran中的线性搜索效率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

这个问题是关于线性搜索的效率与二进制搜索在连续存储中预先排序的数组效率的比较......

我有用fortran编写的应用程序(77!)。我的部分代码的一个常见操作是在数组中找到索引,使得 gx(i)<= xin < GX(I + 1)。我目前已经将它作为二元搜索实现 - 抱歉声明标签和 goto - 我已经评论什么等同的语句将使用fortran 90 ...

i = 1 ih = nx / 2 $如果((xin.le.gx(i))。和。(xin.gt.gx(i + 1)))b $ b 201继续!do while(.true。)然后!找到我们想要 ilow = i + 1; ihigh = i s1 =(gx(ihigh)-xin)/(gx(ihigh)-gx(ilow)) s2 = 1.0-s1 返回 endif if(i.ge.ih)then goto 202!exit $ b $ endif if(xin.le.(gx(ih))then!xin位于数组$的后半部分b $ bi = ih ih = nx-(nx-ih)/ 2 else!xin在阵列的前半部分i = i + 1 ih = i +(ih -i)/ 2 endif goto 201!enddo

然而,今天,我在阅读关于二进制搜索的维基百科,并且遇到了这个问题:

二进制搜索可能与内存层次结构 $(即缓存),因为它的随机访问性质,对于的内存中搜索,如果要搜索的跨度很小,线性搜索可能具有优越的性能,因为表现出更好的参考位置

我并不完全理解这种说法 - 我的印象是缓存提取一次收集到大块(ish)块中,所以如果我们从数组的开始开始,我认为大部分这个数组已经存在于缓存中了(至少和线性搜索一样),所以我不认为这是重要的。

所以我的问题是,有什么办法可以告诉哪个算法性能更好(线性或二分搜索?)是否有数组大小的边界?我目前使用的数组大小约为100个元素...

解决方案

对于小型数组,问题不是缓存。您是对的:一个小数组可能会被快速缓存。

问题是分支预测很可能在二分搜索中失败,因为分支被随机取出或跳过以数据依赖的方式。分支预测未命中会阻塞CPU管道。

这种影响可能很严重。您可以在执行单个二进制搜索分支所需的同一时间内线性搜索3到8个元素(并且您需要执行多个二进制搜索分支)。确切的盈亏平衡点需要衡量。

暂停CPU管道非常昂贵。 Core i7可以在每个时钟周期内退出多达4条指令(3 GHz时每秒12千兆指令!)。但是,只有在你没有拖延的情况下。

使用条件移动CPU指令进行二分搜索有无分支算法。这些算法基本上展开了32个搜索步骤,并在每个步骤中使用 CMOV (32步是理论最大值)。它们是无分支的,但不是免费的:每个下一步都取决于前一个步骤,因此CPU不能在指令流中提前收费。它必须等待所有的时间。所以他们不能解决这个问题,只能稍微改进。

This question is about the efficiency of a linear search vs. the efficiency of a binary search for a pre-sorted array in contiguous storage...

I have an application written in fortran (77!). One frequent operation for my part of the code is to find the index in an array such that gx(i) <= xin < gx(i+1). I've currently implemented this as a binary search -- sorry for the statement labels and goto -- I've commented what the equivalent statments would be using fortran 90...

i=1 ih=nx/2 201 continue !do while (.true.) if((xin.le.gx(i)).and.(xin.gt.gx(i+1)))then !found what we want ilow=i+1; ihigh=i s1=(gx(ihigh)-xin)/(gx(ihigh)-gx(ilow)) s2=1.0-s1 return endif if(i.ge.ih)then goto 202 !exit endif if(xin.le.(gx(ih))then !xin is in second half of array i=ih ih=nx-(nx-ih)/2 else !xin is in first half of array i=i+1 ih=i+(ih-i)/2 endif goto 201 !enddo

However, today, I was reading on Wikipedia about binary search and I came across this:

Binary search can interact poorly with the memory hierarchy (i.e. caching), because of its random-access nature. For in-memory searching, if the span to be searched is small, a linear search may have superior performance simply because it exhibits better locality of reference.

I don't completely understand this statement -- my impression was that cache fetches were gathered in large(ish) chunks at a time, so if we start at the beginning of the array, I thought that most of the array would be in cache already (at least as much as it would be for a linear search), so I didn't think that would matter.

So my question is, is there any way to tell which algorithm will perform better (linear or binary search?) Is there an array size boundary? I'm currently using arrays of size around 100 elements...

解决方案

For small arrays, the problem is not cache. You are right: A small array is likely to be cached quickly.

The problem is that branch prediction is likely to fail for binary search because branches are taken or skipped at random in a data-dependent way. Branch prediction misses stall the CPU pipeline.

This effect can be severe. You can easily search 3 to 8 elements linearly in the same time it takes to do a single binary search branch (and you need to do multiple binary search branches). The exact break even point needs to be measured.

Stalling the CPU pipeline is extremely expensive. A Core i7 can retire up to 4 instructions per clock cycle (12 giga-instructions per second at 3 GHz!). But only, if you are not stalling.

There are branch-free algorithms doing binary search by using conditional-move CPU instructions. These algorithms basically unroll 32 search steps and use a CMOV in each step (32 steps are the theoretical maximum). They are branch-free but not stall free: Each next step depends 100% on the previous one so the CPU cannot charge ahead in the instruction stream. It has to wait all the time. So they don't solve this problem, only improve it slightly.

更多推荐

二进制搜索效率与fortran中的线性搜索效率

本文发布于:2023-11-29 11:18:42,感谢您对本站的认可!
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:效率   线性   fortran

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!