下面是我的实现在Scala中合并排序的:
对象FuncSort { DEF合并(L:流[INT],R:流[INT]):流[INT] = { (L,R)的比赛{ 案例(H#:: T,空)=>升 案例(空,H#:: T)=> ř 情况下(X#:: XS,Y#:: YS)=>如果(X< Y)x#::合并(XS,R),否则Y#::合并(L,YS) } } 高清排序(XS:流[INT]):流[INT] = { 如果(xs.length == 1)XS 其他 { VAL M = xs.length / 2 VAL(L,R)= xs.splitAt(米) 合并(排序(L),排序(R)) } } }
它工作正常,似乎渐近这是很好很好,但它的方式从这里的 algs4.cs.princeton.edu/22mergesort/Merge.java.html ,并使用了大量的内存。有没有更快的实现合并排序是的功能的?显然,这是可能的端口Java版本一行行,但是这不是我要找的。
UPD:我已经改变了流到列表和#:: 到 :: 和排序例程变得越来越快,只有三到四倍比Java版本慢。但我不明白为什么没有与堆栈溢出崩溃? 合并不是尾递归,所有参数都严格评估......这怎么可能?
解决方案您已经提出了多个问题。我试着回答这些问题的逻辑顺序:
的流版本没有堆栈溢出您真的不问这个,但它导致了一些有趣的发现。
在流版本使用的是#::合并(...)的合并函数内。通常,这将是一个递归调用,并可能导致对足够大的输入数据堆栈溢出。但不是在这种情况下。操作#::( A,B)在类ConsWrapper [A] (有一个隐含的转换来实现),是一个同义词 cons.apply [A](高清:A,TL:⇒流[A]):缺点[A] 。正如你所看到的,第二个参数是按名称调用,这意味着它是懒洋洋地评估。
这意味着合并返回类型的新创建的对象缺点这最终将调用再次合并。换句话说:递归不会发生在栈上,但在堆上。通常你有足够的堆。
使用堆递归是一个很好的方法来处理非常深的递归。但它比使用堆栈慢得多。所以,你交易的速度递归深度。这是最主要的原因,所以使用流是如此之慢。
第二个原因是,对于获得流的长度,Scala有兑现的整个流 。但在排序流反正它会兑现每一个元素,所以这不会伤害很大。
在列表中的版本号堆栈溢出当你改变流列表中,您确实使用的堆栈递归。现在,一个堆栈溢出可能发生。但随着排序,你平时有日志(大小)一个递归深度,基地通常对数 2 。因此,到4十亿输入项目进行排序,你需要一个约32堆栈帧。至少为320K一个默认的堆栈大小(在Windows,其他系统有较大的默认值),这个地方留下了很多递归的,因此对于大量输入数据进行排序。
更快职能部门的实施这取决于: - )
您应该使用栈,而不是堆的递归。而且你应该决定你的策略取决于输入数据:
不要使用交换和使用您的缓存。使用可变数据结构,如果你能和排序到位。我认为,功能性和快速排序不能正常工作很好地结合在一起。为了使分选的真快,你将不得不使用状态操作(如就地归并的可变数组)。
我通常会尝试这对我所有的计划:用纯函数的风格尽可能而是使用状态操作的小零件可行时(例如,因为它具有更好的性能或code只是要处理大量的状态并成为更好的可读性,当我使用 VAR 的S代替 VAL S)。
Here's my implementation of merge sort in Scala:
object FuncSort { def merge(l: Stream[Int], r: Stream[Int]) : Stream[Int] = { (l, r) match { case (h #:: t, Empty) => l case (Empty, h #:: t) => r case (x #:: xs, y #:: ys) => if(x < y ) x #:: merge(xs, r) else y #:: merge(l, ys) } } def sort(xs: Stream[Int]) : Stream[Int] = { if(xs.length == 1) xs else { val m = xs.length / 2 val (l, r) = xs.splitAt(m) merge(sort(l), sort(r)) } } }It works correctly and it seems that asymptotically it is fine as well but it is way slower (approx 10 times) than Java implementation from here algs4.cs.princeton.edu/22mergesort/Merge.java.html and uses a lot of memory. Is there a faster implementation of merge sort which is functional? Obviously, it's possible to port Java version line by line but that's not what I'm looking for.
UPD: I've changed Stream to List and #:: to :: and the sorting routine became faster, only three to four times slower than Java version. But I don't understand why doesn't it crashes with stack overflow? merge isn't tail-recursive, all arguments are strictly evaluated...how is it possible?
解决方案You have raised multiple questions. I try to answer them in a logical order:
No stack overflow in the Stream versionYou did not really ask this one, but it leads to some interesting observations.
In the Stream version you are using #:: merge(...) inside the merge function. Usually this would be a recursive call and might lead to a stack overflow for big enough input data. But not in this case. The operator #::(a,b) is implemented in class ConsWrapper[A] (there is an implicit conversion) and is a synonym for cons.apply[A](hd: A, tl: ⇒ Stream[A]): Cons[A]. As you can see, the second argument is call by name, meaning it is evaluated lazily.
That means merge returns a newly created object of type cons which will eventually call merge again. In other words: The recursion does not happen on the stack, but on the heap. And usually you have plenty of heap.
Using the heap for recursion is a nice technique to handle very deep recursions. But it is much slower than using the stack. So you traded speed for recursion depth. This is the main reason, why using Stream is so slow.
The second reason is, that for getting the length of the Stream, Scala has to materialize the whole Stream. But during sorting the Stream it would have to materialize each element anyway, so this does not hurt very much.
No stack overflow in List versionWhen you are changing Stream for List, you are indeed using the stack for recursion. Now a Stack overflow could happen. But with sorting, you usually have a recursion depth of log(size), usually the logarithm of base 2. So to sort 4 billion input items, you would need a about 32 stack frames. With a default stack size of at least 320k (on Windows, other systems have larger defaults), this leaves place for a lot of recursions and hence for lots of input data to be sorted.
Faster functional implementationIt depends :-)
You should use the stack, and not the heap for recursion. And you should decide your strategy depending on the input data:
Don't use swap and use your caches. Use mutable data structures if you can and sort in place. I think that functional and fast sorting does not work very well together. To make sorting really fast, you will have to use stateful operations (e.g. in-place mergesort on mutable arrays).
I usually try this on all my programs: Use pure functional style as far as possible but use stateful operations for small parts when feasible (e.g. because it has better performance or the code just has to deal with lots of states and becomes much better readable when I use vars instead of vals).
更多推荐
快速功能合并排序
发布评论