优化归并排序比快速排序更快

编程入门行业动态更新时间:2024-10-10 13:14:18

本文介绍了优化归并排序比快速排序更快的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

jsperf/optimized-mergesort-versus-quicksort

为什么这种半缓冲区归并排序的工作速度和快速排序一样快?

Why does this half buffer merge sort work as fast as quicksort?

快速排序是:

In-Place 虽然它占用了 log(n) 递归(堆栈空间)

缓存友好

这个半缓冲区归并排序:

This half buffer merge sort:

使用 n/2 缓冲区进行合并.

使用 log(n) 递归.

减少比较.

Uses an n/2 Buffer to do merges.

Uses log(n) recursions.

Makes fewer comparisons.

我的问题是，在这种情况下，为什么半缓冲区归并排序与 QuickSort 的速度匹配?另外，是不是我对 quickSort 做错了什么导致它变慢了?

My question is, why is the half buffer merge sort matching the speed of QuickSort in this scenario? Plus, is there anything I'm doing wrong to the quickSort that makes it slower?

推荐答案

合并排序比快速排序执行的比较少，但移动多.必须调用函数来进行比较会增加比较的开销，从而使快速排序变慢.示例快速排序中的所有 if 语句也减慢了速度.如果比较和交换是内联完成的，那么如果对伪随机整数数组进行排序，则快速排序应该会快一些.

Merge sort does fewer compares, but more moves than quick sort. Having to call a function to do the compares increases the overhead for compares, which makes quick sort slower. All those if statements in the example quick sort is also slowing it down. If the compare and swap are done inline, then quick sort should be a bit faster if sorting an array of pseudo random integers.

如果在具有 16 个寄存器的处理器上运行，例如 64 位模式的 PC，那么使用以寄存器结尾的一堆指针的 4 路归并排序与快速排序差不多.2路归并排序每移动一个元素平均进行1次比较，而4路归并排序每移动一个元素平均进行3次比较，但只需要1/2的pass次数，所以基本操作的次数是一样的，但是比较对缓存更友好，使 4 路归并排序快 15%，与快速排序大致相同.

If running on a processor with 16 registers, such a PC in 64 bit mode, then 4 way merge sort using a bunch of pointers that end up in registers is about as fast as quick sort. A 2 way merge sort averages 1 compare for each element moved, while a 4 way merge sort averages 3 compares for each element moved, but only takes 1/2 the number of passes, so the number of basic operations is the same, but the compares are a bit more cache friendly, making the 4 way merge sort about 15% faster, about the same as quick sort.

我不熟悉 java 脚本，所以我将示例转换为 C++.

I'm not familiar with java script, so I'm converting the examples to C++.

使用java脚本合并排序的转换版本，对1600万个伪随机32位整数进行排序大约需要2.4秒.下面显示的示例快速排序大约需要 1.4 秒，下面显示的示例自下而上合并排序大约需要 1.6 秒.如前所述，在具有 16 个寄存器的处理器上使用一堆指针(或索引)进行 4 路合并也需要大约 1.4 秒.

Using a converted version of the java script merge sort, it takes about 2.4 seconds to sort 16 million pseudo random 32 bit integers. The example quick sort shown below takes about 1.4 seconds, and the example bottom up merge shown below sort about 1.6 seconds. As mentioned, a 4 way merge using a bunch of pointers (or indices) on a processor with 16 registers would also take about 1.4 seconds.

C++ 快速排序示例:

C++ quick sort example:

void QuickSort(int a[], int lo, int hi) { int i = lo, j = hi; int pivot = a[(lo + hi) / 2]; int t; while (i <= j) { // partition while (a[i] < pivot) i++; while (a[j] > pivot) j--; if (i <= j) { t = a[i] a[i] = a[j]; a[j] = t; i++; j--; } } if (lo < j) // recurse QuickSort(a, lo, j); if (i < hi) QuickSort(a, i, hi); }

C++ 自底向上归并排序示例:

C++ bottom up merge sort example:

void BottomUpMergeSort(int a[], int b[], size_t n) { size_t s = 1; // run size if(GetPassCount(n) & 1){ // if odd number of passes for(s = 1; s < n; s += 2) // swap in place for 1st pass if(a[s] < a[s-1]) std::swap(a[s], a[s-1]); s = 2; } while(s < n){ // while not done size_t ee = 0; // reset end index while(ee < n){ // merge pairs of runs size_t ll = ee; // ll = start of left run size_t rr = ll+s; // rr = start of right run if(rr >= n){ // if only left run rr = n; BottomUpCopy(a, b, ll, rr); // copy left run break; // end of pass } ee = rr+s; // ee = end of right run if(ee > n) ee = n; BottomUpMerge(a, b, ll, rr, ee); } std::swap(a, b); // swap a and b s <<= 1; // double the run size } } void BottomUpMerge(int a[], int b[], size_t ll, size_t rr, size_t ee) { size_t o = ll; // b[] index size_t l = ll; // a[] left index size_t r = rr; // a[] right index while(1){ // merge data if(a[l] <= a[r]){ // if a[l] <= a[r] b[o++] = a[l++]; // copy a[l] if(l < rr) // if not end of left run continue; // continue (back to while) while(r < ee) // else copy rest of right run b[o++] = a[r++]; break; // and return } else { // else a[l] > a[r] b[o++] = a[r++]; // copy a[r] if(r < ee) // if not end of right run continue; // continue (back to while) while(l < rr) // else copy rest of left run b[o++] = a[l++]; break; // and return } } } void BottomUpCopy(int a[], int b[], size_t ll, size_t rr) { while(ll < rr){ // copy left run b[ll] = a[ll]; ll++; } } size_t GetPassCount(size_t n) // return # passes { size_t i = 0; for(size_t s = 1; s < n; s <<= 1) i += 1; return(i); }

使用指针的 4 路归并排序的 C++ 示例(goto 用于节省代码空间，它是旧代码).它开始进行 4 路合并，然后当到达运行结束时，它切换到 3 路合并，然后是 2 路合并，然后是剩余运行剩余部分的副本.这类似于用于外部排序的算法，但外部排序逻辑更通用，通常最多可处理 16 路合并.

C++ example of 4 way merge sort using pointers (goto's used to save code space, it's old code). It starts off doing 4 way merge, then when the end of a run is reached, it switches to 3 way merge, then 2 way merge, then a copy of what's left of the remaining run. This is similar to algorithms used for external sorts, but external sort logic is more generalized and often handles up to 16 way merges.

int * BottomUpMergeSort(int a[], int b[], size_t n) { int *p0r; // ptr to run 0 int *p0e; // ptr to end run 0 int *p1r; // ptr to run 1 int *p1e; // ptr to end run 1 int *p2r; // ptr to run 2 int *p2e; // ptr to end run 2 int *p3r; // ptr to run 3 int *p3e; // ptr to end run 3 int *pax; // ptr to set of runs in a int *pbx; // ptr for merged output to b size_t rsz = 1; // run size if(n < 2) return a; if(n == 2){ if(a[0] > a[1])std::swap(a[0],a[1]); return a; } if(n == 3){ if(a[0] > a[2])std::swap(a[0],a[2]); if(a[0] > a[1])std::swap(a[0],a[1]); if(a[1] > a[2])std::swap(a[1],a[2]); return a; } while(rsz < n){ pbx = &b[0]; pax = &a[0]; while(pax < &a[n]){ p0e = rsz + (p0r = pax); if(p0e >= &a[n]){ p0e = &a[n]; goto cpy10;} p1e = rsz + (p1r = p0e); if(p1e >= &a[n]){ p1e = &a[n]; goto mrg201;} p2e = rsz + (p2r = p1e); if(p2e >= &a[n]){ p2e = &a[n]; goto mrg3012;} p3e = rsz + (p3r = p2e); if(p3e >= &a[n]) p3e = &a[n]; // 4 way merge while(1){ if(*p0r <= *p1r){ if(*p2r <= *p3r){ if(*p0r <= *p2r){ mrg40: *pbx++ = *p0r++; // run 0 smallest if(p0r < p0e) // if not end run continue continue; goto mrg3123; // merge 1,2,3 } else { mrg42: *pbx++ = *p2r++; // run 2 smallest if(p2r < p2e) // if not end run continue continue; goto mrg3013; // merge 0,1,3 } } else { if(*p0r <= *p3r){ goto mrg40; // run 0 smallext } else { mrg43: *pbx++ = *p3r++; // run 3 smallest if(p3r < p3e) // if not end run continue continue; goto mrg3012; // merge 0,1,2 } } } else { if(*p2r <= *p3r){ if(*p1r <= *p2r){ mrg41: *pbx++ = *p1r++; // run 1 smallest if(p1r < p1e) // if not end run continue continue; goto mrg3023; // merge 0,2,3 } else { goto mrg42; // run 2 smallest } } else { if(*p1r <= *p3r){ goto mrg41; // run 1 smallest } else { goto mrg43; // run 3 smallest } } } } // 3 way merge mrg3123: p0r = p1r; p0e = p1e; mrg3023: p1r = p2r; p1e = p2e; mrg3013: p2r = p3r; p2e = p3e; mrg3012: while(1){ if(*p0r <= *p1r){ if(*p0r <= *p2r){ *pbx++ = *p0r++; // run 0 smallest if(p0r < p0e) // if not end run continue continue; goto mrg212; // merge 1,2 } else { mrg32: *pbx++ = *p2r++; // run 2 smallest if(p2r < p2e) // if not end run continue continue; goto mrg201; // merge 0,1 } } else { if(*p1r <= *p2r){ *pbx++ = *p1r++; // run 1 smallest if(p1r < p1e) // if not end run continue continue; goto mrg202; // merge 0,2 } else { goto mrg32; // run 2 smallest } } } // 2 way merge mrg212: p0r = p1r; p0e = p1e; mrg202: p1r = p2r; p1e = p2e; mrg201: while(1){ if(*p0r <= *p1r){ *pbx++ = *p0r++; // run 0 smallest if(p0r < p0e) // if not end run continue continue; goto cpy11; } else { *pbx++ = *p1r++; // run 1 smallest if(p1r < p1e) // if not end run continue continue; goto cpy10; } } // 1 way copy cpy11: p0r = p1r; p0e = p1e; cpy10: while (1) { *pbx++ = *p0r++; // copy element if (p0r < p0e) // if not end of run continue continue; break; } pax += rsz << 2; // setup for next set of runs } std::swap(a, b); // swap ptrs rsz <<= 2; // quadruple run size } return a; // return sorted array }

更多推荐

优化归并排序比快速排序更快

本文发布于:2023-11-29 06:57:56，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1645673.html