我正在测试算法并遇到这种奇怪的行为,当 std :: accumulate 比简单的 for 循环。
I was testing algorithms and run into this weird behavior, when std::accumulate is faster than a simple for cycle.
看着生成的汇编程序,我不太明智:-)看来,用于循环被优化为MMX指令,而累加扩展为循环。
Looking at the generated assembler I'm not much wiser :-) It seems that the for cycle is optimized into MMX instructions, while accumulate expands into a loop.
这是代码。该行为体现在 -O3 优化级别,gcc 4.7.1
This is the code. The behavior manifests with -O3 optimization level, gcc 4.7.1
#include <vector> #include <chrono> #include <iostream> #include <random> #include <algorithm> using namespace std; int main() { const size_t vsize = 100*1000*1000; vector<int> x; x.reserve(vsize); mt19937 rng; rng.seed(chrono::system_clock::to_time_t(chrono::system_clock::now())); uniform_int_distribution<uint32_t> dist(0,10); for (size_t i = 0; i < vsize; i++) { x.push_back(dist(rng)); } long long tmp = 0; for (size_t i = 0; i < vsize; i++) { tmp += x[i]; } cout << "dry run " << tmp << endl; auto start = chrono::high_resolution_clock::now(); long long suma = accumulate(x.begin(),x.end(),0); auto end = chrono::high_resolution_clock::now(); cout << "Accumulate runtime " << chrono::duration_cast<chrono::nanoseconds>(end-start).count() << " - " << suma << endl; start = chrono::high_resolution_clock::now(); suma = 0; for (size_t i = 0; i < vsize; i++) { suma += x[i]; } end = chrono::high_resolution_clock::now(); cout << "Manual sum runtime " << chrono::duration_cast<chrono::nanoseconds>(end-start).count() << " - " << suma << endl; return 0; }推荐答案
当您通过 0 进行累加,您正在使用int而不是long long进行累加。
When you pass the 0 to accumulate, you are making it accumulate using an int instead of a long long.
如果您对手动循环进行编码这样,就等于:
If you code your manual loop like this, it will be equivalent:
int sumb = 0; for (size_t i = 0; i < vsize; i++) { sumb += x[i]; } suma = sumb;或者您可以这样调用累加:
or you can call accumulate like this:
long long suma = accumulate(x.begin(),x.end(),0LL);更多推荐
为什么积累比简单的周期要快?
发布评论