IPP 6,7和8的性能比较(Performance comparison of IPP 6, 7 and 8)

编程入门 行业动态 更新时间:2024-10-28 03:29:15
IPP 6,7和8的性能比较(Performance comparison of IPP 6, 7 and 8)

我拥有IPP 6,现在我查了已经有IPP 8了。 在最新的CPU上比较IPP 6,7和8有什么基准吗? 特别适用于1D基本操作(mul,add,complex),FFT和IIR滤波。

I own IPP 6, now I checked there is already IPP 8 available. Are there any benchmarks for comparing IPP 6, 7 and 8 on the newest CPUs? Particularly for 1D basic ops (mul, add, complex), FFT and IIR filtering.

最满意答案

你可以自己做实验。 IPP提供了性能测量实用程序,通常在ipp \ tools \ perfsys目录中为“ps * .exe”。 在IPP 6.x时很难说它是怎么回事,但它应该是类似的。 对于不同的CPU优化,“ps * .exe”可执行文件允许根据每个元素的时钟来衡量特定的IPP功能性能(当然,越低越好)。 这些性能的基本选项。 测试是“ - ?”,“ - e”表示测试中的所有功能,“ - T”仅打开特定的CPU优化,“ - r”将输出保存到csv文件中。

假设您要为AVX,SSE41和SSE3 CPU测量ippsIIR64f_32s_Sfs函数。 您需要启动ps_ipps.exe(这是一维域性能测试)三次:

ps_ipps.exe -fippsIIR64f_32s_Sfs -B -R -TAVX (you'll get csv file with AVX optimization results) ps_ipps.exe -fippsIIR64f_32s_Sfs -B -R -TSSE41 (SSE4.1 perf. data will be appended to csv) ps_ipps.exe -fippsIIR64f_32s_Sfs -B -R -TSSE3" (SSE3 performance data will be appended).

然后grep csv文件用于所需的函数/参数组合,例如

find "ippsIIR64f,32s,Sfs,32768,6,numBq_DF1" ps_ipps.csv

例如,我明白了

ippsIIR64f,32s,Sfs,32768,6,numBq_DF1,-,-,0,nLps=2048,1.30,cpMac,512,- ippsIIR64f,32s,Sfs,32768,6,numBq_DF1,-,-,0,nLps=8,1.56,cpMac,613,- ippsIIR64f,32s,Sfs,32768,6,numBq_DF1,-,-,0,nLps=4,5.61,cpMac,2.21e+003,-

这意味着,SSE3为5.61个时钟,SSE4.1为1.56个时钟,AVX为1.30个时钟。 CPU必须支持您要测量的最高指令集。 至于IPP 7和8,您可以从英特尔网站下载“试用和购买”版本的英特尔产品(Composer或Parallel Studio)来做基准测试。

You can do experiments yourself. IPP is supplied with performance measurement utility, usually "ps*.exe" in ipp\tools\perfsys directory. It's hard to say how it was at time of IPP 6.x, but it should be similar. The "ps*.exe" executable files allow to measure specific IPP function performance in terms of clocks-per-element (the lower the better, of course) for different CPU optimizations. The basic options for these perf. tests are "-?", "-e" shows all functions within test, "-T" turns on specific CPU optimization only, "-r" saves output into csv file.

Suppose, you want to measure ippsIIR64f_32s_Sfs function for AVX, SSE41 and SSE3 CPUs. You need to start ps_ipps.exe (which is 1D domain performance test) three times:

ps_ipps.exe -fippsIIR64f_32s_Sfs -B -R -TAVX (you'll get csv file with AVX optimization results) ps_ipps.exe -fippsIIR64f_32s_Sfs -B -R -TSSE41 (SSE4.1 perf. data will be appended to csv) ps_ipps.exe -fippsIIR64f_32s_Sfs -B -R -TSSE3" (SSE3 performance data will be appended).

Then grep csv file for required function/argument combination, e.g.

find "ippsIIR64f,32s,Sfs,32768,6,numBq_DF1" ps_ipps.csv

For example, I get

ippsIIR64f,32s,Sfs,32768,6,numBq_DF1,-,-,0,nLps=2048,1.30,cpMac,512,- ippsIIR64f,32s,Sfs,32768,6,numBq_DF1,-,-,0,nLps=8,1.56,cpMac,613,- ippsIIR64f,32s,Sfs,32768,6,numBq_DF1,-,-,0,nLps=4,5.61,cpMac,2.21e+003,-

That means, 5.61 clocks for SSE3, 1.56 clocks for SSE4.1 and 1.30 clocks for AVX. You CPU must support the highest instruction set, which you want to measure. As for IPP 7 and 8, you can download "try-and-buy" versions of Intel products (Composer or Parallel Studio) from Intel site to do benchmarks.

更多推荐

本文发布于:2023-07-28 01:09:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1298208.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:性能   IPP   comparison   Performance

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!