快速平方根优化?

编程入门 行业动态 更新时间:2024-10-18 12:21:27
本文介绍了快速平方根优化?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

如果您查看此非常漂亮的页面:

If you check this very nice page:

www.codeproject. com/Articles/69941/最佳平方根方法-算法功能-Precisi

您将看到此程序:

#define SQRT_MAGIC_F 0x5f3759df float sqrt2(const float x) { const float xhalf = 0.5f*x; union // get bits for floating value { float x; int i; } u; u.x = x; u.i = SQRT_MAGIC_F - (u.i >> 1); // gives initial guess y0 return x*u.x*(1.5f - xhalf*u.x*u.x);// Newton step, repeating increases accuracy }

我的问题是:为什么没有将其实现为以下任何特定原因:

My question is: Is there any particular reason why this isn't implemented as:

#define SQRT_MAGIC_F 0x5f3759df float sqrt2(const float x) { union // get bits for floating value { float x; int i; } u; u.x = x; u.i = SQRT_MAGIC_F - (u.i >> 1); // gives initial guess y0 const float xux = x*u.x; return xux*(1.5f - .5f*xux*u.x);// Newton step, repeating increases accuracy }

从拆卸中,我看到的少了一个MUL.完全没有出现xhalf的目的吗?

As, from disassembly, I see one MUL less. Is there any purpose to having xhalf appear at all?

推荐答案

当乘数在最后一行链接在一起作为中间结果并保持不变时,使用80位寄存器的传统浮点数学可能更准确.在80位寄存器中.

It could be that legacy floating point math, which used 80 bit registers, was more accurate when the multipliers where linked together in the last line as intermediate results where kept in 80 bit registers.

上层实现中的第一个乘法与随后的整数数学并行发生,它们使用不同的执行资源. 另一方面,第二个函数看起来更快,但是由于上述原因,很难确定它是否真的是. 另外, const float xux = x * u.x; 语句会将结果减小回32位浮点数,这可能会降低整体精度.

The first multiplication in the upper implementation takes place in parallel to the integer math that follows, they use different execution resources. The second function on the other hand looks faster but it's hard to tell if it really is because of the above. Also, the const float xux = x*u.x; statement reduces the result back to 32 bit float, which may reduce overall accuracy.

您可以并排测试这些函数,并将它们与math.h中的 sqrt 函数进行比较(使用双精度而不是浮点型).通过这种方式,您可以查看哪个更快,哪个更准确.

You could test these functions head to head and compare them to the sqrt function in math.h (use double not float). This way you can see which is faster and which is more accurate.

更多推荐

快速平方根优化?

本文发布于:2023-11-28 21:09:54,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1643962.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:平方根   快速

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!