程序集ia32上的imull操作(imull operation on assembly ia32)

编程入门 行业动态 更新时间:2024-10-28 03:28:55
程序集ia32上的imull操作(imull operation on assembly ia32)

我想在汇编中做一个imull操作并将结果返回给C.

我的函数的签名是'long long multiplicar(void)',代码是:

multiplicar: movl op1, %eax imull op2, %eax adcl $0, %edx ret

我的op2是3.当我的op1是399时运行良好(给出1197)。 但是当我的op1是-399时,我得到了4294966093并且不知道为什么。 我必须使用cdc?

我的op1和op2是长型。 谢谢

I want to do a imull operatation in assembly and return result to C.

The signature of my function is 'long long multiplicar(void)' and the code is:

multiplicar: movl op1, %eax imull op2, %eax adcl $0, %edx ret

My op2 is 3. When my op1 is 399 works well (gives 1197). But when my op1 is -399 i get 4294966093 and don't know why. I have to use cdc?

My op1 and op2 are long long types. Thanks

最满意答案

当给定32位操作数时, imul指令执行带符号的32x32位乘法。 这产生高达64位的结果,但是在两个/三个操作数形式中,只有最低有效字保持通过进位指示的溢出。

请注意,进位只是用于错误检测的单位标志,并且不能携带将多个扩展精度乘法链接在一起所需的信息。

在这种情况下,在最新的编辑之后,似乎目标是将两个64位变量相乘并获取截断的64位结果。 使用32x32 => 64位原语实现这一点需要将四次乘法链接到等级学校方法。 那是(a<<32|b) * (c<<32|d) = (a*c<<64) + (a*d<<32) + (b*c<<32) + (b*d<<0) 。 这里可以删除a*c项,因为我们只需要结果的最低64位。

虽然在理论上这在理论上是直截了当的,但保持临时性和直接使用汇编语言是微妙的,容易出错。 一个额外的皱纹是操作是签名的,我的建议是建立一个基本的无符号乘法基元并分别调整符号。

值得庆幸的 ,如果我们使用8087浮点单元,CPU 本身确实支持64位乘法。 请注意,为避免舍入错误,浮点控制字必须设置为完整的64位精度( _controlfp(_PC_64,_MCW_PC) ),而不是通常使用的53位。

multiply: ;int64_t __cdecl multiply(int64_t lhs, int64_t rhs) fildq 4(%esp) fildq 12(%esp) fmul fistpq 4(%esp) movl 4(%esp),%eax movl 8(%esp),%edx ret

但是请注意,要求完全128位精度的溢出不会产生正确截断的64位结果,并且问题不会处理状态溢出。

The imul instruction, when given 32-bit operands, performs a signed 32x32-bit multiplication. This yields a result of up to 64-bits, however in the two/three-operand forms only the least-significant word is kept with overflow indicated through carry.

Note that carry is only a single-bit flag used for error detection and cannot carry the information required to chain several extended-precision multiplications together.

In this case, after the latest edit, it seems to goal is to multiply two 64-bit variables together and grab the truncated 64-bits result. Achieving this with a 32x32=>64 bit primitive requires chaining together four multiplications by what is amounts to the grade-school method. That is (a<<32|b) * (c<<32|d) = (a*c<<64) + (a*d<<32) + (b*c<<32) + (b*d<<0). The a*c term can be dropped here however since we only require the least-significant 64-bits of the result.

While this is straightforward in theory in practice keeping the temporaries and carries straight in assembly language is subtle and error-prone. An added wrinkle is that the operations are signed, for which my suggestion would be to build a basic unsigned multiplication primitive and adjust for the signs separately.

Thankfully the CPU does in fact support 64-bit multiplication natively if we instead use the 8087 floating-point unit. Note that to avoid rounding errors floating-point control word must be set to full 64-bit precision (_controlfp(_PC_64,_MCW_PC)) as opposed to the 53 bits which are typically used.

multiply: ;int64_t __cdecl multiply(int64_t lhs, int64_t rhs) fildq 4(%esp) fildq 12(%esp) fmul fistpq 4(%esp) movl 4(%esp),%eax movl 8(%esp),%edx ret

Note however that overflows requiring full 128-bit precision not be yield to correctly truncated 64-bits result and question does not state overflow is to be handled.

更多推荐

本文发布于:2023-07-30 01:24:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1321356.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:操作   程序   imull   operation   assembly

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!