不同编译器用于处理数值转换溢出的常见策略是什么?(What are some common strategies different compilers use to deal with overfl

编程入门 行业动态 更新时间:2024-10-23 07:39:55
不同编译器用于处理数值转换溢出的常见策略是什么?(What are some common strategies different compilers use to deal with overflow in numeric conversions?)

我明白,在C ++中,当我将float / double转换为int ,由此浮点数超出了int可容纳的范围,结果未被定义为C ++语言的一部分。 结果取决于实现/编译器。 常见的编译器使用什么策略来处理这个问题?

将7.2E12转换为int可以产生值1634811904或2147483647 。 例如,有没有人知道编译器在每种情况下都在做什么?

I understand that, in C++, when I convert a float/double into an int, whereby the floating-point number is beyond the range that the int can hold, the result is not defined as part of the C++ language. The result depends on the implementation/compiler. What are some strategies common compilers use to deal with this?

Converting 7.2E12 to an int can yield the values 1634811904 or 2147483647. For example, does anyone know what the compiler is doing in each of these cases?

最满意答案

编译器会生成一系列指令,为所有不会导致溢出的输入生成正确的结果。 这是所有必须担心的(因为从浮点到整数的转换中的溢出是未定义的行为 )。 编译器不会“处理”溢出,而完全忽略它们 。 如果平台上的底层汇编指令引发异常,那就好了。 如果他们环绕,很好。 如果他们产生荒谬的结果,再次,罚款。


作为一个例子,常量表达式可以在编译时转换为整数,其规则与平台上生成的汇编指令的行为不同。 我的博客文章举了一个例子:

int printf(const char *, ...); volatile double v = 0; int main() { int i1 = 2147483648.0; int i2 = 2147483648.0 + v; printf("%d %d\n", i1, i2); }

它产生一个程序,为i1和i2打印两个不同的值。 这是因为i1的计算中的转换是在编译时应用的,而i2的计算中的转换是在运行时应用的。


再举一个例子,在x86-64平台上从double转换为32位unsigned int的特殊情况下,结果可能很有趣:

x86指令集中没有指令将浮点转换为无符号整数。

在用于英特尔的Mac OS X上,编译64位程序时,将从双cvttsd2siq型转换为32位无符号整型数据将在单个指令中进行编译:用于64位转换的指令cvttsd2siq ,目标为64位寄存器只有底部的32位随后将被用作它表示的32位无符号整数:

$ cat t.c #include <stdio.h> #include <stdlib.h> int main(int c, char **v) { unsigned int i = 4294967296.0 + strtod(v[1], 0); printf("%u\n", i); } $ gcc -m64 -S -std=c99 -O t.c && cat t.s … addsd LCPI1_0(%rip), %xmm0 ; this is the + from the C program cvttsd2siq %xmm0, %rsi ; one-instruction conversion …

这解释了如何在该平台上获得足够小的双精度(特别是小到足以符合有符号的64位整数)的双精度模232。

在旧的IA-32指令集中,没有指令将double精度转换为64位有符号整数(并且没有指令将double精度转换为32位unsigned int )。 转换为32位unsigned int必须通过组合一些确实存在的指令来完成,包括两个指令cvttsd2si以便从double转换为32位有符号整数:

$ gcc -m32 -S -std=c99 -O t.c && cat t.s … addsd LCPI1_0-L1$pb(%esi), %xmm0 ; this is the + from the C program movsd LCPI1_1-L1$pb(%esi), %xmm1 ; conversion to unsigned int starts here movapd %xmm0, %xmm2 subsd %xmm1, %xmm2 cvttsd2si %xmm2, %eax xorl $-2147483648, %eax ucomisd %xmm1, %xmm0 cvttsd2si %xmm0, %edx cmovael %eax, %edx …

计算两个替代解决方案,分别在%eax和%edx 。 各种替代方案在不同的定义域中都是正确的。 如果要转换的数字( %xmm0 )大于常数2 31(以%xmm1 ,则选择一个选项,否则选择另一个选项。 仅使用从double到int的转换的高级算法将是:

if (d < 231) then (unsigned int)(int)d else (231 + (unsigned int)(int)(d - 231))

将C转换从double转换为unsigned int会得到与其依赖的32位转换指令相同的饱和行为:

$ gcc -m32 -std=c99 -O t.c && ./a.out 123456 0

The compiler generates sequences of instructions that produce the correct result for all inputs that do not cause overflow. This is all it has to worry about (because overflow in the conversion from floating-point to integer is undefined behavior). The compiler does not “deal with” overflows so much as completely ignore them. If the underlying assembly instruction(s) on the platform raise an exception, fine. If they wrap around, fine. If they produce nonsensical results, again, fine.


As an example, constant expressions may be converted to integers at compile-time with rules that differ from the behavior of the assembly instructions generated on the platform. My blog post gives the example:

int printf(const char *, ...); volatile double v = 0; int main() { int i1 = 2147483648.0; int i2 = 2147483648.0 + v; printf("%d %d\n", i1, i2); }

which produces a program that prints two different values for i1 and i2. This is because the conversion in the computation of i1 was applied at compile-time, whereas the conversion in the computation of i2 was applied at run-time.


As another example, in the particular case of the conversion from double to 32-bit unsigned int on the x86-64 platform, the results can be funny:

There are no instructions in the x86 instruction sets to convert from floating-point to unsigned integer.

On Mac OS X for Intel, compiling a 64-bit program, the conversion from double to 32-bit unsigned int is compiled in a single instruction: the instruction for 64-bit conversions, cvttsd2siq, with destination a 64-bit register of which only the bottom 32-bit will subsequently be used as the 32-bit unsigned integer it represents:

$ cat t.c #include <stdio.h> #include <stdlib.h> int main(int c, char **v) { unsigned int i = 4294967296.0 + strtod(v[1], 0); printf("%u\n", i); } $ gcc -m64 -S -std=c99 -O t.c && cat t.s … addsd LCPI1_0(%rip), %xmm0 ; this is the + from the C program cvttsd2siq %xmm0, %rsi ; one-instruction conversion …

This explains how, on that platform, a result modulo 232 can be obtained for doubles that are small enough (specifically, small enough to fit in a signed 64-bit integer).

In the old IA-32 instruction set, there is no instruction to convert a double to a 64-bit signed integer (and there is no instruction to convert a double to a 32-bit unsigned int either). The conversion to 32-bit unsigned int has to be done by combining a few of the instructions that do exist, including two instructions cvttsd2si to convert from double to 32-bit signed integer:

$ gcc -m32 -S -std=c99 -O t.c && cat t.s … addsd LCPI1_0-L1$pb(%esi), %xmm0 ; this is the + from the C program movsd LCPI1_1-L1$pb(%esi), %xmm1 ; conversion to unsigned int starts here movapd %xmm0, %xmm2 subsd %xmm1, %xmm2 cvttsd2si %xmm2, %eax xorl $-2147483648, %eax ucomisd %xmm1, %xmm0 cvttsd2si %xmm0, %edx cmovael %eax, %edx …

Two alternative solutions are computed, respectively in %eax and in %edx. The alternatives are each correct on different definition domains. If the number to convert, in %xmm0, is larger than the constant 231 in %xmm1, then one alternative is chosen, otherwise, the other one is. The high-level algorithm, using only conversions from double to int, would be:

if (d < 231) then (unsigned int)(int)d else (231 + (unsigned int)(int)(d - 231))

This translation of the C conversion from double to unsigned int gives the same saturating behavior as the 32-bit conversion instruction that it relies on:

$ gcc -m32 -std=c99 -O t.c && ./a.out 123456 0

更多推荐

本文发布于:2023-08-02 02:45:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1368411.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:编译器   数值   策略   常见   common

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!