rep

编程入门 行业动态 更新时间:2024-10-14 02:18:30
rep_movsl的Clobber列表(Clobber list for rep_movsl)

我正在尝试内联汇编的例子: http : //www.delorie.com/djgpp/doc/brennan/brennan_att_inline_djgpp.html但是有些东西令我对打扰感到困惑:

关于clobber的行为 Clobbering基本上告诉GCC不相信指定寄存器/存储器中的值。

“好吧,当优化时,它确实有帮助,当GCC可以确切地知道你在前后的寄存器中做什么......它甚至足够聪明,知道如果你告诉它把(x + 1)放入寄存器,那么如果你不打开它,后来的C代码指的是(x + 1),并且它能够保持该寄存器空闲,它将重新使用计算。

这个段落是否意味着clobbering会禁用常见的子表达式消除?

关于clobber列表的教程有些不一致: 对于输入/输出列表中指定的寄存器,GCC不需要将它们置于clobber列表中; 但是,在关于rep_movsl(或rep_stosl)的示例中:

asm(“cld \ n \ t”“rep \ n \ t”“stosl”:/ *没有输出寄存器* /:“c”(count),“a”(fill_value),“D” %ecx“,”%edi“);

虽然“S,D,c”在输出操作数中,但它们被列为再次被破坏。 我在C中尝试了一个简单的片段:

#include<stdio.h> int main() { int a[] = {2, 4, 6}; int b[3]; int n = 3; int v = 12; asm ("cld\n\t" "rep\n\t" "movsl" : : "S" (a), "D" (b), "c" (n) : ); // : "%ecx", "%esi", "%edi" ); printf("%d\n", b[1]); }

如果我使用评论的clobber列表,GCC会抱怨:

ac:8:3:错误:在重新加载'asm'时无法在类'CREG'中找到寄存器ac:8:3:错误:'asm'操作数具有不可能的约束

如果我使用空的clobber列表,它将编译并且输出为4。

I'm trying out the examples of inline assembly in: http://www.delorie.com/djgpp/doc/brennan/brennan_att_inline_djgpp.html But something is confusing me about clobbering:

About behavior of clobber Clobbering essentially tells GCC to not trust the values in the specified register/memories.

"Well, it really helps when optimizing, when GCC can know exactly what you're doing with the registers before and after....It's even smart enough to know that if you tell it to put (x+1) in a register, then if you don't clobber it, and later C code refers to (x+1), and it was able to keep that register free, it will reuse the computation. Whew."

Does this paragraph means clobbering will disable common sub-expression elimination?

There's some inconsistency in the tutorial about the clobber list: For registers specified in input/output list, there's no need to put them in clobber list as GCC knows; However in the example about rep_movsl (or rep_stosl):

asm ("cld\n\t" "rep\n\t" "stosl" : /* no output registers */ : "c" (count), "a" (fill_value), "D" (dest) : "%ecx", "%edi" );

although "S, D, c" are in the output operands, they are listed as clobbered again. I tried a simple snippet in C:

#include<stdio.h> int main() { int a[] = {2, 4, 6}; int b[3]; int n = 3; int v = 12; asm ("cld\n\t" "rep\n\t" "movsl" : : "S" (a), "D" (b), "c" (n) : ); // : "%ecx", "%esi", "%edi" ); printf("%d\n", b[1]); }

If I use the commented clobber list, GCC will complain:

a.c:8:3: error: can't find a register in class ‘CREG’ while reloading ‘asm’ a.c:8:3: error: ‘asm’ operand has impossible constraints

If I use empty clobber list, it will compile and the output is 4.

最满意答案

您引用的文件似乎显着不准确。 以下是GCC的实际操作数约束:

输入:汇编操作从该操作数读取。 GCC假定所有读取在组装操作的最初阶段同时发生 。 输出:汇编操作写入此操作数; 完成后,相关变量将具有有意义的值。 (没有办法告诉GCC这个值是什么。)GCC假定所有的写操作在组装操作的最后阶段同时发生 。 Clobber:汇编操作会破坏此操作数中的任何有意义的值。 与写作一样,所有破坏者都被假定在手术结束同时发生 。 Earlyclobber:除了在操作开始时发生外,与clobber相同。

此外,目前的(GCC 4.7)手册包括这一关键段落:

您不得以与输入或输出操作数重叠的方式编写clobber描述。 例如,如果您在clobber列表中提及该寄存器,则可能没有描述具有一个成员的寄存器类的操作数。 声明为存在于特定寄存器中的变量(请参阅Explicit Reg Vars),并用作asm输入或输出操作数,必须没有在clobber描述中提及的部分。 没有办法指定输入操作数被修改,而没有将其指定为输出操作数。 请注意,如果您指定的所有输出操作数都是为此目的(因此未使用),那么您还需要为asm构造指定volatile,如下所述,以防止GCC删除asm语句作为未使用的语句。

这就是为什么试图输入和破坏某些寄存器对你来说不合格的原因。

现在,插入rep movsl现在有点愚蠢 - 只需使用memcpy并让GCC用您最佳的指令序列替换它 - 但是,编写示例的正确方法是

int main() { int a[] = {2, 4, 6}; int b[3]; int n = 3; int v = 12; int *ap = a, *bp = b; asm volatile ("rep movsl" : "+S" (ap), "+D" (bp), "+c" (n) : : "memory"); printf("%d\n", b[1]); }

您需要ap和bp中间变量,因为数组的地址不是左值,所以它不能出现在输出约束中。 “+ r”符号告诉GCC该寄存器既是输入又是输出。 'volatile'是必须的,因为所有输出操作数在asm之后都是未使用的,所以GCC会乐于删除它(理论上它只是对输出操作数做了什么)。 将“内存”放在clobber列表中是告诉GCC操作修改过的内存的方式。 最后,微型优化:GCC永远不会发出'std',所以你不需要'cld'(这实际上是由x86 ABI保证的)。

我所做的大部分更改都不会影响像这样的小测试程序是否正确运行; 但是,它们在全尺寸程序中都非常重要,可以防止出现细微的优化错误。 例如,如果你遗漏了“记忆”的诅咒,海湾合作委员会将有权将b[1]的负载提升到高于!

The document you are quoting appears to be significantly inaccurate. Here's what asm operand constraints actually mean to GCC:

Input: The assembly operation reads from this operand. GCC assumes that all reads happen simultaneously at the very beginning of the assembly operation. Output: The assembly operation writes to this operand; after it completes, the associated variable will have a meaningful value. (There is no way to tell GCC what that value is.) GCC assumes that all writes happen simultaneously at the very end of the assembly operation. Clobber: The assembly operation destroys any meaningful value in this operand. Like writes, all clobbers are assumed to happen simultaneously at the end of the operation. Earlyclobber: Same as clobber except that it happens at the beginning of the operation.

Furthermore, the current (GCC 4.7) manual includes this critical paragraph:

You may not write a clobber description in a way that overlaps with an input or output operand. For example, you may not have an operand describing a register class with one member if you mention that register in the clobber list. Variables declared to live in specific registers (see Explicit Reg Vars), and used as asm input or output operands must have no part mentioned in the clobber description. There is no way for you to specify that an input operand is modified without also specifying it as an output operand. Note that if all the output operands you specify are for this purpose (and hence unused), you will then also need to specify volatile for the asm construct, as described below, to prevent GCC from deleting the asm statement as unused.

This is why attempting to both input and clobber certain registers is failing for you.

Now, inserting rep movsl is kind of silly nowadays -- just use memcpy and let GCC replace that with an optimal instruction sequence for you -- but nonetheless the correct way to write your example is

int main() { int a[] = {2, 4, 6}; int b[3]; int n = 3; int v = 12; int *ap = a, *bp = b; asm volatile ("rep movsl" : "+S" (ap), "+D" (bp), "+c" (n) : : "memory"); printf("%d\n", b[1]); }

You need the ap and bp intermediate variables because the address of an array is not an lvalue, so it can't appear in the output constraints. The "+r" notation tells GCC that this register is both an input and an output. The 'volatile' is necessary because all of the output operands are unused after the asm, so GCC would otherwise cheerfully delete it (on the theory that it was only there for what it did to the output operands). Putting "memory" in the clobber list is how you tell GCC that the operation modified memory. And finally, a micro-optimization: GCC never ever issues 'std', so you need not 'cld' (this is actually guaranteed by the x86 ABI).

Most of the changes I made would not affect whether a tiny test program like this behaves correctly; however, they are all essential in a full-size program to prevent subtle optimization errors. For instance, if you left out the "memory" clobber, GCC would be within its rights to hoist the load of b[1] above the asm!

更多推荐

本文发布于:2023-08-04 11:40:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1415232.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:rep

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!