了解Strlen实施中的代码

编程入门 行业动态 更新时间:2024-10-24 01:55:27
本文介绍了了解Strlen实施中的代码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

关于glibc中string.h中strlen的实现,我有两个问题.

I have two questions regarding the implementation of strlen in string.h in glibc.

  • 该实现使用带有'holes'的幻数.我不明白这是如何工作的.有人可以帮我了解一下此片段:

  • The implementation uses a magic number with 'holes'. I am not able to understand how this works. Can someone please help me understand this snippet: size_t strlen (const char *str) { const char *char_ptr; const unsigned long int *longword_ptr; unsigned long int longword, himagic, lomagic; /* Handle the first few characters by reading one character at a time. Do this until CHAR_PTR is aligned on a longword boundary. */ for (char_ptr = str; ((unsigned long int) char_ptr & (sizeof (longword) - 1)) != 0; ++char_ptr) if (*char_ptr == '\0') return char_ptr - str; /* All these elucidatory comments refer to 4-byte longwords, but the theory applies equally well to 8-byte longwords. */ longword_ptr = (unsigned long int *) char_ptr; /* Bits 31, 24, 16, and 8 of this number are zero. Call these bits the "holes." Note that there is a hole just to the left of each byte, with an extra at the end: bits: 01111110 11111110 11111110 11111111 bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD The 1-bits make sure that carries propagate to the next 0-bit. The 0-bits provide holes for carries to fall into. */ himagic = 0x80808080L; lomagic = 0x01010101L; if (sizeof (longword) > 4) { /* 64-bit version of the magic. */ /* Do the shift in two steps to avoid a warning if long has 32 bits. */ himagic = ((himagic << 16) << 16) | himagic; lomagic = ((lomagic << 16) << 16) | lomagic; } if (sizeof (longword) > 8) abort (); /* Instead of the traditional loop which tests each character, we will test a longword at a time. The tricky part is testing if *any of the four* bytes in the longword in question are zero. */ for (;;) { longword = *longword_ptr++; if (((longword - lomagic) & ~longword & himagic) != 0) { /* Which of the bytes was the zero? If none of them were, it was a misfire; continue the search. */ const char *cp = (const char *) (longword_ptr - 1); if (cp[0] == 0) return cp - str; if (cp[1] == 0) return cp - str + 1; if (cp[2] == 0) return cp - str + 2; if (cp[3] == 0) return cp - str + 3; if (sizeof (longword) > 4) { if (cp[4] == 0) return cp - str + 4; if (cp[5] == 0) return cp - str + 5; if (cp[6] == 0) return cp - str + 6; if (cp[7] == 0) return cp - str + 7; }}}

    什么是魔幻数字?

    为什么不简单地增加指针直到NULL字符并返回计数呢?这种方法更快吗?为什么会这样?

    Why not simply increment pointer until NULL character and return count? Is this approach faster? Why is it so?

    推荐答案

    这用于一次查看4个字节(32位)甚至8个(64位),以检查其中之一是否为零(结束)字符串),而不是单独检查每个字节.

    This is used to look at 4 bytes (32 bits) or even 8 (64 bits) in one go, to check if one of them is zero (end of string), instead of checking each byte individually.

    这里是检查空字节的一个示例:

    Here is one example to check for a null byte:

    unsigned int v; // 32-bit word to check if any 8-bit byte in it is 0 bool hasZeroByte = ~((((v & 0x7F7F7F7F) + 0x7F7F7F7F) | v) | 0x7F7F7F7F);

    有关更多信息,请参见位扭曲的黑客.

    For some more see Bit Twiddling Hacks.

    这里使用的那个(32位示例):

    The one used here (32-bit example):

    还有一种更快的方法-使用hasless(v,1),该方法已定义 以下;它可以进行4次操作,不需要后续操作 确认.简化为

    There is yet a faster method — use hasless(v, 1), which is defined below; it works in 4 operations and requires no subsquent verification. It simplifies to

    #define haszero(v) (((v) - 0x01010101UL) & ~(v) & 0x80808080UL)

    子表达式(v-0x01010101UL)的计算结果为 v中对应的字节为零或大于零时的任何字节 0x80.子表达式〜v& 0x80808080UL评估为高位设置 以字节为单位,其中v的字节未设置高位(因此 字节小于0x80).最后,通过对这两个子表达式进行与"运算 结果是设置高位,其中v中的字节为零,因为 由于第一个中的值大于0x80而设置了高位 子表达式被第二个掩模遮盖了.

    The subexpression (v - 0x01010101UL), evaluates to a high bit set in any byte whenever the corresponding byte in v is zero or greater than 0x80. The sub-expression ~v & 0x80808080UL evaluates to high bits set in bytes where the byte of v doesn't have its high bit set (so the byte was less than 0x80). Finally, by ANDing these two sub-expressions the result is the high bits set where the bytes in v were zero, since the high bits set due to a value greater than 0x80 in the first sub-expression are masked off by the second.

    一次查看一个字节至少要花费一个完整的整数值(寄存器宽)所需的cpu周期.在此算法中,将检查完整整数以查看它们是否包含零.如果不是,则几乎不使用指令,并且可以对下一个完整整数进行 jump .如果内部的字节数为零,则会进行进一步检查以查看其确切位置.

    Looking at one byte at a time costs at least as much cpu cycles as looking at a full interger value (register wide). In this algorithm, full integers are checked to see if they contain a zero. If not, little instructions are used, and a jump can be made to the next full integer. If there is a zero byte inside, a further check is done to see at what exact position it was.

  • 更多推荐

    了解Strlen实施中的代码

    本文发布于:2023-11-29 23:10:37,感谢您对本站的认可!
    本文链接:https://www.elefans.com/category/jswz/34/1647913.html
    版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
    本文标签:代码   Strlen

    发布评论

    评论列表 (有 0 条评论)
    草根站长

    >www.elefans.com

    编程频道|电子爱好者 - 技术资讯及电子产品介绍!