为什么string.Substring与源字符串共享内存?(Why doesn't string.Substring share memory with the source string?

编程入门 行业动态 更新时间:2024-10-28 08:17:22
为什么string.Substring与源字符串共享内存?(Why doesn't string.Substring share memory with the source string?)

众所周知,.NET中的字符串是不可变的。 (好吧, 不是100% 完全不可改变的 ,但是无论如何都是由设计所固定的,并且由任何合理的人使用。)

这使得基本确定,例如,下面的代码只是在两个变量中存储对同一个字符串的引用:

string x = "shark"; string y = x.Substring(0); // Proof: fixed (char* c = y) { c[4] = 'p'; } Console.WriteLine(x); Console.WriteLine(y);

以上输出:

sharp sharp

显然x和y指的是相同的string对象。 所以这里是我的问题: 为什么Substring 总是与源字符串共享状态? 一个字符串本质上是一个长度为char*指针,对吧? 因此,在我看来,至少在理论上应该允许以下内容分配一个内存块来保存5个字符,其中两个变量只是指向该(不可变)块内的不同位置:

string x = "shark"; string y = x.Substring(1); // Does c[0] point to the same location as x[1]? fixed (char* c = y) { c[0] = 'p'; } // Apparently not... Console.WriteLine(x); Console.WriteLine(y);

以上输出:

shark park

As we all know, strings in .NET are immutable. (Well, not 100% totally immutable, but immutable by design and used as such by any reasonable person, anyway.)

This makes it basically OK that, for example, the following code just stores a reference to the same string in two variables:

string x = "shark"; string y = x.Substring(0); // Proof: fixed (char* c = y) { c[4] = 'p'; } Console.WriteLine(x); Console.WriteLine(y);

The above outputs:

sharp sharp

Clearly x and y refer to the same string object. So here's my question: why wouldn't Substring always share state with the source string? A string is essentially a char* pointer with a length, right? So it seems to me the following should at least in theory be allowed to allocate a single block of memory to hold 5 characters, with two variables simply pointing to different locations within that (immutable) block:

string x = "shark"; string y = x.Substring(1); // Does c[0] point to the same location as x[1]? fixed (char* c = y) { c[0] = 'p'; } // Apparently not... Console.WriteLine(x); Console.WriteLine(y);

The above outputs:

shark park

最满意答案

有两个原因:

字符串元数据(如长度)与字符存储在同一个内存块中,以允许一个字符串使用另一个字符串的部分字符数据,这意味着您将不得不为一个字符串分配两个内存块,而不是一个。 由于大多数字符串不是其他字符串的子字符串,所以额外的内存分配会比通过重用部分字符串所获得的内存消耗更多。

在字符串的最后一个字符后面存储了一个额外的NUL字符,以使该字符串也可以被系统函数使用,这些系统函数需要一个以空字符结尾的字符串。 您不能将该额外的NUL字符放在另一个字符串的子字符串之后。

For two reasons:

The string meta data (e.g. length) is stored in the same memory block as the characters, to allow one string to use part of the character data of another string would mean that you would have to allocate two memory blocks for most strings instead of one. As most strings are not substrings of other strings, that extra memory allocation would be more memory consuming than what you could gain by reusing part of strings.

There is an extra NUL character stored after the last character of the string, to make the string also usable by system functions that expect a null terminated string. You can't put that extra NUL character after a substring that is part of another string.

更多推荐

本文发布于:2023-07-20 20:41:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1203722.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:字符串   内存   Substring   string   memory

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!