C#中的内存管理(Memory management in C#)

编程入门 行业动态 更新时间:2024-10-24 00:23:35
C#中的内存管理(Memory management in C#)

下午好,

我有一些文本文件,其中包含通过分析报纸文章的语料库收集的(2克,数量)对的列表,当我开始一个给定的应用程序时,我需要将它加载到内存中。 为了存储这些对,我使用了如下结构:

private static Dictionary<String, Int64>[] ListaDigramas = new Dictionary<String, Int64>[27];

有一系列字典的理想是因为效率问题,因为我在某处读到长字典会对性能产生负面影响。 也就是说,每2克进入与第一个字符的ASCII码减97相对应的字典(如果第一个字符不是从'a'到'z'范围内的字符,则为26)。

当我将(2-gram,count)对加载到内存中时,应用程序需要一个总共800Mb的RAM,并保持这种状态,直到我使用名为Memory Cleaner的程序释放内存为止。 之后,程序所占用的内存降至7Mb-100Mb,而不会丢失功能(我认为)。

有没有办法以这种方式释放内存,但不使用外部应用程序? 我试图使用GC.Collect()但在这种情况下不起作用。

非常感谢你。

Good afternoon,

I have some text files containing a list of (2-gram, count) pairs collected by analysing a corpus of newspaper articles which I need to load into memory when I start a given application I am developing. To store those pairs, I am using a structure like the following one:

private static Dictionary<String, Int64>[] ListaDigramas = new Dictionary<String, Int64>[27];

The ideia of having an array of dictionaries is due to efficiency questions, since I read somewhere that a long dictionary has a negative impact on performance. That said, every 2-gram goes into the dictionary that corresponds to it's first character's ASCII code minus 97 (or 26 if the first character is not a character in the range from 'a' to 'z').

When I load the (2-gram, count) pairs into memory, the application takes an overall 800Mb of RAM, and stays like this until I use a program called Memory Cleaner to free up memory. After this, the memory taken by the program goes down to the range 7Mb-100Mb, without losing functionality (I think).

Is there any way I can free up memory this way but without using an external application? I tried to use GC.Collect() but it doesn't work in this case.

Thank you very much.

最满意答案

关于我唯一可以想到的其他想法,如果你真的想保持内存使用率下降,将字典存储在一个流中并压缩它。 需要考虑的因素是您访问/膨胀这些数据的频率以及数据的可压缩性。 来自报纸文章的文字压缩得非常好,并且表现击中可能会比你想象的要少。

使用像SharpZipLib这样的开源库( http://www.icsharpcode.net/opensource/sharpziplib/ ),你的代码看起来像这样:

MemoryStream stream = new MemoryStream(); BinaryFormatter formatter = new BinaryFormatter(); formatter.Serialize(stream, ListaDigramas); byte[] dictBytes = stream.ToArray(); Stream zipStream = new DeflaterOutputStream(new MemoryStream()); zipStream.Write(dictBytes, 0, dictBytes.Length);

充气需要一个InflaterInputStream和一个循环来以大块的形式膨胀流,但是相当简单。

您必须使用该应用来查看性能是否可以接受。 请记住,当你膨胀使用它时(除非有人有一个聪明的想法来处理处于压缩状态的对象),你仍然需要足够的内存来保存字典。

老实说,尽管如此,保持它在内存中,让Windows交换到页面文件可能是你最好的/最快的选择。

编辑 我从来没有尝试过,但你可能能够直接序列化到压缩流,这意味着压缩开销很小(你仍然有序列化开销):

MemoryStream stream = new MemoryStream(); BinaryFormatter formatter = new BinaryFormatter(); Stream zipStream = new DeflaterOutputStream(new MemoryStream()); formatter.Serialize(zipStream, ListaDigramas);

About the only other idea I could come up with, if you really want to keep your memory usage down, would be store the dictionary in a stream and compress it. Factors to consider would be how often you're accessing/inflating this data, and how compressible the data is. Text from newspaper articles would compress extremely well, and the performance hit might be less than you'd think.

Using an open-source library like SharpZipLib ( http://www.icsharpcode.net/opensource/sharpziplib/ ), your code would look something like:

MemoryStream stream = new MemoryStream(); BinaryFormatter formatter = new BinaryFormatter(); formatter.Serialize(stream, ListaDigramas); byte[] dictBytes = stream.ToArray(); Stream zipStream = new DeflaterOutputStream(new MemoryStream()); zipStream.Write(dictBytes, 0, dictBytes.Length);

Inflating requires an InflaterInputStream and a loop to inflate the stream in chunks, but is fairly straightforward.

You'd have to play with the app to see if performance was acceptable. Keeping in mind, of course, that you'll still need enough memory to hold the dictionary when you inflate it for use (unless someone has a clever idea to work with the object in its compressed state).

Honestly, though, keeping it as-is in memory and letting Windows swap it to the page file is probably your best/fastest option.

Edit I've never tried it, but you might be able to serialize directly to the compression stream, meaning the compression overhead is minimal (you'd still have the serialization overhead):

MemoryStream stream = new MemoryStream(); BinaryFormatter formatter = new BinaryFormatter(); Stream zipStream = new DeflaterOutputStream(new MemoryStream()); formatter.Serialize(zipStream, ListaDigramas);

更多推荐

本文发布于:2023-07-25 19:40:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1265184.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:内存管理   Memory   management

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!