.NET中更快(不安全)的BinaryReader

编程入门 行业动态 更新时间:2024-10-27 08:29:50
本文介绍了.NET中更快(不安全)的BinaryReader的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我遇到一种情况,我有一个很大的文件,我需要从中读取二进制数据.

I came across a situation where I have a pretty big file that I need to read binary data from.

因此,我意识到.NET中默认的BinaryReader实现非常慢.通过 .NET Reflector 进行查看时,我发现了这一点:

Consequently, I realized that the default BinaryReader implementation in .NET is pretty slow. Upon looking at it with .NET Reflector I came across this:

public virtual int ReadInt32() { if (this.m_isMemoryStream) { MemoryStream stream = this.m_stream as MemoryStream; return stream.InternalReadInt32(); } this.FillBuffer(4); return (((this.m_buffer[0] | (this.m_buffer[1] << 8)) | (this.m_buffer[2] << 0x10)) | (this.m_buffer[3] << 0x18)); }

考虑到自从发明32位CPU以来计算机是如何设计用于32位值的,这使我感到效率极低.

Which strikes me as extremely inefficient, thinking at how computers were designed to work with 32-bit values since the 32 bit CPU was invented.

因此,我使用如下代码创建了自己的(不安全的)FastBinaryReader类:

So I made my own (unsafe) FastBinaryReader class with code such as this instead:

public unsafe class FastBinaryReader :IDisposable { private static byte[] buffer = new byte[50]; //private Stream baseStream; public Stream BaseStream { get; private set; } public FastBinaryReader(Stream input) { BaseStream = input; } public int ReadInt32() { BaseStream.Read(buffer, 0, 4); fixed (byte* numRef = &(buffer[0])) { return *(((int*)numRef)); } } ... }

速度更快-我设法将读取500 MB文件所需的时间减少了5-7秒,但总体上还是很慢的(使用我的FastBinaryReader最初是29秒,现在是〜22秒) ).

Which is much faster - I managed to shave off 5-7 seconds off the time it took to read a 500MB file, but it's still pretty slow overall (29 seconds initially and ~22 seconds now with my FastBinaryReader).

对于为什么读取这么小的文件仍然需要这么长时间,我仍然感到困惑.如果我将文件从一个磁盘复制到另一个磁盘,只需要几秒钟,那么磁盘吞吐量就不会成为问题.

It still kind of baffles me as to why it still takes so long to read such a relatively small file. If I copy the file from one disk to another it takes only a couple of seconds, so disk throughput is not an issue.

我进一步内联了ReadInt32等调用,最后得到了以下代码:

I further inlined the ReadInt32, etc. calls, and I ended up with this code:

using (var br = new FastBinaryReader(new FileStream(cacheFilePath, FileMode.Open, FileAccess.Read, FileShare.Read, 0x10000, FileOptions.SequentialScan))) while (br.BaseStream.Position < br.BaseStream.Length) { var doc = DocumentData.Deserialize(br); docData[doc.InternalId] = doc; } }

public static DocumentData Deserialize(FastBinaryReader reader) { byte[] buffer = new byte[4 + 4 + 8 + 4 + 4 + 1 + 4]; reader.BaseStream.Read(buffer, 0, buffer.Length); DocumentData data = new DocumentData(); fixed (byte* numRef = &(buffer[0])) { data.InternalId = *((int*)&(numRef[0])); data.b = *((int*)&(numRef[4])); data.c = *((long*)&(numRef[8])); data.d = *((float*)&(numRef[16])); data.e = *((float*)&(numRef[20])); data.f = numRef[24]; data.g = *((int*)&(numRef[25])); } return data; }

关于如何使速度更快的任何其他想法?我在想,也许我可以使用编组将整个文件直接映射到某些自定义结构之上的内存中,因为数据是线性的,固定大小的并且是顺序的.

Any further ideas on how to make this even faster? I was thinking maybe I could use marshalling to map the entire file straight into memory on top of some custom structure, since the data is linear, fixed size and sequential.

已解决:我得出的结论是FileStream的buffering/BufferedStream存在缺陷.请在下面查看接受的答案和我自己的答案(以及解决方案).

SOLVED: I came to the conclusion that FileStream's buffering/BufferedStream are flawed. Please see the accepted answer and my own answer (with the solution) below.

推荐答案

进行文件复制时,会读取大量数据并将其写入磁盘.

When you do a filecopy, large chunks of data are read and written to disk.

您一次读取整个文件四个字节.这势必会变慢.即使流实现足够智能以进行缓冲,您仍然至少有500个MB/4 = 131072000 API调用.

You are reading the entire file four bytes at a time. This is bound to be slower. Even if the stream implementation is smart enough to buffer, you still have at least 500MB/4 = 131072000 API calls.

先读取大量数据,然后依次进行处理,然后重复进行直到文件被处理,这不是更明智的选择吗?

Isn't it more wise to just read a large chunk of data, and then go through it sequentially, and repeat until the file has been processed?

更多推荐

.NET中更快(不安全)的BinaryReader

本文发布于:2023-11-09 10:37:13,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1572157.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:更快   不安全   NET   BinaryReader

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!