是否有快速和可扩展的解决方案来保存数据?(Is there a fast and scalable solution to save data?)

编程入门 行业动态 更新时间:2024-10-25 03:18:45
是否有快速和可扩展的解决方案来保存数据?(Is there a fast and scalable solution to save data?)

我正在开发一个需要在Windows平台上扩展的服务。

最初它将每秒接收约50个连接(每个连接将发送大约5kb的数据),但它需要可扩展以接收超过500个未来。

(我想)将接收到的数据保存到Microsoft SQL Server等公共数据库是不切实际的。

是否有另一种解决方案来保存数据? 考虑到它每天会收到超过600万的“记录”。

共有5个步骤:

通过http处理程序(c#)接收数据; 保存收到的数据; < - 这里 请求保存的数据进行处理; 处理请求的数据; 保存处理后的数据。 < - 这里

我的预解决方案是:

通过http处理程序(c#)接收数据; 将收到的数据保存到Message Queue ; MSQ请求使用Windows服务处理保存的数据; 处理请求的数据; 将处理后的数据保存到Microsoft SQL Server (这里是瓶颈);

I'm developing a service that needs to be scalable in Windows platform.

Initially it will receive aproximately 50 connections by second (each connection will send proximately 5kb data), but it needs to be scalable to receive more than 500 future.

It's impracticable (I guess) to save the received data to a common database like Microsoft SQL Server.

Is there another solution to save the data? Considering that it will receive more than 6 millions "records" per day.

There are 5 steps:

Receive the data via http handler (c#); Save the received data; <- HERE Request the saved data to be processed; Process the requested data; Save the processed data. <- HERE

My pre-solution is:

Receive the data via http handler (c#); Save the received data to Message Queue; Request from MSQ the saved data to be processed using a windows services; Process the requested data; Save the processed data to Microsoft SQL Server (here's the bottleneck);

最满意答案

每天600万条记录听起来并不特别庞大。 特别是,每天24小时不会每秒500次 - 您是否期望流量是“突发”的?

我不会亲自使用消息队列 - 在此之前,我一直因不稳定和普遍困难而陷入困境。 我可能会直接写入磁盘。 在内存中,使用一个单线程写入磁盘的生产者/消费者队列。 生产者只会将记录转储到队列中。

有一个单独的批处理任务,一次将一堆记录插入到数据库中。

一次对最佳(或至少批量上传的“好”数量的记录)进行基准测试。 你可能希望有一个线程从磁盘读取数据,另一个数据写入数据库(如果数据库线程有大量积压,文件线程会被阻塞),这样你就不用等待文件访问和数据库同时。

我建议你尽早做一些测试,看看数据库能应付什么(并让你测试各种不同的配置)。 找出瓶颈所在,以及他们会伤害你多少。

6 million records per day doesn't sound particularly huge. In particular, that's not 500 per second for 24 hours a day - do you expect traffic to be "bursty"?

I wouldn't personally use message queue - I've been bitten by instability and general difficulties before now. I'd probably just write straight to disk. In memory, use a producer/consumer queue with a single thread writing to disk. Producers will just dump records to be written into the queue.

Have a separate batch task which will insert a bunch of records into the database at a time.

Benchmark the optimal (or at least a "good" number of records to batch upload) at a time. You may well want to have one thread reading from disk and a separate one writing to the database (with the file thread blocking if the database thread has a big backlog) so that you don't wait for both file access and the database at the same time.

I suggest that you do some tests nice and early, to see what the database can cope with (and letting you test various different configurations). Work out where the bottlenecks are, and how much they're going to hurt you.

更多推荐

本文发布于:2023-07-24 00:19:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1239190.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:解决方案   快速   数据   fast   data

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!