如果我查询包含它的文档，取消设置字段是否会提高性能？(Does unsetting a field improve performance if I query for documents that

编程入门行业动态更新时间:2024-10-27 05:23:26

如果我查询包含它的文档，取消设置字段是否会提高性能？(Does unsetting a field improve performance if I query for documents that contain it?)

我正在构建一个应用程序，我有500多种产品的标准集合。该公司不断进行销售，因此在任何特定时刻，将有2-10种产品在售。

我仍然试图围绕如何在Mongo中建模，但我正在尝试在“模型如何访问数据”之后思考。由于产品页面的访问频率高于其他任何内容，我正在考虑将销售信息直接添加到产品系列中。喜欢这个：

{ _id: 1, name: "Widget", price: 15.99, ... sale: { reducedPrice: 9.99 saleStarts: "Nov 11, 2016", saleEnds: "Nov 18, 2016", } }

我有一个页面，其中列出了所有当前的销售。它不是经常访问，但它需要存在。我的问题是关于该查询中的性能，因为我不希望每次加载该页面时都浏览每个产品，并且我试图通过使用第二个Sales集合来避免重复信息。

据我所知，当Mongo通过一个集合时，如果我正在寻找这样的东西：

Products.find({ sale: { $exists: true } })

它并没有真正涵盖所有记录。因此，如果我在销售结束时取消“销售”并将该字段保留在当前正在销售的记录中，那么性能不应该太糟糕。

我的问题是：我在这里遗漏了什么吗？有没有更好的方法呢？

I'm building an app where I have a standard collection with 500+ products. The company constantly runs sales, so at any given point, 2-10 products will be on sale.

I'm still trying to wrap my head around how to model in Mongo, but I'm trying to think in the "model after how the data will be accessed" style. Since the products page will be accessed more often than anything else, I'm thinking of adding the sale information directly into the product collection. Like this:

{ _id: 1, name: "Widget", price: 15.99, ... sale: { reducedPrice: 9.99 saleStarts: "Nov 11, 2016", saleEnds: "Nov 18, 2016", } }

I do have a page where all the current sales will be listed. It is not accessed as often, but it needs to exist. My question is about performance in that query as I don't want to go through every product every time that page is loaded and I'm trying to avoid duplicating information by having a second Sales collection.

As I understand, when Mongo goes trough the a collection, if I'm looking for something like this:

Products.find({ sale: { $exists: true } })

It doesn't really go trough all the records. So if I unset "sale" whenever a sale ends and just keep the field in the records that are currently on sale, then performance shouldn't really be too bad.

My question is: Am I missing something here? Is there a better way to do it?

最满意答案

MongoDB和许多其他数据库的工作方式是，您需要在要以合理性能查询的字段上使用索引。数据库中的索引保存在内存中，这意味着针对索引字段的查询不需要扫描硬盘驱动器，而是在内存中以高效到遍历的数据结构进行扫描，从而获得更好的性能。还有许多其他细节，我不需要进入，谷歌将很好地解释。

您可以阅读有关MongoDB文档索引的更多信息，但要真正回答您的问题; 如果您的集合中的sale字段没有索引，MongoDB将被强制从磁盘扫描该集合中的所有文档（尽管有些文件可能会缓存在内存中）。

您必须找到服务器可以容纳多少索引的最佳位置，以及不像其他集合那样经常访问的集合的权衡索引。您拥有的索引越多， mongod守护程序将消耗的RAM就越多。

The way MongoDB works, and many other databases, is that you need an index on a field that you want to query for with reasonable performance. Indexes in databases are held in memory, meaning that a query against a field that is indexed will not require a scan against the hard drive, but rather scanned in-memory in an efficient-to-traverse data structure, thus resulting in much better performance. There are many other details to this that I don't need to get into and Google will explain very well.

You can read more about indexes on MongoDB's docs, but to really answer your question; if you don't have an index on the sale field in your collection, MongoDB will be forced to scan all documents in that collection from disk (although some may be cached in-memory).

You will have to find the sweet spot for how many indexes your server can hold, and tradeoff indexes for collections that isn't accessed as often as other collections. The more indexes you have, the more RAM the mongod daemon will consume.

更多推荐

本文发布于:2023-08-02 16:28:00，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1378605.html