所以,我使用Jupyter Lab笔记本连接到我们的生产数据库。 经过几天的工作,我们注意到服务器如何显示数百到数百个数据库的活动连接,列出为已建立(运行“ netstat -na ”)。
这非常糟糕,我们发现这些问题来自于python内核打开与服务器的连接,但实际上并没有关闭它们,即使这样做会被明确告知。
这是我们用来连接到服务器的代码的编辑版本,它本身在笔记本单元中运行,与其他代码分开。 我们孤立了这个问题,我们确信它来自这些代码行:
client = MongoClient(url, maxIdleTimeMS=120000) db = client["database"] coll = db["data"] query = # Our query data = list(coll.find(query)) client.close()这是为什么发生? 我们做错了什么? 为什么.close()方法实际上不关闭连接?
So, I'm using a Jupyter Lab notebook to connect to our production database. After a few days of work, we noticed how the server displays hundreds upon hundreds of active conenctions to the database, listed as established (running "netstat -na").
This is terribly bad, and we identified the issues as coming from the python kernel opening connections to the server without actually ever closing them, even if expelicitely told to do so.
This a redacted version of the code we are using to connect to the server, ran in a notebook cell by itself, separated from the other code. We isolated the issue and we are certain it comes from these lines of code:
client = MongoClient(url, maxIdleTimeMS=120000) db = client["database"] coll = db["data"] query = # Our query data = list(coll.find(query)) client.close()Why is this happening? What are we doing wrong? Why doesn't the .close() method actually close the connection?
最满意答案
在我们的生产环境中,我一直在使用MongoDB已经有一段时间了,并且在过去遇到过这样的问题。 data = list(coll.find(query))这行代码实际上实现了你的游标返回的查询结果,并使连接保持活动状态。 查询的结果是一个生成器,应该在循环中按原样使用。 将游标实现化为list()导致所有数据被拉入内存,导致它有时会崩溃,而不是指向结果集中第一个条目的游标。
您可以简单地在光标上执行以下操作:
for elem in cursor: do_something
并且不需要调用close()方法
其次,使用Jupyter笔记本电脑时,您需要在完成工作后停止会话。 除非这样做,否则笔记本应始终保持与MongoDB的连接,并一直占用资源。
I have been using MongoDB for quite a while now in our production environment and have faced such problems in the past. data = list(coll.find(query)) this line of code actually materializes the results of the query that your cursor returns and causes the connection to stay alive. The result of a query is a generator and should be consumed as is in a loop. Materializing the Cursor into a list() causes all the data to be pulled into the memory causing it to crash at times, as opposed to the cursor which points to the first entry in the result-set.
You can simply perform the following operation on the cursor:
for elem in cursor: do_something
and not require the call to the close() method
Secondly, with Jupyter Notebooks, you need to stop the session after you are done with your work. Unless this is done, the notebook shall always keep the connection to MongoDB alive eating up resources along the way.
更多推荐
发布评论