登录异步Tornado（python）服务器(Logging in an asynchronous Tornado (python) server)

编程入门行业动态更新时间:2024-10-10 10:31:05

我正在开发一个应用程序，我可能需要记录到达服务器的整个流量。此功能可以打开或关闭，也可以在捕获异常时使用。

无论如何，我担心磁盘I / O操作的阻塞性质及其对服务器性能的影响。处理请求时应用的业务逻辑（主要是POST http请求）是异步的，因此每个网络或db调用都是异步执行的。

另一方面，我担心线程在等待磁盘IO操作完成时的延迟。记录的消息可以是几个字节到几KB，但在某些情况下可以是几MB。当数据写入磁盘时，线程没有真正需要暂停，http请求肯定可以在那时完成，并且没有理由在将数据写入磁盘时ioloop线程不能处理其他任务。

所以我的问题是：

我对此问题过度担心吗？记录到标准输出，然后将其重定向到“足够好”的文件？什么是常见方法，或者您认为最适合登录基于龙卷风的应用程序的方法？即使是简单的伐木而不是上面概述的（极端）情况？这基本上是排队日志消息并从专用线程消耗它们的理想情况吗？假设我将日志记录卸载到另一个线程（如Homer Simpson的“其他人不能做吗？”），如果执行磁盘日志记录的线程正在等待磁盘操作完成，那么linux内核是否会接受该操作指出一个机会上下文切换？

任何意见或建议都非常感谢，

埃雷兹

I am working on an application in which I may potentially need to log the entire traffic reaching the server. This feature may be turned on or off, or may be used when exceptions are caught.

In any case, I am concerned about the blocking nature of disk I/O operations and their impact on the performance of the server. The business logic that is applied when a request is handled (mostly POST http requests), is asynchronous in such that every network or db calls are asynchronously executed.

On the other hand, I am concerned about the delay to the thread while it is waiting for the disk IO operation to complete. The logged messages can be a few bytes to a few KBs but in some cases a few MBs. There is no real need for the thread to pause while data is written to disk, the http request can definitely complete at that point and there is no reason that the ioloop thread not to work on another task while data is written to disk.

So my questions are:

am I over-worried about this issue? is logging to standard output and later redirecting it to a file "good enough"? what is the common approach, or the one you found most practical for logging in tornado-based applications? even for simple logging and not the (extreme) case I outlined above? is this basically an ideal case for queuing the logging messages and consume them from a dedicated thread? Say I do offload the logging to a different thread (like Homer Simpson's "Can't Someone Else Do It?"), if the thread that performs the disk logging is waiting for the disk io operation to complete, does the linux kernel takes that point as an opportunity a context switch?

Any comments or suggestion are much appreciated,

Erez

最满意答案

对于“正常”日志记录（每个请求几行），我总是发现直接记录到文件是足够好的。如果您将所有流量记录到服务器，则可能不是这样。有一次我需要做类似的事情我只是用tcpdump从外部捕获流量而不是修改我的服务器。

如果要在进程中捕获它，首先只需从主线程写入文件即可。与往常一样，在采取激烈行动之前测量您自己环境中的事物（ IOLoop.set_blocking_log_threshold对于确定您的日志记录是否有问题很有用）。

如果从主线程块写入时间过长，您可以写入由另一个线程处理的队列，也可以异步写入另一个进程的管道或套接字（syslog？）。

For "normal" logging (a few lines per request), I've always found logging directly to a file to be good enough. That may not be true if you're logging all the traffic to the server. The one time I've needed to do something like that I just captured the traffic externally with tcpdump instead of modifying my server.

If you want to capture it in the process, start by just writing to a file from the main thread. As always, measure things in your own environment before taking drastic action (IOLoop.set_blocking_log_threshold is useful for determining if your logging is a problem).

If writing from the main thread blocks for too long, you can either write to a queue that is processed by another thread, or write asynchronously to a pipe or socket to another process (syslog?).

更多推荐

本文发布于:2023-08-07 13:05:00，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1464300.html