我想在python中使用线程来下载许多网页,并通过以下代码在一个网站中使用队列.
I wanted to use threading in python to download lot of webpages and went through the following code which uses queues in one of the website.
它放置了一个无限的while循环.每个线程是否连续运行,直到所有线程完成才结束?我想念什么吗?
it puts a infinite while loop. Does each of thread run continuously with out ending till all of them are complete? Am I missing something.
#!/usr/bin/env python import Queue import threading import urllib2 import time hosts = ["yahoo", "google", "amazon", "ibm", "apple"] queue = Queue.Queue() class ThreadUrl(threading.Thread): """Threaded Url Grab""" def __init__(self, queue): threading.Thread.__init__(self) self.queue = queue def run(self): while True: #grabs host from queue host = self.queue.get() #grabs urls of hosts and prints first 1024 bytes of page url = urllib2.urlopen(host) print url.read(1024) #signals to queue job is done self.queue.task_done() start = time.time() def main(): #spawn a pool of threads, and pass them queue instance for i in range(5): t = ThreadUrl(queue) t.setDaemon(True) t.start() #populate queue with data for host in hosts: queue.put(host) #wait on the queue until everything has been processed queue.join() main() print "Elapsed Time: %s" % (time.time() - start)推荐答案
将线程设置为daemon线程会使线程在完成主线程后退出.但是,是的,您是正确的,因为只要queue中有某些内容,线程将连续运行,否则它将阻塞.
Setting the thread's to be daemon threads causes them to exit when the main is done. But, yes you are correct in that your threads will run continuously for as long as there is something in the queue else it will block.
文档解释了此详细信息队列文档
The documentation explains this detail Queue docs
python Threading文档也解释了daemon部分.
The python Threading documentation explains the daemon part as well.
没有活动的非守护线程时,整个Python程序都会退出.
因此,当清空队列并在解释器退出时恢复queue.join时,线程将死亡.
So, when the queue is emptied and the queue.join resumes when the interpreter exits the threads will then die.
对Queue
更多推荐
使用队列在python中进行线程化
发布评论