我正在尝试运行以下代码,
for parname in parss: data = {'action': 'listp', 'parish': parname} data = urllib.urlencode(data) req = urllib2.Request('http://www.irishancestors.ie/search/townlands/ded_index.php', data) response = urllib2.urlopen(req)但是我在代码执行后几分钟就得到了错误
urllib2.URLError: <urlopen error [Errno 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond>这是我的代理设置。
任何帮助都非常感谢
I am trying to run the following code,
for parname in parss: data = {'action': 'listp', 'parish': parname} data = urllib.urlencode(data) req = urllib2.Request('http://www.irishancestors.ie/search/townlands/ded_index.php', data) response = urllib2.urlopen(req)but i get the error below few minutes after the code gets executed
urllib2.URLError: <urlopen error [Errno 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond>This is my proxy settings.
Any help is highly appreciated
最满意答案
正如评论中所讨论的,在很短的时间内执行大量请求可能会导致服务器(尤其是Web服务器)阻止您的连接尝试。
这是针对Web自动攻击的常见对策。 根据服务器的不同,请求之间等待非常短的时间可以解决您的问题。
您还可以使用更动态的方法。 首先,执行尽可能多的请求,中间不要等待。 如果请求的时间比平时长得多,则很可能是超时,您必须等待。 此时,您取消请求,等待并重试。 如果后续尝试也导致超时,则等待时间加倍。 通过这个称为自适应退避的过程 ,您应该(希望)能够以最小的开销访问您想要的数据。
As discussed in the comments, executing large numbers of requests in very short time can lead to the server, especially web-servers, to block your connection attempts.
This is a common counter measure to automated attacks on the web. Depending on the server, waiting very short amounts of time between requests should solve your problem.
You could also use a more dynamic approach. First, execute as much requests as possible with no waits in between. If a request takes significantly longer than usual it is most likely a timeout and you have to wait. At this point, you cancel your request, wait and try again. If the subsequent try also results in a timeout you double the waiting time. With this procedure, called adaptive backoff, you should be (hopefully) able to access the data you want with minimal overhead.
更多推荐
发布评论