Requests.get（）遇到问题(Having trouble with Requests.get())

以下代码在昨天工作，现在它被挂在终端，使错误list index out of range ，但是当我使用IDE运行相同的代码时，它完美地工作。

我不明白发生了什么。并且没有网址无效。

import requests import bs4 import webbrowser import csv def CheckStock(url): '''checks for shoes in stock''' headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'} RawHTML = requests.get(url, headers=headers) Page = bs4.BeautifulSoup(RawHTML.text, "lxml") ListOfRawSizes = Page.select('.size-dropdown-block') Sizes = str(ListOfRawSizes[0].getText()).replace('\t', '') Sizes = Sizes.replace('\n\n', ' ') Sizes = Sizes.split() Sizes.remove('Select') Sizes.remove('size') return Sizes

Following code worked yesterday, now it gets hung in terminal, giving error list index out of range, however when I run the same code using IDE it works perfectly.

I do not understand what is going on. And no the url isn't invalid.

import requests import bs4 import webbrowser import csv def CheckStock(url): '''checks for shoes in stock''' headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'} RawHTML = requests.get(url, headers=headers) Page = bs4.BeautifulSoup(RawHTML.text, "lxml") ListOfRawSizes = Page.select('.size-dropdown-block') Sizes = str(ListOfRawSizes[0].getText()).replace('\t', '') Sizes = Sizes.replace('\n\n', ' ') Sizes = Sizes.split() Sizes.remove('Select') Sizes.remove('size') return Sizes

最满意答案

问题是如果您的页面不包含任何带有ListOfRawSizes[0] .size-dropdown-block类的元素，则ListOfRawSizes[0]索引将超出范围。如果列表为空，则不能要求列表中的第一项，这将使index out of range错误。这很可能是由于页面与前一天和现在尝试不同而导致的。

在尝试索引列表之前，您需要检查ListOfRawSizes是否包含任何项目。幸运的是，python可以很容易地检查列表是否为空， if ListOfRawSizes:如果列表至少有一个项， if ListOfRawSizes: true。

ListOfRawSizes = Page.select('.size-dropdown-block') # check to see if the list of raw sizes is not empty if ListOfRawSizes: # we have at least one size so get the first item and do our work Sizes = str(ListOfRawSizes[0].getText()).replace('\t', '') Sizes = Sizes.replace('\n\n', ' ') Sizes = Sizes.split() Sizes.remove('Select') Sizes.remove('size') return Sizes # if we hit the else clause, our list must be empty else: # ...so return an empty list return []

另外，你真的不应该用大写字母命名你的变量。这很容易与类名冲突。 Python中的变量遵循“蛇案”惯例; 全部小写，带下划线以分隔单词。（例如： this_is_snake_case ）。

The problem is if your page doesn't contain any elements with a .size-dropdown-block class, your ListOfRawSizes[0] index will be out of range. You can't ask for the first item in a list if the list is empty, that will throw the index out of range error. This is most likely caused by the page being different from the previous day to when you tried it now.

You need to check if ListOfRawSizes has any items in it before trying to index the list. Fortunately, python makes it easy to check if a list is empty, if ListOfRawSizes: will be true if the list has at least one item.

ListOfRawSizes = Page.select('.size-dropdown-block') # check to see if the list of raw sizes is not empty if ListOfRawSizes: # we have at least one size so get the first item and do our work Sizes = str(ListOfRawSizes[0].getText()).replace('\t', '') Sizes = Sizes.replace('\n\n', ' ') Sizes = Sizes.split() Sizes.remove('Select') Sizes.remove('size') return Sizes # if we hit the else clause, our list must be empty else: # ...so return an empty list return []

Additionally, you really shouldn't name your variables starting with capital letters. This could very easily conflict with class names. Variables in Python follow "snake case" convention; All lowercase with underscores to separate words. (eg: this_is_snake_case).

更多推荐

Requests.get（）遇到问题(Having trouble with Requests.get())

最满意答案

发布评论取消回复

最近发表

热门文章

标签列表