我是Python的新手。 我正在使用BeautifulSoup - python模块。 我必须找到并获取任何id的文本,如MathJax-Element-1, MathJax-Element-2, MathJax-Element-3, MathJax-Element-4,…. 等等,如果存在的话。
我的代码是
from bs4 import BeautifulSoup soup = BeautifulSoup(html_doc, 'html.parser') attempts = 0 a=-1 while attempts < 100: try: a+=1 math="MathJax-Element-" math +=`a` soup=(soup.find(id=math)) print(soup.get_text()) attempts = 0 except AttributeError: attempts +=1但在属性错误后代码失败。 例如,如果没有id MathJax-Element-2,那么我就不会得到任何id的文本,例如MathJax-Element-3和MathJax-Element-4
在异常后尝试离开引起异常的行,即soup=(soup.find(id=math))
我的代码出了什么问题?
I am new to Python. I'm using BeautifulSoup - python module. I have to find and get text of any id like MathJax-Element-1, MathJax-Element-2, MathJax-Element-3, MathJax-Element-4,…. so on if it exists.
my code is
from bs4 import BeautifulSoup soup = BeautifulSoup(html_doc, 'html.parser') attempts = 0 a=-1 while attempts < 100: try: a+=1 math="MathJax-Element-" math +=`a` soup=(soup.find(id=math)) print(soup.get_text()) attempts = 0 except AttributeError: attempts +=1but after an attribute error the code fails. For example if there is no id MathJax-Element-2, then I am not getting text of any id following that, like MathJax-Element-3 and MathJax-Element-4
trying after exception is leaving the line that caused exception ie, soup=(soup.find(id=math))
What has gone wrong in my code?
最满意答案
soup=(soup.find(id=math)) print(soup.get_text())这些行用HTML元素覆盖现有的soup BeautifulSoup对象,该元素没有find方法。 这意味着在第一次迭代后的每次迭代中, soup.find将始终失败。
尝试使用其他变量名称。
element=(soup.find(id=math)) print(element.get_text()) soup=(soup.find(id=math)) print(soup.get_text())These lines are overwriting the existing soup BeautifulSoup object with an HTML element, which has no find method. This means that soup.find will always fail for every iteration after the first one.
Try using a different variable name.
element=(soup.find(id=math)) print(element.get_text())更多推荐
发布评论