BeautifulSoup如何获得跨度内容?(how BeautifulSoup get the content inside a span?)

编程入门 行业动态 更新时间:2024-10-25 02:31:27
BeautifulSoup如何获得跨度内容?(how BeautifulSoup get the content inside a span?)

我试图解析来自我设法分析匹配列的网站的夹具内容,但在解析日期和时间列时遇到困难。

我的程序

import re import pytz import requests import datetime from bs4 import BeautifulSoup from espncricinfo.exceptions import MatchNotFoundError, NoScorecardError from espncricinfo.match import Match bigbash_article_link = "http://www.espncricinfo.com/ci/content/series/1128817.html?template=fixtures" r = requests.get(bigbash_article_link) bigbash_article_html = r.text soup = BeautifulSoup(bigbash_article_html, "html.parser") bigbash1_items = soup.find_all("span",{"class": "fixture_date"}) bigbash_items = soup.find_all("span",{"class": "play_team"}) bigbash_article_dict = {} date_dict = {} for div in bigbash_items: a = div.find('a')['href'] bigbash_article_dict[div.find('a').string] = a print(bigbash_article_dict) for div in bigbash1_items: a = div.find('span').string date_dict[div.find('span').string] = a print(date_dict)

当我执行这个时,我得到print(bigbash_article_dict)输出,但print(date_dict)给了我错误,我该如何解析日期和时间内容?

I'm trying to parse fixture contents from a website I managed to parse Match column but facing difficulty in parsing date and time column.

My program

import re import pytz import requests import datetime from bs4 import BeautifulSoup from espncricinfo.exceptions import MatchNotFoundError, NoScorecardError from espncricinfo.match import Match bigbash_article_link = "http://www.espncricinfo.com/ci/content/series/1128817.html?template=fixtures" r = requests.get(bigbash_article_link) bigbash_article_html = r.text soup = BeautifulSoup(bigbash_article_html, "html.parser") bigbash1_items = soup.find_all("span",{"class": "fixture_date"}) bigbash_items = soup.find_all("span",{"class": "play_team"}) bigbash_article_dict = {} date_dict = {} for div in bigbash_items: a = div.find('a')['href'] bigbash_article_dict[div.find('a').string] = a print(bigbash_article_dict) for div in bigbash1_items: a = div.find('span').string date_dict[div.find('span').string] = a print(date_dict)

When I execute this I get print(bigbash_article_dict) output, but print(date_dict) gives me error, how can I parse date and time content?

最满意答案

按照您的代码,您想要获取标签范围内的内容。 所以你应该使用“div.contents”来获取span的内容。

你的问题应该是BeautifulSoup如何获得跨度内容。

eg. div= <span class="fixture_date"> Thu Feb 22 </span> div.contents[0].strip()= Thu Feb 22 ------------ for div in bigbash1_items: print("div=",div) print("div.contents[0].strip()=",div.contents[0].strip(),"\r\n------------\r\n")

Follow your code, you want to get the content inside the tag span. So you should using "div.contents" to get the contents of span.

And your question should be how BeautifulSoup get the content inside a span.

eg. div= <span class="fixture_date"> Thu Feb 22 </span> div.contents[0].strip()= Thu Feb 22 ------------ for div in bigbash1_items: print("div=",div) print("div.contents[0].strip()=",div.contents[0].strip(),"\r\n------------\r\n")

更多推荐

本文发布于:2023-08-05 00:12:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1423299.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:跨度   如何获得   内容   BeautifulSoup   content

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!