如何使用BeautifulSoup从Python中删除字符串中的html标记(How to remove html tags from strings in Python using BeautifulSoup)
在这里编程新手:)
我想使用BeautifulSoup从网站打印价格。 这是我的代码:
#!/usr/bin/env python # -*- coding: utf-8 -*- from bs4 import BeautifulSoup, SoupStrainer from urllib2 import urlopen url = "Some retailer's url" html = urlopen(url).read() product = SoupStrainer('span',{'style': 'color:red;'}) soup = BeautifulSoup(html, parse_only=product) print soup.prettify()它按以下顺序打印价格:
<span style="color:red;"> 180 </span> <span style="color:red;"> 1250 </span> <span style="color:red;"> 380 </span>我试过print soup.text.strip()但它返回1801250380
请帮我打印每行的价格:)
非常感谢!
programming newbie here :)
I'd like to print the prices from the website using BeautifulSoup. this is my code:
#!/usr/bin/env python # -*- coding: utf-8 -*- from bs4 import BeautifulSoup, SoupStrainer from urllib2 import urlopen url = "Some retailer's url" html = urlopen(url).read() product = SoupStrainer('span',{'style': 'color:red;'}) soup = BeautifulSoup(html, parse_only=product) print soup.prettify()and it prints prices in the following order:
<span style="color:red;"> 180 </span> <span style="color:red;"> 1250 </span> <span style="color:red;"> 380 </span>I tried print soup.text.strip() but it returned 1801250380
Please help me to print the prices per single row :)
Many thanks!
最满意答案
>>> print "\n".join([p.get_text(strip=True) for p in soup.find_all(product)]) 180 1250 380 >>> print "\n".join([p.get_text(strip=True) for p in soup.find_all(product)]) 180 1250 380更多推荐
发布评论