Dryscrape：使用xpath从父节点列表中抓取子节点数据(Dryscrape: scrape child node data from parent node list using xpath)

编程入门行业动态更新时间:2024-10-25 06:21:31

我试图用dryscrape和python来学习http://quotes.toscrape.com/用于学习目的。我能够通过class =“quote”获得所有div。想要使用class =“quote”循环遍历div列表，并使用xpath从此父元素中获取多个数据。

import dryscrape from bs4 import BeautifulSoup session = dryscrape.Session() url = 'http://quotes.toscrape.com/' print 'Visiting the URL...' session.visit(url) print 'Status: ', session.status_code() for div in session.xpath("//div[@class='quote']"): # please help me to scrape author and quote for each div elements

I was trying to scrape http://quotes.toscrape.com/ using dryscrape and python for learning purpose. I was able to get all divs with class="quote". Would like to loop through the list of divs with class="quote" and get multiple data from this parent element using xpath.

最满意答案

import requests from bs4 import BeautifulSoup url = 'http://quotes.toscrape.com/' r = requests.get(url) soup = BeautifulSoup(r.text) for div in soup.findAll("div", {"class": "quote"}): print('Quote : ' + div.find('span').get_text()) print('Author : ' + div.find('small').get_text())

We can loop through each xpath elements and those will be objects having the content of individual elements. Each objects will have methods to get the data.

import dryscrape session = dryscrape.Session() url = 'http://quotes.toscrape.com/' print 'Visiting the URL...' session.visit(url) print 'Status: ', session.status_code() for div in session.xpath("//div[@class='quote']"): print "Quote: ", div.at_xpath(".//span").text() print "Author: ", div.at_xpath(".//small").text()

更多推荐

本文发布于:2023-08-03 12:49:00，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1390132.html