检索特定产品的亚马逊评论

编程入门 行业动态 更新时间:2024-10-26 06:30:00
本文介绍了检索特定产品的亚马逊评论的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我目前正在研究一个项目,该项目需要分析特定产品的评论并获得有关该产品的总体思路.

I'm currently working on a research project which need to analyze reviews of a particular product and get an overall idea about the product.

我听说亚马逊是获得产品评论/评论的好地方.有没有办法通过API从Amazon检索这些用户评论/评论?我尝试了几种python代码,但没有用.如果没有API可以检索数据,我是否需要编写蜘蛛程序?

I heard that amazon is a good place to get product reviews/comments. Is there any way to retrieve those user reviews/comments from Amazon via an API?? I tried several python codes but it doesn't work.. Do i need to write a spider if there is no API to retrieve data?

是否有任何方法/场所来检索给定产品的用户评论?

Are there any approaches/places to retrieve user reviews for a given product?

推荐答案

www.Scrapehero上有一个很好的教程,介绍如何抓取Amazon产品详细信息:如何使用Python抓取Amazon产品详细信息和定价

www.Scrapehero has a great tutorial on how to scrape Amazon product details: How To Scrape Amazon Product Details and Pricing using Python

他们使用的完整纯文本代码是...产品由其ASIN标识,因此请将数组值更改为您感兴趣的产品.

The complete plain text code they use is ... Products are identified by their ASIN so change the array values to the products you are interested in watching.

from lxml import html import csv,os,json import requests from exceptions import ValueError from time import sleep def AmzonParser(url): headers = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.90 Safari/537.36'} page = requests.get(url,headers=headers) while True: sleep(3) try: doc = html.fromstring(page.content) XPATH_NAME = '//h1[@id="title"]//text()' XPATH_SALE_PRICE = '//span[contains(@id,"ourprice") or contains(@id,"saleprice")]/text()' XPATH_ORIGINAL_PRICE = '//td[contains(text(),"List Price") or contains(text(),"M.R.P") or contains(text(),"Price")]/following-sibling::td/text()' XPATH_CATEGORY = '//a[@class="a-link-normal a-color-tertiary"]//text()' XPATH_AVAILABILITY = '//div[@id="availability"]//text()' RAW_NAME = doc.xpath(XPATH_NAME) RAW_SALE_PRICE = doc.xpath(XPATH_SALE_PRICE) RAW_CATEGORY = doc.xpath(XPATH_CATEGORY) RAW_ORIGINAL_PRICE = doc.xpath(XPATH_ORIGINAL_PRICE) RAw_AVAILABILITY = doc.xpath(XPATH_AVAILABILITY) NAME = ' '.join(''.join(RAW_NAME).split()) if RAW_NAME else None SALE_PRICE = ' '.join(''.join(RAW_SALE_PRICE).split()).strip() if RAW_SALE_PRICE else None CATEGORY = ' > '.join([i.strip() for i in RAW_CATEGORY]) if RAW_CATEGORY else None ORIGINAL_PRICE = ''.join(RAW_ORIGINAL_PRICE).strip() if RAW_ORIGINAL_PRICE else None AVAILABILITY = ''.join(RAw_AVAILABILITY).strip() if RAw_AVAILABILITY else None if not ORIGINAL_PRICE: ORIGINAL_PRICE = SALE_PRICE if page.status_code!=200: raise ValueError('captha') data = { 'NAME':NAME, 'SALE_PRICE':SALE_PRICE, 'CATEGORY':CATEGORY, 'ORIGINAL_PRICE':ORIGINAL_PRICE, 'AVAILABILITY':AVAILABILITY, 'URL':url, } return data except Exception as e: print e def ReadAsin(): # AsinList = csv.DictReader(open(os.path.join(os.path.dirname(__file__),"Asinfeed.csv"))) AsinList = ['B0046UR4F4', 'B00JGTVU5A', 'B00GJYCIVK', 'B00EPGK7CQ', 'B00EPGKA4G', 'B00YW5DLB4', 'B00KGD0628', 'B00O9A48N2', 'B00O9A4MEW', 'B00UZKG8QU',] extracted_data = [] for i in AsinList: url = "www.amazon/dp/"+i print "Processing: "+url extracted_data.append(AmzonParser(url)) sleep(5) f=open('data.json','w') json.dump(extracted_data,f,indent=4) if __name__ == "__main__": ReadAsin()

更多推荐

检索特定产品的亚马逊评论

本文发布于:2023-10-12 20:54:41,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1485870.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:亚马逊   产品

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!