爬取网站租房信息

编程入门 行业动态 更新时间:2024-10-08 12:38:49

爬取网站<a href=https://www.elefans.com/category/jswz/34/1714394.html style=租房信息"/>

爬取网站租房信息

#如何利用python在网上爬取数据

from flask import Flask, request
from flask import jsonify
from flask import render_template
from flask import Response
import requests
from bs4 import BeautifulSoup
import pymysql
app = Flask(name)

@app.route("/get_houses_db/")
def get_houses_db():

从数据库读出来的数据,url 为房源 url,address 为房源定位地址

houses = []

Connect to the database

connection = pymysql.connect(host=‘127.0.0.1’,
user=‘root’,
password=‘123’,
db=‘你的数据库名字’,
charset=‘utf8mb4’,
cursorclass=pymysql.cursors.DictCursor)
try:
with connection.cursor() as cursor:

Read a single record

sql = “SELECT 你的 URL 字段,你的地址字段 FROM 你的房源数据表 where 1=1;”
keyword = request.args.get(‘keyword’)
if keyword is not None:
sql = sql + “查询字段 like %%s%” % keyword
cursor.execute(sql)
houses = cursor.fetchall()
finally:
connection.close()
return jsonify(houses)

@app.route("/get_houses", methods=[‘POST’, ‘GET’])
def get_houses():

直接从网页获取数据,url 为房源 url,address 为房源定位地址

houses = []
city = request.args.get(‘city’)
if city is None:
city = ‘bj’
city_url = ‘http://%s.58’ % city
for page_num in range(1, 10):
url = “%s/pinpaigongyu/pn/%d/” % (city_url, page_num)
headers = {
‘connection’: “keep-alive”,
‘upgrade-insecure-requests’: “1”,
‘user-agent’: “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36”,
‘accept’: “text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,/;q=0.8”,
‘accept-encoding’: “gzip, deflate”,
‘accept-language’: “zh-CN,zh;q=0.9,en;q=0.8,da;q=0.7”,
‘cookie’: “f=n; f=n; id58=c5/njVsEqPqC7y9vB/RHAg==; 58tj_uuid=ac94c044-cbb8-451c-b6be-974f90197010; new_uv=1; utm_source=; spm=; init_refer=https%253A%252F%252Fcn.bing%252F; als=0; f=n; new_session=0; qz_gdt=; Hm_lvt_dcee4f66df28844222ef0479976aabf1=1527032264,1527032267,1527032270,1527032380; Hm_lpvt_dcee4f66df28844222ef0479976aabf1=1527032421; ppStore_fingerprint=3283C76981CCD1090B42ACBBF624A4C9613FE967CDC69C58%EF%BC%BF1527032420843”,
‘cache-control’: “no-cache”,
}
response = requests.request(“GET”, url, headers=headers)
htmlSoup = BeautifulSoup(response.text, “html.parser”)
ul = htmlSoup.find(attrs={“class”: “list”})
if ul is None:
continue
li_list = ul.find_all(“li”)
if li_list is None:
continue
for li in li_list:
house = {}
house[‘url’] = ‘%s/%s’ % (city_url, li.find(“a”)[‘href’])
house[‘address’] = li.find(“h2”).text
houses.append(house)
return jsonify(houses)

@app.route(’/’)
def index():
return app.send_static_file(‘index.html’)

if name == ‘main’:
app.run(port=8888)

python3 安装 flask 之后,安装命令 pip install Flask
运行 python app.py

如何从网上爬取公开的专业数据(房产租房信息)

更多推荐

爬取网站租房信息

本文发布于:2024-03-05 12:14:44,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1712269.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:租房信息   网站

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!