只是平时记录使用,只有代码没有讲解,代码是复制就可用,猫眼字符反爬

编程入门 行业动态 更新时间:2024-10-09 22:22:36

只是平时记录使用,只有<a href=https://www.elefans.com/category/jswz/34/1771412.html style=代码没有讲解,代码是复制就可用,猫眼字符反爬"/>

只是平时记录使用,只有代码没有讲解,代码是复制就可用,猫眼字符反爬

猫眼字符反爬
import requests
import re
import os, base64
from fontTools.ttLib import TTFont
from lxml import etree

class MaoYan(object):
def init(self):
self.url = ‘/?ver=normal’
self.headers = {
“User-Agent”: “Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:66.0) Gecko/20100101 Firefox/66.0”,
“Accept”: “text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8”,
“Accept-Language”: “zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2”,
“Accept-Encoding”: “gzip, deflate, br”,
“Connection”: “keep-alive”,
“Upgrade-Insecure-Requests”: “1”,
“Cache-Control”: “max-age=0”,
“Host”: “piaofang.maoyan”,
# “Cookie”:"_lxsdk_cuid=169c71f50e281-0e7c6e010a096-4c312c7c-13c680-169c71f50e3c8; _lxsdk_s=169c71f50e3-37f-516-194%7C%7C24; _lxsdk=4AC4EB9051C411E99C7A6F43C8178BB4643B432F07D64931B5C84EF95DFE1116; __mta=187110935.1553823912192.1553823918296.1553823929644.3; __mta=187110935.1553823912192.1553823912192.1553823914570.2; theme=moviepro; _lx_utm=utm_source%3DBaidu%26utm_medium%3Dorganic"
}
self.font_dict = {
‘uniEF00’:“2”, ‘uniF143’:“1”, ‘uniEBC3’:“8”, ‘uniEE14’:“4”, ‘uniF115’:“7”, ‘uniE4D7’:“5”, ‘uniF4A2’:“0”, ‘uniF366’:“3”, ‘uniE55B’:“9”, ‘uniF4A8’:“6”
}

# 发送请求获得响应
def get_html(self, url):response = requests.get(url, headers=self.headers)return response.content# 创建 self.font 属性
def create_font(self, font_file):# 列出已下载文件file_list = os.listdir('./fonts')# 判断是否已下载if font_file not in file_list:# 未下载则下载新库print('不在字体库中, 下载:', font_file)new_file = font_file.replace("data:application/font-woff;charset=utf-8;base64,", "")new_file = base64.b64decode(new_file)# print(new_file)with open('./fonts/' + "456.woff", 'wb') as f:f.write(new_file)# 打开字体文件,创建 self.font属性# self.font.saveXML(r'.\fonts\456.xml')# # print(self.font)# 把获取到的数据用字体对应起来,得到真实数据
def modify_data(self):new_dict = {}font = TTFont('./fonts/456.woff')result1_list = font.getGlyphOrder()[2:]font2 = TTFont('./fonts/123.woff')result2_list = font2.getGlyphOrder()[2:]for result1 in result1_list:obj1 = font['glyf'][result1]for result2 in result2_list:obj2 = font2['glyf'][result2]if obj2==obj1:result1 = "&#x" + result1[3:].lower()new_dict[result1] = self.font_dict[result2]return new_dictdef start_crawl(self):html = self.get_html(self.url).decode('utf-8')# print(html)# 正则匹配字体文件font_file = re.findall(r'src:url\((.+?)\) format', html)[0]# print(font_file)self.create_font(font_file)# # 正则匹配票房# # html = etree.HTML(html)# # star = html.xpath("//div[@id='ticket_tbody']/ul[1]/li[2]/b/i/text()")star = re.findall(r"""<li class="c2 ">.*?class="cs">(.+?)</i>.*?</li>.*?<li class="c3 "><i class="cs">(.+?)%</i></li>""", html, re.S)[0]print(star)

更多推荐

只是平时记录使用,只有代码没有讲解,代码是复制就可用,猫眼字符反爬

本文发布于:2024-02-14 12:50:34,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1763705.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:代码   猫眼   字符   平时

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!