当最终 url 是 https 时,如何使用 python 取消缩短(解析)url?

编程入门 行业动态 更新时间:2024-10-24 08:28:49
本文介绍了当最终 url 是 https 时,如何使用 python 取消缩短(解析)url?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

当最终 url 是 https 时,我希望在 python 中缩短(解析)一个 url.我看到了这个问题:如何取消缩短网址使用 python? (以及其他类似的),但是正如对已接受答案的评论中所述,此解决方案仅在网址未重定向到 https 时有效.

作为参考,该问题中的代码(重定向到 http url 时工作正常)是:

# 这是针对 Py2k 的.对于 Py3k,请改用 http.client 和 urllib.parse,并且# 使用//代替/进行除法导入 httplib导入 urlparsedef unshorten_url(url):解析 = urlparse.urlparse(url)h = httplib.HTTPConnection(parsedloc)资源 = parsed.path如果 parsed.query != "":资源+=?"+ parsed.queryh.request('HEAD', 资源)响应 = h.getresponse()如果 response.status/100 == 3 和 response.getheader('Location'):return unshorten_url(response.getheader('Location')) # 改为处理短网址链别的:返回网址

(注意 - 出于明显的带宽原因,我希望通过只请求文件头的 [即像上面的 http-only 版本] 而不是请求整个页面的内容来实现)

解决方案

您可以从 url 获取方案,然后在 解析后使用 HTTPSConnection.方案是https.您也可以使用 requests 库非常简单地完成此操作.

>>>进口请求>>>r = requests.head('bit.ly/IFHzvO', allow_redirects=True)>>>打印(r.url)www.google

I am looking to unshorten (resolve) a url in python, when the final urls are https. I have seen the question: How can I un-shorten a URL using python? (as well as similar others), however as noted in the comment to the accepted answer, this solution only works when the urls is not redirected to https.

For reference, the code in that question (which works fine when redirecting to http urls) is:

# This is for Py2k. For Py3k, use http.client and urllib.parse instead, and # use // instead of / for the division import httplib import urlparse def unshorten_url(url): parsed = urlparse.urlparse(url) h = httplib.HTTPConnection(parsedloc) resource = parsed.path if parsed.query != "": resource += "?" + parsed.query h.request('HEAD', resource ) response = h.getresponse() if response.status/100 == 3 and response.getheader('Location'): return unshorten_url(response.getheader('Location')) # changed to process chains of short urls else: return url

(note - for obvious bandwidth reasons, I am looking to achieve via only asking for the file header's [i.e. like the http-only version above] and not by asking for the content of the whole pages)

解决方案

You can get the scheme from the url and then use HTTPSConnection if the parsed.scheme is https. You can also use the requests library to do this very simply.

>>> import requests >>> r = requests.head('bit.ly/IFHzvO', allow_redirects=True) >>> print(r.url) www.google

更多推荐

当最终 url 是 https 时,如何使用 python 取消缩短(解析)url?

本文发布于:2023-11-30 01:18:09,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1648233.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:如何使用   url   https   python

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!