Python:使用 urllib 登录网站

编程入门行业动态更新时间:2024-10-12 01:29:28

本文介绍了Python:使用 urllib 登录网站的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

限时送ChatGPT账号..

我想登录这个网站:https://www.fitbit/login这是我使用的代码:

导入 urllib2导入 urllib导入cookieliblogin_url = 'https://www.fitbit/login'acc_pwd = {'login':'Log In','email':'username','password':'pwd'}cj = cookielib.CookieJar() ## 添加 cookie开瓶器 = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))opener.addheaders = [('User-agent','Mozilla/5.0 \(兼容；MSIE 6.0；Windows NT 5.1)')]数据 = urllib.urlencode(acc_pwd)尝试:opener.open(login_url,data,10)打印登录 - 成功！"除了:打印登录 - 超时！"，login_url

我使用 chrome 来检查输入框的元素，我尝试了很多键对，但没有一个有效.谁能帮我看看这个网站?我在变量 acc_pwd 中显示的正确数据是什么?

非常感谢

解决方案

你忘记了表单的隐藏字段:

<input type="hidden" value="登录" name="登录"><input type="hidden" value="" name="includeWorkflow"><input id="loginRedirect" type="hidden" value="" name="redirect"><input id="disableThirdPartyLogin" type="hidden" value="false" name="disableThirdPartyLogin"><input class="field email" type="text" tabindex="23" name="email" placeholder="E-mail"><input class="field password" type="password" tabindex="24" name="password" placeholder="Mot de passe"></表单>

所以你可能想要更新:

acc_pwd = {'login':'Log In','电子邮件':'用户名'，'密码':'密码'，'disableThirdPartyLogin':'false','登录重定向':'','包括工作流':'','登录':'登录'}

可能会被他们的服务检查.虽然，鉴于字段 disableThirdPartyLogin 的名称，我想知道是否没有脏 javascript 绑定到表单的提交操作，在实际执行 POST 之前实际上添加了一个值.您可能需要使用开发人员工具和 POST 值进行检查.

测试看起来没有，尽管 javascript 添加了一些可能来自 cookie 的值:

__fp w686jv_O1ZZztQ7FkK21Ry2MI7JbqWTf_sourcePage tJvTQfA5dkvGrJMFkFsv6XbX0f6OV1Ndj1zeGcz7OKzA3gkNXMXGnj27D-H9WXS-disableThirdPartyLogin false电子邮件 foo@example包括工作流登录 登录密码 aeou重定向

这是我对使用请求(它具有比 urllib 更好的 API ;-) )的看法

<预><代码>>>>进口请求>>>导入cookielib>>>jar = cookielib.CookieJar()>>>login_url = 'https://www.fitbit/login'>>>acc_pwd = {'登录':'登录',...电子邮件":用户名"，... '密码':'密码'，... 'disableThirdPartyLogin':'false',...'登录重定向':''，... 'includeWorkflow':'',... '登录':'登录'... }>>>r = requests.get(login_url, cookies=jar)>>>r = requests.post(login_url, cookies=jar, data=acc_pwd)

并且不要忘记首先使用 get 进入登录页面以填充您的 cookie jar！

最后，我无法为您提供更多帮助，因为我在 fitbit 上没有有效帐户，我不需要/想要一个.所以我只能进入登录失败页面进行测试.

解析输出，然后你可以使用:

<预><代码>>>>从 lxml 导入 etree>>>p = etree.HTML(r.text)

例如获取错误消息:

<预><代码>>>>p.xpath('//ul[@class="errorList"]/li/text()')['Lutilisateur nexiste pas ou le mot de passe est 不正确.']

资源:

lxml:http://lxml.de请求:http://python-requests

他们都在 pypi 上:

pip install lxml 请求

HTH

I want to log in to this website: https://www.fitbit/login This is my code I use:

import urllib2
import urllib
import cookielib

login_url = 'https://www.fitbit/login'
acc_pwd = {'login':'Log In','email':'username','password':'pwd'}
cj = cookielib.CookieJar() ## add cookies
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
opener.addheaders = [('User-agent','Mozilla/5.0 \
                    (compatible; MSIE 6.0; Windows NT 5.1)')]
data = urllib.urlencode(acc_pwd)
try:
    opener.open(login_url,data,10)
    print 'log in - success!'
except:
    print 'log in - times out!', login_url

I use chrome to inspect the element of the input box, I tried many key pairs, but none works. Any one can help me take a look at this website? What is the correct data I show put in my variable acc_pwd?

Thank you very much

解决方案

You're forgetting the hidden fields of the form:

<form id="loginForm" class="validate-enabled failure form" method="post" action="https://www.fitbit/login" name="login">
    <input type="hidden" value="Log In" name="login">
    <input type="hidden" value="" name="includeWorkflow">
    <input id="loginRedirect" type="hidden" value="" name="redirect">
    <input id="disableThirdPartyLogin" type="hidden" value="false" name="disableThirdPartyLogin">
    <input class="field email" type="text" tabindex="23" name="email" placeholder="E-mail">
    <input class="field password" type="password" tabindex="24" name="password" placeholder="Mot de passe">
</form>

so you may want to update:

acc_pwd = {'login':'Log In',
           'email':'username',
           'password':'pwd',
           'disableThirdPartyLogin':'false',
           'loginRedirect':'',
           'includeWorkflow':'',
           'login':'Log In'
          }

which might get checked by their service. Though, given the name of the field disableThirdPartyLogin, I'm wondering if there's no dirty javascript bound to the form's submit action that actually adds a value before actually doing the POST. You might want to check that with developer tools and POST values analyzed.

Testing that looks it does not, though the javascript adds some values, which may be from cookies:

__fp    w686jv_O1ZZztQ7FkK21Ry2MI7JbqWTf
_sourcePage tJvTQfA5dkvGrJMFkFsv6XbX0f6OV1Ndj1zeGcz7OKzA3gkNXMXGnj27D-H9WXS-
disableThirdPartyLogin  false
email   foo@example
includeWorkflow 
login   Log In
password    aeou
redirect

here's my take on doing this using requests (which has a better API than urllib ;-) )

>>> import requests
>>> import cookielib
>>> jar = cookielib.CookieJar()
>>> login_url = 'https://www.fitbit/login'
>>> acc_pwd = {'login':'Log In',
...            'email':'username',
...            'password':'pwd',
...            'disableThirdPartyLogin':'false',
...            'loginRedirect':'',
...            'includeWorkflow':'',
...            'login':'Log In'
...           }
>>> r = requests.get(login_url, cookies=jar)
>>> r = requests.post(login_url, cookies=jar, data=acc_pwd)

and don't forget to first get on the login page using a get to fill your cookies jar in!

Finally, I can't help you further, as I don't have a valid account on fitbit and I don't need/want one. So I can only get to the login failure page for my tests.

edit:

to parse the output, then you can use:

>>> from lxml import etree
>>> p = etree.HTML(r.text)

for example to get the error messages:

>>> p.xpath('//ul[@class="errorList"]/li/text()')
['Lutilisateur nexiste pas ou le mot de passe est incorrect.']

resources:

lxml: http://lxml.de requests: http://python-requests

and they both on pypi:

pip install lxml requests

HTH

这篇关于Python:使用 urllib 登录网站的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

更多推荐

[db:关键词]

本文发布于:2023-04-28 13:44:21，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1173725.html

上一篇：使用java从奇怪但有效的url获取域
下一篇： R中的调整不起作用(Regmatches in R not working)

发布评论取消回复

评论列表（有 0 条评论）

Python:使用 urllib 登录网站

问题描述

发布评论取消回复

最近发表

热门文章

标签列表