我想从需要登录的网站的HTML中提取数据。我在网上浏览了4天,但没有明确的结果。
I want to extract data from HTML of a website that needs to be logged in. I am searhing the web for 4 days with no clear results.
我需要Asp/C#的答案。
I need an answer for Asp/C#.
我到目前为止收集的信息:
- 我需要使用HttpWebRequest和HttpWebResponse
- 我必须使用cookie
- 该网站正在使用PHP, JS / AJAX
- postData类似于 email = ???& password = ???& login_now = yes& login_submit =;
- 登录按钮重定向到 http:// www 。 wsName / ajax / login.ajax .php
- I need to use HttpWebRequest and HttpWebResponse
- I have to work with cookies
- Website is using PHP, JS/AJAX
- postData is like "email=???&password=???&login_now=yes&login_submit=";
- Login button redirects to "www.wsName/ajax/login.ajax.php"
登录页面源代码:
<form method="post" action="/" id="login_form" onsubmit="return ajax_submit('#login_form',post_login)"> <table class="login_table" cellspacing="0" cellpadding="0"> <tr> <td><label for="email">Email</label></td> <td><label for="password">Password</label></td> <td></td> </tr> <tr> <td> <div class='sign_up_error' id="sign_up_error" style='margin-top:0;margin-left:-207px;'></div> <div class='light_input' style='width:150px'> <input type="text" name="email" id="email" class="login_field" style='width:142px' tabindex="1" /> </div> </td> <td> <div class='light_input' style='width:150px'> <input type="password" name="password" id="password" class="login_field" style='width:142px' tabindex="2" /> </div> </td> <td> <span class='button' id="login_button" onclick="ajax_submit('#login_form',post_login);" tabindex="3"><span class='button_border'>Log In <img src='/pics/cf_mini_arrow_right.png'></span></span> <input type="submit" class="" style="width:0px; height: 0px; overflow:hidden; border: none;" name="submit_login"/> </td> </tr> <tr> <td><input type="checkbox" name="remember" id="remember" value="1" tabindex="4"/><label for="remember">Remember me</label></td> <td class='right' style='padding-right:5px;'><a class='weight_normal' href="/forgot-password/">Forgot password?</a></td> <input type="hidden" name="login_now" value="yes" /> <td><input type='hidden' name="login_submit" id="login_submit" /></td> </tr> </table> </form>请求标头类似于(wsName:网站名称):
- 接受:text / html,application / xhtml + xml,application / xml; q = 0.9,image / webp, / ; q = 0.8
- 接受编码:gzip,压缩
- 接受语言:tr-TR,tr; q = 0.8,en- US; q = 0.6,en; q = 0.4
- 缓存控制:最大年龄= 0
- 连接:保持活动状态
- 内容长度:59
- 内容类型:应用程序/ x-www-form-urlencoded
- 主机: wsName
- 升级不安全请求:1
- 用户代理:Mozilla / 5.0(Windows NT 10.0 ; WOW64)AppleWebKit / 537.36 (KHTML,like Gecko)Chrome / 56.0.2924.87 Safari / 537.36
- Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,/;q=0.8
- Accept-Encoding:gzip, deflate
- Accept-Language:tr-TR,tr;q=0.8,en-US;q=0.6,en;q=0.4
- Cache-Control:max-age=0
- Connection:keep-alive
- Content-Length:59
- Content-Type:application/x-www-form-urlencoded
- Host: wsName
- Upgrade-Insecure-Requests:1
- User-Agent:Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36
请求标题的Cookie
-
wsName [css] = medium_text;
wsName[css]=medium_text;
cc_cookie_accept = cc_cookie_accept;
cc_cookie_accept=cc_cookie_accept;
我的代码(在此处上无效)就像:
My Code (which doesnt work and taken from here) is like:
string loginURL = "www.*wsName*/ajax/login.ajax.php"; string loginURL2 = "www.*wsName*"; string formDataStr = "email=???&password=???&login_now=yes&login_submit="; //First request: Get the cookies CookieCollection cookies = new CookieCollection(); HttpWebRequest request = (HttpWebRequest)WebRequest.Create(loginURL); request.CookieContainer = new CookieContainer(); request.CookieContainer.Add(cookies); //Get the response from the server and save the cookies from the first request.. HttpWebResponse response = (HttpWebResponse)request.GetResponse(); cookies = response.Cookies; //Second request: POST the form data and recover the cookies from the first request.. HttpWebRequest getRequest = (HttpWebRequest)WebRequest.Create(loginURL); getRequest.CookieContainer = new CookieContainer(); getRequest.CookieContainer.Add(cookies); //recover cookies First request getRequest.Method = WebRequestMethods.Http.Post; getRequest.UserAgent = "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36"; getRequest.AllowWriteStreamBuffering = true; getRequest.ProtocolVersion = HttpVersion.Version11; //getRequest.Referer = "www.*wsName*/index.php"; getRequest.AllowAutoRedirect = true; getRequest.ContentType = "application/x-www-form-urlencoded"; byte[] byteArray = Encoding.ASCII.GetBytes(formDataStr); getRequest.ContentLength = byteArray.Length; Stream newStream = getRequest.GetRequestStream(); //open connection newStream.Write(byteArray, 0, byteArray.Length); // Send the data. newStream.Close(); HttpWebResponse getResponse = (HttpWebResponse)getRequest.GetResponse(); using (StreamReader sr = new StreamReader(getResponse.GetResponseStream())) { string sourceCode = sr.ReadToEnd(); }我使用了两个链接(loginURL,loginURL2),进行了修改并尝试了很多东西,但是最后我有一个很大的零。这是我第一次需要使用HttpWebRequest和HttpWebResponse。所以我想我在这里想念一些东西。
I used both links (loginURL, loginURL2), modified and tried many things but at the end i have a very big zero. This is the first time i needed to work with HttpWebRequest and HttpWebResponse. So i think i miss something here.
请帮助,谢谢。
推荐答案您可以使用http请求配置文件工具,例如fiddler,用于记录您的Web浏览器请求详细信息,然后再次运行c#代码,比较两个结果之间的区别。
you can use http request profile tool, like fiddler to record your request details of web browser, and run your c# code again, compare the two results where are the differences.
更多推荐
在Asp.net中以编程方式登录网站
发布评论