我想以编程方式访问网页并从中提取一些信息。
我想通过Java代码登录某个网站,让服务器感觉这个请求实际上来自真正的浏览器。
我几乎就是一个问题:网站需要一个parameter - "sessid"传递parameter - "sessid"以随每个请求一起传递。
例如,当我第一次访问页面时, sessid=90334而在下一页,它就像sessid=78204 。
因此,我传递的url应包含sessid的值,否则验证将失败: www.somesite.com/somepage.php?sessid=75749 ? sessid 。
该网页包含一个<input>标记,其中包含sessid的值,我必须检索该标记的值。
我怎样才能做到这一点? 标签是这样的:
<input type="hidden" name="sessid" value="69529">
我可以使用以下代码成功阅读整个网页:
BufferedReader rd = new BufferedReader(new InputStreamReader(conn.getInputStream())); StringBuilder response = new StringBuilder(); String line; while ((line = rd.readLine()) != null) { response.append(line); }I want to access a webpage programmatically and extract some information from it.
I want to log in to some website through Java code and make the server feel that the request is actually coming from a real browser.
I am almost there albeit one problem: the website requires a parameter - "sessid" to be passed with to be passed with every request which keeps on changing with every request.
For e.g when I first access the page the sessid=90334 and at the next page its like sessid=78204.
Therefore the url I pass should contain the value of sessid otherwise the authentication fails: www.somesite.com/somepage.php?sessid=75749.
The webpage contains one <input> tag which holds the value of sessid and i have to retrieve the value of that tag.
How can i do that? The tag is like this:
<input type="hidden" name="sessid" value="69529">
I am able to read the whole webpage successfully using the following code:
BufferedReader rd = new BufferedReader(new InputStreamReader(conn.getInputStream())); StringBuilder response = new StringBuilder(); String line; while ((line = rd.readLine()) != null) { response.append(line); }最满意答案
您可以使用StringBuilder类的indexOf方法:
String startInputFragment = "<input type=\"hidden\" name=\"sessid\" value=\""; int startIdx = response.indexOf(startInputFragment); if (startIdx >= 0) { int endIdx = response.indexOf("\">", startIdx); String val = response.substring(startIdx + startInputFragment.length(), endIdx); System.out.println("-->" + val + "<--"); } else { //tag not found: you may throw an ex or do something else }You can use indexOf method of StringBuilder class:
String startInputFragment = "<input type=\"hidden\" name=\"sessid\" value=\""; int startIdx = response.indexOf(startInputFragment); if (startIdx >= 0) { int endIdx = response.indexOf("\">", startIdx); String val = response.substring(startIdx + startInputFragment.length(), endIdx); System.out.println("-->" + val + "<--"); } else { //tag not found: you may throw an ex or do something else }更多推荐
发布评论