我尝试使用PHP Curl从网址下载CSV文件,但Curl响应是网页的HTML,而不是CSV。如何解决此问题?
我已尝试匹配referer和用户代理。
这是firefox中的工作请求:
GET [someURL ] HTTP / 1.1 Host:[someHost] 用户代理:Mozilla / 5.0(Windows; U; Windows NT 6.1; en-US; rv:1.9.2.13)Gecko / 20101203 Firefox / 13 Accept:text / html,application / xhtml + xml,application / xml; q = 0.9,* / *; q = 0.8 Accept-Language:en-us,en; q = 0.5 Accept-Encoding:gzip,deflate Accept-Charset:ISO-8859-1,utf-8; q = 0.7,*; q = 0.7 Keep-Alive:115 Connection:keep-alive Referer:[someReferer] Cookie:[某些cookie数据,包括jsession]这里是响应:
HTTP / 1.1 200 OK Content-Type:text / csv 日期:2011年6月5日星期日00:33:21 GMT 过期日期:2011年6月5日00:34:21 GMT Content-Disposition:attachment; filename = [some file name.csv] Connection:Keep-Alive Last-Modified:Sun,05 Jun 2011 00:33:21 GMT X-Powered-By:Servlet / 2.5 JSP / 2.1 Content-Length:5012解决方案
我认为您正在寻找的选项是 -J :
从 curl.haxx.se/docs/manpage.html#-J :
-O,--remote-name选项使用服务器指定的Content-Disposition文件名,而不是从URL中提取文件名。
I am trying to download a CSV file from a URL using PHP Curl, but the Curl response is the HTML of the page, and not the CSV. How do I fix this?
I've tried matching the referer and user-agent.
Thanks!
Here is a working request in firefox:
GET [someURL] HTTP/1.1 Host: [someHost] User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 115 Connection: keep-alive Referer: [someReferer] Cookie: [some cookie data, including jsession]and here's the response:
HTTP/1.1 200 OK Content-Type: text/csv Date: Sun, 05 Jun 2011 00:33:21 GMT Expires: Sun, 05 Jun 2011 00:34:21 GMT Content-Disposition: attachment; filename=[some file name.csv] Connection: Keep-Alive Last-Modified: Sun, 05 Jun 2011 00:33:21 GMT X-Powered-By: Servlet/2.5 JSP/2.1 Content-Length: 5012解决方案
I think that option that you are looking for is -J:
From curl.haxx.se/docs/manpage.html#-J:
-J, --remote-header-name
(HTTP) This option tells the -O, --remote-name option to use the server-specified Content-Disposition filename instead of extracting a filename from the URL.
更多推荐
当响应是content
发布评论