无法下载utf

编程入门 行业动态 更新时间:2024-10-09 06:21:31
无法下载utf-8网页内容(Can't download utf-8 web content)

我有一个简单的代码来获得越南网站的回复: http : //vnexpress.net ,但是有一个小问题。 这是第一次,它下载确定,但在此之后,内容包含这样的未知符号:b \ b \ 0 \ 0 \ 0 \ 0 \ 0 \0 \a`I %&/ m ....问题是什么?

string address = "http://vnexpress.net"; WebClient webClient = new WebClient(); webClient.Headers.Add("user-agent", "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.97 Safari/537.11 AlexaToolbar/alxg-3.1"); webClient.Encoding = System.Text.Encoding.UTF8; return webClient.DownloadString(address);

I have simple code for getting response from a vietnamese website: http://vnexpress.net , but there is a small problem. For the first time, it downloads ok, but after that, the content contains unknown symbols like this:�\b\0\0\0\0\0\0�\a`I�%&/m.... What is the problem?

string address = "http://vnexpress.net"; WebClient webClient = new WebClient(); webClient.Headers.Add("user-agent", "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.97 Safari/537.11 AlexaToolbar/alxg-3.1"); webClient.Encoding = System.Text.Encoding.UTF8; return webClient.DownloadString(address);

最满意答案

你会发现响应是GZipped。 除非您创建派生类并修改底层的HttpWebRequest以允许自动解压,否则似乎没有办法使用WebClient下载它。

以下是你如何做到这一点:

public class MyWebClient : WebClient { protected override WebRequest GetWebRequest(Uri address) { var req = base.GetWebRequest(address) as HttpWebRequest; req.AutomaticDecompression = DecompressionMethods.GZip; return req; } }

并使用它:

string address = "http://vnexpress.net"; MyWebClient webClient = new MyWebClient(); webClient.Headers.Add("user-agent", "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.97 Safari/537.11 AlexaToolbar/alxg-3.1"); webClient.Encoding = System.Text.Encoding.UTF8; return webClient.DownloadString(address);

You'll find that the response is GZipped. There doesn't appear to be a way to download that with WebClient, unless you create a derived class and modify the underlying HttpWebRequest to allow automatic decompression.

Here's how you'd do that:

public class MyWebClient : WebClient { protected override WebRequest GetWebRequest(Uri address) { var req = base.GetWebRequest(address) as HttpWebRequest; req.AutomaticDecompression = DecompressionMethods.GZip; return req; } }

And to use it:

string address = "http://vnexpress.net"; MyWebClient webClient = new MyWebClient(); webClient.Headers.Add("user-agent", "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.97 Safari/537.11 AlexaToolbar/alxg-3.1"); webClient.Encoding = System.Text.Encoding.UTF8; return webClient.DownloadString(address);

更多推荐

本文发布于:2023-08-04 20:36:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1421480.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:无法下载   utf

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!