我一直在追踪一个URL重写应用程序中的错误。该缺陷表现出了作为查询字符串中的一些读音符号字符的编码问题。
I have been tracking down a bug on a Url Rewriting application. The bug showed up as an encoding problem on some diacritic characters in the querystring.
基本上,问题是,这是基本的要求/search.aspx?search=heřmánek天渐渐改写的查询字符串搜索=他%C5%99米%C3%a1nek
Basically, the problem was that a request which was basically /search.aspx?search=heřmánek was getting rewritten with a querystring of "search=he%c5%99m%c3%a1nek"
正确的值(使用一些不同,工作code)的查询字符串的改写为搜索=他%u0159m%u00e1nek
The correct value (using some different, working code) was a rewrite of the querystring as "search=he%u0159m%u00e1nek"
请注意两个字符串之间的差异。但是,如果你俩后,你会看到,URL编码再现相同的字符串。它不是,直到您使用context.Rewrite功能的编码休息。在调用重写功能后,断了线的回报heÅmánek(使用的Request.QueryString [搜索]和工作字符串返回heřmánek'。这种变化发生的情况。
Note the difference between the two strings. However, if you post both you'll see that the Url Encoding reproduces the same string. It's not until you use the context.Rewrite function that the encoding breaks. The broken string returns 'heÅmánek' (using Request.QueryString["Search"] and the working string returns 'heřmánek'. This change happens after the call to the rewrite function.
我跟踪下来到一组code。使用的Request.QueryString(工作),另一个使用Request.Url.Query(request.Url返回Uri实例)。
I traced this down to one set of code using Request.QueryString (working) and the other using Request.Url.Query (request.Url returns a Uri instance).
虽然我曾在那里是我的理解一个洞在这里,所以如果有人知道的区别,我已经准备好了教训。错误
While I have worked out the bug there is a hole in my understanding here, so if anyone knows the difference, I'm ready for the lesson.
推荐答案您有什么指示的破恩codeD字符串实际上是按照标准正确的编码。你表示为正确的编码的人使用一个非标准扩展的规格,让%uXXXX 的格式(我相信它应该表明UTF-16编码)。
What you indicated as the "broken" encoded string is actually the correct encoding according to standards. The one that you indicated as "correct" encoding is using a non-standard extension to the specifications to allow a format of %uXXXX (I believe it's supposed to indicate UTF-16 encoding).
在任何情况下,破恩codeD字符串就可以了。您可以使用下面的code来测试:
In any case, the "broken" encoded string is ok. You can use the following code to test that:
Uri uri = new Uri("www.example/test.aspx?search=heřmánek"); Console.WriteLine(uri.Query); Console.WriteLine(HttpUtility.UrlDecode(uri.Query));
工作正常。然而......在一个预感,我试过UrlDe code与指定的Latin-1 codePAGE,而不是默认的UTF-8:
Works fine. However... on a hunch, I tried UrlDecode with a Latin-1 codepage specified, instead of the default UTF-8:
Console.WriteLine(HttpUtility.UrlDecode(uri.Query, Encoding.GetEncoding("iso-8859-1")));
...我把你指定的,heÅmánek坏值。换句话说,它看起来像调用 HttpContext.RewritePath()莫名其妙地改变URL编码/解码使用拉丁文1 codePAGE,而不是UTF-8 ,这是由该UrlEn code,其默认的编码/德code的方法。
... and I got the bad value you specified, 'heÅmánek'. In other words, it looks like the call to HttpContext.RewritePath() somehow changes the urlencoding/decoding to use the Latin-1 codepage, rather than UTF-8, which is the default encoding used by the UrlEncode/Decode methods.
这看起来像一个错误,如果你问我。你可以看一下对 RewritePath() code反射器,并认为它肯定是玩的查询字符串 - 围绕其传递到各种虚拟路径的功能,和从一些非托管IIS code。
This looks like a bug if you ask me. You can look at the RewritePath() code in reflector and see that it is definitely playing with the querystring - passing it around to all kinds of virtual path functions, and out to some unmanaged IIS code.
我不知道的地方前进的道路上,在乌里在Request对象的核心被操纵的错误codePAGE?这可以解释为什么的Request.QueryString (也就是简单地从HTTP标头的原始值)是正确的,同时采用了变音符号错误的编码是不正确的URI。
I wonder if somewhere along the way, the Uri at the core of the Request object gets manipulated with the wrong codepage? That would explain why Request.Querystring (which is simply the raw values from the HTTP headers) would be correct, while the Uri using the wrong encoding for the diacriticals would be incorrect.
更多推荐
什么是Request.Url.Query到Request.QueryString之间的区别?
发布评论