我正在读取和解析一个Amazon XML文件,而当XML文件显示为'时,当我尝试打印该文件时,出现以下错误:
I'm reading and parsing an Amazon XML file and while the XML file shows a ' , when I try to print it I get the following error:
'ascii' codec can't encode character u'\u2019' in position 16: ordinal not in range(128)从到目前为止我在网上阅读的内容来看,该错误是由于XML文件位于UTF-8中引起的,但是Python希望将其作为ASCII编码字符来处理.有没有简单的方法可以使错误消失并让我的程序在读取时打印XML?
From what I've read online thus far, the error is coming from the fact that the XML file is in UTF-8, but Python wants to handle it as an ASCII encoded character. Is there a simple way to make the error go away and have my program print the XML as it reads?
推荐答案很可能,您的问题是您对其进行了解析,现在您正在尝试打印XML的内容,但由于存在一些外部原因,您无法进行打印Unicode字符.尝试先将unicode字符串编码为ascii:
Likely, your problem is that you parsed it okay, and now you're trying to print the contents of the XML and you can't because theres some foreign Unicode characters. Try to encode your unicode string as ascii first:
unicodeData.encode('ascii', 'ignore')忽略"部分将告诉它仅跳过那些字符.来自python文档:
the 'ignore' part will tell it to just skip those characters. From the python docs:
>>> u = unichr(40960) + u'abcd' + unichr(1972) >>> u.encode('utf-8') '\xea\x80\x80abcd\xde\xb4' >>> u.encode('ascii') Traceback (most recent call last): File "<stdin>", line 1, in ? UnicodeEncodeError: 'ascii' codec can't encode character '\ua000' in position 0: ordinal not in range(128) >>> u.encode('ascii', 'ignore') 'abcd' >>> u.encode('ascii', 'replace') '?abcd?' >>> u.encode('ascii', 'xmlcharrefreplace') 'ꀀabcd޴'您可能想阅读这篇文章: www.joelonsoftware/articles/Unicode.html ,我发现它作为发生问题的基本教程非常有用.阅读之后,您将不再觉得自己只是在猜测要使用的命令(或者至少是我遇到的命令).
You might want to read this article: www.joelonsoftware/articles/Unicode.html, which I found very useful as a basic tutorial on what's going on. After the read, you'll stop feeling like you're just guessing what commands to use (or at least that happened to me).
更多推荐
Python Unicode编码错误
发布评论