问题描述
一个编码问题的简单测试程序:
simple test program of an encoding issue:
#!/bin/env python
# -*- coding: utf-8 -*-
print u"Råbjerg" # >>> unicodedata.name(u"å") = 'LATIN SMALL LETTER A WITH RING ABOVE'
这是我从 debian 命令框使用它时得到的结果,我不明白为什么在这里使用重定向会破坏这个东西,因为我可以在不使用的情况下正确看到它.
here is what i get when i use it from a debian command box, i do not understand why using redirect here broke the thing, as i can see it correctly when using without.
有人可以帮助理解我错过了什么吗?打印这些字符的正确方法应该是什么,以便它们在任何地方都可以?
can someone help to understand what i have missed? and what should the right way to print this characters so that they are ok everywhere?
$ python testu.py
Råbjerg
$ python testu.py > A
Traceback (most recent call last):
File "testu.py", line 3, in <module>
print u"Råbjerg"
UnicodeEncodeError: 'ascii' codec can't encode character u'xe5' in position 1: ordinal not in range(128)
使用 debian Debian GNU/Linux 6.0.7 (squeeze) 配置:
using debian Debian GNU/Linux 6.0.7 (squeeze) configured with:
$ locale
LANG=fr_FR.UTF-8
LANGUAGE=
LC_CTYPE="fr_FR.UTF-8"
LC_NUMERIC="fr_FR.UTF-8"
LC_TIME="fr_FR.UTF-8"
LC_COLLATE="fr_FR.UTF-8"
LC_MONETARY="fr_FR.UTF-8"
LC_MESSAGES="fr_FR.UTF-8"
LC_PAPER="fr_FR.UTF-8"
LC_NAME="fr_FR.UTF-8"
LC_ADDRESS="fr_FR.UTF-8"
LC_TELEPHONE="fr_FR.UTF-8"
LC_MEASUREMENT="fr_FR.UTF-8"
LC_IDENTIFICATION="fr_FR.UTF-8"
LC_ALL=
从后面看到的其他类似问题中可以看到
from other similar questions seen later from the pointing done below
#!/bin/env python1
# -*- coding: utf-8 -*-
import sys, locale
s = u"Råbjerg" # >>> unicodedata.name(u"å") = 'LATIN SMALL LETTER A WITH RING ABOVE'
if sys.stdout.encoding is None: # if it is a pipe, seems python2 return None
s = s.encode(locale.getpreferredencoding())
print s
推荐答案
重定向输出时,sys.stdout
未连接到终端,Python 无法确定输出编码.当不直接输出时,Python 可以检测到 sys.stdout
是一个 TTY,并在打印 unicode 时使用为该 TTY 配置的编解码器.
When redirecting the output, sys.stdout
is not connected to a terminal and Python cannot determine the output encoding. When not directing the output, Python can detect that sys.stdout
is a TTY and will use the codec configured for that TTY when printing unicode.
设置PYTHONIOENCODING
环境变量告诉Python在这种情况下使用什么编码,或者显式编码.
Set the PYTHONIOENCODING
environment variable to tell Python what encoding to use in such cases, or encode explicitly.
这篇关于python...使用linux时的编码问题>的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
更多推荐
[db:关键词]
发布评论