从文本文件 - sed,其他bash / shell方法中删除unicode字符(Remove unicode characters from textfiles - sed , other bash/shell methods)
如何从终端上的一堆文本文件中删除unicode字符? 我已经尝试过了,但它没有工作:
sed 'g/\u'U+200E'//' -i *.txt我需要从文本文件中删除这些unicodes
U+0091 - sort of weird "control" space U+0092 - same sort of weird "control" space A0 - non-space break U+200E - left to right markHow do I remove unicode characters from a bunch of text files on the terminal? I've tried this but it didn't work:
sed 'g/\u'U+200E'//' -i *.txtI need to remove these unicodes from the textfiles
U+0091 - sort of weird "control" space U+0092 - same sort of weird "control" space A0 - non-space break U+200E - left to right mark最满意答案
如果你想删除只有特定的字符,你有python,你可以:
CHARS=$(python -c 'print u"\u0091\u0092\u00a0\u200E".encode("utf8")') sed 's/['"$CHARS"']//g' < /tmp/utf8_input.txt > /tmp/ascii_output.txtIf you want to remove ONLY particular characters and you have python, you can:
CHARS=$(python -c 'print u"\u0091\u0092\u00a0\u200E".encode("utf8")') sed 's/['"$CHARS"']//g' < /tmp/utf8_input.txt > /tmp/ascii_output.txt更多推荐
发布评论