来自 txt 文件程序的字数统计

编程入门行业动态更新时间:2024-10-20 07:41:29

本文介绍了来自 txt 文件程序的字数统计的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

我正在使用以下代码计算 txt 文件的字数:

#!/usr/bin/pythonfile=open("D:\\zzzz\\names2.txt","r+")字数={}对于 file.read().split() 中的单词:如果单词不在 wordcount 中:字数[字] = 1别的:字数[字] += 1打印(字，字数)文件.关闭()；

这给了我这样的输出:

>>>山羊{'山羊':2，'牛':1，'狗':1，'狮子':1，'蛇':1，'马':1，'ï»¿':1，'老虎':1、'猫':2、'狗':1}

但我希望以下列方式输出:

word 字数统计山羊 2牛 1狗1.....

我还在输出中得到一个额外的符号(ï»¿).我怎样才能删除它?

解决方案

您遇到的有趣符号是 UTF-8 BOM(字节顺序标记).要摆脱它们，请使用正确的编码打开文件(我假设您使用的是 Python 3):

file = open(r"D:\zzzz\names2.txt", "r", encoding="utf-8-sig")

此外，对于计数，您可以使用 collections.计数器:

from collections import Counterwordcount = Counter(file.read().split())

显示它们:

>>>对于 wordcount.items() 中的项目: print("{}\t{}".format(*item))...蛇 1狮子 2山羊 2马3

I am counting word of a txt file with the following code:

#!/usr/bin/python file=open("D:\\zzzz\\names2.txt","r+") wordcount={} for word in file.read().split(): if word not in wordcount: wordcount[word] = 1 else: wordcount[word] += 1 print (word,wordcount) file.close();

this is giving me the output like this:

>>> goat {'goat': 2, 'cow': 1, 'Dog': 1, 'lion': 1, 'snake': 1, 'horse': 1, 'ï»¿': 1, 'tiger': 1, 'cat': 2, 'dog': 1}

but I want the output in the following manner:

word wordcount goat 2 cow 1 dog 1.....

Also I am getting an extra symbol in the output (ï»¿). How can I remove this?

解决方案

The funny symbols you're encountering are a UTF-8 BOM (Byte Order Mark). To get rid of them, open the file using the correct encoding (I'm assuming you're on Python 3):

file = open(r"D:\zzzz\names2.txt", "r", encoding="utf-8-sig")

Furthermore, for counting, you can use collections.Counter:

from collections import Counter wordcount = Counter(file.read().split())

Display them with:

>>> for item in wordcount.items(): print("{}\t{}".format(*item)) ... snake 1 lion 2 goat 2 horse 3

更多推荐

来自 txt 文件程序的字数统计

本文发布于:2023-11-11 07:13:17，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1577737.html