我的Python脚本的目的是比较多个CSV文件中存在的数据,寻找差异。 数据是有序的,但文件之间的顺序不同。 这些文件包含大约70K行,重约15MB。 没有什么花哨或硬核在这里。 这是代码的一部分:
def getCSV(fpath): with open(fpath,"rb") as f: csvfile = csv.reader(f) for row in csvfile: allRows.append(row) allCols = map(list, zip(*allRows)) 我正在从我的CSV文件中读取吗? 我正在使用csv.reader ,但是我会从使用csv.DictReader受益吗? 如何创建包含在精确列中具有特定值的整行的列表?The purpose of my Python script is to compare the data present in multiple CSV files, looking for discrepancies. The data are ordered, but the ordering differs between files. The files contain about 70K lines, weighing around 15MB. Nothing fancy or hardcore here. Here's part of the code:
def getCSV(fpath): with open(fpath,"rb") as f: csvfile = csv.reader(f) for row in csvfile: allRows.append(row) allCols = map(list, zip(*allRows)) Am I properly reading from my CSV files? I'm using csv.reader, but would I benefit from using csv.DictReader? How can I create a list containing whole rows which have a certain value in a precise column?最满意答案
你确定要保留所有行吗? 这将创建一个仅包含匹配值的列表... fname也可以来自glob.glob()或os.listdir()或您选择的任何其他数据源。 需要注意的是,你提到了第20列,但第[20]行将是第21列......
import csv matching20 = [] for fname in ('file1.csv', 'file2.csv', 'file3.csv'): with open(fname) as fin: csvin = csv.reader(fin) next(csvin) # <--- if you want to skip header row for row in csvin: if row[20] == 'value': matching20.append(row) # or do something with it here如果您有标题行并希望按名称访问列,则只需要csv.DictReader 。
Are you sure you want to be keeping all rows around? This creates a list with matching values only... fname could also come from glob.glob() or os.listdir() or whatever other data source you so choose. Just to note, you mention the 20th column, but row[20] will be the 21st column...
import csv matching20 = [] for fname in ('file1.csv', 'file2.csv', 'file3.csv'): with open(fname) as fin: csvin = csv.reader(fin) next(csvin) # <--- if you want to skip header row for row in csvin: if row[20] == 'value': matching20.append(row) # or do something with it hereYou only want csv.DictReader if you have a header row and want to access your columns by name.
更多推荐
发布评论