Python:常规CSV文件解析和操作(Python: General CSV file parsing and manipulation)

编程入门 行业动态 更新时间:2024-10-17 13:34:38
Python:常规CSV文件解析和操作(Python: General CSV file parsing and manipulation)

我的Python脚本的目的是比较多个CSV文件中存在的数据,寻找差异。 数据是有序的,但文件之间的顺序不同。 这些文件包含大约70K行,重约15MB。 没有什么花哨或硬核在这里。 这是代码的一部分:

def getCSV(fpath): with open(fpath,"rb") as f: csvfile = csv.reader(f) for row in csvfile: allRows.append(row) allCols = map(list, zip(*allRows)) 我正在从我的CSV文件中读取吗? 我正在使用csv.reader ,但是我会从使用csv.DictReader受益吗? 如何创建包含在精确列中具有特定值的整行的列表?

The purpose of my Python script is to compare the data present in multiple CSV files, looking for discrepancies. The data are ordered, but the ordering differs between files. The files contain about 70K lines, weighing around 15MB. Nothing fancy or hardcore here. Here's part of the code:

def getCSV(fpath): with open(fpath,"rb") as f: csvfile = csv.reader(f) for row in csvfile: allRows.append(row) allCols = map(list, zip(*allRows)) Am I properly reading from my CSV files? I'm using csv.reader, but would I benefit from using csv.DictReader? How can I create a list containing whole rows which have a certain value in a precise column?

最满意答案

你确定要保留所有行吗? 这将创建一个仅包含匹配值的列表... fname也可以来自glob.glob()或os.listdir()或您选择的任何其他数据源。 需要注意的是,你提到了第20列,但第[20]行将是第21列......

import csv matching20 = [] for fname in ('file1.csv', 'file2.csv', 'file3.csv'): with open(fname) as fin: csvin = csv.reader(fin) next(csvin) # <--- if you want to skip header row for row in csvin: if row[20] == 'value': matching20.append(row) # or do something with it here

如果您有标题行并希望按名称访问列,则只需要csv.DictReader 。

Are you sure you want to be keeping all rows around? This creates a list with matching values only... fname could also come from glob.glob() or os.listdir() or whatever other data source you so choose. Just to note, you mention the 20th column, but row[20] will be the 21st column...

import csv matching20 = [] for fname in ('file1.csv', 'file2.csv', 'file3.csv'): with open(fname) as fin: csvin = csv.reader(fin) next(csvin) # <--- if you want to skip header row for row in csvin: if row[20] == 'value': matching20.append(row) # or do something with it here

You only want csv.DictReader if you have a header row and want to access your columns by name.

更多推荐

本文发布于:2023-08-04 11:26:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1415251.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:常规   操作   文件   CSV   Python

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!