使用python,我想读一个字典中的所有文本文件中的特定字符串后面的行。 我想在数千个文本文件中完成此操作。
我能够使用以下代码(从此堆栈溢出答案中获得)识别并打印出特定的字符串('Abstract'):
for files in filepath: with open(files, 'r') as f: for line in f: if 'Abstract' in line: print line;但是,我该如何告诉python开始阅读仅在字符串之后出现的行?
Using python, I'd like to read to a dictionary all of the lines in a text file that come after a particular string. I'd like to do this over thousands of text files.
I'm able to identify and print out the particular string ('Abstract') using the following code (gotten from this stack overflow answer):
for files in filepath: with open(files, 'r') as f: for line in f: if 'Abstract' in line: print line;But how do I tell python to start reading the lines that only come after the string?
最满意答案
当您到达您想要开始的线路时,再启动另一个循环:
for files in filepath: with open(files, 'r') as f: for line in f: if 'Abstract' in line: for line in f: # now you are at the lines you want # do work一个文件对象是它自己的迭代器,所以当我们到达带有Abstract的行时,我们继续从该行开始迭代,直到我们已经消耗了迭代器。
一个简单的例子:
gen = (n for n in xrange(8)) for x in gen: if x == 3: print("starting second loop") for x in gen: print("In second loop",x) else: print("In first loop", x) In first loop 0 In first loop 1 In first loop 2 starting second loop In second loop 4 In second loop 5 In second loop 6 In second loop 7你也可以使用itertools.dropwhile消耗直到你想要的点的行。
from itertools import dropwhile for files in filepath: with open(files, 'r') as f: dropped = dropwhile(lambda _line: "Abstract" not in _line, f) next(dropped,"") for line in dropped: print(line)just start another loop when you reach the line you want to start from :
for files in filepath: with open(files, 'r') as f: for line in f: if 'Abstract' in line: for line in f: # now you are at the lines you want # do workA file object is it's own iterator, so when we reach the line with Abstract in it we continue our iteration from that line until we have consumed the iterator.
A simple example:
gen = (n for n in xrange(8)) for x in gen: if x == 3: print("starting second loop") for x in gen: print("In second loop",x) else: print("In first loop", x) In first loop 0 In first loop 1 In first loop 2 starting second loop In second loop 4 In second loop 5 In second loop 6 In second loop 7You can also use itertools.dropwhile to consume the lines up to the point you want.
from itertools import dropwhile for files in filepath: with open(files, 'r') as f: dropped = dropwhile(lambda _line: "Abstract" not in _line, f) next(dropped,"") for line in dropped: print(line)更多推荐
发布评论