我的代码试图读取rootDir指定目录中的所有日志文件,并将该日志文件中的某些信息写入outputFile
我遇到的问题是searchObj_Archive_date.group() , fullpath , zDiscsVar , zCopiesVar和searchObj_Year_3or6.group()没有从日志文件中的某些行读入我的文件。 这只发生在总输出行文本的10%左右,所以我很困惑为什么它只在某些时候发生,所以不是E:\filepath\text.txt | 5/23/2015 12:00 | C:\anotherFilePath\text.txt | 23 | 23 | 5Year E:\filepath\text.txt | 5/23/2015 12:00 | C:\anotherFilePath\text.txt | 23 | 23 | 5Year E:\filepath\text.txt | 5/23/2015 12:00 | C:\anotherFilePath\text.txt | 23 | 23 | 5Year ,我得到E:\filepath\text.txt | | | | | E:\filepath\text.txt | | | | |
任何关于为什么会出现此错误的见解将非常感激。 我的代码如下:
做了一些研究之后,我发现导致我的错误的是每当一行有逗号时。 它停止读取该逗号的行并跳到下一行,有人知道解决方法吗?
我的输入文本示例给出了我的问题: 11/23/2015 12:34:58 Adding file D:\fp\fp1\fp2\text, text, text.txt
通常这些行没有逗号,所以有人知道在阅读文本行时处理逗号的方法吗?
import os import re fo = open('outputFile', 'w') fo.write("Col|Col|Col|Col|Col|Col \n") # 1.walk around directory and find log file in one of folders rootDir = "C:\\Users\\" for path, dirs, files in os.walk(rootDir, topdown=False): for filename in files: fullpath = os.path.join(path, filename) if (filename=="text.txt"): # 2.open file. read from file fi2 = open(fullpath, 'r+') fi2Content = fi2.read() zDiscs = re.search(r'(\sNumber of copies: (\d{1,2}))', fi2Content, re.M|re.I) if zDiscs: zDiscsVar = str(zDiscs.group(2)) zCopies = re.search(r'(Number of Discs in Set: (\d{1,2}))', fi2Content, re.M|re.I) if zCopies: zCopiesVar = str(zCopies.group(2)) fi = open(fullpath, 'r') # 3.parse text in incoming file and use regex to find PATH for line in fi: #4.write path and info to outgoing file m = re.search(r'(Adding file(.*))',line) if m: searchObj_Adding_file = re.search(r'[A-Z]:\\.+', line, re.M|re.I) searchObj_Archive_date = re.search(r'^\d{2}\/\d{2}\/\d{4}\s\d{2}:\d{2}:\d{2}', line, re.M|re.I) searchObj_Year_3or6 = re.search(r'\dyear', line, re.M|re.I) if searchObj_Adding_file: fo.write(searchObj_Adding_file.group() + "|") fo.write(searchObj_Archive_date.group() + "|") fo.write(fullpath + "|") fo.write(zDiscsVar + "|") fo.write(zCopiesVar + "|") fo.write(searchObj_Year_3or6.group() + '\n') #5. close file fo.close() fi.close() fi2.close()My code is trying to read all log files throughout the specified directory in rootDir and write certain pieces of information from that log file to an outputFile
The issue I'm having is searchObj_Archive_date.group(), fullpath,zDiscsVar,zCopiesVar, and searchObj_Year_3or6.group() aren't being read into my file from certain lines within the log files. This happens for only about 10% of the total outputted lines of text, so I'm confused why it's only happening some of the time, so instead of E:\filepath\text.txt | 5/23/2015 12:00 | C:\anotherFilePath\text.txt | 23 | 23 | 5Year, I get E:\filepath\text.txt | | | | |
Any insight as to why this error is occuring would be greatly appreciated. My code is below:
After doing some researched, I found that what's causing my error is that whenever a line has a comma , in it. It stops reading the line at that comma and skips to the next line, does anybody know a workaround to this?
An example of my input text that's giving me problems: 11/23/2015 12:34:58 Adding file D:\fp\fp1\fp2\text, text, text.txt
Normally these lines don't have commas, so does anyone know of a way to handle commas when reading in lines of text?
import os import re fo = open('outputFile', 'w') fo.write("Col|Col|Col|Col|Col|Col \n") # 1.walk around directory and find log file in one of folders rootDir = "C:\\Users\\" for path, dirs, files in os.walk(rootDir, topdown=False): for filename in files: fullpath = os.path.join(path, filename) if (filename=="text.txt"): # 2.open file. read from file fi2 = open(fullpath, 'r+') fi2Content = fi2.read() zDiscs = re.search(r'(\sNumber of copies: (\d{1,2}))', fi2Content, re.M|re.I) if zDiscs: zDiscsVar = str(zDiscs.group(2)) zCopies = re.search(r'(Number of Discs in Set: (\d{1,2}))', fi2Content, re.M|re.I) if zCopies: zCopiesVar = str(zCopies.group(2)) fi = open(fullpath, 'r') # 3.parse text in incoming file and use regex to find PATH for line in fi: #4.write path and info to outgoing file m = re.search(r'(Adding file(.*))',line) if m: searchObj_Adding_file = re.search(r'[A-Z]:\\.+', line, re.M|re.I) searchObj_Archive_date = re.search(r'^\d{2}\/\d{2}\/\d{4}\s\d{2}:\d{2}:\d{2}', line, re.M|re.I) searchObj_Year_3or6 = re.search(r'\dyear', line, re.M|re.I) if searchObj_Adding_file: fo.write(searchObj_Adding_file.group() + "|") fo.write(searchObj_Archive_date.group() + "|") fo.write(fullpath + "|") fo.write(zDiscsVar + "|") fo.write(zCopiesVar + "|") fo.write(searchObj_Year_3or6.group() + '\n') #5. close file fo.close() fi.close() fi2.close()最满意答案
我在搜索文本行之前删除了逗号。 为此,我在if: m之后插入了lineWoCommas = line.replace(',', '')
I removed my commas before searching the line of text. To do this, I inserted lineWoCommas = line.replace(',', '') after if: m
更多推荐
发布评论