根据时间戳记间隔创建csv文件的数据帧

编程入门 行业动态 更新时间:2024-10-25 03:26:36
本文介绍了根据时间戳记间隔创建csv文件的数据帧的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我相信我的问题确实很简单,并且必须有一种非常简单的方法来解决此问题,但是由于我对Python相当陌生,尤其是熊猫,所以我无法自己解决它.

I believe that my problem is really straightforward and there must be a really easy way to solve this issue, however as I am quite new with Python, specially pandas, I could not sort it out by my own.

我有数百个具有以下格式的csv文件: text_2014-02-22_13-00-00

I have hundreds of csv files that are on the following format: text_2014-02-22_13-00-00

因此格式为 str_YY-MM-DD_HH-MI-SS .概括起来,每个文件代表一个小时的间隔.

So the format is str_YY-MM-DD_HH-MI-SS. And to sum up, every file represents a interval of one hour.

我想根据该间隔从我将用Start_Time和End_Time设置的间隔创建一个数据帧.因此,例如,如果我将Start_Time设置为2014-02-22 21:40:00并将End_Time设置为2014-02-22 22:55:00(我使用的时间格式只是为了说明该示例),那么我将获得一个数据帧,该数据帧包含上述间隔之间的数据,该间隔来自两个不同的文件.

I want to create a dataframe based on the interval that I will set with Start_Time and End_Time, from that interval. So, if for example, I set Start_Time as 2014-02-22 21:40:00 and End_Time as 2014-02-22 22:55:00 (The time-format that I am using is just to illustrate the example), then I will get a dataframe which comprehends the data in between the aforementioned interval , which comes from two different files.

所以,我认为这个问题可能分为两个部分:

So, I believe that this problem might be divided into two parts:

1-从文件名中仅读取日期

1 - Read just the date out of the file name

2-根据我设置的时间间隔创建一个数据框.

2 - Create a dataframe based on the time interval that I set.

希望我能做到简洁明了.非常感谢您在此方面的帮助!也欢迎提出查询建议

Hope that I managed to be succinct and precise. I would really appreciate your help on this one! Suggestions of what to look up for are also welcome

推荐答案

解决方案有几个不同的部分.

The solution has a few different parts.

  • 创建文件夹的路径
  • 手动创建3个csv文件
  • 将csv文件保存到列表
  • 编写自定义函数以将文件名解析为日期时间对象
  • 将它们组合在一起,循环浏览文件夹中的csv文件
  • import os import pandas as pd import datetime # step 1: create the path to folder path_cwd = os.getcwd() # step 2: manually 3 sample CSV files df_1 = pd.DataFrame({'Length': [10, 5, 6], 'Width': [5, 2, 3], 'Weight': [100, 120, 110] }).to_csv('text_2014-02-22_13-00-00.csv', index=False) df_2 = pd.DataFrame({'Length': [11, 7, 8], 'Width': [4, 1, 2], 'Weight': [101, 111, 131] }).to_csv('text_2014-02-22_14-00-00.csv', index=False) df_3 = pd.DataFrame({'Length': [15, 9, 7], 'Width': [1, 4, 2], 'Weight': [200, 151, 132] }).to_csv('text_2014-02-22_15-00-00.csv', index=False) # step 3: save the contents of the folder to a list list_csv = os.listdir(path_cwd) list_csv = [x for x in list_csv if '.csv' in x] print('here are the 3 CSV files in the folder: ') print(list_csv) # step 4: extract the datetime from filenames def get_datetime_filename(str_filename): ''' Function to grab the datetime from the filename. Example: 'text_2014-02-22_13-00-00.csv' ''' # split the filename by the underscore list_split_file = str_filename.split('_') # the 2nd part is the date str_date = list_split_file[1] # the 3rd part is the time, remove the '.csv' str_time = list_split_file[2] str_time = str_time.split('.')[0] # combine the 2nd and 3rd parts str_datetime = str(str_date + ' ' + str_time) # convert the string to a datetime object # chrisalbon/python/basics/strings_to_datetime/ # stackoverflow/questions/10663720/converting-a-time-string-to-seconds-in-python dt_datetime = datetime.datetime.strptime(str_datetime, '%Y-%m-%d %H-%M-%S') return dt_datetime # Step 5: bring it all together # create empty dataframe df_master = pd.DataFrame() # loop through each csv files for each_csv in list_csv: # full path to csv file temp_path_csv = os.path.join(path_cwd, each_csv) # temporary dataframe df_temp = pd.read_csv(temp_path_csv) # add a column with the datetime from filename df_temp['datetime_source'] = get_datetime_filename(each_csv) # concatenate dataframes df_master = pd.concat([df_master, df_temp]) # reset the dataframe index df_master = df_master.reset_index(drop=True) # examine the master dataframe print(df_master.shape) # print(df_master.head(10)) df_master.head(10)

    更多推荐

    根据时间戳记间隔创建csv文件的数据帧

    本文发布于:2023-10-13 09:46:19,感谢您对本站的认可!
    本文链接:https://www.elefans.com/category/jswz/34/1487606.html
    版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
    本文标签:戳记   间隔   文件   时间   数据

    发布评论

    评论列表 (有 0 条评论)
    草根站长

    >www.elefans.com

    编程频道|电子爱好者 - 技术资讯及电子产品介绍!