Python:在数据框列中将秒转换为日期时间格式

编程入门 行业动态 更新时间:2024-10-23 23:32:59
本文介绍了Python:在数据框列中将秒转换为日期时间格式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

当前,我正在处理一个大数据框(12x47800).十二列之一是由整数秒组成的列.我想将此列更改为由datetime.time格式组成的列. Schedule是我的数据框,在这里我尝试更改名为"depTime"的列.由于我希望它是datetime.time,并且可能会超过午夜,所以我添加了if语句.这是有效的",但确实如人们所想象的那样缓慢.有没有更快的方法可以做到这一点? 我当前的代码,唯一可以使用的代码是:

Currently I am working with a big dataframe (12x47800). One of the twelve columns is a column consisting of an integer number of seconds. I want to change this column to a column consisting of a datetime.time format. Schedule is my dataframe where I try changing the column named 'depTime'. Since I want it to be a datetime.time and it could cross midnight i added the if-statement. This 'works' but really slow as one could imagine. Is there a faster way to do this? My current code, the only one I could get working is:

for i in range(len(schedule)): t_sec = schedule.iloc[i].depTime [t_min, t_sec] = divmod(t_sec,60) [t_hour,t_min] = divmod(t_min,60) if t_hour>23: t_hour -= 23 schedule['depTime'].iloc[i] = dt.time(int(t_hour),int(t_min),int(t_sec))

预先感谢大家.

Ps:我对Python还是很陌生,所以如果有人可以帮助我,我将非常感激:)

Ps: I'm pretty new to Python, so if anybody could help me I would be very gratefull :)

推荐答案

我要添加一个比原始解决方案快得多的新解决方案,因为它依赖于熊猫矢量化函数而不是循环(pandas apply函数本质上是经过优化的循环)数据).

I'm adding a new solution which is much faster than the original since it relies on pandas vectorized functions instead of looping (pandas apply functions are essentially optimized loops on the data).

我用大小与您相似的样本对其进行了测试,其差异为778ms至21.3ms.因此,我绝对推荐新版本.

I tested it with a sample similar in size to yours and the difference is from 778ms to 21.3ms. So I definitely recommend the new version.

这两种解决方案都基于将秒整数转换为timedelta格式并将其添加到参考日期时间.然后,我只需捕获结果日期时间的时间部分.

Both solutions are based on transforming your seconds integers into timedelta format and adding it to a reference datetime. Then, I simply capture the time component of the resulting datetimes.

新的(更快的)选项:

import datetime as dt seconds = pd.Series(np.random.rand(50)*100).astype(int) # Generating test data start = dt.datetime(2019,1,1,0,0) # You need a reference point datetime_series = seconds.astype('timedelta64[s]') + start time_series = datetime_series.dt.time time_series

原始(较慢)答案:

这不是最优雅的解决方案,但是可以解决问题.

Not the most elegant solution, but it does the trick.

import datetime as dt seconds = pd.Series(np.random.rand(50)*100).astype(int) # Generating test data start = dt.datetime(2019,1,1,0,0) # You need a reference point time_series = seconds.apply(lambda x: start + pd.Timedelta(seconds=x)).dt.time

更多推荐

Python:在数据框列中将秒转换为日期时间格式

本文发布于:2023-10-16 12:11:40,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1497515.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:转换为   中将   日期   格式   时间

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!