在Python中将日期设置为“str”和nan值的子数据集(Subsetting a data frame with dates as `str` and nan values in Python)

编程入门 行业动态 更新时间:2024-10-25 06:33:07
在Python中将日期设置为“str”和nan值的子数据集(Subsetting a data frame with dates as `str` and nan values in Python) python

我有一个数据框,使用Data = pandas.read_csv从.csv文件中提取

数据框的一列是日期,例如'14/09/2015' ,数据类型是str 。

我需要创建一个子集,我使用它: NewDataFrame = DataFrame['DatesColumn'][DataFrame['DatesColumn']==desired date]

但我有两个主要问题:

由于日期是字符串,我试图使用切片[-1]。 但我收到错误: KeyError : -1L

我尝试使用此代码选择2014:

NewDataFrame = DataFrame['DatesColumn'][DataFrame['DatesColumn'][-1]==4]

我有空字段已导入为nan值。 如果我尝试执行for循环来转换数据,我会收到错误:

TypeError: 'float' object has no attribute '__getitem__'

问:我如何按年分配数据(或清理数据)?

非常感谢。

I have a data frame, extracted from a .csv file using Data = pandas.read_csv

One of the columns of the data frame are dates, such as '14/09/2015', the type of data is str.

I need to create a subset, for which I use: NewDataFrame = DataFrame['DatesColumn'][DataFrame['DatesColumn']==desired date]

But I have two main problems:

Since the dates are strings, I have tried to use a slice [-1]. But I get the error: KeyError : -1L

I tried to use this code to select 2014:

NewDataFrame = DataFrame['DatesColumn'][DataFrame['DatesColumn'][-1]==4]

I have empty fields that have been imported as nan values. If I try to perform a for loop to transform the data, I get the error:

TypeError: 'float' object has no attribute '__getitem__'

Q: How can I subset the data (or clean it) by year?

Many thanks.

最满意答案

对于NaN值,您可以使用fillna() 。

# to fill NaNs with zeros noNans = withNans.fillna(0)

对于日期问题,您应该让现有的库为您处理日期字符串,而不是自己处理日期字符串。 在这种情况下, read_csv()函数可以为您完成。 请参阅此处的文档。

这是一个小例子:

Csv文件:

1,14/09/2016,dataa 1,14/09/2015,dataa 2,14/10/2014,dataa2

码:

import pandas as pd from datetime import date df = pd.read_csv("test.csv", header=None, parse_dates=[1]) df[df[1] > date.today()]

仅打印

0 1 2 0 1 2016-09-14 dataa

For the NaN values you can use fillna().

# to fill NaNs with zeros noNans = withNans.fillna(0)

And for the date issue, instead of handling the date strings yourself you should let the already existing libraries handle them for you. In this case the read_csv() function can do it for you. See the documentation here.

Here's a little example:

Csv file:

1,14/09/2016,dataa 1,14/09/2015,dataa 2,14/10/2014,dataa2

Code:

import pandas as pd from datetime import date df = pd.read_csv("test.csv", header=None, parse_dates=[1]) df[df[1] > date.today()]

Prints only

0 1 2 0 1 2016-09-14 dataa

更多推荐

本文发布于:2023-08-04 14:11:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1416110.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:设置为   中将   日期   数据   Python

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!