将时间序列数据转换为横截面数据的最有效方法是什么?

编程入门行业动态更新时间:2024-10-28 14:22:46

本文介绍了将时间序列数据转换为横截面数据的最有效方法是什么?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

这是东西，下面有数据集，其中 date 是索引:

Here's the thing, I have the dataset below where date is the index:

date value 2020-01-01 100 2020-02-01 140 2020-03-01 156 2020-04-01 161 2020-05-01 170 . . .

我想在另一个数据集中对其进行转换:

And I want to transform it in this other dataset:

value_t0 value_t1 value_t2 value_t3 value_t4 ... 100 NaN NaN NaN NaN ... 140 100 NaN NaN NaN ... 156 140 100 NaN NaN ... 161 156 140 100 NaN ... 170 161 156 140 100 ...

首先，我考虑过使用pandas.pivot_table来做某事，但这只会提供按某列分组的不同布局，这并不是我想要的.后来，我考虑使用pandasql并应用"case when"，但是那将行不通，因为我必须输入数十行代码.所以我被困在这里.

First I thought about using pandas.pivot_table to do something, but that would just provide a different layout grouped by some column, which is not exactly what I want. Later, I thought about using pandasql and apply 'case when', but that wouldn't work because I would have to type dozens of lines of code. So I'm stuck here.

推荐答案

尝试一下:

new_df = pd.DataFrame({f"value_t{i}": df['value'].shift(i) for i in range(len(df))})

系列 .shift(n)方法可以通过将所有内容下移并填写上面的NaN来获得所需输出的一列.因此，我们通过使用字典理解来遍历原始数据帧，向它提供 {列名:列数据，...} 形式的字典，从而构建了一个新的数据帧.

The series .shift(n) method can get you a single column of your desired output by shifting everything down and filling in NaNs above. So we're building a new dataframe by feeding it a dictionary of the form {column name: column data, ...}, by using dictionary comprehension to iterate through your original dataframe.