重置后Python Pandas运行总计

编程入门行业动态更新时间:2024-10-09 19:23:05

本文介绍了重置后Python Pandas运行总计的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

我想执行以下任务.给定2列(好和坏)，我想用运行总计替换两列的任何行.这是当前数据帧以及所需数据帧的示例.

I would like to perform the following task. Given a 2 columns (good and bad) I would like to replace any rows for the two columns with a running total. Here is an example of the current dataframe along with the desired data frame.

我应该添加我的意图.我正在尝试使用连续变量作为输入来创建均等合并(在这种情况下为20)变量.我知道可以使用pandas cut和qcut函数，但是对于好/坏率，返回的结果将为零(需要计算证据和信息价值的权重).分子或分母中的零将不允许数学计算.

I should have added what my intentions are. I am trying to create equally binned (in this case 20) variable using a continuous variable as the input. I know the pandas cut and qcut functions are available, however the returned results will have zeros for the good/bad rate (needed to compute the weight of evidence and information value). Zeros in either the numerator or denominator will not allow the mathematical calculations to work.

d={'AAA':range(0,20), 'good':[3,3,13,20,28,32,59,72,64,52,38,24,17,19,12,5,7,6,2,0], 'bad':[0,0,1,1,1,0,6,8,10,6,6,10,5,8,2,2,1,3,1,1]} df=pd.DataFrame(data=d) print(df)

以下是我对上述数据框需要做的解释.

Here is an explanation of what I need to do to the above dataframe.

粗略地说，每当我在任一列中遇到零时，我都需要为该列使用一个连续的总计，而不是对包含零的列的下一行具有非零值的下一行.

Roughly speaking, anytime I encounter a zero for either column, I need to use a running total for the column which is not zero to the next row which has a non-zero value for the column that contained zeros.

这是所需的输出:

dd={'AAA':range(0,16), 'good':[19,20,60,59,72,64,52,38,24,17,19,12,5,7,6,2], 'bad':[1,1,1,6,8,10,6,6,10,5,8,2,2,1,3,2]} desired_df=pd.DataFrame(data=dd) print(desired_df)

推荐答案

P.Tillmann.感谢您的协助.对于更高级的读者，我认为您像我一样会感到震惊.我很乐意接受任何建议，以使此建议更加精简.

P.Tillmann. I appreciate your assistance with this. For the more advanced readers I would assume you to find this code appalling, as I do. I would be more than happy to take any recommendation which makes this more streamlined.

d={'AAA':range(0,20), 'good':[3,3,13,20,28,32,59,72,64,52,38,24,17,19,12,5,7,6,2,0], 'bad':[0,0,1,1,1,0,6,8,10,6,6,10,5,8,2,2,1,3,1,1]} df=pd.DataFrame(data=d) print(df) row_good=0 row_bad=0 row_bad_zero_count=0 row_good_zero_count=0 row_out='NO' crappy_fix=pd.DataFrame() for index,row in df.iterrows(): if row['good']==0 or row['bad']==0: row_bad += row['bad'] row_good += row['good'] row_bad_zero_count += 1 row_good_zero_count += 1 output_ind='1' row_out='NO' elif index+1 < len(df) and (df.loc[index+1,'good']==0 or df.loc[index+1,'bad']==0): row_bad=row['bad'] row_good=row['good'] output_ind='2' row_out='NO' elif (row_bad_zero_count > 1 or row_good_zero_count > 1) and row['good']!=0 and row['bad']!=0: row_bad += row['bad'] row_good += row['good'] row_bad_zero_count=0 row_good_zero_count=0 row_out='YES' output_ind='3' else: row_bad=row['bad'] row_good=row['good'] row_bad_zero_count=0 row_good_zero_count=0 row_out='YES' output_ind='4' if ((row['good']==0 or row['bad']==0) and (index > 0 and (df.loc[index-1,'good']!=0 or df.loc[index-1,'bad']!=0)) and row_good != 0 and row_bad != 0): row_out='YES' if row_out=='YES': temp_dict={'AAA':row['AAA'], 'good':row_good, 'bad':row_bad} crappy_fix=crappy_fix.append([temp_dict],ignore_index=True) print(str(row['AAA']),'-', str(row['good']),'-', str(row['bad']),'-', str(row_good),'-', str(row_bad),'-', str(row_good_zero_count),'-', str(row_bad_zero_count),'-', row_out,'-', output_ind) print(crappy_fix)

更多推荐

重置后Python Pandas运行总计

本文发布于:2023-10-16 14:12:38，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1497801.html