本文介绍了PYTHON:删除一些行后将一列拆分为多个的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我对Python很陌生,我正在尝试清理一些数据.我已将链接附加到数据文件(两个选项卡:原始数据和所需结果).请帮忙!
I am pretty new to Python and I am trying to cleanse some data. I've attached a link to the data file (Two tabs: Raw data and desired outcome). Please help!
我正在尝试做的事情:
- 删除第1-23行
- 使用'-'作为分隔符将B列拆分为多列
- 将列名称分配给新列
- 保留数字列
链接到原始数据(第一个标签)&期望的结果(第二个标签): www .dropbox/s/kjgtwoelq21eetw/Example2.xlsx?dl = 0
我目前所拥有的:
import numpy as np data_xls=pd.read_excel("Example2.xlsx", index_col=None).fillna('') data_xls = data_xls.iloc[22:] data_xls.rename(columns=data_xls.iloc[0]).drop(data_xls.index[0]) data_xls['Internal Link Tracking (non-promotions) - ENT (c20)'].str.split('-', expand=True) writer = pd.ExcelWriter('Output2.xlsx') data_xls.to_excel(writer, 'O1', index=False) writer.save()非常感谢您的帮助! 泰
Thank you so much in advance for your help! Tae
推荐答案使用:
# Read the excel file with sheet_name='Raw data' and skiprows=23 which are not necessary data_xls = pd.read_excel("Example2.xlsx", sheet_name='Raw data', skiprows=23) # Create the dummy columns names which are similar to desired output column dummy_col_names = ['Internal Link Tracking (non','Campaign Name','Creative','Action','Action 2'] # Use str.split with expand=True to create a dataframe dummy_df = data_xls['Internal Link Tracking (non-promotions) - ENT (c20)'].str.split('-',expand = True) # Rename columns as per dummy column list dummy_df.columns = dummy_col_names # Drop the column which is not necessary data_xls.drop('Internal Link Tracking (non-promotions) - ENT (c20)', axis=1, inplace=True) # Use pd.concat along axis=1 to concat both data_xls and dummy_df along columns data_xls = pd.concat((data_xls,dummy_df),sort=False,axis=1) # To preserve oreder similar to desired output column use the following code col_names = data_xls.columns.tolist() data_xls = data_xls[col_names[:1]+dummy_col_names+col_names[1:-5]]更多推荐
PYTHON:删除一些行后将一列拆分为多个
发布评论