有两个数据帧,一个具有很少的信息(df1),另一个具有所有数据(df2).我正在df1的新列中尝试创建的内容,该列查找Total2值并根据名称"相应地填充新列.请注意,在df1中可见的名称将始终在df2的名称中找到匹配项.我想知道熊猫中是否已经有一些功能可以做到这一点?我的最终目标是创建一个条形图.
Have two dataframes, one has few information (df1) and other has all data (df2). What I am trying to create in a new column in df1 that finds the Total2 values and populates the new column accordingly based on the Names. Note that the Names visible in df1 will always find a match in Names of df2. I am wondering if there is some function in Pandas that already does this? My end goal is to create a bar chart.
alldatapath = "all_data.csv" filteredpath = "filtered.csv" import pandas as pd df1 = pd.read_csv( filteredpath, # file name sep=',', # column separator quotechar='"', # quoting character na_values="NA", # fill missing values with 0 usecols=[0,1], # columns to use decimal='.') # symbol for decimals df2 = pd.read_csv( alldatapath, # file name sep=',', # column separator quotechar='"', # quoting character na_values="NA", # fill missing values with 0 usecols=[0,1], # columns to use decimal='.') # symbol for decimals df1 = df1.head(5) #trim to top 5 print(df1) print(df2)输出(df1):
Name Total 0 Accounting 3 1 Reporting 1 2 Finance 1 3 Audit 1 4 Template 2输出(df2):
Name Total2 0 Reporting 100 1 Accounting 120 2 Finance 400 3 Audit 500 4 Information 50 5 Template 1200 6 KnowHow 2000最终输出(df1)应该类似于:
Final Output (df1) should be something like:
Name Total Total2(new column) 0 Accounting 3 120 1 Reporting 1 100 2 Finance 1 400 3 Audit 1 500 4 Template 2 1200推荐答案
需要 map 首先由Series表示新列:
df1['Total2'] = df1['Name'].map(df2.set_index('Name')['Total2']) print (df1) Name Total Total2 0 Accounting 3 120 1 Reporting 1 100 2 Finance 1 400 3 Audit 1 500 4 Template 2 1200然后 set_index 与 DataFrame.plot.bar :
df1.set_index('Name').plot.bar()更多推荐
使用其他数据框中的匹配值在数据框中创建新列
发布评论