我有两个数据框,学生数据框和费用数据框。 费用 DataFrame中缺少某些学生的费用明细。我想返回所有缺少费用详细信息的学生的详细信息。 班级,部分和 RollNo这三个字段构成唯一的组合。
I have two DataFrames, 'Students' DataFrame and 'Fee' DataFrame. The fee details of some of the students are missing in 'Fee' DataFrame. I would like to return the details of all students whose fee details are missing. The three fields 'Class', 'Section' and 'RollNo' form a unique combination.
Students = pd.DataFrame({ 'Class': [7, 7, 8], 'Section': ['A', 'B', 'B'], 'RollNo': [2, 3, 4], 'Student': ['Ram', 'Rahim', 'Robert'] }) Fee = pd.DataFrame({ 'Class': [7, 7, 8], 'Section': ['A', 'B', 'B'], 'RollNo': [2, 2, 3], 'Fee': [10, 20, 30] })
Students Class RollNo Section Student 0 7 2 A Ram 1 7 3 B Rahim 2 8 4 B Robert
Fee Class Fee RollNo Section 0 7 10 2 A 1 7 20 2 B 2 8 30 3 B
基本上,我想在左侧表格中找到不匹配的记录根据上述3个字段在学生和费用数据框架之间进行左联接。在Python中使用 Pandas 来实现此目标的最简单方法是什么?
Essentially, I would like to find the unmatched records from the left table when I do a left join between 'Students' and 'Fee' DataFrames based on 3 fields mentioned above. What is the simplest way to achieve this using Pandas in Python?
非常感谢!
推荐答案如果 Fee 列中的 NaN 没有, code>费用 DataFrame使用 合并 由 布尔索引 与 isna :
If no NaNs in Fee column in Fee DataFrame use merge anf filter by boolean indexing with isna:
df = pd.merge(Students, Fee, how='left') print (df) Class RollNo Section Student Fee 0 7 2 A Ram 10.0 1 7 3 B Rahim NaN 2 8 4 B Robert NaN df1 = df[df['Fee'].isna()].drop('Fee', axis=1) #for oldier versions of pandas #df1 = df[df['Fee'].isnull()].drop('Fee', axis=1) print (df1) Class RollNo Section Student 1 7 3 B Rahim 2 8 4 B Robert使用 NaN 的更通用的解决方案,也将参数 indicator 添加到 merge 并使用 left_only 过滤行:
More general solution working with NaNs too add parameter indicator to merge and filter rows with left_only:
Fee = pd.DataFrame({'Class':[7,7,8], 'Section':['A','B','B'], 'RollNo':[2,2,3], 'Fee':[np.nan,20,30]}) print (Fee) Class Fee RollNo Section 0 7 NaN 2 A 1 7 20.0 2 B 2 8 30.0 3 B df = pd.merge(Students, Fee, how='left', indicator=True) print (df) Class RollNo Section Student Fee _merge 0 7 2 A Ram NaN both 1 7 3 B Rahim NaN left_only 2 8 4 B Robert NaN left_only df1 = df[df['_merge'].eq('left_only')].drop(['Fee','_merge'], axis=1) print (df1) Class RollNo Section Student 1 7 3 B Rahim 2 8 4 B Robert更多推荐
pandas 左联接中不匹配的左表记录
发布评论