当重复仅在第一列中时,drop_duplicates在pandas中(drop_duplicates in pandas when duplicate is only in first column)
我有一个包含两列的数据框。 第一列,比如A,有重复,第二列没有。
我试过了
df["A"].drop_duplicates(inplace=True)但是返回相同的行数。 如何删除“A”列中的值相同的行?
例:
John Miller John Smith Mark Robinson Jeffrey Robinson应该回来
John Miller Mark Robinson Jeffrey RobinsonI have a dataframe with two columns. The first column, say A, has duplicates, the second does not.
I have tried
df["A"].drop_duplicates(inplace=True)but that returns the same number of rows. How can I drop the rows where the value in column "A" is the same?
Example:
John Miller John Smith Mark Robinson Jeffrey Robinsonshould return
John Miller Mark Robinson Jeffrey Robinson最满意答案
将drop_duplicates与参数subset一起使用:
df.drop_duplicates(subset=['A'],inplace=True) print (df) A B 0 John Miller 2 Mark Robinson 3 Jeffrey Robinson文档:
subset :列标签或标签序列,可选
仅考虑用于标识重复项的某些列,默认情况下使用所有列
Use drop_duplicates with parameter subset:
df.drop_duplicates(subset=['A'],inplace=True) print (df) A B 0 John Miller 2 Mark Robinson 3 Jeffrey RobinsonDocs:
subset : column label or sequence of labels, optional
Only consider certain columns for identifying duplicates, by default use all of the columns
更多推荐
发布评论