作为一个例子,假设我有一个python pandas DataFrame,如下所示:
# PERSON THINGS 0 Joe Candy Corn, Popsicles 1 Jane Popsicles 2 John Candy Corn, Ice Packs 3 Lefty Ice Packs, Hot Dogs我想使用熊猫groupby功能有以下输出:
THINGS COUNT Candy Corn 2 Popsicles 2 Ice Packs 2 Hot Dogs 1我通常了解以下groupby命令:
df.groupby(['THINGS']).count()但是输出不是单个项目,而是整个字符串。 我想我明白为什么会出现这种情况,但我不清楚如何最好地处理问题以获得所需的输出,而不是以下内容:
THINGS PERSON Candy Corn, Ice Packs 1 Candy Corn, Popsicles 1 Ice Packs, Hot Dogs 1 Popsicles 1大熊猫是否有像SQL中的LIKE这样的函数,或者我在考虑在熊猫中如何做到这一点?
任何援助赞赏。
As an example, let's say I have a python pandas DataFrame that is the following:
# PERSON THINGS 0 Joe Candy Corn, Popsicles 1 Jane Popsicles 2 John Candy Corn, Ice Packs 3 Lefty Ice Packs, Hot DogsI would like to use the pandas groupby functionality to have the following output:
THINGS COUNT Candy Corn 2 Popsicles 2 Ice Packs 2 Hot Dogs 1I generally understand the following groupby command:
df.groupby(['THINGS']).count()But the output is not by individual item, but by the entire string. I think I understand why this is, but it's not clear to me how to best approach the problem to get the desired output instead of the following:
THINGS PERSON Candy Corn, Ice Packs 1 Candy Corn, Popsicles 1 Ice Packs, Hot Dogs 1 Popsicles 1Does pandas have a function like the LIKE in SQL, or am I thinking about how to do this wrong in pandas?
Any assistance appreciated.
最满意答案
通过分词来创建一个系列,并使用value_counts
In [292]: pd.Series(df.THINGS.str.cat(sep=', ').split(', ')).value_counts() Out[292]: Popsicles 2 Ice Packs 2 Candy Corn 2 Hot Dogs 1 dtype: int64Create a series by splitting words, and use value_counts
In [292]: pd.Series(df.THINGS.str.cat(sep=', ').split(', ')).value_counts() Out[292]: Popsicles 2 Ice Packs 2 Candy Corn 2 Hot Dogs 1 dtype: int64更多推荐
发布评论