如何从分组数据创建直方图(How to create histogram from grouped data)

编程入门 行业动态 更新时间:2024-10-27 18:26:00
如何从分组数据创建直方图(How to create histogram from grouped data)

我正在尝试从pandas中的分组数据创建直方图。

到目前为止,我能够创建标准线图。 但我无法弄清楚如何做同样的直方图(条形图)。 我想得到2个年龄直方图,其中有泰坦尼克迷恋幸存者,但没有 - 看年龄分布是否有差异。

来源数据: https : //www.udacity.com/api/nodes/5454512672/supplemental_media/titanic-datacsv/download

到目前为止我的代码:

import pandas as pn titanic = pn.DataFrame.from_csv('titanic_data.csv') SurvivedAge= titanic.groupby(['Survived','Age']).size() SurvivedAge=SurvivedAge.reset_index() SurvivedAge.columns=['Survived', 'Age', 'Num'] SurvivedAge.index=(SurvivedAge['Survived']) del SurvivedAge['Survived'] SurvivedAget=SurvivedAge.reset_index().pivot('Age', 'Survived','Num') SurvivedAget.plot()

当我试图从这个数据集中绘制直方图时,我得到了奇怪的结果。

SurvivedAget.hist()

我很感激你的帮助。

I'm trying to create histogram from grouped data in pandas.

So far I was able to create standard line plot. But I can't figure out how to do the same to get histogram (bar chart). I would like to get 2 age histograms of persons who survived Titanic crush and who didn't - to see if there is a difference in age distribution.

Source data: https://www.udacity.com/api/nodes/5454512672/supplemental_media/titanic-datacsv/download

So far my code:

import pandas as pn titanic = pn.DataFrame.from_csv('titanic_data.csv') SurvivedAge= titanic.groupby(['Survived','Age']).size() SurvivedAge=SurvivedAge.reset_index() SurvivedAge.columns=['Survived', 'Age', 'Num'] SurvivedAge.index=(SurvivedAge['Survived']) del SurvivedAge['Survived'] SurvivedAget=SurvivedAge.reset_index().pivot('Age', 'Survived','Num') SurvivedAget.plot()

when I'm trying to plot a histogram from this data set I'm getting strange results.

SurvivedAget.hist()

I would be grateful for help with that.

最满意答案

您可以:

titanic = pd.read_csv('titanic_data.csv') survival_by_age = titanic.groupby(['Age', 'Survived']).size().unstack('Survived') survival_by_age.columns = ['No', 'Yes'] survival_by_age.plot.bar(title='Survival by Age')

要得到:

在此处输入图像描述

你可以进一步调整 。 您还可以合并小数年龄,以便您可以使用整数索引,或将数据分组为5年龄跨度以获得更加用户友好的输出。 然后有一个各种类型的分布图的seaborn 。

You can:

titanic = pd.read_csv('titanic_data.csv') survival_by_age = titanic.groupby(['Age', 'Survived']).size().unstack('Survived') survival_by_age.columns = ['No', 'Yes'] survival_by_age.plot.bar(title='Survival by Age')

to get:

enter image description here

which you can further tweak. You could also consolidate the fractional ages so you can use integer indices, or bin the data into say 5yr age spans to get more user-friendly output. And then there is seaborn with a various types of distribution plots.

更多推荐

本文发布于:2023-07-05 14:30:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1038437.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:直方图   数据   create   data   grouped

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!