ggplot 中的分组条形图

编程入门 行业动态 更新时间:2024-10-27 05:28:39
本文介绍了ggplot 中的分组条形图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我有一个调查文件,其中行是观察和列问题.

以下是一些 中,您有一个数据框看起来像这样:

>头(df)ID 类型 Annee X1PCE X2PCE X3PCE X4PCE X5PCE X6PCE1 1 A 1980 450 338 154 36 13 92 2 A 2000 288 407 212 54 16 233 3 A 2020 196 434 246 68 19 364 4 B 1980 111 326 441 90 21 115 5 B 2000 63 298 443 133 42 216 6 B 2020 36 257 462 162 55 30

由于您在第 4-9 列中有数值,这些数值稍后会绘制在 y 轴上,因此可以使用 reshape 轻松转换并绘制.

对于我们当前的数据集,我们需要类似的东西,所以我们使用 freq=table(col(raw), as.matrix(raw)) 来得到这个:

>数据名字非常.坏坏好非常.好1 食物 7 6 5 22 音乐 5 5 7 33 人 6 3 7 4

想象一下你有 Very.Bad、Bad、Good 等等,而不是 X1PCE、X2PCE,X3PCE.看到相似之处了吗?但是我们需要先创建这样的结构.因此 freq=table(col(raw), as.matrix(raw)).

I have a survey file in which row are observation and column question.

Here are some fake data they look like:

People,Food,Music,People P1,Very Bad,Bad,Good P2,Good,Good,Very Bad P3,Good,Bad,Good P4,Good,Very Bad,Very Good P5,Bad,Good,Very Good P6,Bad,Good,Very Good

My aim is to create this kind of plot with ggplot2.

  • I absolutely don't care of the colors, design, etc.
  • The plot doesn't correspond to the fake data

Here are my fake data:

raw <- read.csv("pastebin/raw.php?i=L8cEKcxS",sep=",") raw[,2]<-factor(raw[,2],levels=c("Very Bad","Bad","Good","Very Good"),ordered=FALSE) raw[,3]<-factor(raw[,3],levels=c("Very Bad","Bad","Good","Very Good"),ordered=FALSE) raw[,4]<-factor(raw[,4],levels=c("Very Bad","Bad","Good","Very Good"),ordered=FALSE)

But if I choose Y as count then I'm facing an issue about choosing the X and the Group values... I don't know if I can succeed without using reshape2... I've also tired to use reshape with melt function. But I don't understand how to use it...

解决方案

EDIT: Eight years later...

This needs a tidyverse solution, so here is one, with all non-base packages explicitly stated so that you know where each function comes from (except for read.csv which is from utils which comes with base R):

library(magrittr) # needed for %>% if dplyr is not attached "pastebin/raw.php?i=L8cEKcxS" %>% read.csv(sep = ",") %>% tidyr::pivot_longer(cols = c(Food, Music, People.1), names_to = "variable", values_to = "value") %>% dplyr::group_by(variable, value) %>% dplyr::summarise(n = dplyr::n()) %>% dplyr::mutate(value = factor( value, levels = c("Very Bad", "Bad", "Good", "Very Good")) ) %>% ggplot2::ggplot(ggplot2::aes(variable, n)) + ggplot2::geom_bar(ggplot2::aes(fill = value), position = "dodge", stat = "identity")


The original answer:

First you need to get the counts for each category, i.e. how many Bads and Goods and so on are there for each group (Food, Music, People). This would be done like so:

raw <- read.csv("pastebin/raw.php?i=L8cEKcxS",sep=",") raw[,2]<-factor(raw[,2],levels=c("Very Bad","Bad","Good","Very Good"),ordered=FALSE) raw[,3]<-factor(raw[,3],levels=c("Very Bad","Bad","Good","Very Good"),ordered=FALSE) raw[,4]<-factor(raw[,4],levels=c("Very Bad","Bad","Good","Very Good"),ordered=FALSE) raw=raw[,c(2,3,4)] # getting rid of the "people" variable as I see no use for it freq=table(col(raw), as.matrix(raw)) # get the counts of each factor level

Then you need to create a data frame out of it, melt it and plot it:

Names=c("Food","Music","People") # create list of names data=data.frame(cbind(freq),Names) # combine them into a data frame data=data[,c(5,3,1,2,4)] # sort columns # melt the data frame for plotting data.m <- melt(data, id.vars='Names') # plot everything ggplot(data.m, aes(Names, value)) + geom_bar(aes(fill = variable), position = "dodge", stat="identity")

Is this what you're after?

To clarify a little bit, in ggplot multiple grouping bar you had a data frame that looked like this:

> head(df) ID Type Annee X1PCE X2PCE X3PCE X4PCE X5PCE X6PCE 1 1 A 1980 450 338 154 36 13 9 2 2 A 2000 288 407 212 54 16 23 3 3 A 2020 196 434 246 68 19 36 4 4 B 1980 111 326 441 90 21 11 5 5 B 2000 63 298 443 133 42 21 6 6 B 2020 36 257 462 162 55 30

Since you have numerical values in columns 4-9, which would later be plotted on the y axis, this can be easily transformed with reshape and plotted.

For our current data set, we needed something similar, so we used freq=table(col(raw), as.matrix(raw)) to get this:

> data Names Very.Bad Bad Good Very.Good 1 Food 7 6 5 2 2 Music 5 5 7 3 3 People 6 3 7 4

Just imagine you have Very.Bad, Bad, Good and so on instead of X1PCE, X2PCE, X3PCE. See the similarity? But we needed to create such structure first. Hence the freq=table(col(raw), as.matrix(raw)).

更多推荐

ggplot 中的分组条形图

本文发布于:2023-07-18 00:49:16,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1139265.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:条形图   ggplot

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!