根据样本大小/频率排除箱内图谱中的类别(Exclude categories in boxplot according to sample size / frequency)

编程入门 行业动态 更新时间:2024-10-27 13:34:13
根据样本大小/频率排除箱内图谱中的类别(Exclude categories in boxplot according to sample size / frequency)

我有一个大型的数据库,我想创建一个boxplot:

数据:test.hospital

y:测试结果(%):共1500个样本

x1:不同年份(2011-2017)

x2:不同的医院(30个不同的医院名称)

各医院的样本量差异很大,因此在某些情况下,实际上数据太少无法说明数据。 因此,我想排除样本量<15的我的箱子里的所有医院。

所以我想要做的是创建一个额外的行,频率为医院抽样的时间,并使用该行来排除我的boxplot的低样本量。

正如你可能得到的是我对R很新,所以对大多数人来说这是可能的。 一个非常简单的问题......但我真的很喜欢它的答案...!

非常感谢哟:)

I have a large database from which I would like to create a boxplot:

data: test.hospital

y: test results (%): 1500 samples in total

x1: different years (2011-2017)

x2: different hospitals (30 different hospital names)

The sample size differs a lot across the hospitals, so in some cases there is actually too little data to say anything about the data. Therefore I would like to exclude all the hospitals from my boxplot that have a samples size<15.

So what I would like to do is to create an extra row with frequencies of how many time the hospital is sampled, and use that row to exclude the low sample size for my boxplot..

As you probably get is that I am very new to R, so for most people this is prob. a very easy question... stil I would really like the answer to it...!

Thank yo so much :)

最满意答案

尝试使用dplyr软件包。 group_by有助于区分不同的医院, mutate计数它们, filter选择至少有15次观察的医院。 %>%是加入函数的管道符号。

install.packages(dplyr) library(dplyr) test.hospital.filtered <- group_by(test.hospital, x2) %>% mutate(sampled_count = n()) %>% filter(sampled_count >= 15)

现在使用ggplot创建ggplot图。 年数在x轴上,测试结果在y轴上,显示过滤的医院。

install.packages(ggplot) library(ggplot) ggplot(test.hospital.filtered, aes(x = x1, y = y, fill = x2)) + geom_boxplot()

Try to use dplyr package. group_by helps to differentiate among hospitals, mutate counts them, filter picks hospitals with at least 15 observation. %>% is pipe symbol for joining the functions.

install.packages(dplyr) library(dplyr) test.hospital.filtered <- group_by(test.hospital, x2) %>% mutate(sampled_count = n()) %>% filter(sampled_count >= 15)

Now use ggplot for creating boxplots. Years are on x axis, test results are on y axis, filtered hospitals are displayed.

install.packages(ggplot) library(ggplot) ggplot(test.hospital.filtered, aes(x = x1, y = y, fill = x2)) + geom_boxplot()

更多推荐

本文发布于:2023-07-18 11:38:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1159800.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:图谱   样本   频率   箱内   大小

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!