获取有关多个变量和多个组的基本统计信息

编程入门行业动态更新时间:2024-10-27 14:29:45

本文介绍了获取有关多个变量和多个组的基本统计信息的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

我想在下面的2个变量 y1 和 y2 上为我的数据计算2个基本统计信息.

I want to calculate a 2 basic statistics for my data below on the 2 variables y1 and y2.

首先，对于每个 group ，我想分别获得 variance * n_of_group-1 (例如，对于 group == 1 在 y1 上为 6 ，在 y2 上为 2 .

First, for each group, I want to separately obtain variance*n_of_group-1 (e.g., for group==1 the answer will be 6 on y1 and 2 on y2).

第二，对于每个 group ，我想分别获得 covariance * n_of_group-1 (例如，对于 group == 1 将为 0 ).

Second, for each group, I want to separately obtain covariance*n_of_group-1 (e.g., for group==1 the answer will be 0).

我已经尝试过一些方法，但是我想知道如何将 * n_of_group-1 部分应用于下面的R代码?

I have tried something, but I wonder how to apply the *n_of_group-1 part to my R code below?

ps. n_of_group 只是每个组的 count()或 n().我的所需输出如下所示.

ps. n_of_group simply is the count() or n() of each group. My desired output is shown below.

z <- "group y1 y2 1 1 2 3 2 1 3 4 3 1 5 4 4 1 2 5 5 2 4 8 6 2 5 6 7 2 6 7 8 3 7 6 9 3 8 7 10 3 10 8 11 3 9 5 12 3 7 6" dat <- read.table(text = z, header = T) dat %>% group_by(group) %>% summarise(var1 = var(y1), var2 = var(y2)) # how to apply the `*n_of_group-1` to var1 & var2 dat %>% group_by(group) %>% summarise(co = cov(y1,y2)) # how to apply the `*n_of_group-1` to co, what if `co` was more than 1 number

所需的输出(如果我们将每个组的结果放在2x2矩阵中):

Desired output (if we put the results above for each group in a 2x2 matrix):

group1 = matrix(c(6,0,0,2),2) # The two repetitive element in the middle (0,0) are # the second statistic, the other elements are the # first statistics group2 = matrix(c(2,-1,-1,2),2) group3 = matrix(c(6.8,2.6,2.6,5.2),2)

推荐答案

我们还可以使用 across

library(dplyr) dat %>% group_by(group) %>% summarise(co = cov(y1, y2) * (n() - 1), across(c(y1, y2), ~ var(.) * (n() - 1), .names = 'var_{.col}'), .groups = 'drop')

-输出

# A tibble: 3 x 4 # group co var_y1 var_y2 # <int> <dbl> <dbl> <dbl> #1 1 0 6 2 #2 2 -1 2 2 #3 3 2.6 6.8 5.2

此外，最好先创建 n

library(tibble) dat %>% add_count(group) %>% group_by(group) %>% summarise(co = cov(y1, y2) * (first(n) - 1), across(c(y1, y2), ~ var(.) * (first(n)- 1), .names = 'var_{.col}'), .groups = 'drop')

更多推荐

获取有关多个变量和多个组的基本统计信息

本文发布于:2023-06-04 21:18:35，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/505241.html