我想在下面的2个变量 y1 和 y2 上为我的数据计算2个基本统计信息.
I want to calculate a 2 basic statistics for my data below on the 2 variables y1 and y2.
首先,对于每个 group ,我想分别获得 variance * n_of_group-1 (例如,对于 group == 1 在 y1 上为 6 ,在 y2 上为 2 .
First, for each group, I want to separately obtain variance*n_of_group-1 (e.g., for group==1 the answer will be 6 on y1 and 2 on y2).
第二,对于每个 group ,我想分别获得 covariance * n_of_group-1 (例如,对于 group == 1 将为 0 ).
Second, for each group, I want to separately obtain covariance*n_of_group-1 (e.g., for group==1 the answer will be 0).
我已经尝试过一些方法,但是我想知道如何将 * n_of_group-1 部分应用于下面的R代码?
I have tried something, but I wonder how to apply the *n_of_group-1 part to my R code below?
ps. n_of_group 只是每个组的 count()或 n().我的所需输出如下所示.
ps. n_of_group simply is the count() or n() of each group. My desired output is shown below.
z <- "group y1 y2 1 1 2 3 2 1 3 4 3 1 5 4 4 1 2 5 5 2 4 8 6 2 5 6 7 2 6 7 8 3 7 6 9 3 8 7 10 3 10 8 11 3 9 5 12 3 7 6" dat <- read.table(text = z, header = T) dat %>% group_by(group) %>% summarise(var1 = var(y1), var2 = var(y2)) # how to apply the `*n_of_group-1` to var1 & var2 dat %>% group_by(group) %>% summarise(co = cov(y1,y2)) # how to apply the `*n_of_group-1` to co, what if `co` was more than 1 number所需的输出(如果我们将每个组的结果放在2x2矩阵中):
Desired output (if we put the results above for each group in a 2x2 matrix):
group1 = matrix(c(6,0,0,2),2) # The two repetitive element in the middle (0,0) are # the second statistic, the other elements are the # first statistics group2 = matrix(c(2,-1,-1,2),2) group3 = matrix(c(6.8,2.6,2.6,5.2),2) 推荐答案我们还可以使用 across
library(dplyr) dat %>% group_by(group) %>% summarise(co = cov(y1, y2) * (n() - 1), across(c(y1, y2), ~ var(.) * (n() - 1), .names = 'var_{.col}'), .groups = 'drop')-输出
# A tibble: 3 x 4 # group co var_y1 var_y2 # <int> <dbl> <dbl> <dbl> #1 1 0 6 2 #2 2 -1 2 2 #3 3 2.6 6.8 5.2此外,最好先创建 n
library(tibble) dat %>% add_count(group) %>% group_by(group) %>% summarise(co = cov(y1, y2) * (first(n) - 1), across(c(y1, y2), ~ var(.) * (first(n)- 1), .names = 'var_{.col}'), .groups = 'drop')更多推荐
获取有关多个变量和多个组的基本统计信息
发布评论