问题描述
限时送ChatGPT账号..我正在创建一个 shiny 应用
,用户将在其中上传一个包含多个变量的 .csv 文件.使用dplyr
,我将select
前四个变量,如下所示,并将它们从长格式转换.
I'm creating a shiny app
in which the user will upload a .csv file that contains several variables. Using dplyr
, I will select
the first four variables, shown below, and convert them from long format.
数据
df <- read.table(text = c("
Customer Rate Factor Power
W1 6 TK1 5
W2 3 TK1 0
W3 1 TK1 0
W4 2 TK1 0
W5 4 TK1 0
W6 8 TK1 0
W7 5 TK1 0
W8 7 TK1 3
W1 6 TK2 0
W2 3 TK2 1
W3 1 TK2 0
W4 2 TK2 5
W5 4 TK2 0
W6 8 TK2 0
W7 5 TK2 0
W8 7 TK2 3
W1 6 TK3 0
W2 3 TK3 5
W3 1 TK3 1
W4 2 TK3 0
W5 4 TK3 0
W6 8 TK3 0
W7 5 TK3 0
W8 7 TK3 0
W1 6 TK4 0
W2 3 TK4 3
W3 1 TK4 0
W4 2 TK4 0
W5 4 TK4 0
W6 8 TK4 0
W7 5 TK4 0
W8 7 TK4 0
W1 6 TK5 1
W2 3 TK5 0
W3 1 TK5 5
W4 2 TK5 0
W5 4 TK5 1
W6 8 TK5 0
W7 5 TK5 0
W8 7 TK5 0
W1 6 TK6 0
W2 3 TK6 0
W3 1 TK6 0
W4 2 TK6 0
W5 4 TK6 0
W6 8 TK6 0
W7 5 TK6 5
W8 7 TK6 0
W1 6 TK7 0
W2 3 TK7 0
W3 1 TK7 0
W4 2 TK7 0
W5 4 TK7 0
W6 8 TK7 3
W7 5 TK7 3
W8 7 TK7 0
W1 6 TK8 0
W2 3 TK8 0
W3 1 TK8 1
W4 2 TK8 0
W5 4 TK8 0
W6 8 TK8 3
W7 5 TK8 0
W8 7 TK8 0
W1 6 TK9 0
W2 3 TK9 0
W3 1 TK9 0
W4 2 TK9 0
W5 4 TK9 5
W6 8 TK9 0
W7 5 TK9 0
W8 7 TK9 0
W1 6 TK10 0
W2 3 TK10 0
W3 1 TK10 0
W4 2 TK10 0
W5 4 TK10 0
W6 8 TK10 5
W7 5 TK10 0
W8 7 TK10 0
W1 6 TK11 0
W2 3 TK11 0
W3 1 TK11 0
W4 2 TK11 0
W5 4 TK11 0
W6 8 TK11 0
W7 5 TK11 0
W8 7 TK11 3
W1 6 TK12 0
W2 3 TK12 0
W3 1 TK12 0
W4 2 TK12 0
W5 4 TK12 0
W6 8 TK12 0
W7 5 TK12 0
W8 7 TK12 5"), header = T)
我使用下面的代码将长格式转换为宽格式
I used the code below to convert from long to wide format
从长到宽
library(dplyr)
library(tidyr)
df_wide <- df %>%
tidyr::spread(Factor, Power)
结果
> df_wide
Customer Rate TK1 TK10 TK11 TK12 TK2 TK3 TK4 TK5 TK6 TK7 TK8 TK9
1 W1 6 5 0 0 0 0 0 0 1 0 0 0 0
2 W2 3 0 0 0 0 1 5 3 0 0 0 0 0
3 W3 1 0 0 0 0 0 1 0 5 0 0 1 0
4 W4 2 0 0 0 0 5 0 0 0 0 0 0 0
5 W5 4 0 0 0 0 0 0 0 1 0 0 0 5
6 W6 8 0 5 0 0 0 0 0 0 0 3 3 0
7 W7 5 0 0 0 0 0 0 0 0 5 3 0 0
8 W8 7 3 0 3 5 3 0 0 0 0 0 0 0
宽格式显示 Factor
变量的级别为 TK1
然后是 TK10
The wide format is showing the levels of Factor
variable as TK1
and then TK10
> levels(df$Factor)
[1] "TK1" "TK10" "TK11" "TK12" "TK2" "TK3" "TK4" "TK5" "TK6" "TK7" "TK8" "TK9"
我希望 Factor 的级别从 TK1、TK2 到 TK12
I want the levels of Factor to be from TK1, TK2 till TK12
我可以解决如下
df$Factor <- factor(df$Factor, levels = c("TK1", "TK2" , "TK3" , "TK4", "TK5" , "TK6" , "TK7" , "TK8" , "TK9", "TK10", "TK11", "TK12"))
但是,因子变量的级别将是用户输入的函数.可能是 14、15 或 20.
However, the levels of Factor variable will be a function of the user's input. It might be 14, 15 or 20.
问题
有没有办法让 Factor 列的级别从低到高排列,而不管用户输入如何?
Is there any way to arrange the levels of Factor column from lowest to highest regardless of the user's input?
推荐答案
我们可以将其更改为 factor
并指定 levels
We can change it to factor
with levels
specified
df %>%
mutate(Factor = factor(Factor, levels = paste0("TK", 1:12))) %>%
spread(Factor, Power)
<小时>
或者让它更动态,我们提取
非数字和数字部分到单独的列('Factor1','Factor2'),将'Factor'更改为factor
与 levels
通过 paste
将 min
到 max
值的序列指定在 'Factor2' 中'Factor1' 中的第一个字符值,删除 'Factor1' 和 'Factor2',以及 spread
.
Or make it more dynamic, we extract
the non-numeric and numeric part into separate columns ('Factor1', 'Factor2'), change the 'Factor' to factor
with levels
specified by paste
ing the sequence of min
to max
values in 'Factor2' with that of the first character value in 'Factor1', remove the 'Factor1' and 'Factor2', and spread
.
library(tidyr)
res <- df %>%
extract(Factor, into = c("Factor1", "Factor2"), "(\\D+)(\\d+)",
remove = FALSE, convert=TRUE) %>%
mutate(Factor = factor(Factor, levels = paste0(Factor1[1],
min(Factor2):max(Factor2)))) %>%
select(-Factor1, -Factor2) %>%
spread(Factor, Power)
head(res, 2)
# Customer Rate TK1 TK2 TK3 TK4 TK5 TK6 TK7 TK8 TK9 TK10 TK11 TK12
#1 W1 6 5 0 0 0 1 0 0 0 0 0 0 0
#2 W2 3 0 1 5 3 0 0 0 0 0 0 0 0
这篇关于dplyr 和 tidyr:将长格式转换为宽格式并排列列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
更多推荐
[db:关键词]
发布评论