使用ordered()在R中排序因子级别时观察到NA(Observations becoming NA when ordering levels of factors in R with ordere

编程入门 行业动态 更新时间:2024-10-28 21:30:32
使用ordered()在R中排序因子级别时观察到NA(Observations becoming NA when ordering levels of factors in R with ordered())

您有一个包含4个变量的纵向数据框p ,如下所示:

> head(p) date.1 County.x providers beds price 1 Jan/2011 essex 258 5545 251593.4 2 Jan/2011 greater manchester 108 3259 152987.7 3 Jan/2011 kent 301 7191 231985.7 4 Jan/2011 tyne and wear 103 2649 143196.6 5 Jan/2011 west midlands 262 6819 149323.9 6 Jan/2012 essex 2 27 231398.5

我的变量的结构如下:

'data.frame': 259 obs. of 5 variables: $ date.1 : Factor w/ 66 levels "Apr/2011","Apr/2012",..: 23 23 23 23 23 24 24 24 25 25 ... $ County.x : Factor w/ 73 levels "avon","bedfordshire",..: 22 24 32 65 67 22 32 67 22 32 ... $ providers: int 258 108 301 103 262 2 9 2 1 1 ... $ beds : int 5545 3259 7191 2649 6819 27 185 24 70 13 ... $ price : num 251593 152988 231986 143197 149324 ...

我想按时间顺序订购date.1 。 在应用ordered()之前,此变量不包含NA观察值。

> summary(is.na(p$date.1)) Mode FALSE NA's logical 259 0

但是,一旦我应用我的函数来订购对应于date.1的级别:

p$date.1 = with(p, ordered(date.1, levels = c("Jun/2010", "Jul/2010", "Aug/2010", "Sep/2010", "Oct/2010", "Nov/2010", "Dec/2010", "Jan/2011", "Feb/2011", "Mar/2011","Apr/2011", "May/2011", "Jun/2011", "Jul/2011", "Aug/2011", "Sep/2011", "Oct/2011", "Nov/2011", "Dec/2011" ,"Jan/2012", "Feb/2012" ,"Mar/2012" ,"Apr/2012", "May/2012", "Jun/2012", "Jul/2012", "Aug/2012", "Sep/2012", "Oct/2012", "Nov/2012", "Dec/2012", "Jan/2013", "Feb/2013", "Mar/2013", "Apr/2013", "May/2013", "Jun/2013", "Jul/2013", "Aug/2013", "Sep/2013", "Oct/2013", "Nov/2013", "Dec/2013", "Jan/2014", "Feb/2014", "Mar/2014", "Apr/2014", "May/2014", "Jun/2014", "Jul/2014" ,"Aug/2014", "Sep/2014", "Oct/2014", "Nov/2014", "Dec/2014", "Jan/2015", "Feb/2015", "Mar/2015", "Apr/2015","May/2015", "Jun/2015" ,"Jul/2015" ,"Aug/2015", "Sep/2015", "Oct/2015", "Nov/2015")))

我似乎错过了一些观察。

> summary(is.na(p$date.1)) Mode FALSE TRUE NA's logical 250 9 0

使用ordered()时有没有人遇到过这个问题? 或者,是否有任何其他可能的解决方案按时间顺序对我的观察进行分组?

Hi have a longitudinal data frame p that contains 4 variables and looks like this:

> head(p) date.1 County.x providers beds price 1 Jan/2011 essex 258 5545 251593.4 2 Jan/2011 greater manchester 108 3259 152987.7 3 Jan/2011 kent 301 7191 231985.7 4 Jan/2011 tyne and wear 103 2649 143196.6 5 Jan/2011 west midlands 262 6819 149323.9 6 Jan/2012 essex 2 27 231398.5

The structure of my variables is the following:

'data.frame': 259 obs. of 5 variables: $ date.1 : Factor w/ 66 levels "Apr/2011","Apr/2012",..: 23 23 23 23 23 24 24 24 25 25 ... $ County.x : Factor w/ 73 levels "avon","bedfordshire",..: 22 24 32 65 67 22 32 67 22 32 ... $ providers: int 258 108 301 103 262 2 9 2 1 1 ... $ beds : int 5545 3259 7191 2649 6819 27 185 24 70 13 ... $ price : num 251593 152988 231986 143197 149324 ...

I want to order date.1 chronologically. Prior to apply ordered(), this variable does not contain NA observations.

> summary(is.na(p$date.1)) Mode FALSE NA's logical 259 0

However, once I apply my function for ordering the levels corresponding to date.1:

p$date.1 = with(p, ordered(date.1, levels = c("Jun/2010", "Jul/2010", "Aug/2010", "Sep/2010", "Oct/2010", "Nov/2010", "Dec/2010", "Jan/2011", "Feb/2011", "Mar/2011","Apr/2011", "May/2011", "Jun/2011", "Jul/2011", "Aug/2011", "Sep/2011", "Oct/2011", "Nov/2011", "Dec/2011" ,"Jan/2012", "Feb/2012" ,"Mar/2012" ,"Apr/2012", "May/2012", "Jun/2012", "Jul/2012", "Aug/2012", "Sep/2012", "Oct/2012", "Nov/2012", "Dec/2012", "Jan/2013", "Feb/2013", "Mar/2013", "Apr/2013", "May/2013", "Jun/2013", "Jul/2013", "Aug/2013", "Sep/2013", "Oct/2013", "Nov/2013", "Dec/2013", "Jan/2014", "Feb/2014", "Mar/2014", "Apr/2014", "May/2014", "Jun/2014", "Jul/2014" ,"Aug/2014", "Sep/2014", "Oct/2014", "Nov/2014", "Dec/2014", "Jan/2015", "Feb/2015", "Mar/2015", "Apr/2015","May/2015", "Jun/2015" ,"Jul/2015" ,"Aug/2015", "Sep/2015", "Oct/2015", "Nov/2015")))

It seems I miss some observations.

> summary(is.na(p$date.1)) Mode FALSE TRUE NA's logical 250 9 0

Has anyone come across with this problem when using ordered()? or alternatively, is there any other possible solution to group my observations chronologically?

最满意答案

您的某个p$date.1可能与任何级别都不匹配。 试试这个ord.mon作为关卡。

ord.mon <- do.call(paste, c(expand.grid(month.abb, 2010:2015), sep = "/"))

然后,您可以尝试这一点,看看两者之间是否存在任何不匹配。

p$date.1 %in% ord.mon

最后,您还可以在将date.1 columng转换为Date之后对数据框进行排序(请注意,您必须事先添加实际日期)

p <- p[order(as.Date(paste0("01/", p$date.1), "%d/%b/%Y")), ]

It is possible that one of your p$date.1 doesn't matched to any of the levels. Try this ord.monas the levels.

ord.mon <- do.call(paste, c(expand.grid(month.abb, 2010:2015), sep = "/"))

Then, you can try this to see if there's any mismatch between the two.

p$date.1 %in% ord.mon

Last, You can also sort the data frame after transforming the date.1 columng into Date (Note that you have to add an actual date beforehand)

p <- p[order(as.Date(paste0("01/", p$date.1), "%d/%b/%Y")), ]

更多推荐

本文发布于:2023-07-22 19:38:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1222840.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:因子   级别   观察到   ordered   NA

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!