删除带有或不带有NA的常量列

编程入门行业动态更新时间:2024-10-09 15:20:37

本文介绍了删除带有或不带有NA的常量列的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

我正在尝试使许多 lm 模型在一个函数中工作，我需要从data.table中自动删除常量列。因此，我只想保留具有两个或多个唯一值的列，但从计数中排除 NA 。

I am trying to get many lm models work in a function and I need to automatically drop constant columns from my data.table. Thus, I want to keep only columns with two or more unique values, excluding NA from the count.

I尝试了在SO上找到的几种方法，但是我仍然无法删除具有两个值的列：常数和NA。

I tried several methods found on SO, but I am still not able to drop columns that have two values: a constant and NAs.

我的可复制代码：

library(data.table) df <- data.table(x=c(1,2,3,NA,5), y=c(1,1,NA,NA,NA),z=c(NA,NA,NA,NA,NA), d=c(2,2,2,2,2)) > df x y z d 1: 1 1 NA 2 2: 2 1 NA 2 3: 3 NA NA 2 4: NA NA NA 2 5: 5 NA NA 2

我的意图是删除列y，z和d，因为它们是恒定的，包括y，当省略 NA s时只有一个唯一值。

My intention is to drop columns y, z, and d since they are constant, including y that only have one unique value when NAs are omitted.

我尝试过：

same <- sapply(df, function(.col){ all(is.na(.col)) || all(.col[1L] == .col)}) df1 <- df[ , !same, with = FALSE] > df1 x y 1: 1 1 2: 2 1 3: 3 NA 4: NA NA 5: 5 NA

如图所示， y仍然存在... 有帮助吗？

As seen, 'y' is still there ... Any help?

推荐答案

由于您有 data.table ，因此可以使用 uniqueN 及其 na.rm 参数：

Because you have a data.table, you may use uniqueN and its na.rm argument:

df[ , lapply(.SD, function(v) if(uniqueN(v, na.rm = TRUE) > 1) v)] # x # 1: 1 # 2: 2 # 3: 3 # 4: NA # 5: 5

一个 base 替代可能是 Filter（function（x）length（unique （x [！is.na（x）]））> 1，df）

更多推荐

删除带有或不带有NA的常量列

本文发布于:2023-10-30 07:35:55，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1542200.html