选择特定列并将csv名称添加到最终的csv文件中(Selecting specific columns and adding csv names to final csv file)
我正在尝试从位于不同子目录中的许多csv文件中提取相同的前16列数据,并将csv文件名添加到最终csv的每一行。 我的代码:
getwd() root<-list.dirs(".", recursive=TRUE) # get list of files ending in csv in directory root dir(root, pattern='csv$', recursive = TRUE, full.names = TRUE) %>% # read files into data frames lapply(FUN = read.csv) %>% # bind all data frames into a single data frame rbind_all %>% # write into a single csv file write.csv("all.csv")我想知道在哪里放置选择列并添加文件名代码。
回答:
getwd() root<-list.dirs(".", recursive=TRUE) # get list of files ending in csv in directory root dir(root, pattern='csv$', recursive = TRUE, full.names = TRUE) %>% # read files into data frames, select first 16 columns and add filename lapply(FUN = function(p) read.csv(p) %>% select(1:16) %>% mutate(file_name=p)) %>% # bind all data frames into a single data frame rbind_all %>% # write into a single csv file write.csv("all.csv")I'm trying to extract the same first 16 columns of data from many csv files that are in different sub-directories and add the csv file names to each row of the final csv. My code:
getwd() root<-list.dirs(".", recursive=TRUE) # get list of files ending in csv in directory root dir(root, pattern='csv$', recursive = TRUE, full.names = TRUE) %>% # read files into data frames lapply(FUN = read.csv) %>% # bind all data frames into a single data frame rbind_all %>% # write into a single csv file write.csv("all.csv")I'd like to know where to put the select columns and add file names code.
ANSWER:
getwd() root<-list.dirs(".", recursive=TRUE) # get list of files ending in csv in directory root dir(root, pattern='csv$', recursive = TRUE, full.names = TRUE) %>% # read files into data frames, select first 16 columns and add filename lapply(FUN = function(p) read.csv(p) %>% select(1:16) %>% mutate(file_name=p)) %>% # bind all data frames into a single data frame rbind_all %>% # write into a single csv file write.csv("all.csv")最满意答案
您应该在使用lapply时执行此操作,因为这是您可以访问文件名/路径的最后一步:
dir(root, pattern='csv$', recursive = TRUE, full.names = TRUE) %>% lapply(FUN = function(p) read.csv(p) %>% select(1:16) %>% mutate(file_name=p)) %>% bind_rows() %>% write.csv("all.csv")You should do it at the time where you use lapply, since this is the last step where you can access file name/path:
dir(root, pattern='csv$', recursive = TRUE, full.names = TRUE) %>% lapply(FUN = function(p) read.csv(p) %>% select(1:16) %>% mutate(file_name=p)) %>% bind_rows() %>% write.csv("all.csv")更多推荐
发布评论