两个结构的猫:不相同的字段

编程入门行业动态更新时间:2024-10-09 13:21:06

本文介绍了两个结构的猫:不相同的字段的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

我有多个csv文件

a.csv

field_a, field_b 111, 121 112, 122

b.csv

field_a, field_c 211, 231 212, 232

c.csv

field_a, field_b, field_c 311, 321, 331 312, 322, 332

我想将它们连接起来

output.csv

field_a,field_b,field_c 111, 121, NA 112, 122, NA 211, NA, 231 212, NA, 232 311, 321, 331 312, 322, 332

我想用八度来做.

到目前为止我做了什么:

What i did so far:

a=csv2cell(a.csv) A=cell2struct(a(2:end,:),a(1,:),1)

现在我正在寻找类似的东西

and now i'm looking for something like

合并(A，B，C) 或者 vertcat(A，B，C)

merge(A,B,C) or vertcat(A,B,C)

但是我不明白，所有字段都在输出中.

but i didn't get it, that all fields are in the output.

Rhi这样做是这样的:

Whith R i did it like this:

filelist<-list.files() for (i in 1:length(filelist)) { datas[[i]]<-list(as.data.frame(read.csv(filelist[i]))) merged <- merge(merged,datas[[i]], all=TRUE)}

但是for循环太慢了.因此，我正在寻找一次将它们全部合并的可能性.

but the for-loop is terrible slow. So i'm looking for a possibility to merged them all at once.

推荐答案

我终于做到了:

使用八度(MATLAB)

With Octave (MATLAB)

% FileNames=readdir(pwd); d=dir(pwd); isDirIdx = [d.isdir]; names = {d.name}; FileNames = names(~isDirIdx); for ii = 1:numel(FileNames) % Load csv to cell datas{ii}=csv2cell(FileNames{ii}); % Then I convert them to a struct Datas{ii}=cell2struct((datas{ii}(2:end,:)),[datas{ii}(1,:)],2); try fields=[fields, fieldnames(Datas{ii})'];% fails for the first loop, becauce 'fields' doesn't exist yet catch fields=[fieldnames(Datas{ii})']; % create 'fields' in the first loop end Datalenght(ii)=numel(Datas{ii}(1)); end cd(startdir) for jj=1:numel(Datas) missing_fields{jj} = setdiff(fields,fieldnames(Datas{jj})); for kk=1:numel(missing_fields{jj}) [Datas{jj}.(missing_fields{jj}{kk})]=deal(NaN);%*zeros(numel(datas{jj}(2:end,1)),1);) end end

问题是，我没有看到将结构导出到csv的简便方法.所以我切换回R.因为我没有足够的内存，所以我无法在r中加载所有文件并将其导出为一个csv.所以首先我将每个netcdf文件导出到具有完全相同值的csv.然后，我用unix/gnu cat命令将它们全部串联起来.

The problem was, i didn't saw a easy way to export the struct to a csv. So I switch back to R. Because I have not enough memory, i couldn't load all files in r and export them as one csv. So first i exported every netcdf file to a csv with exactly the same values. Then I concatenated them all with the unix/gnu cat command.

# Converts all NetCDF (*.nc) in a folder to ASCII (csv) # when there are more then one, all csv will have the same fields # when there is a field missing in one NetCDF file, this scripts adds 'NA' Values # it saves memory, because there is always only one NetCDF-File in the memory. # Needs package RNetCDF: # cran.r-project/web/packages/RNetCDF/index.html # load package library('RNetCDF') # get list of all files to merge filelist<-list.files() # initialise variable names varnames_all<-{} varnames_file<-list(filelist) n_files<-length(filelist) n_vars<-rep(NA,n_files) # initialise # get variables-names of each NetCDF file for (i in 1:n_files) { ncfile<-open.nc(filelist[i]) # open nc file print(paste(filelist[i],"opend!")) # get number of variable in the NetCDF n_vars[i]<-file.inq.nc(ncfile)$nvars varnames="" # initialise and clear # read every variable name for (j in 0:(n_vars[i]-1)) { varnames[j]<-var.inq.nc(ncfile,j)$name } close.nc(ncfile) varnames_file[[i]]<-varnames # add to the list of all files varnames_all<-(c(varnames_all,varnames)) # concat to one array } varnames_all<-unique(varnames_all) # take every varname only once print("Existing variable names:") print(varnames_all) #initialise a data.frame for load the NetCDF datas<-data.frame() for (i in 1:length(filelist)) { print(filelist[i]) ncfile<-open.nc(filelist[i]) # open nc file print(paste("reading ", filelist[i], "...")) datas<-as.data.frame(read.nc(ncfile)) #import data from ncfile as data frame close.nc(ncfile) #check witch variables are missing missing_vars<-setdiff(varnames_all,colnames(datas)) # Add missing variables a colums with NA datas[missing_vars]<-NA print(paste("writing ", filelist[i], " to ", filelist[i],".csv ...", sep="")) #reorder colum in the same way as in the array varname_all datas<-datas[varnames_all] # Write File write.csv(datas,file=paste(filelist[i],".csv", sep="")) # clear Memory rm(datas) }

那只猫是直挺的

#!/bin/bash # Concatenate csv files, whitch have exactly the same fields ## Change to the directory, from where the files is executed path=$PWD cd $path if [ $# -gt 0 ]; then cd $1 fi # get a list of all data files datafile_list=$( ls ) read -a datafile_array <<< $datafile_list echo "copying files ..." echo "copying file:" ${datafile_array[0]} cat < ./${datafile_array[0]} > ../outputCat.csv for (( i=1; i<${#datafile_array[@]}; i++)) do echo "copying file" ${datafile_array[$i]} cat < ./${datafile_array[$i]} | tail -n+2 >> ../outputCat.csv done

更多推荐

两个结构的猫:不相同的字段

本文发布于:2023-11-29 04:22:20，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1645258.html