我有模拟的数据,如下所示:
I have simulated data that looks like this:
LastName Date email CreditCardNum AgeZip Amount Paul 21/02/14 Aliquam.fringilla@dolordapibus.co.uk 4241033422900360 6738851$14.39 Bullock 2/7/2014adipiscing.fringilla@lectusjusto 5178789953524240 3336538$498.31 Mcmahon 11/5/2013lobortis.ultrices@lacus 5389589582467450 7734302$92.44 Walters 25/09/13 consectetuer.cursus.et@sitamet 5157094536097720 7794007$206.60 Franco 17/06/13 et@disparturientmontes.ca 345477952996264 2415873$89.12这就是我试图将它导入R的方式,包括标题:
and this is how I'm attempting to import it into R, with headers:
w <- c(11,10,57,16,3,5,8) df <- read.fwf("data.txt",widths=w,stringsAsFactors=F) names(df) <- df[1,]; df <- df[-1,]我之所以不使用 header = T 是它给我的错误:
The reason I'm not using header=T is that it gives me the error:
Error in read.table(file = FILE, header = header, sep = sep, row.names = row.names, : more columns than column names这不是真的。我知道宽度( w )是正确的。那么这个错误来自哪里?我的解决方案工作正常,我' d只是想了解发生了什么。
which just isn't true. I know the widths (w) are correct. So where is this error coming from? My solution works fine, I'd just like to understand what's happening.
推荐答案如果指定 header = TRUE ,然后,根据?read.fwf ,您必须确保列名由 sep 分隔。默认情况下,名称由 \t (制表符)分隔,对于您的数据,这不能为真。
If you specify header=TRUE, then, as per ?read.fwf, you must ensure that the column names are separated by sep. The default is for names to be separated by \t (the tab character) and this must not be true for your data.
以下工作正常:
w <- c(11, 10, 57, 16, 3, 5, 8) read.fwf(widths=w, header=TRUE, sep='|', file=textConnection('LastName |Date |email |CreditCardNum |Age|Zip |Amount Paul 21/02/14 Aliquam.fringilla@dolordapibus.co.uk 4241033422900360 6738851$14.39 Bullock 2/7/2014adipiscing.fringilla@lectusjusto 5178789953524240 3336538$498.31 Mcmahon 11/5/2013lobortis.ultrices@lacus 5389589582467450 7734302$92.44 Walters 25/09/13 consectetuer.cursus.et@sitamet 5157094536097720 7794007$206.60 Franco 17/06/13 et@disparturientmontes.ca 345477952996264 2415873$89.12'))更多推荐
当header = TRUE时,read.fwf出错
发布评论