- 我有一个.csv表(t1),其列为:Amazon S3存储中的c1,c2,c3
- 我要将其复制到Amazon redshift中
- 我创建具有以下列的表:c1,c2,c3,其中所有列均可为空
-
我使用以下命令复制:
- I have a .csv table (t1) with columns: c1, c2, c3 in amazon S3 storage
- I want to copy that into amazon redshift
- I create the table with columns: c1, c2, c3 where all columns are nullable
I copy with command:
从t1复制t1a(c1,c3)
copy t1a (c1,c3) from t1
我希望它将复制c1和c3从t1开始,并将默认的null值放置在c2中,因此t1a中的行可能看起来像(c1_rowX,null,c3_rowX)。
I expected it would copy c1 and c3 over from t1 and place the default null value in c2 so a row in t1a might look like (c1_rowX, null, c3_rowX).
相反,我收到类型错误,因为它将c2(字符串类型)数据从t1转换为t1a的c3(整数类型)。
Instead I get a type error because it's coping c2 (string type) data from t1 into c3 (int type) of t1a.
当我不指定列时,copy命令可以正常工作:
the copy command works fine when I don't specify the columns:
copy t1a $来自t1的b $ b
copy t1a from t1
我提供了redshift复制命令文档的链接:
I've included a link to the redshift copy command documentation:
docs.aws.amazon / redshift / latest / dg / r_COPY.html
- 主要问题是我在使用指定列时是否存在问题。谢谢
如果要跳过预处理部分,可以定义要跳过的列为 CHAR(1),然后将 TRUNCATECOLUMNS 参数用于 COPY 命令:
If you want to skip preprocessing part, you can define the column to be skipped as CHAR(1) and then use a TRUNCATECOLUMNS parameter to a COPY command:
CREATE TABLE t1a ( c1, c2 CHAR(1), c3 ); COPY t1a FROM t1 TRUNCATECOLUMNSTRUNCATECOLUMNS 会在导入过程中忽略所有长度超过表模式中定义的数据,因此该列中的所有数据将被截断为1个字符。
The TRUNCATECOLUMNS ignores all data that is longer than defined in a table schema during the import, so all data in that column will be truncated into 1 character.
hack,建议对输入文件进行预处理,但有时只需要hack。
That's just a hack, preprocessing input file is recommended, but sometimes a hack is all that's needed.
更多推荐
在redshift postgresql中我可以使用复制功能跳过列吗
发布评论