我正在尝试使用清单文件加载镶木地板文件并出现错误.
I'm trying to load parquet file using manifest file and getting below error.
查询:124138由于内部错误而失败.档案' s3.amazonaws/sbredshift-east/data/000002_0 版本号无效:)
query: 124138ailed due to an internal error. File 's3.amazonaws/sbredshift-east/data/000002_0 has an invalid version number: )
这是我的复制命令
从"s3://sbredshift-east/manifest/supplier.manifest"复制测试表IAM_ROLE'arn:aws:iam :: 123456789:role/MyRedshiftRole123'格式为PARQUET清单;
这是我的清单文件
**{ "entries":[ { "url":"s3://sbredshift-east/data/000002_0", "mandatory":true, "meta":{ "content_length":1000 } } ] }**通过指定文件名,我可以使用复制命令加载相同的文件.
I'm able to load the same file using copy command by specifying the file name.
从's3://sbredshift-east/data/000002_0'复制测试表IAM_ROLE'arn:aws:iam :: 123456789:role/MyRedshiftRole123'FOR PARQUET;
INFO:加载到表供应商"中的操作已完成,已成功加载800000条记录.复制
INFO: Load into table 'supplier' completed, 800000 record(s) loaded successfully. COPY
我的复制声明中可能有什么问题?
What could be wrong in my copy statement?
推荐答案获取镶木地板副本以使用清单文件的唯一方法是添加具有content_length的元密钥.
The only way I've gotten parquet copy to work with manifest file is to add the meta key with the content_length.
从我可以在错误日志中收集的信息来看,用于拼花地板(带有清单)的COPY命令可能首先是使用Redshift Spectrum作为外部表来读取文件的.如果是这种情况,则此隐藏步骤确实需要content_step,这与他们最初关于COPY命令的声明相矛盾.
From what I can gather in my error logs, the COPY command for parquet (w/ manifest) might first be reading the files using Redshift Spectrum as an external table. If that's the case, this hidden step does require the content_step which contradicts their initial statement about COPY commands.
docs.amazonaws/zh_CN/redshift/latest/dg/loading-data-files-using-manifest.html
更多推荐
使用复制命令和清单文件将镶木地板格式文件加载到Amazon Redshift中时出错
发布评论