我不想用CSV做数据库导出作业,我想使用Mapreduce作业,用DatastoreInput读取所有数据存储实体,在mapper中,juste为当前实体发出CSV字符串,并使用ValueProjectionReducer传递通过价值观。 但CloudSqlFileOutput为每个分片作业写一个文件。 如何使用mapreduce库将所有csv行导出到一个文件中?
谢谢。
I wan't to do a database export jobs in CSV, I think to use a Mapreduce Job, read all datastore entity with the DatastoreInput, in the mapper, juste emit the CSV string for the current entity, and use the ValueProjectionReducer to just pass through values. But the CloudSqlFileOutput write one file for each shards jobs. How to export all csv lines in one file with the mapreduce library ?
Thanks.
最满意答案
将reducers的数量设置为1.(在最多0.5的版本中,这是输出类别之一。)
Set the number of reducers to 1. (In versions up to 0.5 this is done one the output class.)
更多推荐
发布评论