我想创建一个从S3获取zip文件(可能包含csv文件列表)的lambda,将其解压缩并上传回s3.由于lambda受内存/磁盘大小的限制,因此我必须将其从s3流传输回它.我使用python(boto3)在下面查看我的代码
I want to create a lambda that gets a zip file(which may contain a list of csv files) from S3, unzip it and upload back to s3. since lambda is limited by memory/disk size, I have to stream it from s3 and back into it. I use python (boto3) see my code below
count = 0 obj = s3.Object( bucket_name, key ) buffer = io.BytesIO(obj.get()["Body"].read()) print (buffer) z = zipfile.ZipFile(buffer) for x in z.filelist: with z.open(x) as foo2: print(sys.getsizeof(foo2)) line_counter = 0 out_buffer = io.BytesIO() for f in foo2: out_buffer.write(f) # out_buffer.writelines(f) line_counter += 1 print (line_counter) print foo2.name s3.Object( bucket_name, "output/"+foo2.name+"_output" ).upload_fileobj(out_buffer) out_buffer.close() z.close()结果是,在存储桶中创建空文件.例如:如果文件:input.zip包含文件:1.csv,2.csv我进入存储桶中具有相应名称的2个空csv文件.另外,我不确定它是否确实可以流式传输文件,或者只是下载所有zip文件谢谢
result is, creating empty files in the bucket. for example: if file: input.zip contained files: 1.csv,2.csv i get in the bucket 2 empty csv files with the corresponding names. also, i'm not sure it indeed stream the files, or just download all the zip file thanks
推荐答案您需要搜索,然后再上传到ByesIO文件的开头.
You need to seek back to the beginning of the ByesIO file before uploading.
out_buffer = io.BytesIO() for f in foo2: out_buffer.write(f) # out_buffer.writelines(f) line_counter += 1 out_buffer.seek(0) # Change stream position to beginning of file s3.Object( bucket_name, "output/"+foo2.name+"_output").upload_fileobj(out_buffer) out_buffer.close()更多推荐
如何使用python将流上传到AWS s3
发布评论