AWS Glue爬网程序未创建表

编程入门行业动态更新时间:2024-10-28 00:16:13

本文介绍了AWS Glue爬网程序未创建表的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

我有一个在AWS Glue中创建的搜寻器，该搜寻器在成功完成后没有在数据目录中创建表.

I have a crawler I created in AWS Glue that does not create a table in the Data Catalog after it successfully completes.

搜寻器大约需要20秒钟才能运行，并且日志显示该搜寻器已成功完成. CloudWatch日志显示:

The crawler takes roughly 20 seconds to run and the logs show it successfully completed. CloudWatch log shows:

基准:对抓取工具运行开始抓取
基准:分类完成，将结果写入数据库
基准:完成对目录的写入
基准:抓取工具已完成运行并处于就绪状态

我不知道为什么未创建数据目录中的表. AWS Docs在调试方面没有多大帮助.

I am at a loss as to why the tables in the data catalog are not being created. AWS Docs are not of much help debugging.

推荐答案

检查与搜寻器关联的IAM角色.您很可能没有正确的权限.

check the IAM role associated with the crawler. Most likely you don't have correct permission.

创建搜寻器时，如果选择创建IAM角色(默认设置)，则它将为仅指定的S3对象创建策略.如果以后再编辑搜寻器并仅更改S3路径.与搜寻器相关联的角色将无权访问新的S3路径.

When you create the crawler, if you choose to create an IAM role(the default setting), then it will create a policy for S3 object you specified only. if later you edit the crawler and change the S3 path only. The role associated with the crawler won't have permission to the new S3 path.

更多推荐

AWS Glue爬网程序未创建表

本文发布于:2023-10-16 10:48:01，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1497353.html