我正在使用docker-compose设置可扩展的气流集群。我基于Dockerfile hub.docker/r/ puckel / docker-airflow /
I am using docker-compose to set up a scalable airflow cluster. I based my approach off of this Dockerfile hub.docker/r/puckel/docker-airflow/
我的问题是将日志设置为从s3写入/读取。完成dag后,我会收到这样的错误
My problem is getting the logs set up to write/read from s3. When a dag has completed I get an error like this
*** Log file isn't local. *** Fetching here: ea43d4d49f35:8793/log/xxxxxxx/2017-06-26T11:00:00 *** Failed to fetch log file from worker. *** Reading remote logs... Could not read logs from s3://buckets/xxxxxxx/airflow/logs/xxxxxxx/2017-06- 26T11:00:00我在 airflow.cfg 文件
[MyS3Conn] aws_access_key_id = xxxxxxx aws_secret_access_key = xxxxxxx aws_default_region = xxxxxxx然后在 airflow.cfg
remote_base_log_folder = s3://buckets/xxxx/airflow/logs remote_log_conn_id = MyS3Conn我是否对此进行了正确设置并且存在错误?我缺少这里成功的秘诀吗?
Did I set this up properly and there is a bug? Is there a recipe for success here that I am missing?
-更新
我尝试在URI中导出和JSON格式,但似乎都不起作用。然后,我导出了aws_access_key_id和aws_secret_access_key,然后气流开始拾取它。现在我在工作日志中得到了他的错误
I tried exporting in URI and JSON formats and neither seemed to work. I then exported the aws_access_key_id and aws_secret_access_key and then airflow started picking it up. Now I get his error in the worker logs
6/30/2017 6:05:59 PMINFO:root:Using connection to: s3 6/30/2017 6:06:00 PMERROR:root:Could not read logs from s3://buckets/xxxxxx/airflow/logs/xxxxx/2017-06-30T23:45:00 6/30/2017 6:06:00 PMERROR:root:Could not write logs to s3://buckets/xxxxxx/airflow/logs/xxxxx/2017-06-30T23:45:00 6/30/2017 6:06:00 PMLogging into: /usr/local/airflow/logs/xxxxx/2017-06-30T23:45:00-更新
我也找到了此链接 www.mail-archive/dev@airflow.incubator.apache/msg00462.html
然后我将其中的一台工作计算机脱壳(与Web服务器和调度程序分开)并在python中运行这段代码
I then shelled into one of my worker machines (separate from the webserver and scheduler) and ran this bit of code in python
import airflow s3 = airflow.hooks.S3Hook('s3_conn') s3.load_string('test', airflow.conf.get('core', 'remote_base_log_folder'))我收到此错误。
boto.exception.S3ResponseError: S3ResponseError: 403 Forbidden我尝试导出几种不同类型的 AIRFLOW_CONN _ envs,如连接部分 airflow.incubator.apache/concepts.html 以及对该问题的其他答案。
I tried exporting several different types of AIRFLOW_CONN_ envs as explained here in the connections section airflow.incubator.apache/concepts.html and by other answers to this question.
s3://<AWS_ACCESS_KEY_ID>:<AWS_SECRET_ACCESS_KEY>@S3 {"aws_account_id":"<xxxxx>","role_arn":"arn:aws:iam::<xxxx>:role/<xxxxx>"} {"aws_access_key_id":"<xxxxx>","aws_secret_access_key":"<xxxxx>"}我还导出了AWS_ACCESS_KEY_ID和AWS_SECRET_ACCESS_KEY,但均未成功。
I have also exported AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY with no success.
这些凭据存储在数据库中,因此一旦我将它们添加到用户界面应该由工作人员领取,但是由于某些原因他们不能写/读日志。
These credentials are being stored in a database so once I add them in the UI they should be picked up by the workers but they are not able to write/read logs for some reason.
推荐答案您需要通过气流UI设置s3连接。为此,您需要转到气流UI上的管理->连接选项卡,并为S3连接创建新行。
You need to set up the s3 connection through airflow UI. For this, you need to go to the Admin -> Connections tab on airflow UI and create a new row for your S3 connection.
示例配置为:
Conn ID:my_conn_S3
Conn Id: my_conn_S3
Conn类型:S3
Conn Type: S3
附加:{ aws_access_key_id: your_aws_key_id, aws_secret_access_key: your_aws_secret_key}
Extra: {"aws_access_key_id":"your_aws_key_id", "aws_secret_access_key": "your_aws_secret_key"}
更多推荐
设置s3以记录气流
发布评论