Airflow DockerOperator找不到某些图像,但可以找到其他图像

编程入门 行业动态 更新时间:2024-10-25 18:27:59
本文介绍了Airflow DockerOperator找不到某些图像,但可以找到其他图像的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

尝试在Airflow中使用Docker运算符时出现以下错误.气流设置对我不可见(它由另一个团队在我无法访问的计算机上运行,​​并且负责的团队没有响应).我从自己编写的docker文件创建了docker映像.cmprod名称是指docker映像.

I get the following error when trying to use the Docker operator in Airflow. The airflow setup is not visible to me (it is running by another team on a machine I cannot access and the responsible team is not responsive). I created the docker image from a docker file I wrote myself. The name cmprod refers to the docker image.

ImageNotFound: 404 Client Error: Not Found ("pull access denied for cmprod, repository does not exist or may require 'docker login': denied: requested access to the resource is denied")

我不熟悉docker login的使用,并且我不确定它是否适用于这种情况,因为我能够运行某些映像,而不能运行其他映像.起初,我虽然输入了不正确的docker映像名称,但我检查并再次检查.以下是 docker images 的输出.我能够通过气流成功运行图像.

I am unfamiliar with the use of docker login and I am not sure if it applies in this case since I am able to run some images and not others. At first I though I incorrectly typed the name of the docker image but I checked and double checked. Below is the output of docker images. I was able to run the image condatest successfully through airflow.

REPOSITORY TAG IMAGE ID CREATED SIZE cm_prod latest 08f408557eb7 15 hours ago 2.12GB cmprod latest 08f408557eb7 15 hours ago 2.12GB <none> <none> 4af8c991ea19 15 hours ago 730MB <none> <none> 9da4759a3316 15 hours ago 64.2MB condatest latest e24563f9bb48 5 days ago 2.12GB

我以为我可能错误地使用了docker运算符,但是我能够运行其他图像.我认为可能存在一个气流配置问题,其中不允许某些操作系统或不允许使用某些权限运行,但是我找不到任何文档来证明这是否可能.

I thought I might be using the docker operator incorrectly but I am able to run some other images. I thought maybe there was an airflow configuration issue where certain operating systems were not allowed or running with certain permissions was not allowed but I have been unable to find any documentation on whether this is possible.

我的测试没有显示以上任何因素来确定使用docker操作员通过气流是否可以找到docker镜像的100%.这个问题似乎不适合反复试验.关于可能发生的事情的任何建议将不胜感激.

My testing does not show any of the above factors to determine 100% whether a docker image can or cannot be found by airflow using the docker operator. This problem does not seem amenable to trial and error. Any advice on what may be happening would be appreciated.

我能够在浏览器中看到气流UI并触发dag,并且有一个共享目录,可以在其中转储我的dag规范脚本.Airflow版本是1.10.3.

I am able to see the airflow UI in my browser and trigger dags and there is a shared directory where I can dump my dag specification script. Airflow is Version : 1.10.3.

docker的版本信息遵循 docker版本:

The version info for docker follows docker version:

Client: Docker Engine - Community Version: 19.03.6 API version: 1.40 Go version: go1.12.16 Git commit: 369ce74a3c Built: Thu Feb 13 01:29:29 2020 OS/Arch: linux/amd64 Experimental: false Server: Docker Engine - Community Engine: Version: 19.03.6 API version: 1.40 (minimum version 1.12) Go version: go1.12.16 Git commit: 369ce74a3c Built: Thu Feb 13 01:28:07 2020 OS/Arch: linux/amd64 Experimental: false containerd: Version: 1.2.10 GitCommit: b34a5c8af56e510852c35414db4c1f4fa6172339 runc: Version: 1.0.0-rc8+dev GitCommit: 3e425f80a8c931f88e6d94a8c831b9d5aa481657 docker-init: Version: 0.18.0 GitCommit: fec3683

要求提供气流DAG代码.我犹豫要发布整件事,因为我从一个离开的团队成员那里继承了一些代码,我觉得最好将dag中的某些代码作为单独的脚本来实现.以下是最相关的代码块.让我知道是否有任何遗漏.为了清楚起见,在这些块之间有一个部分,但如果似乎没有任何作用,则可以包括在内.

The airflow DAG code was requested. I am hesitant to post the whole thing because I inherited some code from a team member who left and I feel like some of the code in the dag would be best implemented as a separate script. Below are the most relevant code blocks. Let me know if anything seems missing. There is a section between these blocks I omit for clarity but can include if nothing seems to work.

代码块1:导入依赖项

from functools import reduce import os, os.path from datetime import datetime, timedelta from airflow import DAG from airflow.operators.mssql_operator import MsSqlOperator from airflow.operators.docker_operator import DockerOperator from airflow.utils.helpers import chain

代码块2:DAG和操作符实例化

CODE BLOCK 2: DAG and OPERATOR Instantiation

# create SQL operators def create_SQL_operator(taskfile, dag): """ Creates a MsSQL operator for a given DAG. """ op = MsSqlOperator( task_id=taskfile, sql=readSQL(os.path.join(ProjDir, taskfile)), mssql_conn_id='clarity', autocommit=True, database='clarity', dag=dag ) return op # Airflow arguments default_args = { 'owner': 'airflow', 'description': 'Parallel SQL DAG', 'depend_on_past': False, 'start_date': datetime(2020, 1, 1), 'email': ['*PERSONTOEMAIL*'], 'email_on_failure': False, 'email_on_retry': True } # DAG definition DAG = DAG(ProjName + '_and_infer', description='Running parallel SQLs for project: {} and inference on the data'.format(ProjName), default_args=default_args, schedule_interval=CronTime, # '0 */2 * * *', #every 2 hours concurrency=50, # setup to allow 50 concurrent parallel tasks catchup=False) t_predict = DockerOperator( task_id='dockerPredict', image='cmprod', api_version='auto', auto_remove=True, volumes=['*ABSOLUTEPATHTOMOUNT*:/ds-cm'], command='bash inference.sh ', docker_url='unix://var/run/docker.sock', network_mode='bridge', dag=DAG) # Create SQL task operators in Airflow global space ops = [] ops = [(order, create_SQL_operator(taskfile, DAG)) for order, taskfile in sql_rank] ops.sort(key=lambda tup: tup[0]) # create cluster ops list from itertools import groupby from operator import itemgetter opsList = [] opsList = [[j for i, j in grouper] for order, grouper in groupby(ops, key=itemgetter(0))] # flatten list with only 1 element: Airflow chain() cannot accept list of lists!! chainList = [] chainList = [reduce(plus, list) if len(list) == 1 else list for list in opsList] chainList.append(t_predict) # create final DAG graph exec(r' >> '.join([r'chainList['+str(i)+r']' for i in range(len(chainList))]))

更新自从我最初发布此问题以来,我将condatest图像替换为上述代码,并设法以另一种方式出错:挂载的目录中缺少Shell脚本.

UPDATE Since I originally posted this question I substituted the condatest image into the above code and managed to error out in a different way: there was a missing shell script in the mounted directory.

当我复制丢失的文件并再次运行时,气流无法再找到最新的图像.我检查了一下,发现新复制的脚本没有执行权限,并添加了该权限.Airflow仍然找不到以前正常工作的Docker容器.

When I copied the missing file and ran again airflow could no longer find the condatest image. I checked and saw that the newly copied script did not have execute permission and added the permission. Airflow still could not find the previously working docker container.

我删除了shell脚本,气流可以再次找到该容器.这是否意味着问题与Linux权限有关?我不清楚安装的驱动器中的物品如何影响气流检测容器的能力.此外,我知道我过去能够使用由dockerobject在气流中启动的docker容器运行相同的脚本.

I deleted the shell script and airflow can find the container again. Does this mean the problem has to do with Linux permissions? It is unclear to me how the contents of the mounted drive affect the ability of airflow to detect the container. Furthermore, I know I was able to run that same script using a docker container started by a dockerobject in airflow in the past.

推荐答案

将气流升级为airflow2后,日志提供了一些其他信息.Airflow已配置为可在多台服务器上运行,并且已在每台服务器上设置了docker,但未使用任何映像注册表.似乎当作业调度程序尝试在我构建docker映像的服务器以外的服务器上执行dag时,该映像不可用.看来我早先找到的解决方法恰好与我的工作计划在哪个服务器上进行的幸运抽奖相吻合.

After upgrading airflow to airflow2 the logs have provided some additional information. Airflow had been configured to run on multiple servers and docker had been set up on each server but no image registry had been used. It seems that when the job scheduler tried to execute the dag on a server other than the server where I built my docker image the image was unavailable. It seems the workarounds I was finding earlier were just coincided with a lucky draw for what server my job was scheduled on.

为解决此问题,我们将调度程序配置为仅使用一台服务器.

To resolve this we have configured our scheduler to use only one server.

更多推荐

Airflow DockerOperator找不到某些图像,但可以找到其他图像

本文发布于:2023-11-24 12:23:42,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1625182.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:图像   找不到   可以找到   Airflow   DockerOperator

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!