我尝试使用 Java客户端提交Oozie作业来自另一个Job的java动作的API 。该群集正在使用Kerberos。
以下是我的代码:
//获取OozieClient for本地Oozie 字符串oozieUrl =hadooputl02.northamerica.xyz:11000/oozie/; AuthOozieClient wc = new AuthOozieClient(oozieUrl); wc.setDebugMode(1); //创建工作流作业配置并设置工作流应用程序路径属性conf = wc.createConfiguration(); conf.setProperty(OozieClient.APP_PATH,wfAppPath); conf.setProperty(jobTracker,yarnRM); conf.setProperty(nameNode,hdfs:// ingestiondev); //提交并启动工作流作业 String jobId = wc.run(conf); System.out.println(提交的工作流作业);但是我收到以下错误:
org.apache.oozie.action.hadoop.JavaMainException:IO_ERROR: java.io.IOException:连接Oozie服务器时发生错误。重试次数= 1。例外=无法验证,GSSException:没有提供有效的凭证(机制级别:未能找到任何Kerberos tgt) ... 导致:AUTHENTICATION:无法验证,GSSException :没有提供有效的凭证(机制级别:无法找到任何Kerberos tgt) ... 引起:org.apache.hadoop.security.authentication.client.AuthenticationException:GSSException:没有提供有效凭证(机制级别:无法找到任何Kerberos tgt) ... 引起:GSSException:未提供有效凭据(机制级别:无法找到任何Kerberos tgt)我相信在代码中有更多需要通过kerberos为节点/用户提供对oozie服务器的访问权。
有人可以指向在Kerberized群集上使用Oozie Java API的正确方式吗?
谢谢!
解决方案错误消息是明确的:无法找到任何Kerberos tgt 。您的作业运行在YARN容器中,随机节点上,并且没有可用的Kerberos票据。
您是否想知道Oozie如何使用您的Kerberos凭据开始工作,即使它不知道你的密码?这是因为它使用Hadoop内建的后门。但是,您的工作没有适当的Kerberos凭据,因此您在尝试执行某些操作时看到的消息未被覆盖。
How Oozie管理没有凭证的认证
但有一个问题:代表令牌不适用于任何使用纯Kerberos身份验证的服务 - 即Hive Metastore,Hive JDBC,HBase,ZooKeeper,Oozie等。 这就是为什么Oozie有一个解决方法: explicit <凭证> 请求,用于Hive操作,Hive2操作,HBase操作等。 [声明:我真的不知道它是如何工作的]
我怀疑这些凭据中的任何一个都可以对Oozie本身起作用......!
您可以如何管理自己的自定义身份验证 $ b
您可以在我的这篇文章中找到更多详细信息:在kerber下使用JDBC连接到impala时出错os authrication
免责声明:我不知道Oozie预计哪个JAAS主题(例如,ZooKeeper期望 Client ,Hive希望 com.sun.security.jgss.krb5.initiate )
c>添加到容器CWD中的临时文件(当作业停止时将自动销毁)
I am trying to submit an Oozie job using Java Client API from another Job's java action. The cluster is using Kerberos.
Here is my code:
// get a OozieClient for local Oozie String oozieUrl = "hadooputl02.northamerica.xyz:11000/oozie/"; AuthOozieClient wc = new AuthOozieClient(oozieUrl); wc.setDebugMode(1); // create a workflow job configuration and set the workflow application path Properties conf = wc.createConfiguration(); conf.setProperty(OozieClient.APP_PATH, wfAppPath); conf.setProperty("jobTracker", "yarnRM"); conf.setProperty("nameNode", "hdfs://ingestiondev"); // submit and start the workflow job String jobId = wc.run(conf); System.out.println("Workflow job submitted");But I am getting the following error:
org.apache.oozie.action.hadoop.JavaMainException: IO_ERROR : java.io.IOException: Error while connecting Oozie server. No of retries = 1. Exception = Could not authenticate, GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) ... Caused by: AUTHENTICATION : Could not authenticate, GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) ... Caused by: org.apache.hadoop.security.authentication.client.AuthenticationException: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) ... Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)I believe there is something more required in the code to give the node/user access to the oozie server through kerberos.
Can someone point me to the correct way to use Oozie Java API on a Kerberized cluster?
thanks!
解决方案The error message is explicit: Failed to find any Kerberos tgt. Your job runs in a YARN container, on a random node, and has no Kerberos ticket available there.
Did you ever wonder how Oozie could start a job with your Kerberos credentials, even though it does not know your password? That's because it uses a backdoor built inside Hadoop. But then your job has no proper Kerberos credentials, hence the message you see when you try to do something not covered.
How Oozie manages authentication without credentials
- you connect to an Edge Node, create a Kerberos ticket with kinit, run an Oozie command line to submit a Coordinator (which will fire a Workflow at specific dates and times)
- the Oozie CLI authenticates against the Oozie server with the local Kerberos ticket, so the Coordinator (and Workflow) "belong to you"
- when the Coordinator triggers the Workflow, and the Workflow starts an Action, and the Action starts a YARN job... it's the Oozie server that authenticates against YARN ResourceManager (typically as oozie) -- your Kerberos ticket has probably expired long ago
- but since oozie is defined as a priviledged proxy account in YARN config, then the RM accepts to start the job under your account, even though you did not properly authenticate via Kerberos
- how is it possible?? because internally YARN and HDFS use a delegation token -- usually, you authenticate once with Kerberos, then you get a token, and you are good for all core services on all nodes; with Oozie in the mix, you don't even have to authenticate...
But there's a catch: the delegation token does not work for any service that uses pure Kerberos authentication -- i.e. Hive Metastore, Hive JDBC, HBase, ZooKeeper, Oozie, etc. That's why Oozie has a workaround: explicit <credential> requests for Hive actions, Hive2 actions, HBase actions, etc. [disclaimer: I don't really know how it actually works]
I doubt that any of these "credentials" would work against Oozie itself...!
How you can manage your own custom authentication
You will find more details in that post of mine: Error when connect to impala with JDBC under kerberos authrication
Disclaimer: I don't know which JAAS "subject" is expected by Oozie (for instance, ZooKeeper expects Client, Hive expects com.sun.security.jgss.krb5.initiate)
Alternative: forget about JAAS and use the cache.
- set env variable KRB5CCNAME to a temp file in the CWD of the container (which will be destroyed automatically when the job stops)
- spawn a Linux command kinit -kt myname.keytab myname@REALM which will obtain a Kerberos ticket in the cache defined by KRB5CCNAME
- and let JAAS follow the default process
更多推荐
使用Kerberos从其他作业的Java操作中提交Oozie作业
发布评论