admin管理员组文章数量:1633029
查看jobmanager日志:standalonesession-0-master.log
2020-05-16 21:46:53,511 WARN akka.remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://flink-metrics@master:3821] has failed, address is now gated for [50] ms. Reason: [Disassociated]
2020-05-16 21:46:53,511 WARN akka.remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://flink@master:6123] has failed, address is now gated for [50] ms. Reason: [Disassociated]
2020-05-16 21:47:36,620 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor - The heartbeat of JobManager with id bc6c72e5dded7b29a59ecc5417a12aee timed out.
2020-05-16 21:47:36,620 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor - Close JobManager connection for job 53a1798d17013773e5622cc42c9bb39b.
2020-05-16 21:47:36,621 INFO org.apache.flink.runtime.taskmanager.Task - Attempting to fail task externally Source: Custom Source -> Map (1/2) (2dad626977317551ef95c85e3f44cc3a).
2020-05-16 21:47:36,621 INFO org.apache.flink.runtime.taskmanager.Task - Source: Custom Source -> Map (1/2) (2dad626977317551ef95c85e3f44cc3a) switched from RUNNING to FAILED.
org.apache.flink.util.FlinkException: JobManager responsible for 53a1798d17013773e5622cc42c9bb39b lost the leadership.
at org.apache.flink.runtime.taskexecutor.TaskExecutor.closeJobManagerConnection(TaskExecutor.java:1272)
at org.apache.flink.runtime.taskexecutor.TaskExecutor.access$1200(TaskExecutor.java:154)
at org.apache.flink.runtime.taskexecutor.TaskExecutor$JobManagerHeartbeatListener.notifyHeartbeatTimeout(TaskExecutor.java:1791)
at org.apache.flink.runtime.heartbeat.HeartbeatMonitorImpl.run(HeartbeatMonitorImpl.java:109)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:397)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:190)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
at scala.PartialFunction.applyOrElse(PartialFunction.scala:123)
at scala.PartialFunction.applyOrElse$(PartialFunction.scala:122)
at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
at akka.actor.Actor.aroundReceive(Actor.scala:517)
at akka.actor.Actor.aroundReceive$(Actor.scala:515)
at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
at akka.actor.ActorCell.invoke(ActorCell.scala:561)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
at akka.dispatch.Mailbox.run(Mailbox.scala:225)
at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.util.concurrent.TimeoutException: The heartbeat of JobManager with id bc6c72e5dded7b29a59ecc5417a12aee timed out.
at org.apache.flink.runtime.taskexecutor.TaskExecutor$JobManagerHeartbeatListener.notifyHeartbeatTimeout(TaskExecutor.java:1792)
... 26 more
单从日志我没有找到问题原因(因为我太菜)。后来问同事,是因为提交了一个job,这个job中有这么一句:
System.exit(-1);
在异常的时候,执行了这句代码,导致flink所在的JVM虚拟机直接关闭了。
因此,在代码中尽量不要有这种语句。
本文标签: 报错JobManagerFlinkresponsibleLeadership
版权声明:本文标题:flink报错:JobManager responsible for xxx lost the leadership 内容由热心网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:https://www.elefans.com/xitong/1729157182a1188090.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论