提交Hadoop作业

编程入门 行业动态 更新时间:2024-10-26 22:18:17
本文介绍了提交Hadoop作业的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述 我需要不断获得mappers和reducer的运行时间。 $ b

JobClient jobclient =新的JobClient(conf); RunningJob runjob = jobclient.submitJob(conf); TaskReport [] maps = jobclient.getMapTaskReports(runjob.getID()); long mapDuration = 0; for(TaskReport rpt:maps){ mapDuration + = rpt.getFinishTime() - rpt.getStartTime(); }

然而,当我运行程序时,似乎没有提交作业,映射器从不启动。我怎样才能使用 JobClient.runJob(conf)并仍然能够获得运行时间?

解决方案

()方法立即返回控制给调用程序,而不等待hadoop Job启动,更不完整。如果你想等待,然后使用 waitForCompletion()方法,该方法仅在hadoop作业完成后返回。我认为你需要一些介于两者之间的内容,因为你希望在提交之后但在完成之前运行后续代码。

我建议你将后续代码放在循环中,直到作业完成(使用 isComplete()该测试的方法),并随着工作进展观察映射器和简化器。您可能还想在某个位置放置一个Thread.sleep(xxx)。

要回复您的评论,您希望...

job.waitForCompletion(); TaskCompletionEvent event [] = job.getTaskCompletionEvents(); for(int i = 0; i< event.length(); i ++){ System.out.println(Task+ i +took+ event [i] .getTaskRunTime() +ms); }

I need to constantly get the mappers' and reducers' running time. I have submitted the job as follows.

JobClient jobclient = new JobClient(conf); RunningJob runjob = jobclient.submitJob(conf); TaskReport [] maps = jobclient.getMapTaskReports(runjob.getID()); long mapDuration = 0; for(TaskReport rpt: maps){ mapDuration += rpt.getFinishTime() - rpt.getStartTime(); }

However when I run the program, it seems like the job is not submitted and the mapper never starts. How can I use JobClient.runJob(conf) and still be able to get the running time?

解决方案

The submitJob() method returns control immediately to the calling program without waiting for the hadoop Job to start, much less complete. If you want to wait then use the waitForCompletion() method which returns only after the hadoop job has finished. I think you want something in between since you want to run subsequent code after the submit but before the complete.

I suggest you put your follow-on code in a loop that continues until the job is complete (Use the isComplete() method for that test) and observe the mappers and reducers as the job progresses. You probably want to put a Thread.sleep(xxx) in the loop somewhere, too.

To respond to your comment, you want to...

job.waitForCompletion(); TaskCompletionEvent event[] = job.getTaskCompletionEvents(); for (int i = 0; i < event.length(); i++) { System.out.println("Task "+i+" took "+event[i].getTaskRunTime()+" ms"); }

更多推荐

提交Hadoop作业

本文发布于:2023-11-24 05:09:53,感谢您对本站的认可!
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:作业   Hadoop

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!