我运行的配置单元查询运行良好的小数据集。但我运行了250万条记录,我得到了日志中的错误
FATAL org.apache.hadoop.mapred.Child:运行child时出错:java.lang.OutOfMemoryError:无法在java.lang.Thread.start0创建新的本地线程(本地方法)$ b $ java.util.Thread.start(Thread.java:640 ) at org.apache.hadoop.mapred.Task $ TaskReporter.startCommunicationThread(Task.java:725) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:362) at org.apache.hadoop.mapred.Child $ 4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth。 Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136) at org.apache.hadoop.mapred.Child.main( Child.java:249) 2013-03-18 14:12:58,907 WARN org.apache.hadoop.mapred.Child:运行子时出错java.io.IOException:不能运行程序ln:java.io.IOException:error = 11,资源暂时不可用在java.lang.ProcessBuilder.start(ProcessBuilder.java:460)在java.lang.Runtime.exec (Runtime.java:593)在java.lang.Runtime.exec处(Runtime.java:431)在java.lang.Runtime.exec处(Runtime.java:369) at org.apache.hadoop.fs.FileUtil.symLink(FileUtil.java:567)在org.apache.hadoop.mapred.TaskRunner.symlink(TaskRunner.java:787)在org.apache。 hadoop.mapred.TaskRunner.setupWorkDir(TaskRunner.java:752)在org.apache.hadoop.mapred.Child.main(Child.java:225)引起:java.io.IOException: java.io.IOException:error = 11,Resource暂不可用在java.lang.UNIXProcess。< init>(UNIXProcess.java:148)在java.lang.ProcessImpl.start(ProcessImpl。 java:65) at java.lang.ProcessBuilder.start(ProcessBuilder.java:453) ... 7 more 2013-03-18 14:12:58,911 INFO org.apache .hadoop.mapred.Task:Runnning clean为任务 2013-03-18 14:12:58,911 INFO org.apache.hadoop.mapred.Child:清除的错误java.lang.NullPointerException $ b $ org.apache。 hadoop.mapred.Task.taskCleanup(Task.java:1048) at org.apache.hadoop.mapred.Child.main(Child.java:281)需要帮助。
解决方案谢谢大家。 。 你是对的。这是因为文件描述符,因为我的程序在目标表中生成了很多文件。由于分区结构的多级别。
我增加了ulimit和xceivers属性。它确实有帮助。但仍然在我们的情况下,这些限制也被越过了。然后我们决定按照分区分配数据,然后我们只为每个分区获取一个文件。
它对我们有效。我们将系统规模扩大至500亿条记录,并为我们工作
I have running hive query which running fine for small dataset. but i am running for 250 million records i have getting below errors in logs
FATAL org.apache.hadoop.mapred.Child: Error running child : java.lang.OutOfMemoryError: unable to create new native thread at java.lang.Thread.start0(Native Method) at java.lang.Thread.start(Thread.java:640) at org.apache.hadoop.mapred.Task$TaskReporter.startCommunicationThread(Task.java:725) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:362) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136) at org.apache.hadoop.mapred.Child.main(Child.java:249) 2013-03-18 14:12:58,907 WARN org.apache.hadoop.mapred.Child: Error running child java.io.IOException: Cannot run program "ln": java.io.IOException: error=11, Resource temporarily unavailable at java.lang.ProcessBuilder.start(ProcessBuilder.java:460) at java.lang.Runtime.exec(Runtime.java:593) at java.lang.Runtime.exec(Runtime.java:431) at java.lang.Runtime.exec(Runtime.java:369) at org.apache.hadoop.fs.FileUtil.symLink(FileUtil.java:567) at org.apache.hadoop.mapred.TaskRunner.symlink(TaskRunner.java:787) at org.apache.hadoop.mapred.TaskRunner.setupWorkDir(TaskRunner.java:752) at org.apache.hadoop.mapred.Child.main(Child.java:225) Caused by: java.io.IOException: java.io.IOException: error=11, Resource temporarily unavailable at java.lang.UNIXProcess.<init>(UNIXProcess.java:148) at java.lang.ProcessImpl.start(ProcessImpl.java:65) at java.lang.ProcessBuilder.start(ProcessBuilder.java:453) ... 7 more 2013-03-18 14:12:58,911 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task 2013-03-18 14:12:58,911 INFO org.apache.hadoop.mapred.Child: Error cleaning up java.lang.NullPointerException at org.apache.hadoop.mapred.Task.taskCleanup(Task.java:1048) at org.apache.hadoop.mapred.Child.main(Child.java:281)need help on this.
解决方案Thank you all.. You are correct. it is because of the file descriptor, as my program was generating lot of file in target table. due to multilevel of partition structure.
I have increased the ulimit and also xceivers property. it did helped. but still in our situation those limits were also crossed
Then we decided to distribute data as per the partitions and then we are getting only one file per partition.
It worked for us. We scaled our system to 50+billion records and it worked for us
更多推荐
java.lang.OutOfMemoryError:无法为大数据集创建新的本地线程
发布评论