如果单个文件比YARN

编程入门 行业动态 更新时间:2024-10-18 16:50:41
如果单个文件比YARN - SPARK中的地图操作中的执行者大,会发生什么?(What will happen if the single file is larger than a executor in a map operation in YARN - SPARK?)

我正在开发一个解决方案,其中驱动程序将读取xml文件,从中我将采用HDFS文件路径,这将在地图操作中读取。我在这里有几个问题。 由于地图操作将在容器中执行(容器将在启动作业时分配)。

什么是单个输入文件大于执行程序。 由于文件未在驱动程序中读取,因此无法分配更多资源? 或者应用程序主机将从资源管理器获得更多内存?

任何帮助都非常感谢。

I'm working on a solution where driver program will read the xml file and from that i will take a HDFS file path and that will be read inside map operation.I have few questions here. Since the map operation will be performed in containers (Containers will be allocated while starting the job ).

What is the single input file is greater than a executor. Since the file is not read in driver program it cannot allocate more resource? OR the application master will get more memory from resource manager?

Any help is highly appreciated.

最满意答案

单个输入文件大于执行程序是什么?

由于文件是HDFS,Spark将在HDFS中为1个块创建1个分区 。 每个分区都将在Worker中处理。

如果文件有多个无法一次计算的块,那么spark确保一旦资源空闲(在完成阶段转换之后)就会计算挂起的分区。

What is the single input file is greater than a executor?

As the file is in HDFS, Spark will create 1 partition for 1 block in HDFS. Every partitions will be processed in a Worker.

If file has many blocks which can't be computed at a time then spark make sure the pending partition will be computed once resources are free(after completing transformation with a stage).

更多推荐

本文发布于:2023-07-16 12:48:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1128615.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:文件   YARN

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!