为什么Spark(在Google Dataproc上)不使用所有vcore?

编程入门 行业动态 更新时间:2024-10-25 02:27:00
本文介绍了为什么Spark(在Google Dataproc上)不使用所有vcore?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我正在Google DataProc集群上运行Spark作业.但是看起来Spark并未使用集群中所有可用的 vcores ,如下所示

I'm running a spark job on a Google DataProc cluster. But looks like Spark is not using all the vcores available in the cluster as you can see below

基于其他一些问题,例如此和此,我已经设置好了群集使用 DominantResourceCalculator 来同时考虑vcpus和内存以进行资源分配

Based on some other questions like this and this, i have setup the cluster to use DominantResourceCalculator to consider both vcpus and memory for resource allocation

gcloud dataproc clusters create cluster_name --bucket="profiling- job-default" \ --zone=europe-west1-c \ --master-boot-disk-size=500GB \ --worker-boot-disk-size=500GB \ --master-machine-type=n1-standard-16 \ --num-workers=10 \ --worker-machine-type=n1-standard-16 \ --initialization-actions gs://custom_init_gcp.sh \ --metadata MINICONDA_VARIANT=2 \ --properties=^--^yarn:yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DominantResourceCalculator

但是当我用自定义火花标记提交工作时,看起来YARN不遵守这些自定义参数,并且默认使用内存作为进行资源计算的准绳

But when i submit my job with custom spark flags, looks like YARN doesn't respect these custom parameters and defaults to using memory as the yardstick for resource calculation

gcloud dataproc jobs submit pyspark --cluster cluster_name \ --properties spark.sql.broadcastTimeout=900,sparkwork.timeout=800\ ,yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DominantResourceCalculator\ ,spark.dynamicAllocation.enabled=true\ ,spark.executor.instances=10\ ,spark.executor.cores=14\ ,spark.executor.memory=15g\ ,spark.driver.memory=50g \ src/my_python_file.py

可以帮助别人弄清楚这里发生了什么吗?

Can help somebody figure out what's going on here?

推荐答案

我做错了是添加配置 yarn.scheduler.capacity.resource-calculator = org.apache.hadoop.yarn.util.resource.DominantResourceCalculator 转换为 YARN ,而不是集群创建时的 capacity-scheduler.xml (应该正确)

What I did wrong was to add the configuration yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DominantResourceCalculator to YARN instead of the capacity-scheduler.xml (as it should be rightly) while cluster creation

第二,我更改了最初设置为 1 的 yarn:yarn.scheduler.minimum-allocation-vcores .

Secondly, i changed yarn:yarn.scheduler.minimum-allocation-vcores which was initially set to 1.

我不确定这些更改之一还是全部导致了解决方案(我会很快更新).我的新集群创建如下:

I'm not sure if either one of these or both of these changes led to the solution (i will update soon). My new cluster creation looks like below:

gcloud dataproc clusters create cluster_name --bucket="profiling- job-default" \ --zone=europe-west1-c \ --master-boot-disk-size=500GB \ --worker-boot-disk-size=500GB \ --master-machine-type=n1-standard-16 \ --num-workers=10 \ --worker-machine-type=n1-standard-16 \ --initialization-actions gs://custom_init_gcp.sh \ --metadata MINICONDA_VARIANT=2 \ --properties=^--^yarn:yarn.scheduler.minimum-allocation-vcores=4--capacity-scheduler:yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DominantResourceCalculator

更多推荐

为什么Spark(在Google Dataproc上)不使用所有vcore?

本文发布于:2023-11-25 03:06:43,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1628067.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:Google   Spark   vcore   Dataproc

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!