我正在EMR上创建集群,并配置Zeppelin从S3读取笔记本.为此,我使用了一个看起来像这样的json对象:
I am creating clusters on EMR and configure Zeppelin to read the notebooks from S3. To do that I am using a json object that looks like that:
[ { "Classification": "zeppelin-env", "Properties": { }, "Configurations": [ { "Classification": "export", "Properties": { "ZEPPELIN_NOTEBOOK_STORAGE":"org.apache.zeppelin.notebook.repo.S3NotebookRepo", "ZEPPELIN_NOTEBOOK_S3_BUCKET":"hs-zeppelin-notebooks", "ZEPPELIN_NOTEBOOK_USER":"user" }, "Configurations": [ ] } ] } ]我将这个对象粘贴到EMR的Stoftware配置页面中: 我的问题是,如何/在何处可以直接配置Spark解释器,而无需在每次启动集群时从Zeppelin手动配置它?
I am pasting this object in the Stoftware configuration page of EMR: My question is, how/where I can configure the Spark interpreter directly without the need to manually configure it from Zeppelin each time I start a cluster?
推荐答案这有点涉及,您需要做两件事:
This is a bit involved, you will need to do 2 things:
因此,您需要做的是编写一个Shell脚本,然后向运行此Shell脚本的EMR群集配置中添加一个额外的步骤.
So what you need to do is write a shell script and then add an extra step to the EMR cluster configuration that runs this shell script.
Zeppelin配置位于json中,您可以使用jq(一种工具)来处理json.我不知道您要确切更改什么,但是这里有一个示例(添加了(神秘缺失的)DepInterpreter:
The Zeppelin configuration is in json, you can use jq (a tool) to manipulate json. I don't know what you want to change exactly, but here is an example that adds the (mysteriously missing) DepInterpreter:
#!/bin/bash # 1 edit the Spark interpreter set -e cat /etc/zeppelin/conf/interpreter.json | jq '.interpreterSettings."2ANGGHHMQ".interpreterGroup |= .+ [{"class":"org.apache.zeppelin.spark.DepInterpreter", "name":"dep"}]' | sudo -u zeppelin tee /etc/zeppelin/conf/interpreter.json # Trigger restart of Spark interpreter curl -X PUT localhost:8890/api/interpreter/setting/restart/2ANGGHHMQ将此Shell脚本放入s3存储桶中. 然后,使用
Put this shell script in a s3 bucket. Then start your EMR cluster with
--steps Type=CUSTOM_JAR,Name=CustomJAR,ActionOnFailure=CONTINUE,Jar=s3://eu-west-1.elasticmapreduce/libs/script-runner/script-runner.jar,Args=[s3://mybucket/script.sh]更多推荐
启动集群时在EMR上配置Zeppelin的Spark解释器
发布评论