-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZEPPELIN-2898. Support Yarn-Cluster for Spark Interpreter #2577
Conversation
@Leemoonsoo @jongyoul Could you help review ? |
//Only one of py4j-0.9-src.zip and py4j-0.8.2.1-src.zip should exist | ||
//TODO(zjffdu), this is not maintainable when new version is added. | ||
String[] pythonLibs = new String[]{"pyspark.zip", "py4j-0.9-src.zip", "py4j-0.8.2.1-src.zip", | ||
"py4j-0.10.1-src.zip", "py4j-0.10.3-src.zip", "py4j-0.10.4-src.zip"}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we really need to fix this....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I wouldn't have thought it used for a long time...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, this is on my plan for the next step.
b116cd7
to
3f34f8c
Compare
@Leemoonsoo @jongyoul As I mentioned in the PR description, this is not a perfect PR. If you don't have any more comments, I will merge it and continue the next PR to improve it. Thanks |
3f34f8c
to
91a9b47
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you document more of MiniHadoopCluster, MiniZeppelin?
@@ -0,0 +1,23 @@ | |||
# | |||
# Licensed to the Apache Software Foundation (ASF) under one or more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need to release with a new .properties file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the default log4j.properties use DailyRollingFileAppender
, this is not suitable for yarn container.
additionalPythonPath = additionalPythonPath + ":" + py4jLibPath; | ||
} else { | ||
additionalPythonPath = py4jLibPath; | ||
if (addBuiltinPy4j) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit addBuiltinPy4j
is a bit confusing? useBuiltinPy4j
? addBuiltinPy4jPath
?
Map<String, String> envs = EnvironmentUtils.getProcEnvironment(); | ||
if (envs.containsKey("PYTHONPATH")) { | ||
if (envs.containsKey("PYTHONPATH") && additionalPythonPath != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
||
instead? if PYTHONPATH is not set but additionalPythonPath is, don't we want to set it?
if (py4j.length == 0) { | ||
throw new RuntimeException("No py4j files found under " + sparkHome + "/python/lib"); | ||
} else if (py4j.length > 1) { | ||
throw new RuntimeException("Multiple py4j files found under " + sparkHome + "/python/lib"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One wonder if this might be a bit too restrictive ... I for one have multiple py4j versions and Spark always picks the right one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because spark hard code each py4j version, but zeppelin needs to work for multiple versions of spark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, but it might break people with existing setup...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why ? If there's multiple version of py4j under SPARK_HOME/python/lib
or ZEPPELIN_HOME/interpreter/spark/pyspark
, it must be something wrong with user's enviroment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe, maybe it is an edge case.
do we need to force the logic here? is it possible to call say, sbin/spark-config.sh
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think sbin/spark-config.sh
only affect pyspark shell. Spark also hard code py4j version in scala code when it needs to detect the py4j and distribute to yarn containers. . See https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/api/python/PythonUtils.scala#L35
return env; | ||
} | ||
|
||
private void setupPropertiesForPySpark(Properties sparkProperties, String sparkMaster) { | ||
if (sparkMaster.startsWith("yarn")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use isYarnMode()
?
} | ||
if (key.equals("master")) { | ||
sparkMaster = property.getProperty("master"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this do the same as the other code (or perhaps re-use as a method) - fall back to get spark.master?
|
||
setupPropertiesForPySpark(sparkProperties, sparkMaster); | ||
setupPropertiesForSparkR(sparkProperties, sparkMaster, property.getProperty("SPARK_HOME")); | ||
if (sparkMaster != null && sparkMaster.equals("yarn-cluster")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this match isYarnMode()
?
if (sparkMaster != null) { | ||
sparkConfBuilder.append(" --master " + sparkMaster); | ||
} | ||
if (sparkMaster.equals("yarn-cluster")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this only works for yarn-cluster mode, not yarn-client mode. yarn-client mode still use the default log4j.properties, because driver runs in the zeppelin host
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, my point is we are not handling yarn cluster mode consistently.
yarn cluster mode can set by master=yarn, deployMode=cluster or
master=yarn-cluster (deprecated)
we have both in this PR.. either we should support both, or the supported not deprecated way, I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I was planning to fix it in the next PR. This is a messy. Let me fix it in this PR.
if (sparkRPath.exists() && sparkRPath.isFile()) { | ||
mergeSparkProperty(sparkProperties, "spark.yarn.dist.archives", sparkRPath.getAbsolutePath()); | ||
} else { | ||
LOGGER.warn("sparkr.zip is not found, sparkr may not work."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sparkr may not work
=> SparkR may not work
42beef2
to
ed04545
Compare
ed04545
to
9da7c4b
Compare
@Leemoonsoo @jongyoul @felixcheung Any more comments ? |
I don't |
Thanks, I will merge it to continue the next follow up PR. |
hey @zjffdu how are we tracking the followup tasks, if there is any? |
@felixcheung I will do the follow up in https://issues.apache.org/jira/browse/ZEPPELIN-2685 |
### What is this PR for? Follow up of #2577. Main changes on Interpreter * Add throw `InterpreterException` which is checked exception for the abstract methods of `Interpreter`, this would enforce the interpreter implementation to throw `InterpreterException`. * field name refactoring. * `property` -> `properties` * `getProperty()` --> `getProperties()` * Introduce launcher layer for interpreter launching. Currently we only use shell script to launch interpreter, but it could be any other service or component to launch interpreter, such as livy server , other 3rd party tools or even we may create a separate module for interpreter launcher * abstract cass `InterpreterLauncher` * For now, only 2 implementation: `ShellScriptLauncher` & `SparkInterpreterLauncher`. We could add method in class `Interpreter` to allow interpreter to specify its own launcher class, but it could be future work. ### What type of PR is it? [Improvement | Refactoring] ### Todos * [ ] - Task ### What is the Jira issue? * https://issues.apache.org/jira/browse/ZEPPELIN-2685 ### How should this be tested? Unit test is covered. `ShellScriptLauncherTest` & `SparkInterpreterLauncherTest` ### Screenshots (if appropriate) ### Questions: * Does the licenses files need update? No * Is there breaking changes for older versions? No * Does this needs documentation? No Author: Jeff Zhang <[email protected]> Closes #2592 from zjffdu/ZEPPELIN-2685 and squashes the following commits: 17dc2f1 [Jeff Zhang] address comments e545cc3 [Jeff Zhang] ZEPPELIN-2685. Improvement on Interpreter class
What is this PR for?
This is the first version for supporting yarn-cluster of
SparkInterpreter
. I just delegate all the function tospark-submit
as yarn-cluster is natively supported by spark, we don't need to reinvent the wheel. But there's still improvement to be done in future, e.g. I put some spark specific logic inInterpreterSetting
which is not a good practise. I plan to improve it when I refactor theInterpreter
class (ZEPPELIN-2685).Besides that, I also add
MiniHadoopCluster
&MiniZeppelin
which help for the integration test of yarn-client & yarn-cluster mode, otherwise I have to manually verify yarn-client & yarn-cluster mode which would easily cause regression issue in future.To be noticed:
What type of PR is it?
[Feature]
Todos
What is the Jira issue?
https://github.com/zjffdu/zeppelin/tree/ZEPPELIN-2898
How should this be tested?
System test is added in
SparkInterpreterIT
.Questions: