mirror of
https://github.com/apache/zeppelin
synced 2026-05-24 09:38:26 +00:00
Zeppelin currently embeds all spark dependencies under interpreter/spark and loading them on runtime. Which is useful because of user can try Zeppelin + Spark with local mode without installation and configuration of spark. However, when user has existing spark and hadoop installation, it'll be really helpful to just pointing them instead of build zeppelin with specific version of spark and hadoop combination. This PR implements ability to use external spark and hadoop installation, by doing * spark-dependencies module packages spark/hadoop dependencies under interpreter/spark/dep, to support local mode (current behavior) * When SPARK_HOME and HADOOP_HOME is defined, bin/interpreter.sh exclude interpreter/spark/dep from classpath and include system installed spark and hadoop into the classpath. This patch makes Zeppelin binary independent from spark version. Once Zeppelin is been built, SPARK_HOME can point any version of spark. Author: Lee moon soo <moon@apache.org> Closes #244 from Leemoonsoo/spark_provided and squashes the following commits:654c378[Lee moon soo] use consistant, simpler expressions57b3f96[Lee moon soo] Add commenteb4ec09[Lee moon soo] fix reading spark-*.conf filebacfd93[Lee moon soo] Update readme3a88c77[Lee moon soo] Test use explicitly %spark5a17d9c[Lee moon soo] Call sqlContext.sql using reflection615c395[Lee moon soo] get correct method0c28561[Lee moon soo] call listenerBus() using reflection62b8c45[Lee moon soo] Print all logs5edb6fd[Lee moon soo] Use reflection to call addListeneraf7a925[Lee moon soo] add pyspark flag5f8a734[Lee moon soo] test -> packagea0150cf[Lee moon soo] not use travis-install for mvn testcd4519c[Lee moon soo] try sys.stdout.write instead of print6304180[Lee moon soo] enable 1.2.x test797c0e2[Lee moon soo] enable 1.3.x test8de7add[Lee moon soo] trying to find why travis is not closing the testcf0a61e[Lee moon soo] rm -rf only interpreter directory instead of mvn clean2606c04[Lee moon soo] bringing travis-install.sh backdf8f0ba[Lee moon soo] test more efficiently9d6b40f[Lee moon soo] Update .travis2ca3d95[Lee moon soo] set SPARK_HOME2a61ecd[Lee moon soo] Clear interpreter directory on mvn cleanf1e8789[Lee moon soo] update travis config9e812e7[Lee moon soo] Use reflection not to use import org.apache.spark.scheduler.Stagec3d96c1[Lee moon soo] Handle ZEPPELIN_CLASSPATH proper way0f9598b[Lee moon soo] py4j version as a property1b7f951[Lee moon soo] Add dependency for compile and testb1d62a5[Lee moon soo] Add scala-library in test scopec49be62[Lee moon soo] Add hadoop jar and spark jar from HADOOP_HOME, SPARK_HOME when they are defined2052aa3[Lee moon soo] Load interpreter/spark/dep only when SPARK_HOME is undefined54fdf0d[Lee moon soo] Separate spark-dependency into submodule
53 lines
1.8 KiB
Python
Executable file
53 lines
1.8 KiB
Python
Executable file
#!/usr/bin/python
|
|
# -*- coding: utf-8 -*-
|
|
# Licensed under the Apache License, Version 2.0 (the "License");
|
|
# you may not use this file except in compliance with the License.
|
|
# You may obtain a copy of the License at
|
|
#
|
|
# http://www.apache.org/licenses/LICENSE-2.0
|
|
#
|
|
# Unless required by applicable law or agreed to in writing, software
|
|
# distributed under the License is distributed on an "AS IS" BASIS,
|
|
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
# See the License for the specific language governing permissions and
|
|
# limitations under the License.
|
|
|
|
import sys
|
|
import subprocess
|
|
from datetime import datetime, timedelta
|
|
|
|
def main(file, cmd):
|
|
print cmd, "writing to", file
|
|
out = open(file, "w")
|
|
count = 0
|
|
process = subprocess.Popen(cmd,
|
|
stderr=subprocess.STDOUT,
|
|
stdout=subprocess.PIPE)
|
|
|
|
start = datetime.now()
|
|
nextPrint = datetime.now() + timedelta(seconds=1)
|
|
# wait for the process to terminate
|
|
pout = process.stdout
|
|
line = pout.readline()
|
|
while line:
|
|
count = count + 1
|
|
if datetime.now() > nextPrint:
|
|
diff = datetime.now() - start
|
|
sys.stdout.write("\r%d seconds %d log lines"%(diff.seconds, count))
|
|
sys.stdout.flush()
|
|
nextPrint = datetime.now() + timedelta(seconds=10)
|
|
out.write(line)
|
|
line = pout.readline()
|
|
out.close()
|
|
errcode = process.wait()
|
|
diff = datetime.now() - start
|
|
sys.stdout.write("\r%d seconds %d log lines"%(diff.seconds, count))
|
|
sys.stdout.write("\n" + str(cmd) + " done " + str(errcode) + "\n")
|
|
return errcode
|
|
|
|
if __name__ == "__main__":
|
|
if sys.argv < 1:
|
|
print "Usage: %s [file info]" % sys.argv[0]
|
|
sys.exit(1)
|
|
|
|
sys.exit(main(sys.argv[1], sys.argv[2:]))
|