doc for spark standalone

2026-05-24 09:38:26 +00:00 · 2016-07-26 00:43:59 +09:00 · 2016-07-26 00:43:59 +09:00 · 83fdef6e2b
commit 83fdef6e2b
parent 483a89705b
7 changed files with 138 additions and 2 deletions
--- a/docs/_includes/themes/zeppelin/_navigation.html
+++ b/docs/_includes/themes/zeppelin/_navigation.html
@ -32,7 +32,6 @@
                <li><a href="{{BASE_PATH}}/manual/notebookashomepage.html">Customize Zeppelin Homepage</a></li>
                <li role="separator" class="divider"></li>
                <li class="title"><span><b>More</b><span></li>
-                <li><a href="{{BASE_PATH}}/install/virtual_machine.html">Zeppelin on Vagrant VM</a></li>
                <li><a href="{{BASE_PATH}}/install/upgrade.html">Upgrade Zeppelin Version</a></li>
              </ul>
            </li>
@ -102,6 +101,10 @@
                <li><a href="{{BASE_PATH}}/security/notebook_authorization.html">Notebook Authorization</a></li>
                <li><a href="{{BASE_PATH}}/security/datasource_authorization.html">Data Source Authorization</a></li>
                <li role="separator" class="divider"></li>
+                <li class="title"><span><b>Advanced</b><span></li>
+                <li><a href="{{BASE_PATH}}/install/virtual_machine.html">Zeppelin on Vagrant VM</a></li>
+                <li><a href="{{BASE_PATH}}/install/spark_cluster_mode.html#spark-standalone-mode">Zeppelin on Spark Cluster Mode (Standalone)</a></li>
+                <li role="separator" class="divider"></li>
                <li class="title"><span><b>Contibute</b><span></li>
                <li><a href="{{BASE_PATH}}/development/writingzeppelininterpreter.html">Writing Zeppelin Interpreter</a></li>
                <li><a href="{{BASE_PATH}}/development/writingzeppelinapplication.html">Writing Zeppelin Application (Experimental)</a></li>                
--- a/docs/assets/themes/zeppelin/img/docs-img/spark_ui.png
+++ b/docs/assets/themes/zeppelin/img/docs-img/spark_ui.png
--- a/docs/assets/themes/zeppelin/img/docs-img/standalone_conf.png
+++ b/docs/assets/themes/zeppelin/img/docs-img/standalone_conf.png
--- a/docs/index.md
+++ b/docs/index.md
@ -133,7 +133,6 @@ Join to our [Mailing list](https://zeppelin.apache.org/community.html) and repor
  * [Publish your Paragraph](./manual/publish.html) results into your external website
  * [Customize Zeppelin Homepage](./manual/notebookashomepage.html) with one of your notebooks
 * More
-  * [Apache Zeppelin on Vagrant VM](./install/virtual_machine.html): a guide for installing Apache Zeppelin on Vagrant virtual machine
  * [Upgrade Apache Zeppelin Version](./install/upgrade.html): a manual procedure of upgrading Apache Zeppelin version

 ####Interpreter
@ -168,6 +167,9 @@ Join to our [Mailing list](https://zeppelin.apache.org/community.html) and repor
  * [Shiro Authentication](./security/shiroauthentication.html)
  * [Notebook Authorization](./security/notebook_authorization.html)
  * [Data Source Authorization](./security/datasource_authorization.html)
+* Advanced
+  * [Apache Zeppelin on Vagrant VM](./install/virtual_machine.html)
+  * [Zeppelin on Spark Cluster Mode (Standalone)](./install/spark_cluster_mode.html#spark-standalone-mode)
 * Contribute
  * [Writing Zeppelin Interpreter](./development/writingzeppelininterpreter.html)
  * [Writing Zeppelin Application (Experimental)](./development/writingzeppelinapplication.html)
--- a/docs/install/spark_cluster_mode.md
+++ b/docs/install/spark_cluster_mode.md
@ -0,0 +1,74 @@
+---
+layout: page
+title: "Apache Zeppelin on Spark cluster mode"
+description: ""
+group: install
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+{% include JB/setup %}
+
+# Apache Zeppelin on Spark Cluster Mode
+
+<div id="toc"></div>
+
+## Overview 
+[Apache Spark](http://spark.apache.org/) has supported three cluster manager types([Standalone](http://spark.apache.org/docs/latest/spark-standalone.html), [Apache Mesos](http://spark.apache.org/docs/latest/running-on-mesos.html) and [Hadoop YARN](http://spark.apache.org/docs/latest/running-on-yarn.html)) so far.
+This document will guide you how you can build and configure the environment on 3 types of Spark cluster manager with Apache Zeppelin using [Docker](https://www.docker.com/) scripts.
+So [install docker](https://docs.docker.com/engine/installation/) on the machine first.
+
+## Spark standalone mode
+[Spark standalone](http://spark.apache.org/docs/latest/spark-standalone.html) is a simple cluster manager included with Spark that makes it easy to set up a cluster.
+You can simply set up Spark standalone environment with below steps. 
+
+> **Note :** Since Apache Zeppelin and Spark use same `8080` port for their web UI, you might need to change `zeppelin.server.port` in `conf/zeppelin-site.xml`.
+
+### 1. Build Docker file
+You can find docker script files under `scripts/docker/spark-cluster-managers`.
+
+```
+cd $ZEPPELIN_HOME/scripts/docker/spark-cluster-managers/spark_standalone
+docker build -t "spark_standalone" .
+```
+
+### 2. Run docker
+
+```
+docker run -it \
+-p 8080:8080 \
+-p 7077:7077 \
+-p 8888:8888 \
+-p 8081:8081 \
+-h sparkmaster \
+--name spark_standalone \
+spark_standalone bash; 
+```
+
+### 3. Configure Spark interpreter in Zeppelin
+Set Spark master as `spark://localhost:7077` in Zeppelin **Interpreters** setting page.
+
+<img src="../assets/themes/zeppelin/img/docs-img/standalone_conf.png" />
+
+### 4. Run Zeppelin with Spark interpreter
+After running single paragraph with Spark interpreter in Zeppelin, browse `https://localhost:8080` and check whether Spark cluster is running well or not.
+
+<img src="../assets/themes/zeppelin/img/docs-img/spark_ui.png" />
+
+You can also simply verify that Spark is running well in Docker with below command.
+
+```
+ps -ef | grep spark
+```
+
+
--- a/scripts/docker/spark-cluster-managers/spark_standalone/Dockerfile
+++ b/scripts/docker/spark-cluster-managers/spark_standalone/Dockerfile
@ -0,0 +1,40 @@
+FROM centos:centos6
+MAINTAINER hsshim@nflabs.com
+
+ENV SPARK_PROFILE 1.6
+ENV SPARK_VERSION 1.6.2
+ENV HADOOP_PROFILE 2.3
+ENV SPARK_HOME /usr/local/spark
+
+# Update the image with the latest packages
+RUN yum update -y; yum clean all
+
+# Get utils
+RUN yum install -y \
+wget \
+tar \
+curl \
+&& \
+yum clean all
+
+# Remove old jdk
+RUN yum remove java; yum remove jdk
+
+# install jdk7 
+RUN yum install -y java-1.7.0-openjdk-devel
+ENV JAVA_HOME /usr/lib/jvm/java
+ENV PATH $PATH:$JAVA_HOME/bin
+
+# install spark
+RUN curl -s http://apache.mirror.cdnetworks.com/spark/spark-$SPARK_VERSION/spark-$SPARK_VERSION-bin-hadoop$HADOOP_PROFILE.tgz | tar -xz -C /usr/local/
+RUN cd /usr/local && ln -s spark-$SPARK_VERSION-bin-hadoop$HADOOP_PROFILE spark
+
+# update boot script
+COPY entrypoint.sh /etc/entrypoint.sh
+RUN chown root.root /etc/entrypoint.sh
+RUN chmod 700 /etc/entrypoint.sh
+
+#spark
+EXPOSE 8080 7077 8888 8081
+
+ENTRYPOINT ["/etc/entrypoint.sh"]
--- a/scripts/docker/spark-cluster-managers/spark_standalone/entrypoint.sh
+++ b/scripts/docker/spark-cluster-managers/spark_standalone/entrypoint.sh
@ -0,0 +1,17 @@
+#!/bin/bash
+
+export SPARK_MASTER_PORT=7077
+
+# run spark 
+cd /usr/local/spark/sbin
+./start-master.sh
+./start-slave.sh spark://`hostname`:$SPARK_MASTER_PORT
+
+CMD=${1:-"exit 0"}
+if [[ "$CMD" == "-d" ]];
+then
+	service sshd stop
+	/usr/sbin/sshd -D -d
+else
+	/bin/bash -c "$*"
+fi