[ZEPPELIN-5483] Update spark doc

2026-05-24 09:38:26 +00:00 · 2021-08-09 11:53:55 +08:00 · 2021-08-09 11:53:55 +08:00 · 42b6213d47
commit 42b6213d47
parent d4d82e3a84
13 changed files with 1738 additions and 116 deletions
--- a/docs/README.md
+++ b/docs/README.md
@ -42,7 +42,7 @@ bundle exec jekyll serve --watch

 **Run locally using docker**
 ```
-docker run --rm -it \                                                  
+docker run --rm -it \
       -v $PWD:/docs \
       -w /docs \
       -p '4000:4000' \
--- a/docs/_includes/themes/zeppelin/_navigation.html
+++ b/docs/_includes/themes/zeppelin/_navigation.html
@ -34,8 +34,10 @@
                <li><a href="{{BASE_PATH}}/quickstart/yarn.html">Yarn</a></li>
                <li role="separator" class="divider"></li>
                <li><a href="{{BASE_PATH}}/quickstart/spark_with_zeppelin.html">Spark with Zeppelin</a></li>
+                <li><a href="{{BASE_PATH}}/quickstart/flink_with_zeppelin.html">Flink with Zeppelin</a></li>
                <li><a href="{{BASE_PATH}}/quickstart/sql_with_zeppelin.html">SQL with Zeppelin</a></li>
                <li><a href="{{BASE_PATH}}/quickstart/python_with_zeppelin.html">Python with Zeppelin</a></li>
+                <li><a href="{{BASE_PATH}}/quickstart/r_with_zeppelin.html">R with Zeppelin</a></li>
              </ul>
            </li>

@ -131,6 +133,7 @@
                <li><a href="{{BASE_PATH}}/usage/interpreter/overview.html">Overview</a></li>
                <li role="separator" class="divider"></li>
                <li><a href="{{BASE_PATH}}/interpreter/spark.html">Spark</a></li>
+                <li><a href="{{BASE_PATH}}/interpreter/flink.html">Flink</a></li>
                <li><a href="{{BASE_PATH}}/interpreter/jdbc.html">JDBC</a></li>
                <li><a href="{{BASE_PATH}}/interpreter/python.html">Python</a></li>
                <li><a href="{{BASE_PATH}}/interpreter/r.html">R</a></li>
@ -140,7 +143,6 @@
                <li><a href="{{BASE_PATH}}/interpreter/bigquery.html">BigQuery</a></li>
                <li><a href="{{BASE_PATH}}/interpreter/cassandra.html">Cassandra</a></li>
                <li><a href="{{BASE_PATH}}/interpreter/elasticsearch.html">Elasticsearch</a></li>
-                <li><a href="{{BASE_PATH}}/interpreter/flink.html">Flink</a></li>
                <li><a href="{{BASE_PATH}}/interpreter/geode.html">Geode</a></li>
                <li><a href="{{BASE_PATH}}/interpreter/groovy.html">Groovy</a></li>
                <li><a href="{{BASE_PATH}}/interpreter/hazelcastjet.html">Hazelcast Jet</a></li>
--- a/docs/interpreter/ksql.md
+++ b/docs/interpreter/ksql.md
@ -57,7 +57,7 @@ Following some examples:
 PRINT 'orders';
 ```

-![PRINT image]({{BASE_PATH}}/assets/themes/zeppelin/img/docs-img/ksql.1.gif)
+![PRINT image]({{BASE_PATH}}/assets/themes/zeppelin/img/docs-img/ksql.1.png)

 ```
 %ksql
@ -66,7 +66,7 @@ CREATE STREAM ORDERS WITH
   KAFKA_TOPIC ='orders');
 ```

-![CREATE image]({{BASE_PATH}}/assets/themes/zeppelin/img/docs-img/ksql.1.gif)
+![CREATE image]({{BASE_PATH}}/assets/themes/zeppelin/img/docs-img/ksql.2.png)

 ```
 %ksql
@ -75,4 +75,4 @@ FROM ORDERS
 LIMIT 10
 ```

-![LIMIT image]({{BASE_PATH}}/assets/themes/zeppelin/img/docs-img/ksql.3.gif)
+![LIMIT image]({{BASE_PATH}}/assets/themes/zeppelin/img/docs-img/ksql.3.png)
--- a/docs/interpreter/spark.md
+++ b/docs/interpreter/spark.md
@ -26,7 +26,7 @@ limitations under the License.
 ## Overview
 [Apache Spark](http://spark.apache.org) is a fast and general-purpose cluster computing system.
 It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs.
-Apache Spark is supported in Zeppelin with Spark interpreter group which consists of below six interpreters.
+Apache Spark is supported in Zeppelin with Spark interpreter group which consists of following interpreters.

 <table class="table-configuration">
  <tr>
@ -52,7 +52,17 @@ Apache Spark is supported in Zeppelin with Spark interpreter group which consist
  <tr>
    <td>%spark.r</td>
    <td>SparkRInterpreter</td>
-    <td>Provides an R environment with SparkR support</td>
+    <td>Provides an vanilla R environment with SparkR support</td>
+  </tr>
+  <tr>
+    <td>%spark.ir</td>
+    <td>SparkIRInterpreter</td>
+    <td>Provides an R environment with SparkR support based on Jupyter IRKernel</td>
+  </tr>
+  <tr>
+    <td>%spark.shiny</td>
+    <td>SparkShinyInterpreter</td>
+    <td>Used to create R shiny app with SparkR support</td>
  </tr>
  <tr>
    <td>%spark.sql</td>
@ -66,6 +76,69 @@ Apache Spark is supported in Zeppelin with Spark interpreter group which consist
  </tr>
 </table>

+## Main Features
+
+<table class="table-configuration">
+  <tr>
+    <th>Feature</th>
+    <th>Description</th>
+  </tr>
+  <tr>
+    <td>Support multiple versions of Spark</td>
+    <td>You can run different versions of Spark in one Zeppelin instance</td>
+  </tr>
+  <tr>
+    <td>Support multiple versions of Scala</td>
+    <td>You can run different Scala versions (2.10/2.11/2.12) of Spark in on Zeppelin instance</td>
+  </tr>
+  <tr>
+    <td>Support multiple languages</td>
+    <td>Scala, SQL, Python, R are supported, besides that you can also collaborate across languages, e.g. you can write Scala UDF and use it in PySpark</td>
+  </tr>
+  <tr>
+    <td>Support multiple execution modes</td>
+    <td>Local | Standalone | Yarn | K8s </td>
+  </tr>
+  <tr>
+    <td>Interactive development</td>
+    <td>Interactive development user experience increase your productivity</td>
+  </tr>
+
+  <tr>
+    <td>Inline Visualization</td>
+    <td>You can visualize Spark Dataset/DataFrame vis Python/R's plotting libraries, and even you can make SparkR Shiny app in Zeppelin</td>
+  </tr>
+
+  </tr>
+    <td>Multi-tenancy</td>
+    <td>Multiple user can work in one Zeppelin instance without affecting each other.</td>
+  </tr>
+
+  </tr>
+    <td>Rest API Support</td>
+    <td>You can not only submit Spark job via Zeppelin notebook UI, but also can do that via its rest api (You can use Zeppelin as Spark job server).</td>
+  </tr>
+</table>
+
+## Play Spark in Zeppelin docker
+
+For beginner, we would suggest you to play Spark in Zeppelin docker.
+In the Zeppelin docker image, we have already installed
+miniconda and lots of [useful python and R libraries](https://github.com/apache/zeppelin/blob/branch-0.10/scripts/docker/zeppelin/bin/env_python_3_with_R.yml)
+including IPython and IRkernel prerequisites, so `%spark.pyspark` would use IPython and `%spark.ir` is enabled.
+Without any extra configuration, you can run most of tutorial notes under folder `Spark Tutorial` directly.
+
+First you need to download Spark, because there's no Spark binary distribution shipped with Zeppelin.
+e.g. Here we download Spark 3.1.2 to`/mnt/disk1/spark-3.1.2`,
+and we mount it to Zeppelin docker container and run the following command to start Zeppelin docker container.
+
+```bash
+docker run -u $(id -u) -p 8080:8080 --rm -v /mnt/disk1/spark-3.1.2:/opt/spark -e SPARK_HOME=/opt/spark  --name zeppelin apache/zeppelin:0.10.0
+```
+
+After running the above command, you can open `http://localhost:8080` to play Spark in Zeppelin. We only verify the spark local mode in Zeppelin docker, other modes may not work due to network issues.
+
+
 ## Configuration
 The Spark interpreter can be configured with properties provided by Zeppelin.
 You can also set other Spark properties which are not listed in the table. For a list of additional properties, refer to [Spark Available Properties](http://spark.apache.org/docs/latest/configuration.html#available-properties).
@ -201,14 +274,15 @@ You can also set other Spark properties which are not listed in the table. For a
    <td></td>
    <td>
      Overrides Spark UI default URL. Value should be a full URL (ex: http://{hostName}/{uniquePath}.
-      In Kubernetes mode, value can be Jinja template string with 3 template variables 'PORT', 'SERVICE_NAME' and 'SERVICE_DOMAIN'.
-      (ex: http://{{PORT}}-{{SERVICE_NAME}}.{{SERVICE_DOMAIN}})
+      In Kubernetes mode, value can be Jinja template string with 3 template variables PORT, {% raw %} SERVICE_NAME {% endraw %}  and  {% raw %} SERVICE_DOMAIN {% endraw %}.
+      (e.g.: {% raw %}http://{{PORT}}-{{SERVICE_NAME}}.{{SERVICE_DOMAIN}} {% endraw %}). In yarn mode, value could be a knox url with {% raw %} {{applicationId}} {% endraw %} as placeholder,
+      (e.g.: {% raw %}https://knox-server:8443/gateway/yarnui/yarn/proxy/{{applicationId}}/{% endraw %})
     </td>
  </tr>
  <tr>
    <td>spark.webui.yarn.useProxy</td>
    <td>false</td>
-    <td>whether use yarn proxy url as spark weburl, e.g. http://localhost:8088/proxy/application_1583396598068_0004</td>
+    <td>whether use yarn proxy url as Spark weburl, e.g. http://localhost:8088/proxy/application_1583396598068_0004</td>
  </tr>
  <tr>
    <td>spark.repl.target</td>
@ -224,17 +298,21 @@ You can also set other Spark properties which are not listed in the table. For a

 Without any configuration, Spark interpreter works out of box in local mode. But if you want to connect to your Spark cluster, you'll need to follow below two simple steps.

-### Export SPARK_HOME
+* Set SPARK_HOME
+* Set master
+
+
+### Set SPARK_HOME

 There are several options for setting `SPARK_HOME`.

 * Set `SPARK_HOME` in `zeppelin-env.sh`
-* Set `SPARK_HOME` in Interpreter setting page
+* Set `SPARK_HOME` in interpreter setting page
 * Set `SPARK_HOME` via [inline generic configuration](../usage/interpreter/overview.html#inline-generic-confinterpreter) 

-#### 1. Set `SPARK_HOME` in `zeppelin-env.sh`
+#### Set `SPARK_HOME` in `zeppelin-env.sh`

-If you work with only one version of spark, then you can set `SPARK_HOME` in `zeppelin-env.sh` because any setting in `zeppelin-env.sh` is globally applied.
+If you work with only one version of Spark, then you can set `SPARK_HOME` in `zeppelin-env.sh` because any setting in `zeppelin-env.sh` is globally applied.

 e.g. 

@ -251,21 +329,21 @@ export HADOOP_CONF_DIR=/usr/lib/hadoop
 ```


-#### 2. Set `SPARK_HOME` in Interpreter setting page
+#### Set `SPARK_HOME` in interpreter setting page

-If you want to use multiple versions of spark, then you need create multiple spark interpreters and set `SPARK_HOME` for each of them. e.g.
-Create a new spark interpreter `spark24` for spark 2.4 and set `SPARK_HOME` in interpreter setting page
+If you want to use multiple versions of Spark, then you need to create multiple Spark interpreters and set `SPARK_HOME` separately. e.g.
+Create a new Spark interpreter `spark24` for Spark 2.4 and set its `SPARK_HOME` in interpreter setting page as following,
 <center>
 <img src="{{BASE_PATH}}/assets/themes/zeppelin/img/docs-img/spark_SPARK_HOME24.png" width="80%">
 </center>

-Create a new spark interpreter `spark16` for spark 1.6 and set `SPARK_HOME` in interpreter setting page
+Create a new Spark interpreter `spark16` for Spark 1.6 and set its `SPARK_HOME` in interpreter setting page as following,
 <center>
 <img src="{{BASE_PATH}}/assets/themes/zeppelin/img/docs-img/spark_SPARK_HOME16.png" width="80%">
 </center>


-#### 3. Set `SPARK_HOME` via [inline generic configuration](../usage/interpreter/overview.html#inline-generic-confinterpreter) 
+#### Set `SPARK_HOME` via [inline generic configuration](../usage/interpreter/overview.html#inline-generic-confinterpreter) 

 Besides setting `SPARK_HOME` in interpreter setting page, you can also use inline generic configuration to put the 
 configuration with code together for more flexibility. e.g.
@ -273,23 +351,26 @@ configuration with code together for more flexibility. e.g.
 <img src="{{BASE_PATH}}/assets/themes/zeppelin/img/docs-img/spark_inline_configuration.png" width="80%">
 </center>

-### Set master in Interpreter menu
-After starting Zeppelin, go to **Interpreter** menu and edit **spark.master** property in your Spark interpreter setting. The value may vary depending on your Spark cluster deployment type.
+### Set master
+
+After setting `SPARK_HOME`, you need to set **spark.master** property in either interpreter setting page or inline configuartion. The value may vary depending on your Spark cluster deployment type.

 For example,

 * **local[*]** in local mode
 * **spark://master:7077** in standalone cluster
- * **yarn-client** in Yarn client mode  (Not supported in spark 3.x, refer below for how to configure yarn-client in Spark 3.x)
- * **yarn-cluster** in Yarn cluster mode  (Not supported in spark 3.x, refer below for how to configure yarn-client in Spark 3.x)
+ * **yarn-client** in Yarn client mode  (Not supported in Spark 3.x, refer below for how to configure yarn-client in Spark 3.x)
+ * **yarn-cluster** in Yarn cluster mode  (Not supported in Spark 3.x, refer below for how to configure yarn-cluster in Spark 3.x)
 * **mesos://host:5050** in Mesos cluster

 That's it. Zeppelin will work with any version of Spark and any deployment type without rebuilding Zeppelin in this way.
 For the further information about Spark & Zeppelin version compatibility, please refer to "Available Interpreters" section in [Zeppelin download page](https://zeppelin.apache.org/download.html).

-> Note that without exporting `SPARK_HOME`, it's running in local mode with included version of Spark. The included version may vary depending on the build profile.
+Note that without setting `SPARK_HOME`, it's running in local mode with included version of Spark. The included version may vary depending on the build profile. And this included version Spark has limited function, so it 
+is always recommended to set `SPARK_HOME`.

-> Yarn client mode and local mode will run driver in the same machine with zeppelin server, this would be dangerous for production. Because it may run out of memory when there's many spark interpreters running at the same time. So we suggest you only allow yarn-cluster mode via setting `zeppelin.spark.only_yarn_cluster` in `zeppelin-site.xml`.
+Yarn client mode and local mode will run driver in the same machine with zeppelin server, this would be dangerous for production. Because it may run out of memory when there's many Spark interpreters running at the same time. So we suggest you 
+only allow yarn-cluster mode via setting `zeppelin.spark.only_yarn_cluster` in `zeppelin-site.xml`.

 #### Configure yarn mode for Spark 3.x

@ -314,77 +395,55 @@ Specifying `yarn-client` & `yarn-cluster` in `spark.master` is not supported in
 </table>


+## Interpreter binding mode
+
+The default [interpreter binding mode](../usage/interpreter/interpreter_binding_mode.html) is `globally shared`. That means all notes share the same Spark interpreter.
+
+So we recommend you to use `isolated per note` which means each note has own Spark interpreter without affecting each other. But it may run out of your machine resource if too many
+Spark interpreters are created, so we recommend to always use yarn-cluster mode in production if you run Spark in hadoop cluster. And you can use [inline configuration](../usage/interpreter/overview.html#inline-generic-configuration) via `%spark.conf` in the first paragraph to customize your spark configuration.
+
+You can also choose `scoped` mode. For `scoped` per note mode, Zeppelin creates separated scala compiler/python shell for each note but share a single `SparkContext/SqlContext/SparkSession`.
+
+
 ## SparkContext, SQLContext, SparkSession, ZeppelinContext

-SparkContext, SQLContext, SparkSession (for spark 2.x) and ZeppelinContext are automatically created and exposed as variable names `sc`, `sqlContext`, `spark` and `z`, respectively, in Scala, Kotlin, Python and R environments.
+SparkContext, SQLContext, SparkSession (for spark 2.x, 3.x) and ZeppelinContext are automatically created and exposed as variable names `sc`, `sqlContext`, `spark` and `z` respectively, in Scala, Kotlin, Python and R environments.


 > Note that Scala/Python/R environment shares the same SparkContext, SQLContext, SparkSession and ZeppelinContext instance.

-## YARN Mode
+## Yarn Mode
+
 Zeppelin support both yarn client and yarn cluster mode (yarn cluster mode is supported from 0.8.0). For yarn mode, you must specify `SPARK_HOME` & `HADOOP_CONF_DIR`. 
-Usually you only have one hadoop cluster, so you can set `HADOOP_CONF_DIR` in `zeppelin-env.sh` which is applied to all spark interpreters. If you want to use spark against multiple hadoop cluster, then you need to define
+Usually you only have one hadoop cluster, so you can set `HADOOP_CONF_DIR` in `zeppelin-env.sh` which is applied to all Spark interpreters. If you want to use spark against multiple hadoop cluster, then you need to define
 `HADOOP_CONF_DIR` in interpreter setting or via inline generic configuration.

-## Dependency Management
+## K8s Mode

-For spark interpreter, it is not recommended to use Zeppelin's [Dependency Management](../usage/interpreter/dependency_management.html) for managing 
-third party dependencies (`%spark.dep` is removed from Zeppelin 0.9 as well). Instead you should set the standard Spark properties.
-
-<table class="table-configuration">
-  <tr>
-    <th>Spark Property</th>
-    <th>Spark Submit Argument</th>
-    <th>Description</th>
-  </tr>
-  <tr>
-    <td>spark.files</td>
-    <td>--files</td>
-    <td>Comma-separated list of files to be placed in the working directory of each executor. Globs are allowed.</td>
-  </tr>
-  <tr>
-    <td>spark.jars</td>
-    <td>--jars</td>
-    <td>Comma-separated list of jars to include on the driver and executor classpaths. Globs are allowed.</td>
-  </tr>
-  <tr>
-    <td>spark.jars.packages</td>
-    <td>--packages</td>
-    <td>Comma-separated list of Maven coordinates of jars to include on the driver and executor classpaths. The coordinates should be groupId:artifactId:version. If spark.jars.ivySettings is given artifacts will be resolved according to the configuration in the file, otherwise artifacts will be searched for in the local maven repo, then maven central and finally any additional remote repositories given by the command-line option --repositories.</td>
-  </tr>
-</table>
-
-You can either set Spark properties in interpreter setting page or set Spark submit arguments in `zeppelin-env.sh` via environment variable `SPARK_SUBMIT_OPTIONS`. 
-For examples:
-
-```bash
-export SPARK_SUBMIT_OPTIONS="--files <my_file> --jars <my_jar> --packages <my_package>"
-```
-
-But it is not recommended to set them in `SPARK_SUBMIT_OPTIONS`. Because it will be shared by all spark interpreters, which means you can not set different dependencies for different users.
+Regarding how to run Spark on K8s in Zeppelin, please check [this doc](../quickstart/kubernetes.html).


 ## PySpark

-There're 2 ways to use PySpark in Zeppelin:
+There are 2 ways to use PySpark in Zeppelin:

 * Vanilla PySpark
 * IPySpark

 ### Vanilla PySpark (Not Recommended)
-Vanilla PySpark interpreter is almost the same as vanilla Python interpreter except Zeppelin inject SparkContext, SQLContext, SparkSession via variables `sc`, `sqlContext`, `spark`.

-By default, Zeppelin would use IPython in `%spark.pyspark` when IPython is available, Otherwise it would fall back to the original PySpark implementation.
-If you don't want to use IPython, then you can set `zeppelin.pyspark.useIPython` as `false` in interpreter setting. For the IPython features, you can refer doc
-[Python Interpreter](python.html)
+Vanilla PySpark interpreter is almost the same as vanilla Python interpreter except Spark interpreter inject SparkContext, SQLContext, SparkSession via variables `sc`, `sqlContext`, `spark`.
+
+By default, Zeppelin would use IPython in `%spark.pyspark` when IPython is available (Zeppelin would check whether ipython's prerequisites are met), Otherwise it would fall back to the vanilla PySpark implementation.

 ### IPySpark (Recommended)
-You can use `IPySpark` explicitly via `%spark.ipyspark`. IPySpark interpreter is almost the same as IPython interpreter except Zeppelin inject SparkContext, SQLContext, SparkSession via variables `sc`, `sqlContext`, `spark`.
-For the IPython features, you can refer doc [Python Interpreter](python.html)
+
+You can use `IPySpark` explicitly via `%spark.ipyspark`. IPySpark interpreter is almost the same as IPython interpreter except Spark interpreter inject SparkContext, SQLContext, SparkSession via variables `sc`, `sqlContext`, `spark`.
+For the IPython features, you can refer doc [Python Interpreter](python.html#ipython-interpreter-pythonipython-recommended)

 ## SparkR

-Zeppelin support SparkR via `%spark.r`. Here's configuration for SparkR Interpreter.
+Zeppelin support SparkR via `%spark.r`, `%spark.ir` and `%spark.shiny`. Here's configuration for SparkR Interpreter.

 <table class="table-configuration">
  <tr>
@ -412,12 +471,28 @@ Zeppelin support SparkR via `%spark.r`. Here's configuration for SparkR Interpre
    <td>out.format = 'html', comment = NA, echo = FALSE, results = 'asis', message = F, warning = F, fig.retina = 2</td>
    <td>R plotting options.</td>
  </tr>
+  <tr>
+    <td>zeppelin.R.shiny.iframe_width</td>
+    <td>100%</td>
+    <td>IFrame width of Shiny App</td>
+  </tr>
+  <tr>
+    <td>zeppelin.R.shiny.iframe_height</td>
+    <td>500px</td>
+    <td>IFrame height of Shiny App</td>
+  </tr>
+  <tr>
+    <td>zeppelin.R.shiny.portRange</td>
+    <td>:</td>
+    <td>Shiny app would launch a web app at some port, this property is to specify the portRange via format '<start>:<end>', e.g. '5000:5001'. By default it is ':' which means any port</td>
+  </tr>
 </table>

+Refer [R doc](r.html) for how to use R in Zeppelin.

 ## SparkSql

-Spark Sql Interpreter share the same SparkContext/SparkSession with other Spark interpreter. That means any table registered in scala, python or r code can be accessed by Spark Sql.
+Spark sql interpreter share the same SparkContext/SparkSession with other Spark interpreters. That means any table registered in scala, python or r code can be accessed by Spark sql.
 For examples:

 ```scala
@ -435,11 +510,13 @@ df.createOrReplaceTempView("people")
 select * from people
 ```

-By default, each sql statement would run sequentially in `%spark.sql`. But you can run them concurrently by following setup.
+You can write multiple sql statements in one paragraph. Each sql statement is separated by semicolon.
+Sql statement in one paragraph would run sequentially. 
+But sql statements in different paragraphs can run concurrently by the following configuration.

-1. Set `zeppelin.spark.concurrentSQL` to true to enable the sql concurrent feature, underneath zeppelin will change to use fairscheduler for spark. And also set `zeppelin.spark.concurrentSQL.max` to control the max number of sql statements running concurrently.
+1. Set `zeppelin.spark.concurrentSQL` to true to enable the sql concurrent feature, underneath zeppelin will change to use fairscheduler for Spark. And also set `zeppelin.spark.concurrentSQL.max` to control the max number of sql statements running concurrently.
 2. Configure pools by creating `fairscheduler.xml` under your `SPARK_CONF_DIR`, check the official spark doc [Configuring Pool Properties](http://spark.apache.org/docs/latest/job-scheduling.html#configuring-pool-properties)
-3. Set pool property via setting paragraph property. e.g.
+3. Set pool property via setting paragraph local property. e.g.

 ```
 %spark(pool=pool1)
@ -448,25 +525,61 @@ By default, each sql statement would run sequentially in `%spark.sql`. But you c
 ```

 This pool feature is also available for all versions of scala Spark, PySpark. For SparkR, it is only available starting from 2.3.0.
- 
-## Interpreter Setting Option

-You can choose one of `shared`, `scoped` and `isolated` options when you configure Spark interpreter.
-e.g. 
+## Dependency Management
+
+For Spark interpreter, it is not recommended to use Zeppelin's [Dependency Management](../usage/interpreter/dependency_management.html) for managing
+third party dependencies (`%spark.dep` is removed from Zeppelin 0.9 as well). Instead, you should set the standard Spark properties as following:
+
+<table class="table-configuration">
+  <tr>
+    <th>Spark Property</th>
+    <th>Spark Submit Argument</th>
+    <th>Description</th>
+  </tr>
+  <tr>
+    <td>spark.files</td>
+    <td>--files</td>
+    <td>Comma-separated list of files to be placed in the working directory of each executor. Globs are allowed.</td>
+  </tr>
+  <tr>
+    <td>spark.jars</td>
+    <td>--jars</td>
+    <td>Comma-separated list of jars to include on the driver and executor classpaths. Globs are allowed.</td>
+  </tr>
+  <tr>
+    <td>spark.jars.packages</td>
+    <td>--packages</td>
+    <td>Comma-separated list of Maven coordinates of jars to include on the driver and executor classpaths. The coordinates should be groupId:artifactId:version. If spark.jars.ivySettings is given artifacts will be resolved according to the configuration in the file, otherwise artifacts will be searched for in the local maven repo, then maven central and finally any additional remote repositories given by the command-line option --repositories.</td>
+  </tr>
+</table>
+
+As general Spark properties, you can set them in via inline configuration or interpreter setting page or in `zeppelin-env.sh` via environment variable `SPARK_SUBMIT_OPTIONS`.
+For examples:
+
+```bash
+export SPARK_SUBMIT_OPTIONS="--files <my_file> --jars <my_jar> --packages <my_package>"
+```
+
+To be noticed, `SPARK_SUBMIT_OPTIONS` is deprecated and will be removed in future release.

-* In `scoped` per user mode, Zeppelin creates separated Scala compiler for each user but share a single SparkContext.
-* In `isolated` per user mode, Zeppelin creates separated SparkContext for each user.

 ## ZeppelinContext
+
 Zeppelin automatically injects `ZeppelinContext` as variable `z` in your Scala/Python environment. `ZeppelinContext` provides some additional functions and utilities.
-See [Zeppelin-Context](../usage/other_features/zeppelin_context.html) for more details.
+See [Zeppelin-Context](../usage/other_features/zeppelin_context.html) for more details. For Spark interpreter, you can use z to display Spark `Dataset/Dataframe`.
+
+
+<img src="{{BASE_PATH}}/assets/themes/zeppelin/img/docs-img/spark_zshow.png">
+

 ## Setting up Zeppelin with Kerberos
+
 Logical setup with Zeppelin, Kerberos Key Distribution Center (KDC), and Spark on YARN:

 <img src="{{BASE_PATH}}/assets/themes/zeppelin/img/docs-img/kdc_zeppelin.png">

-There're several ways to make spark work with kerberos enabled hadoop cluster in Zeppelin. 
+There are several ways to make Spark work with kerberos enabled hadoop cluster in Zeppelin. 

 1. Share one single hadoop cluster.
 In this case you just need to specify `zeppelin.server.kerberos.keytab` and `zeppelin.server.kerberos.principal` in zeppelin-site.xml, Spark interpreter will use these setting by default.
@ -474,11 +587,26 @@ In this case you just need to specify `zeppelin.server.kerberos.keytab` and `zep
 2. Work with multiple hadoop clusters.
 In this case you can specify `spark.yarn.keytab` and `spark.yarn.principal` to override `zeppelin.server.kerberos.keytab` and `zeppelin.server.kerberos.principal`.

+### Configuration Setup
+
+1. On the server that Zeppelin is installed, install Kerberos client modules and configuration, krb5.conf.
+   This is to make the server communicate with KDC.
+
+2. Add the two properties below to Spark configuration (`[SPARK_HOME]/conf/spark-defaults.conf`):
+
+    ```
+    spark.yarn.principal
+    spark.yarn.keytab
+    ```
+
+> **NOTE:** If you do not have permission to access for the above spark-defaults.conf file, optionally, you can add the above lines to the Spark Interpreter setting through the Interpreter tab in the Zeppelin UI.
+
+3. That's it. Play with Zeppelin!

 ## User Impersonation

-In yarn mode, the user who launch the zeppelin server will be used to launch the spark yarn application. This is not a good practise.
-Most of time, you will enable shiro in Zeppelin and would like to use the login user to submit the spark yarn app. For this purpose,
+In yarn mode, the user who launch the zeppelin server will be used to launch the Spark yarn application. This is not a good practise.
+Most of time, you will enable shiro in Zeppelin and would like to use the login user to submit the Spark yarn app. For this purpose,
 you need to enable user impersonation for more security control. In order the enable user impersonation, you need to do the following steps

 **Step 1** Enable user impersonation setting hadoop's `core-site.xml`. E.g. if you are using user `zeppelin` to launch Zeppelin, then add the following to `core-site.xml`, then restart both hdfs and yarn. 
@ -508,19 +636,7 @@ You can get rid of this message by setting `zeppelin.spark.deprecatedMsg.show` t

 <img src="{{BASE_PATH}}/assets/themes/zeppelin/img/docs-img/spark_deprecate.png">

-### Configuration Setup

-1. On the server that Zeppelin is installed, install Kerberos client modules and configuration, krb5.conf.
-This is to make the server communicate with KDC.
-
-2. Add the two properties below to Spark configuration (`[SPARK_HOME]/conf/spark-defaults.conf`):
-
-    ```
-    spark.yarn.principal
-    spark.yarn.keytab
-    ```
-
-  > **NOTE:** If you do not have permission to access for the above spark-defaults.conf file, optionally, you can add the above lines to the Spark Interpreter setting through the Interpreter tab in the Zeppelin UI.
-
-3. That's it. Play with Zeppelin!
+## Community

+[Join our community](http://zeppelin.apache.org/community.html) to discuss with others.
--- a/docs/quickstart/flink_with_zeppelin.md
+++ b/docs/quickstart/flink_with_zeppelin.md
@ -0,0 +1,42 @@
+---
+layout: page
+title: "Flink with Zeppelin"
+description: ""
+group: quickstart
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+{% include JB/setup %}
+
+# Flink support in Zeppelin 
+
+<div id="toc"></div>
+
+<br/>
+
+For a brief overview of Apache Flink fundamentals with Apache Zeppelin, see the following guide:
+
+- **built-in** Apache Flink integration.
+- With [Flink Scala Scala](https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/repls/scala_shell/) [PyFlink Shell](https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/repls/python_shell/), [Flink SQL](https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/sql/overview/)
+- Inject ExecutionEnvironment, StreamExecutionEnvironment, BatchTableEnvironment, StreamTableEnvironment.
+- Canceling job and displaying its progress 
+- Supports different modes: local, remote, yarn, yarn-application
+- Dependency management
+- Streaming Visualization
+
+<br/>
+
+For the further information about Flink support in Zeppelin, please check 
+
+- [Flink Interpreter](../interpreter/flink.html)
--- a/docs/quickstart/install.md
+++ b/docs/quickstart/install.md
@ -50,7 +50,7 @@ Two binary packages are available on the [download page](http://zeppelin.apache.

 - **all interpreter package**: unpack it in a directory of your choice and you're ready to go.
 - **net-install interpreter package**: only spark, python, markdown and shell interpreter included. Unpack and follow [install additional interpreters](../usage/interpreter/installation.html) to install other interpreters. If you're unsure, just run `./bin/install-interpreter.sh --all` and install all interpreters.
-  
+
 ### Building Zeppelin from source

 Follow the instructions [How to Build](../setup/basics/how_to_build.html), If you want to build from source instead of using binary package.
@ -67,9 +67,11 @@ bin/zeppelin-daemon.sh start

 After Zeppelin has started successfully, go to [http://localhost:8080](http://localhost:8080) with your web browser.

-By default Zeppelin is listening at `127.0.0.1:8080`, so you can't access it when it is deployed in another remote machine.
+By default Zeppelin is listening at `127.0.0.1:8080`, so you can't access it when it is deployed on another remote machine.
 To access a remote Zeppelin, you need to change `zeppelin.server.addr` to `0.0.0.0` in `conf/zeppelin-site.xml`.

+Check log file at `ZEPPELIN_HOME/logs/zeppelin-server-*.log` if you can not open Zeppelin.
+
 #### Stopping Zeppelin

 ```
@ -84,15 +86,27 @@ Make sure that [docker](https://www.docker.com/community-edition) is installed i
 Use this command to launch Apache Zeppelin in a container.

 ```bash
-docker run -p 8080:8080 --rm --name zeppelin apache/zeppelin:0.9.0
+docker run -p 8080:8080 --rm --name zeppelin apache/zeppelin:0.10.0

 ```
+
 To persist `logs` and `notebook` directories, use the [volume](https://docs.docker.com/engine/reference/commandline/run/#mount-volume--v-read-only) option for docker container.

 ```bash
-docker run -p 8080:8080 --rm -v $PWD/logs:/logs -v $PWD/notebook:/notebook \
+docker run -u $(id -u) -p 8080:8080 --rm -v $PWD/logs:/logs -v $PWD/notebook:/notebook \
           -e ZEPPELIN_LOG_DIR='/logs' -e ZEPPELIN_NOTEBOOK_DIR='/notebook' \
-           --name zeppelin apache/zeppelin:0.9.0
+           --name zeppelin apache/zeppelin:0.10.0
+```
+
+`-u $(id -u)` is to make sure you have the permission to write logs and notebooks. 
+
+For many interpreters, they require other dependencies, e.g. Spark interpreter requires Spark binary distribution
+and Flink interpreter requires Flink binary distribution. You can also mount them via docker volumn. e.g.
+
+```bash
+docker run -u $(id -u) -p 8080:8080 --rm -v /mnt/disk1/notebook:/notebook \
+-v /usr/lib/spark-current:/opt/spark -v /mnt/disk1/flink-1.12.2:/opt/flink -e FLINK_HOME=/opt/flink  \
+-e SPARK_HOME=/opt/spark  -e ZEPPELIN_NOTEBOOK_DIR='/notebook' --name zeppelin apache/zeppelin:0.10.0
 ```

 If you have trouble accessing `localhost:8080` in the browser, Please clear browser cache.
@ -146,13 +160,15 @@ Congratulations, you have successfully installed Apache Zeppelin! Here are a few

 #### New to Apache Zeppelin...
 * For an in-depth overview, head to [Explore Zeppelin UI](../quickstart/explore_ui.html).
- * And then, try run [Tutorial Notebook](http://localhost:8080/#/notebook/2A94M5J1Z) in your Zeppelin.
+ * And then, try run Tutorial Notebooks shipped with your Zeppelin distribution.
 * And see how to change [configurations](../setup/operation/configuration.html) like port number, etc.

-#### Spark, Python, SQL, and more 
+#### Spark, Flink, SQL, Python, R and more 
 * [Spark support in Zeppelin](./spark_with_zeppelin.html), to know more about deep integration with [Apache Spark](http://spark.apache.org/). 
+ * [Flink support in Zeppelin](./flink_with_zeppelin.html), to know more about deep integration with [Apache Flink](http://flink.apache.org/).
 * [SQL support in Zeppelin](./sql_with_zeppelin.html) for SQL support
 * [Python support in Zeppelin](./python_with_zeppelin.html), for Matplotlib, Pandas, Conda/Docker integration.
+ * [R support in Zeppelin](./r_with_zeppelin.html)
 * [All Available Interpreters](../#available-interpreters)

 #### Multi-user support ...
--- a/docs/quickstart/python_with_zeppelin.md
+++ b/docs/quickstart/python_with_zeppelin.md
@ -27,16 +27,17 @@ limitations under the License.

 The following guides explain how to use Apache Zeppelin that enables you to write in Python:

+- supports [vanilla python](../interpreter/python.html#vanilla-python-interpreter-python) and [ipython](../interpreter/python.html#ipython-interpreter-pythonipython-recommended)
 - supports flexible python environments using [conda](../interpreter/python.html#conda), [docker](../interpreter/python.html#docker)  
 - can query using [PandasSQL](../interpreter/python.html#sql-over-pandas-dataframes)
 - also, provides [PySpark](../interpreter/spark.html)
+- [run python interpreter in yarn cluster](../interpreter/python.html#run-python-in-yarn-cluster) with customized conda python environment.
 - with [matplotlib integration](../interpreter/python.html#matplotlib-integration)
- support [ipython](../interpreter/python.html#ipython-interpreter-pythonipython-recommended) 
 - can create results including **UI widgets** using [Dynamic Form](../interpreter/python.html#using-zeppelin-dynamic-forms)

 <br/>

-For the further information about Spark support in Zeppelin, please check 
+For the further information about Python support in Zeppelin, please check 

 - [Python Interpreter](../interpreter/python.html)

--- a/docs/quickstart/r_with_zeppelin.md
+++ b/docs/quickstart/r_with_zeppelin.md
@ -0,0 +1,42 @@
+---
+layout: page
+title: "R with Zeppelin"
+description: ""
+group: quickstart
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+{% include JB/setup %}
+
+# R support in Zeppelin
+
+<div id="toc"></div>
+
+<br/>
+
+The following guides explain how to use Apache Zeppelin that enables you to write in R:
+
+- Supports [vanilla R](../interpreter/r.html#how-to-use-r-interpreter) and [IRkernel](../interpreter/r.html#how-to-use-r-interpreter)
+- Visualize R dataframe via [ZeppelinContext](../interpreter/r.html#zshow)
+- [Run R interpreter in yarn cluster](../interpreter/r.html#run-r-in-yarn-cluster) with customized conda R environment.
+- [Make R Shiny App] (../interpreter/r.html#make-shiny-app-in-zeppelin)
+
+<br/>
+
+For the further information about R support in Zeppelin, please check
+
+- [R Interpreter](../interpreter/r.html)
+
+
+
--- a/docs/quickstart/spark_with_zeppelin.md
+++ b/docs/quickstart/spark_with_zeppelin.md
@ -28,12 +28,13 @@ limitations under the License.
 For a brief overview of Apache Spark fundamentals with Apache Zeppelin, see the following guide:

 - **built-in** Apache Spark integration.
- with [SparkSQL](http://spark.apache.org/sql/), [PySpark](https://spark.apache.org/docs/latest/api/python/pyspark.html), [SparkR](https://spark.apache.org/docs/latest/sparkr.html)
- inject [SparkContext](https://spark.apache.org/docs/latest/api/java/org/apache/spark/SparkContext.html), [SQLContext](https://spark.apache.org/docs/latest/sql-programming-guide.html) and [SparkSession](https://spark.apache.org/docs/latest/sql-programming-guide.html) automatically
- canceling job and displaying its progress 
- supporting [Spark Cluster Mode](../setup/deployment/spark_cluster_mode.html#apache-zeppelin-on-spark-cluster-mode) for external spark clusters
- supports [different context per user / note](../usage/interpreter/interpreter_binding_mode.html) 
- sharing variables among PySpark, SparkR and Spark through [ZeppelinContext](../interpreter/spark.html#zeppelincontext)
+- With [Spark Scala](https://spark.apache.org/docs/latest/quick-start.html) [SparkSQL](http://spark.apache.org/sql/), [PySpark](https://spark.apache.org/docs/latest/api/python/pyspark.html), [SparkR](https://spark.apache.org/docs/latest/sparkr.html)
+- Inject [SparkContext](https://spark.apache.org/docs/latest/api/java/org/apache/spark/SparkContext.html), [SQLContext](https://spark.apache.org/docs/latest/sql-programming-guide.html) and [SparkSession](https://spark.apache.org/docs/latest/sql-programming-guide.html) automatically
+- Canceling job and displaying its progress 
+- Supports different modes: local, standalone, yarn(client & cluster), k8s
+- Dependency management
+- Supports [different context per user / note](../usage/interpreter/interpreter_binding_mode.html) 
+- Sharing variables among PySpark, SparkR and Spark through [ZeppelinContext](../interpreter/spark.html#zeppelincontext)
 - [Livy Interpreter](../interpreter/livy.html)

 <br/>
--- a/docs/quickstart/sql_with_zeppelin.md
+++ b/docs/quickstart/sql_with_zeppelin.md
@ -38,6 +38,7 @@ The following guides explain how to use Apache Zeppelin that enables you to writ
  * [Apache Tajo](../interpreter/jdbc.html#apache-tajo)
  * and so on 
 - [Spark Interpreter](../interpreter/spark.html) supports [SparkSQL](http://spark.apache.org/sql/)
+- [Flink Interpreter](../interpreter/flink.html) supports [Flink SQL](https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/sql/overview/)
 - [Python Interpreter](../interpreter/python.html) supports [pandasSQL](../interpreter/python.html#sql-over-pandas-dataframes) 
 - can create query result including **UI widgets** using [Dynamic Form](../usage/dynamic_form/intro.html)

@ -56,6 +57,7 @@ For the further information about SQL support in Zeppelin, please check

 - [JDBC Interpreter](../interpreter/jdbc.html)
 - [Spark Interpreter](../interpreter/spark.html)
+- [Flink Interpreter](../interpreter/flink.html)
 - [Python Interpreter](../interpreter/python.html)
 - [IgniteSQL Interpreter](../interpreter/ignite.html#ignite-sql-interpreter) for [Apache Ignite](https://ignite.apache.org/)
 - [Kylin Interpreter](../interpreter/kylin.html) for [Apache Kylin](http://kylin.apache.org/)
--- a/docs/setup/deployment/flink_and_spark_cluster.md
+++ b/docs/setup/deployment/flink_and_spark_cluster.md
@ -20,6 +20,8 @@ limitations under the License.

 {% include JB/setup %}

+<font color=red>This document is outdated, it is not verified in the latest Zeppelin.</font>
+
 # Install with Flink and Spark cluster

 <div id="toc"></div>
--- a/docs/usage/interpreter/dependency_management.md
+++ b/docs/usage/interpreter/dependency_management.md
@ -24,13 +24,14 @@ limitations under the License.

 You can include external libraries to interpreter by setting dependencies in interpreter menu.

+To be noticed, this approach doesn't work for spark and flink interpreters. They have their own dependency management, please refer their doc for details.
+
 When your code requires external library, instead of doing download/copy/restart Zeppelin, you can easily do following jobs in this menu.

 * Load libraries recursively from Maven repository
 * Load libraries from local filesystem
 * Add additional maven repository
- * Automatically add libraries to SparkCluster
-
+ 
 <hr>
 <div class="row">
  <div class="col-md-6">
--- a/Mode_2GE79Y5FV.zpln
+++ b/Mode_2GE79Y5FV.zpln