zeppelin/docs/_includes
Lee moon soo b13651cedf [ZEPPELIN-3840] Zeppelin on Kubernetes
### What type of PR is it?
This PR adds ability to run Zeppelin on Kubernetes. It aims

 - Zero configuration to start Zeppelin on Kubernetes. (and Spark on Kubernetes)
 - Run everything on Kubernetes: Zeppelin, Interpreters, Spark.
 - Highly customizable to adopt various user configurations and extensions.

Key features are

 - Provides zeppelin-server.yaml file for `kubectl` to run Zeppelin server
 - All interpreters are automatically running as a Pod.
 - Spark interpreter automatically configured to use [Spark on Kubernetes](https://spark.apache.org/docs/latest/running-on-kubernetes.html)
 - Reverse proxy is configured to access Spark UI

To do
 - [x] Document how reverse proxy for Spark UI works and how to configure custom domain.
 - [x] Document how to customize zeppelin-server and interpreter yaml.
 - [x] Document new configurations
 - [x] Document how to mount volume for notebook and configurations

### How it works

#### Run Zeppelin Server on Kubernetes
`k8s/zeppelin-server.yaml` is provided to run Zeppelin Server with few sidecars and configurations.
This file is easy to publish (user can easily consume it using `curl`), highly customizable while it includes all the necessary things.

#### K8s Interpreter launcher
This PR adds new module, `launcher-k8s-standard` under `zeppelin/zeppelin-plugins/launcher/k8s-standard/` directory. This launcher is [automatically being selected](https://github.com/apache/zeppelin/pull/3240/files#diff-82fddd2ffb77aaffc4b9cf7b5b1eaa79) when Zeppelin is running on Kubernetes. The launcher both handles Spark interpreter and All other interpreters.

The launcher launches interpreter as a Pod using template [k8s/interpreter/100-interpreter-pod.yaml](https://github.com/apache/zeppelin/pull/3240/files#diff-d9ce62e2c992d32f0184d7edb862f3c4).
Reason filename has `100-` in prefix is because all files in the directory is consumed in alphabetical order by launcher on interpreter start/stop. User can drop more files here to extend/customize interpreter, and filename can be used to control order. The template is rendered by [jinjava](https://github.com/HubSpot/jinjava).

#### Spark interpreter

When interpreter group is `spark`, K8sRemoteInterpreterProcess [sets necessary spark configuration](https://github.com/apache/zeppelin/pull/3240/files#diff-6d1d3084f55bdd519e39ede4a619e73dR297) automatically to use [Spark on Kubernetes](https://spark.apache.org/docs/latest/running-on-kubernetes.html). User doesn't have to configure anything. It uses client mode.

#### Spark UI

We may make user manually configure port-forward or do something to access Spark UI, but that's not optimal. It is the best when Spark UI is automatically accessible when user have access to Zeppelin UI, without any extra configuration.

To enable this, Zeppelin server Pod has a reverse proxy as a sidecar, and it split traffic to Zeppelin server and Spark UI running in the other Pod. It assume both `service.domain.com` and `*.service.domain.com` point the nginx proxy address. `service.domain.com` is directed to ZeppelinServer, `*.service.domain.com` is directed to interpreter Pod.

`<port>-<interpreter pod svc name>.service.domain.com` is convention to access any application running in interpreter Pod. If Spark interpreter Pod is running with a name `spark-axefeg` and Spark UI is running on port 4040,

```
4040-spark-axefeg.service.domain.com
```

is the address to access Spark UI. Default service domain is [local.zeppelin-project.org:8080](https://github.com/apache/zeppelin/pull/3240/files#diff-56ccb2e2c2617b27dbaae866d9431e51R22), while `local.zeppelin-project.org` and `*.local.zeppelin-project.org` point `127.0.0.1`, and it works with `kubectl port-forward`.

### What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-3840

### How should this be tested?

Prepare a Kubernetes cluster with enough resources (cpus > 5, mem > 6g).
If you're using [minikube](https://github.com/kubernetes/minikube), check your capacity using `kubectl describe node` command before start.

You'll need to build Zeppelin docker image and Spark docker image to test. Please follow guide docs/quickstart/kubernetes.md.

To quickly try without building docker images, I have uploaded pre-built image on docker hub `moon/zeppelin:0.9.0-SNAPSHOT`, `moon/spark:2.4.0`. Try following command

```
ZEPPELIN_SERVER_YAML="curl -s https://raw.githubusercontent.com/Leemoonsoo/zeppelin/kubernetes/k8s/zeppelin-server.yaml"
$ZEPPELIN_SERVER_YAML | sed 's/apache\/zeppelin:0.9.0-SNAPSHOT/moon\/zeppelin:0.9.0-SNAPSHOT/' | sed 's/spark:2.4.0/moon\/spark:2.4.0/' | kubectl apply -f -
```

And port forward

```
kubectl port-forward zeppelin-server 8080:80
```

And browse http://localhost:8080

To clean up

```
$ZEPPELIN_SERVER_YAML | sed 's/apache\/zeppelin:0.9.0-SNAPSHOT/moon\/zeppelin:0.9.0-SNAPSHOT/' | sed 's/spark:2.4.0/moon\/spark:2.4.0/' | kubectl delete -f -
```

### Screenshots (if appropriate)
See this video https://youtu.be/7E4ZGn4pnTo

### Future work

 - Per interpreter docker image
 - Blocking communication between interpreter Pod.
 - Spark Interpreter Pod has Role CRUD for any pod/service in the same namespace. Which should be restricted to only Spark executors Pod.
 - Per note interpreter mode by default when Zeppelin is running on Kubernetes

### Questions:
* Does the licenses files need update? no
* Is there breaking changes for older versions? no
* Does this needs documentation? yes

Author: Lee moon soo <leemoonsoo@gmail.com>
Author: Lee moon soo <moon@apache.org>

Closes #3240 from Leemoonsoo/kubernetes and squashes the following commits:

0100a36f2 [Lee moon soo] update how it works on docs, add some comments on yaml files
423412a93 [Lee moon soo] zeppelin.k8s.mode -> zeppelin.run.mode
4e7d8170d [Lee moon soo] localtest.me -> local.zeppelin-project.org
993a0e44e [Lee moon soo] document configurations
9ab6fc420 [Lee moon soo] address code review
22e090f61 [Lee moon soo] logger -> LOGGER
11960dd59 [Lee moon soo] update corresponding test as well
3b652a48e [Lee moon soo] Make spark executor set ownerreference correctly
1a3a07098 [Lee moon soo] Set ownerreference to Role and Rolebinding of interpreter
e2dc88a19 [Lee moon soo] suppress error log when wait target is already removed
fa36c18e3 [Lee moon soo] Make spark master configurable
b4f58a9a1 [Lee moon soo] sig term for quick termination
64a56b5c9 [Lee moon soo] Add docs
e9ce64fe7 [Lee moon soo] update dockerfile
ec09b8b88 [Lee moon soo] add test
3078bac55 [Lee moon soo] spark ui support
9341fcbfe [Lee moon soo] install kubectl and configure log4j in docker image
0f7c0d4e8 [Lee moon soo] add license
f30561189 [Lee moon soo] rename file
2b579ff12 [Lee moon soo] let user override namespace
f4166ad04 [Lee moon soo] make spark container image configurable
0d472ea52 [Lee moon soo] load properties and environment variables
b0e2c36c6 [Lee moon soo] Rbac role, rolebinding
2960dcb87 [Lee moon soo] configure namespace
a4072e6b9 [Lee moon soo] add signal handler
7a8736756 [Lee moon soo] configure spark on kubernetes
263d859d4 [Lee moon soo] use headless service for interpreter pod
7fe9823b1 [Lee moon soo] interpreter pod cascade delete on zeppelin-server delete
86e876435 [Lee moon soo] add services on RBAC
18b8f68cb [Lee moon soo] print spec file contents on debug log
0dea3836b [Lee moon soo] create and connect interpreter pod
9f1b7a169 [Lee moon soo] run kubernetes launcher
2fd2ac8c3 [Lee moon soo] kubernetes mode configuration
58f9f1909 [Lee moon soo] add rbac
36cf391a4 [Lee moon soo] correct plugin name
52bb6c7e1 [Lee moon soo] add k8s dir in package
5f602a65e [Lee moon soo] K8sRemoteInterpreterProcess
07489f76d [Lee moon soo] kubectl with exec
d2f3d5b7e [Lee moon soo] add k8s-standard launcher module
2019-01-18 09:00:07 -08:00
..
JB ZEPPELIN-279: move website w/ docs to master branch 2015-09-05 19:48:22 +09:00
themes/zeppelin [ZEPPELIN-3840] Zeppelin on Kubernetes 2019-01-18 09:00:07 -08:00