### What is this PR for?
Configure IPython interpreter in Docker image by installing necessary dependencies.
### What type of PR is it?
Improvement
### What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-3989
### How should this be tested?
Build docker image and see if %python interpreter works with Ipython kernel.
### Questions:
* Does the licenses files need update? no
* Is there breaking changes for older versions? no
* Does this needs documentation? no
Author: Lee moon soo <moon@apache.org>
Closes#3303 from Leemoonsoo/ZEPPELIN-3989 and squashes the following commits:
e64ac428e [Lee moon soo] install ipython interpreter dependencies
### What type of PR is it?
This PR adds ability to run Zeppelin on Kubernetes. It aims
- Zero configuration to start Zeppelin on Kubernetes. (and Spark on Kubernetes)
- Run everything on Kubernetes: Zeppelin, Interpreters, Spark.
- Highly customizable to adopt various user configurations and extensions.
Key features are
- Provides zeppelin-server.yaml file for `kubectl` to run Zeppelin server
- All interpreters are automatically running as a Pod.
- Spark interpreter automatically configured to use [Spark on Kubernetes](https://spark.apache.org/docs/latest/running-on-kubernetes.html)
- Reverse proxy is configured to access Spark UI
To do
- [x] Document how reverse proxy for Spark UI works and how to configure custom domain.
- [x] Document how to customize zeppelin-server and interpreter yaml.
- [x] Document new configurations
- [x] Document how to mount volume for notebook and configurations
### How it works
#### Run Zeppelin Server on Kubernetes
`k8s/zeppelin-server.yaml` is provided to run Zeppelin Server with few sidecars and configurations.
This file is easy to publish (user can easily consume it using `curl`), highly customizable while it includes all the necessary things.
#### K8s Interpreter launcher
This PR adds new module, `launcher-k8s-standard` under `zeppelin/zeppelin-plugins/launcher/k8s-standard/` directory. This launcher is [automatically being selected](https://github.com/apache/zeppelin/pull/3240/files#diff-82fddd2ffb77aaffc4b9cf7b5b1eaa79) when Zeppelin is running on Kubernetes. The launcher both handles Spark interpreter and All other interpreters.
The launcher launches interpreter as a Pod using template [k8s/interpreter/100-interpreter-pod.yaml](https://github.com/apache/zeppelin/pull/3240/files#diff-d9ce62e2c992d32f0184d7edb862f3c4).
Reason filename has `100-` in prefix is because all files in the directory is consumed in alphabetical order by launcher on interpreter start/stop. User can drop more files here to extend/customize interpreter, and filename can be used to control order. The template is rendered by [jinjava](https://github.com/HubSpot/jinjava).
#### Spark interpreter
When interpreter group is `spark`, K8sRemoteInterpreterProcess [sets necessary spark configuration](https://github.com/apache/zeppelin/pull/3240/files#diff-6d1d3084f55bdd519e39ede4a619e73dR297) automatically to use [Spark on Kubernetes](https://spark.apache.org/docs/latest/running-on-kubernetes.html). User doesn't have to configure anything. It uses client mode.
#### Spark UI
We may make user manually configure port-forward or do something to access Spark UI, but that's not optimal. It is the best when Spark UI is automatically accessible when user have access to Zeppelin UI, without any extra configuration.
To enable this, Zeppelin server Pod has a reverse proxy as a sidecar, and it split traffic to Zeppelin server and Spark UI running in the other Pod. It assume both `service.domain.com` and `*.service.domain.com` point the nginx proxy address. `service.domain.com` is directed to ZeppelinServer, `*.service.domain.com` is directed to interpreter Pod.
`<port>-<interpreter pod svc name>.service.domain.com` is convention to access any application running in interpreter Pod. If Spark interpreter Pod is running with a name `spark-axefeg` and Spark UI is running on port 4040,
```
4040-spark-axefeg.service.domain.com
```
is the address to access Spark UI. Default service domain is [local.zeppelin-project.org:8080](https://github.com/apache/zeppelin/pull/3240/files#diff-56ccb2e2c2617b27dbaae866d9431e51R22), while `local.zeppelin-project.org` and `*.local.zeppelin-project.org` point `127.0.0.1`, and it works with `kubectl port-forward`.
### What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-3840
### How should this be tested?
Prepare a Kubernetes cluster with enough resources (cpus > 5, mem > 6g).
If you're using [minikube](https://github.com/kubernetes/minikube), check your capacity using `kubectl describe node` command before start.
You'll need to build Zeppelin docker image and Spark docker image to test. Please follow guide docs/quickstart/kubernetes.md.
To quickly try without building docker images, I have uploaded pre-built image on docker hub `moon/zeppelin:0.9.0-SNAPSHOT`, `moon/spark:2.4.0`. Try following command
```
ZEPPELIN_SERVER_YAML="curl -s https://raw.githubusercontent.com/Leemoonsoo/zeppelin/kubernetes/k8s/zeppelin-server.yaml"
$ZEPPELIN_SERVER_YAML | sed 's/apache\/zeppelin:0.9.0-SNAPSHOT/moon\/zeppelin:0.9.0-SNAPSHOT/' | sed 's/spark:2.4.0/moon\/spark:2.4.0/' | kubectl apply -f -
```
And port forward
```
kubectl port-forward zeppelin-server 8080:80
```
And browse http://localhost:8080
To clean up
```
$ZEPPELIN_SERVER_YAML | sed 's/apache\/zeppelin:0.9.0-SNAPSHOT/moon\/zeppelin:0.9.0-SNAPSHOT/' | sed 's/spark:2.4.0/moon\/spark:2.4.0/' | kubectl delete -f -
```
### Screenshots (if appropriate)
See this video https://youtu.be/7E4ZGn4pnTo
### Future work
- Per interpreter docker image
- Blocking communication between interpreter Pod.
- Spark Interpreter Pod has Role CRUD for any pod/service in the same namespace. Which should be restricted to only Spark executors Pod.
- Per note interpreter mode by default when Zeppelin is running on Kubernetes
### Questions:
* Does the licenses files need update? no
* Is there breaking changes for older versions? no
* Does this needs documentation? yes
Author: Lee moon soo <leemoonsoo@gmail.com>
Author: Lee moon soo <moon@apache.org>
Closes#3240 from Leemoonsoo/kubernetes and squashes the following commits:
0100a36f2 [Lee moon soo] update how it works on docs, add some comments on yaml files
423412a93 [Lee moon soo] zeppelin.k8s.mode -> zeppelin.run.mode
4e7d8170d [Lee moon soo] localtest.me -> local.zeppelin-project.org
993a0e44e [Lee moon soo] document configurations
9ab6fc420 [Lee moon soo] address code review
22e090f61 [Lee moon soo] logger -> LOGGER
11960dd59 [Lee moon soo] update corresponding test as well
3b652a48e [Lee moon soo] Make spark executor set ownerreference correctly
1a3a07098 [Lee moon soo] Set ownerreference to Role and Rolebinding of interpreter
e2dc88a19 [Lee moon soo] suppress error log when wait target is already removed
fa36c18e3 [Lee moon soo] Make spark master configurable
b4f58a9a1 [Lee moon soo] sig term for quick termination
64a56b5c9 [Lee moon soo] Add docs
e9ce64fe7 [Lee moon soo] update dockerfile
ec09b8b88 [Lee moon soo] add test
3078bac55 [Lee moon soo] spark ui support
9341fcbfe [Lee moon soo] install kubectl and configure log4j in docker image
0f7c0d4e8 [Lee moon soo] add license
f30561189 [Lee moon soo] rename file
2b579ff12 [Lee moon soo] let user override namespace
f4166ad04 [Lee moon soo] make spark container image configurable
0d472ea52 [Lee moon soo] load properties and environment variables
b0e2c36c6 [Lee moon soo] Rbac role, rolebinding
2960dcb87 [Lee moon soo] configure namespace
a4072e6b9 [Lee moon soo] add signal handler
7a8736756 [Lee moon soo] configure spark on kubernetes
263d859d4 [Lee moon soo] use headless service for interpreter pod
7fe9823b1 [Lee moon soo] interpreter pod cascade delete on zeppelin-server delete
86e876435 [Lee moon soo] add services on RBAC
18b8f68cb [Lee moon soo] print spec file contents on debug log
0dea3836b [Lee moon soo] create and connect interpreter pod
9f1b7a169 [Lee moon soo] run kubernetes launcher
2fd2ac8c3 [Lee moon soo] kubernetes mode configuration
58f9f1909 [Lee moon soo] add rbac
36cf391a4 [Lee moon soo] correct plugin name
52bb6c7e1 [Lee moon soo] add k8s dir in package
5f602a65e [Lee moon soo] K8sRemoteInterpreterProcess
07489f76d [Lee moon soo] kubectl with exec
d2f3d5b7e [Lee moon soo] add k8s-standard launcher module
### What is this PR for?
trivial pom file change
### What type of PR is it?
[Improvement]
### Todos
* [ ] - Task
### What is the Jira issue?
* https://issues.apache.org/jira/browse/ZEPPELIN-3192
### How should this be tested?
* Travis pass
### Screenshots (if appropriate)
### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No
Author: Jeff Zhang <zjffdu@apache.org>
Closes#2748 from zjffdu/ZEPPELIN-3192 and squashes the following commits:
635acde [Jeff Zhang] ZEPPELIN-3192. Bump up Zeppelin version to 0.9.0-SNAPSHOT
### What is this PR for?
The PR updates Spark version to the current 2.1.2, since the one which is present now (2.1.1) is not available anymore.
### What type of PR is it?
Fix
### What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-3017
### How should this be tested?
manually
Author: mark91 <marcogaido91@gmail.com>
Closes#2636 from mgaido91/ZEPPELIN-3017 and squashes the following commits:
b3cfb44 [mark91] [ZEPPELIN-3017] fix Spark version in Dockerfiles
### What is this PR for?
Using OpenJDK at distributing docker image will reduce legal threats.
### What type of PR is it?
Bug Fix
### Todos
### What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-2861
### How should this be tested?
`docker build scripts/docker/zeppelin/bin`
### Screenshots (if appropriate)
### Questions:
* Does the licenses files need update? **NO**
* Is there breaking changes for older versions? **NO**
* Does this needs documentation? **NO**
Author: Jinkyu Yi <jincreator@jincreator.net>
Closes#2536 from jincreator/ZEPPELIN-2861 and squashes the following commits:
3b4fbcb [Jinkyu Yi] [ZEPPELIN-2861] Use OpenJDK in docker image.
### What is this PR for?
A few sentences describing the overall goals of the pull request's commits.
First time? Check out the contributing guide - https://zeppelin.apache.org/contribution/contributions.html
### What type of PR is it?
[Bug Fix | Improvement | Feature | Documentation | Hot Fix | Refactoring]
### Todos
* [ ] - Task
### What is the Jira issue?
* Open an issue on Jira https://issues.apache.org/jira/browse/ZEPPELIN/
* Put link here, and add [ZEPPELIN-*Jira number*] in PR title, eg. [ZEPPELIN-533]
### How should this be tested?
Outline the steps to test the PR here.
### Screenshots (if appropriate)
### Questions:
* Does the licenses files need update?
* Is there breaking changes for older versions?
* Does this needs documentation?
Author: sven0726 <sven0726@gmail.com>
Closes#2370 from sven0726/master and squashes the following commits:
15af90e [sven0726] change hadoop profile version
3b68a5f [sven0726] change the spark version of dockerfile
### What is this PR for?
Use single `Dockerfile` for each release since [apache infra uses tag pushes](https://issues.apache.org/jira/browse/INFRA-12781) to build an image.
- https://issues.apache.org/jira/browse/INFRA-12781
After release process finishes, dockerhub will build using the pushed tag.
### What type of PR is it?
[Improvement]
### What is the Jira issue?
[ZEPPELIN-2492](https://issues.apache.org/jira/browse/ZEPPELIN-2492)
### How should this be tested?
1. `./dev/change_zeppelin_version.sh 0.8.0-SNAPSHOT 0.7.1`
2. Check that version is properly set: `vi scrtips/docker/zeppelin/bin/Dockerfile`
3. Build docker image `cd scripts/docker/zeppelin/bin; docker build -t zeppelin:0.7.1 ./`
4. Run the image: `docker run -p 8080:8080 --rm --name zeppelin zeppelin:0.7.1`
### Screenshots (if appropriate)
NONE
### Questions:
* Does the licenses files need update? - NO
* Is there breaking changes for older versions? - NO
* Does this needs documentation? - NO
Author: 1ambda <1amb4a@gmail.com>
Closes#2318 from 1ambda/ZEPPELIN-2492/use-single-dockerfile-for-each-tag and squashes the following commits:
483bec3 [1ambda] docs: Update README for Dockerfile
5826c8c [1ambda] fix: Use single dockerfile for tag push
### What is this PR for?
Created `Dockerfile` for released bin
- based on **Ubuntu:16.04 (LTS)** for desktop usage
- **JDK 8**
- **R** with basic packages
- **Python 2** with basic packages
- **miniconda2** for `%python.conda`
### Details
We already discussed about using alpine image in https://github.com/apache/zeppelin/pull/1761.
- However, it's not designed for desktop usage
- Doesn't have some official packages (R, ...)
- Not familiar to users for desktop OS
That the reason why ubuntu is used in base image
```
zeppelin base b3818f9ae4b1 11 hours ago 1.67 GB
zeppelin 0.6.2 c0a4d8556f92 7 hours ago 2.29 GB
zeppelin 0.7.0 c4a5ad0d04bd 8 hours ago 2.5 GB
zeppelin 0.7.1 54173b77743b 7 hours ago 2.49 GB
```
### What type of PR is it?
[Feature]
### Todos
* [x] - base image
* [x] - script for creating bin images
* [x] - bin image template
### What is the Jira issue?
[ZEPPELIN-1711](https://issues.apache.org/jira/browse/ZEPPELIN-1711)
### How should this be tested?
1. build base image `cd scripts/docker/zeppelin/base; docker build -t zeppelin:base ./`
2. build bin image `cd scripts/docker/zeppelin/0.7.1; docker build -t zeppelin:0.7.1 ./`
3. execute docker images
```
docker run -p 8080:8080 --rm --name zeppelin zeppelin:0.7.1
```
since it takes time to build, you can use already [published docker images](https://hub.docker.com/r/1ambda/docker-zeppelin/)
```
docker run -p 8080:8080 --rm --name zeppelin 1ambda/docker-zeppelin:0.7.1
```
4. should be able to run spark, python and R tutorials
### Screenshots (if appropriate)
NO
### Questions:
* Does the licenses files need update? - NO
* Is there breaking changes for older versions? - NO
* Does this needs documentation? - YES, updated
Author: 1ambda <1amb4a@gmail.com>
Closes#2264 from 1ambda/ZEPPELIN-1711/bin-dockerfile and squashes the following commits:
69a0b1f [1ambda] docs: Update docker.md
ced897f [1ambda] fix: DON'T remove /tmp
1f6da76 [1ambda] feat: Dockerfiles for 060, 070, 071
0fc3f75 [1ambda] feat: Add template for bin image
5cba56e [1ambda] feat: Use ubuntu for base image
### What is this PR for?
This PR is for making docker images for zeppelin releases. It contains a script for building image for each release. Another script is used for publishing images to zeppelin Dockerhub account.
This repo, https://github.com/mfelgamal/zeppelin-dockers, is a demonstration of this PR. It contains zeppelin-base image and an image for each zeppelin release.
### What type of PR is it?
[Feature]
### Todos
- Review Comments
- Documentation
### What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-1386
### How should this be tested?
- run create_release script or publish_release script.
### Screenshots (if appropriate)
### Questions:
- Does the licenses files need update? no
- Is there breaking changes for older versions? no
- Does this needs documentation? yes
Author: mahmoudelgamal <mahmoudf.elgamal@gmail.com>
Author: mfelgamal <mahmoudf.elgamal@gmail.com>
Author: Mahmoud Elgamal <mahmoudf.elgamal@gmail.com>
Author: 1ambda <1amb4a@gmail.com>
Closes#1538 from mfelgamal/zeppelin-dockers and squashes the following commits:
cc8493f [Mahmoud Elgamal] Merge pull request #3 from 1ambda/fix/remove-startzeppelinsh
d48ecef [1ambda] fix: Remove start-zeppelin.sh
b64c680 [mahmoudelgamal] Remove gcc and g++ for decreasing the size
1f093d4 [mahmoudelgamal] Add script start-zeppelin to zeppelin-base
d2c744e [mahmoudelgamal] add scala to zeppelin-base
fd23970 [mahmoudelgamal] remove bash erorr message.
e1d4b77 [mahmoudelgamal] add R and python to zeppelin-base
e731cb4 [mahmoudelgamal] Add java-cacerts to zeppelin-base
e642309 [mahmoudelgamal] Add documentation and some modifications
231a414 [mahmoudelgamal] Add zeppelin-base image
ac06f3a [mahmoudelgamal] Make docker image for zeppelin release
48d0a01 [mfelgamal] Merge pull request #1 from apache/master
### What is this PR for?
This PR is for the documentation of running zeppelin with CDH docker environment.
and This PR is the part of https://issues.apache.org/jira/browse/ZEPPELIN-1198.
Tested CDH5.7 on ubuntu.
### What type of PR is it?
Documentation
### What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-1281
### Questions:
* Does the licenses files need update? no
* Is there breaking changes for older versions? no
* Does this needs documentation? no
Author: astroshim <hsshim@nflabs.com>
Author: AhyoungRyu <ahyoungryu@apache.org>
Author: HyungSung <hsshim@nflabs.com>
Closes#1451 from astroshim/ZEPPELIN-1281 and squashes the following commits:
5dcb8c1 [astroshim] move configurations to right path and add excluding rat-plugin
09408e3 [HyungSung] Merge pull request #11 from AhyoungRyu/ZEPPELIN-1281-ahyoung
850119c [AhyoungRyu] Generate TOC & change some sentences
e687a53 [AhyoungRyu] Replace zeppelin_with_cdh.png to crop the url part
cc9a023 [AhyoungRyu] Remove main title link anchor
b525f68 [astroshim] separate cdh doc with spark_cluster_mode.md
e66993f [astroshim] fix doc
a7b5b2d [astroshim] cdh docker environment
### What is this PR for?
This PR is for the documentation of running zeppelin on production environments especially spark on mesos via Docker.
Related issue is https://github.com/apache/zeppelin/pull/1227 and https://github.com/apache/zeppelin/pull/1318 and I got a lot of hints from https://github.com/sequenceiq/hadoop-docker.
Tested on ubuntu.
### What type of PR is it?
Documentation
### What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-1279
### How should this be tested?
You can refer to https://github.com/apache/zeppelin/blob/master/docs/README.md#build-documentation.
### Questions:
* Does the licenses files need update? no
* Is there breaking changes for older versions? no
* Does this needs documentation? no
Author: astroshim <hsshim@nflabs.com>
Author: AhyoungRyu <fbdkdud93@hanmail.net>
Author: HyungSung <hsshim@nflabs.com>
Closes#1389 from astroshim/ZEPPELIN-1279 and squashes the following commits:
974366a [HyungSung] Merge pull request #10 from AhyoungRyu/ZEPPELIN-1279-ahyoung
076fdba [AhyoungRyu] Change zeppelin_mesos_conf.png file
1cbe9d3 [astroshim] fix spark version and mesos
2b821b4 [astroshim] fix docs
159bafc [astroshim] fix anchor
d8c43b4 [astroshim] add navigation
c808350 [astroshim] add image file and doc
a3b0ded [astroshim] create dockerfile for mesos
### What is this PR for?
This PR is for the documentation of running zeppelin on production environments especially spark on yarn.
Related issue is https://github.com/apache/zeppelin/pull/1227 and I got a lot of hints from https://github.com/sequenceiq/hadoop-docker.
Tested on ubuntu.
### What type of PR is it?
Documentation
### What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-1280
### Questions:
* Does the licenses files need update? no
* Is there breaking changes for older versions? no
* Does this needs documentation? no
Author: astroshim <hsshim@nflabs.com>
Author: AhyoungRyu <fbdkdud93@hanmail.net>
Author: HyungSung <hsshim@nflabs.com>
Closes#1318 from astroshim/ZEPPELIN-1280 and squashes the following commits:
60958cd [astroshim] small changes for doc
6c44b7b [astroshim] Merge branch 'master' into ZEPPELIN-1280
dad297c [astroshim] update version
4c8d72d [astroshim] merge with Ayoung's
8c62cf1 [astroshim] fixed felixcheung pointed out.
86ca513 [HyungSung] Merge pull request #9 from AhyoungRyu/ZEPPELIN-1280-ahyoung
cde5f8d [AhyoungRyu] Modify document description so that this docs can be searched
9e9390c [AhyoungRyu] Minor update for spark_cluster_mode.md
633c930 [astroshim] running zeppelin on yarn
### What is this PR for?
This PR is for documentation for running zeppelin on production environments.
### What type of PR is it?
Documentation
### What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-1198
### Questions:
* Does the licenses files need update? no
* Is there breaking changes for older versions? no
* Does this needs documentation? no
Author: astroshim <hsshim@nflabs.com>
Closes#1227 from astroshim/ZEPPELIN-1198/standalone and squashes the following commits:
53a32f2 [astroshim] add 'via Docker'
61a0e5e [astroshim] add apache license header
83fdef6 [astroshim] doc for spark standalone