zeppelin/docs/install/install.md
Lee moon soo caa664d6ee [ZEPPELIN-1683] Run python process in docker container
### What is this PR for?
Inspired by ZEPPELIN-1671 conda interpreter.
Docker can provides kind of virtual environment for python like conda does.
This PR implements %python.docker interpreter that helps run python process in docker container.
This PR implements feature on top of https://github.com/apache/zeppelin/pull/1645

### What type of PR is it?
Feature

### Todos
* [x] - basic feature
* [x] - unittest
* [x] - documentation

### What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-1683

### How should this be tested?
see screenshot

### Screenshots (if appropriate)
![pydocker](https://cloud.githubusercontent.com/assets/1540981/20421814/38a93a9c-ad1b-11e6-8a64-2d0230ff4d8a.gif)

### Questions:
* Does the licenses files need update? no
* Is there breaking changes for older versions? no
* Does this needs documentation? yes

Author: Lee moon soo <moon@apache.org>

Closes #1654 from Leemoonsoo/pydocker and squashes the following commits:

22507e6 [Lee moon soo] Add new line at the end of the file
41c09d9 [Lee moon soo] Run python process in docker container
2016-11-24 09:08:52 -08:00

12 KiB

layout title description group
page Quick Start This page will help you get started and will guide you through installing Apache Zeppelin, running it in the command line and configuring options. install

{% include JB/setup %}

Quick Start

Welcome to Apache Zeppelin! On this page are instructions to help you get started.

Installation

Apache Zeppelin officially supports and is tested on the following environments:

Name Value
Oracle JDK 1.7
(set JAVA_HOME)
OS Mac OSX
Ubuntu 14.X
CentOS 6.X
Windows 7 Pro SP1

Downloading Binary Package

Two binary packages are available on the Apache Zeppelin Download Page. Only difference between these two binaries is interpreters are included in the package file.

  • Package with all interpreters.

    Just unpack it in a directory of your choice and you're ready to go.

  • Package with net-install interpreters.

    Unpack and follow install additional interpreters to install interpreters. If you're unsure, just run ./bin/install-interpreter.sh --all and install all interpreters.

Starting Apache Zeppelin from the Command Line

Starting Apache Zeppelin

On all unix like platforms:

bin/zeppelin-daemon.sh start

If you are on Windows:

bin\zeppelin.cmd

After Zeppelin has started successfully, go to http://localhost:8080 with your web browser.

Stopping Zeppelin

bin/zeppelin-daemon.sh stop

Next Steps

Congratulations, you have successfully installed Apache Zeppelin! Here are few steps you might find useful:

New to Apache Zeppelin...

Zeppelin with Apache Spark ...

Zeppelin with JDBC data sources ...

  • Check JDBC Interpreter to know more about configure and uses multiple JDBC data sources.

Zeppelin with Python ...

  • Check Python interpreter to know more about Matplotlib, Pandas, Conda/Docker environment integration.

Multi-user environment ...

Other useful informations ...

Apache Zeppelin Configuration

You can configure Apache Zeppelin with either environment variables in conf/zeppelin-env.sh (conf\zeppelin-env.cmd for Windows) or Java properties in conf/zeppelin-site.xml. If both are defined, then the environment variables will take priority.

zeppelin-env.sh zeppelin-site.xml Default value Description
ZEPPELIN_PORT zeppelin.server.port 8080 Zeppelin server port
ZEPPELIN_SSL_PORT zeppelin.server.ssl.port 8443 Zeppelin Server ssl port (used when ssl environment/property is set to true)
ZEPPELIN_MEM N/A -Xmx1024m -XX:MaxPermSize=512m JVM mem options
ZEPPELIN_INTP_MEM N/A ZEPPELIN_MEM JVM mem options for interpreter process
ZEPPELIN_JAVA_OPTS N/A JVM options
ZEPPELIN_ALLOWED_ORIGINS zeppelin.server.allowed.origins * Enables a way to specify a ',' separated list of allowed origins for REST and websockets.
i.e. http://localhost:8080
N/A zeppelin.anonymous.allowed true The anonymous user is allowed by default.
ZEPPELIN_SERVER_CONTEXT_PATH zeppelin.server.context.path / Context path of the web application
ZEPPELIN_SSL zeppelin.ssl false
ZEPPELIN_SSL_CLIENT_AUTH zeppelin.ssl.client.auth false
ZEPPELIN_SSL_KEYSTORE_PATH zeppelin.ssl.keystore.path keystore
ZEPPELIN_SSL_KEYSTORE_TYPE zeppelin.ssl.keystore.type JKS
ZEPPELIN_SSL_KEYSTORE_PASSWORD zeppelin.ssl.keystore.password
ZEPPELIN_SSL_KEY_MANAGER_PASSWORD zeppelin.ssl.key.manager.password
ZEPPELIN_SSL_TRUSTSTORE_PATH zeppelin.ssl.truststore.path
ZEPPELIN_SSL_TRUSTSTORE_TYPE zeppelin.ssl.truststore.type
ZEPPELIN_SSL_TRUSTSTORE_PASSWORD zeppelin.ssl.truststore.password
ZEPPELIN_NOTEBOOK_HOMESCREEN zeppelin.notebook.homescreen Display note IDs on the Apache Zeppelin homescreen
i.e. 2A94M5J1Z
ZEPPELIN_NOTEBOOK_HOMESCREEN_HIDE zeppelin.notebook.homescreen.hide false Hide the note ID set by ZEPPELIN_NOTEBOOK_HOMESCREEN on the Apache Zeppelin homescreen.
For the further information, please read Customize your Zeppelin homepage.
ZEPPELIN_WAR_TEMPDIR zeppelin.war.tempdir webapps Location of the jetty temporary directory
ZEPPELIN_NOTEBOOK_DIR zeppelin.notebook.dir notebook The root directory where notebook directories are saved
ZEPPELIN_NOTEBOOK_S3_BUCKET zeppelin.notebook.s3.bucket zeppelin S3 Bucket where notebook files will be saved
ZEPPELIN_NOTEBOOK_S3_USER zeppelin.notebook.s3.user user User name of an S3 bucket
i.e. bucket/user/notebook/2A94M5J1Z/note.json
ZEPPELIN_NOTEBOOK_S3_ENDPOINT zeppelin.notebook.s3.endpoint s3.amazonaws.com Endpoint for the bucket
ZEPPELIN_NOTEBOOK_S3_KMS_KEY_ID zeppelin.notebook.s3.kmsKeyID AWS KMS Key ID to use for encrypting data in S3 (optional)
ZEPPELIN_NOTEBOOK_S3_EMP zeppelin.notebook.s3.encryptionMaterialsProvider Class name of a custom S3 encryption materials provider implementation to use for encrypting data in S3 (optional)
ZEPPELIN_NOTEBOOK_AZURE_CONNECTION_STRING zeppelin.notebook.azure.connectionString The Azure storage account connection string
i.e.
DefaultEndpointsProtocol=https;
AccountName=<accountName>;
AccountKey=<accountKey>
ZEPPELIN_NOTEBOOK_AZURE_SHARE zeppelin.notebook.azure.share zeppelin Azure Share where the notebook files will be saved
ZEPPELIN_NOTEBOOK_AZURE_USER zeppelin.notebook.azure.user user Optional user name of an Azure file share
i.e. share/user/notebook/2A94M5J1Z/note.json
ZEPPELIN_NOTEBOOK_STORAGE zeppelin.notebook.storage org.apache.zeppelin.notebook.repo.VFSNotebookRepo Comma separated list of notebook storage locations
ZEPPELIN_NOTEBOOK_ONE_WAY_SYNC zeppelin.notebook.one.way.sync false If there are multiple notebook storage locations, should we treat the first one as the only source of truth?
ZEPPELIN_NOTEBOOK_PUBLIC zeppelin.notebook.public true Make notebook public (set only `owners`) by default when created/imported. If set to `false` will add `user` to `readers` and `writers` as well, making it private and invisible to other users unless permissions are granted.
ZEPPELIN_INTERPRETERS zeppelin.interpreters org.apache.zeppelin.spark.SparkInterpreter,
org.apache.zeppelin.spark.PySparkInterpreter,
org.apache.zeppelin.spark.SparkSqlInterpreter,
org.apache.zeppelin.spark.DepInterpreter,
org.apache.zeppelin.markdown.Markdown,
org.apache.zeppelin.shell.ShellInterpreter,
...
Comma separated interpreter configurations [Class]
NOTE: This property is deprecated since Zeppelin-0.6.0 and will not be supported from Zeppelin-0.7.0 on.
ZEPPELIN_INTERPRETER_DIR zeppelin.interpreter.dir interpreter Interpreter directory
ZEPPELIN_WEBSOCKET_MAX_TEXT_MESSAGE_SIZE zeppelin.websocket.max.text.message.size 1024000 Size (in characters) of the maximum text message that can be received by websocket.

Start Apache Zeppelin with a service manager

Note : The below description was written based on Ubuntu Linux.

Apache Zeppelin can be auto-started as a service with an init script, using a service manager like upstart.

This is an example upstart script saved as /etc/init/zeppelin.conf This allows the service to be managed with commands such as

sudo service zeppelin start  
sudo service zeppelin stop  
sudo service zeppelin restart

Other service managers could use a similar approach with the upstart argument passed to the zeppelin-daemon.sh script.

bin/zeppelin-daemon.sh upstart

zeppelin.conf

description "zeppelin"

start on (local-filesystems and net-device-up IFACE!=lo)
stop on shutdown

# Respawn the process on unexpected termination
respawn

# respawn the job up to 7 times within a 5 second period.
# If the job exceeds these values, it will be stopped and marked as failed.
respawn limit 7 5

# zeppelin was installed in /usr/share/zeppelin in this example
chdir /usr/share/zeppelin
exec bin/zeppelin-daemon.sh upstart

Building from Source

If you want to build from source instead of using binary package, follow the instructions here.