### What is this PR for? Inspired by ZEPPELIN-1671 conda interpreter. Docker can provides kind of virtual environment for python like conda does. This PR implements %python.docker interpreter that helps run python process in docker container. This PR implements feature on top of https://github.com/apache/zeppelin/pull/1645 ### What type of PR is it? Feature ### Todos * [x] - basic feature * [x] - unittest * [x] - documentation ### What is the Jira issue? https://issues.apache.org/jira/browse/ZEPPELIN-1683 ### How should this be tested? see screenshot ### Screenshots (if appropriate)  ### Questions: * Does the licenses files need update? no * Is there breaking changes for older versions? no * Does this needs documentation? yes Author: Lee moon soo <moon@apache.org> Closes #1654 from Leemoonsoo/pydocker and squashes the following commits:22507e6[Lee moon soo] Add new line at the end of the file41c09d9[Lee moon soo] Run python process in docker container
12 KiB
| layout | title | description | group |
|---|---|---|---|
| page | Quick Start | This page will help you get started and will guide you through installing Apache Zeppelin, running it in the command line and configuring options. | install |
{% include JB/setup %}
Quick Start
Welcome to Apache Zeppelin! On this page are instructions to help you get started.
Installation
Apache Zeppelin officially supports and is tested on the following environments:
| Name | Value |
|---|---|
| Oracle JDK | 1.7 (set JAVA_HOME) |
| OS | Mac OSX Ubuntu 14.X CentOS 6.X Windows 7 Pro SP1 |
Downloading Binary Package
Two binary packages are available on the Apache Zeppelin Download Page. Only difference between these two binaries is interpreters are included in the package file.
-
Package with
allinterpreters.Just unpack it in a directory of your choice and you're ready to go.
-
Package with
net-installinterpreters.Unpack and follow install additional interpreters to install interpreters. If you're unsure, just run
./bin/install-interpreter.sh --alland install all interpreters.
Starting Apache Zeppelin from the Command Line
Starting Apache Zeppelin
On all unix like platforms:
bin/zeppelin-daemon.sh start
If you are on Windows:
bin\zeppelin.cmd
After Zeppelin has started successfully, go to http://localhost:8080 with your web browser.
Stopping Zeppelin
bin/zeppelin-daemon.sh stop
Next Steps
Congratulations, you have successfully installed Apache Zeppelin! Here are few steps you might find useful:
New to Apache Zeppelin...
- For an in-depth overview, head to Explore Apache Zeppelin UI.
- And then, try run tutorial notebook in your Zeppelin.
- And see how to change configurations like port number, etc.
Zeppelin with Apache Spark ...
- To know more about deep integration with Apache Spark, check Spark Interpreter.
Zeppelin with JDBC data sources ...
- Check JDBC Interpreter to know more about configure and uses multiple JDBC data sources.
Zeppelin with Python ...
- Check Python interpreter to know more about Matplotlib, Pandas, Conda/Docker environment integration.
Multi-user environment ...
- Turn on authentication.
- Manage your notebook permission.
- For more informations, go to More -> Security section.
Other useful informations ...
- Learn how Display System works.
- Use Service Manager to start Zeppelin.
- If you're using previous version please see Upgrade Zeppelin version.
Apache Zeppelin Configuration
You can configure Apache Zeppelin with either environment variables in conf/zeppelin-env.sh (conf\zeppelin-env.cmd for Windows) or Java properties in conf/zeppelin-site.xml. If both are defined, then the environment variables will take priority.
| zeppelin-env.sh | zeppelin-site.xml | Default value | Description |
|---|---|---|---|
| ZEPPELIN_PORT | zeppelin.server.port | 8080 | Zeppelin server port |
| ZEPPELIN_SSL_PORT | zeppelin.server.ssl.port | 8443 | Zeppelin Server ssl port (used when ssl environment/property is set to true) |
| ZEPPELIN_MEM | N/A | -Xmx1024m -XX:MaxPermSize=512m | JVM mem options |
| ZEPPELIN_INTP_MEM | N/A | ZEPPELIN_MEM | JVM mem options for interpreter process |
| ZEPPELIN_JAVA_OPTS | N/A | JVM options | |
| ZEPPELIN_ALLOWED_ORIGINS | zeppelin.server.allowed.origins | * | Enables a way to specify a ',' separated list of allowed origins for REST and websockets. i.e. http://localhost:8080 |
| N/A | zeppelin.anonymous.allowed | true | The anonymous user is allowed by default. |
| ZEPPELIN_SERVER_CONTEXT_PATH | zeppelin.server.context.path | / | Context path of the web application |
| ZEPPELIN_SSL | zeppelin.ssl | false | |
| ZEPPELIN_SSL_CLIENT_AUTH | zeppelin.ssl.client.auth | false | |
| ZEPPELIN_SSL_KEYSTORE_PATH | zeppelin.ssl.keystore.path | keystore | |
| ZEPPELIN_SSL_KEYSTORE_TYPE | zeppelin.ssl.keystore.type | JKS | |
| ZEPPELIN_SSL_KEYSTORE_PASSWORD | zeppelin.ssl.keystore.password | ||
| ZEPPELIN_SSL_KEY_MANAGER_PASSWORD | zeppelin.ssl.key.manager.password | ||
| ZEPPELIN_SSL_TRUSTSTORE_PATH | zeppelin.ssl.truststore.path | ||
| ZEPPELIN_SSL_TRUSTSTORE_TYPE | zeppelin.ssl.truststore.type | ||
| ZEPPELIN_SSL_TRUSTSTORE_PASSWORD | zeppelin.ssl.truststore.password | ||
| ZEPPELIN_NOTEBOOK_HOMESCREEN | zeppelin.notebook.homescreen | Display note IDs on the Apache Zeppelin homescreen i.e. 2A94M5J1Z |
|
| ZEPPELIN_NOTEBOOK_HOMESCREEN_HIDE | zeppelin.notebook.homescreen.hide | false | Hide the note ID set by ZEPPELIN_NOTEBOOK_HOMESCREEN on the Apache Zeppelin homescreen. For the further information, please read Customize your Zeppelin homepage. |
| ZEPPELIN_WAR_TEMPDIR | zeppelin.war.tempdir | webapps | Location of the jetty temporary directory |
| ZEPPELIN_NOTEBOOK_DIR | zeppelin.notebook.dir | notebook | The root directory where notebook directories are saved |
| ZEPPELIN_NOTEBOOK_S3_BUCKET | zeppelin.notebook.s3.bucket | zeppelin | S3 Bucket where notebook files will be saved |
| ZEPPELIN_NOTEBOOK_S3_USER | zeppelin.notebook.s3.user | user | User name of an S3 bucket i.e. bucket/user/notebook/2A94M5J1Z/note.json |
| ZEPPELIN_NOTEBOOK_S3_ENDPOINT | zeppelin.notebook.s3.endpoint | s3.amazonaws.com | Endpoint for the bucket |
| ZEPPELIN_NOTEBOOK_S3_KMS_KEY_ID | zeppelin.notebook.s3.kmsKeyID | AWS KMS Key ID to use for encrypting data in S3 (optional) | |
| ZEPPELIN_NOTEBOOK_S3_EMP | zeppelin.notebook.s3.encryptionMaterialsProvider | Class name of a custom S3 encryption materials provider implementation to use for encrypting data in S3 (optional) | |
| ZEPPELIN_NOTEBOOK_AZURE_CONNECTION_STRING | zeppelin.notebook.azure.connectionString | The Azure storage account connection string i.e. DefaultEndpointsProtocol=https; |
|
| ZEPPELIN_NOTEBOOK_AZURE_SHARE | zeppelin.notebook.azure.share | zeppelin | Azure Share where the notebook files will be saved |
| ZEPPELIN_NOTEBOOK_AZURE_USER | zeppelin.notebook.azure.user | user | Optional user name of an Azure file share i.e. share/user/notebook/2A94M5J1Z/note.json |
| ZEPPELIN_NOTEBOOK_STORAGE | zeppelin.notebook.storage | org.apache.zeppelin.notebook.repo.VFSNotebookRepo | Comma separated list of notebook storage locations |
| ZEPPELIN_NOTEBOOK_ONE_WAY_SYNC | zeppelin.notebook.one.way.sync | false | If there are multiple notebook storage locations, should we treat the first one as the only source of truth? |
| ZEPPELIN_NOTEBOOK_PUBLIC | zeppelin.notebook.public | true | Make notebook public (set only `owners`) by default when created/imported. If set to `false` will add `user` to `readers` and `writers` as well, making it private and invisible to other users unless permissions are granted. |
| ZEPPELIN_INTERPRETERS | zeppelin.interpreters | org.apache.zeppelin.spark.SparkInterpreter, org.apache.zeppelin.spark.PySparkInterpreter, org.apache.zeppelin.spark.SparkSqlInterpreter, org.apache.zeppelin.spark.DepInterpreter, org.apache.zeppelin.markdown.Markdown, org.apache.zeppelin.shell.ShellInterpreter, ... |
Comma separated interpreter configurations [Class] NOTE: This property is deprecated since Zeppelin-0.6.0 and will not be supported from Zeppelin-0.7.0 on. |
| ZEPPELIN_INTERPRETER_DIR | zeppelin.interpreter.dir | interpreter | Interpreter directory |
| ZEPPELIN_WEBSOCKET_MAX_TEXT_MESSAGE_SIZE | zeppelin.websocket.max.text.message.size | 1024000 | Size (in characters) of the maximum text message that can be received by websocket. |
Start Apache Zeppelin with a service manager
Note : The below description was written based on Ubuntu Linux.
Apache Zeppelin can be auto-started as a service with an init script, using a service manager like upstart.
This is an example upstart script saved as /etc/init/zeppelin.conf
This allows the service to be managed with commands such as
sudo service zeppelin start
sudo service zeppelin stop
sudo service zeppelin restart
Other service managers could use a similar approach with the upstart argument passed to the zeppelin-daemon.sh script.
bin/zeppelin-daemon.sh upstart
zeppelin.conf
description "zeppelin"
start on (local-filesystems and net-device-up IFACE!=lo)
stop on shutdown
# Respawn the process on unexpected termination
respawn
# respawn the job up to 7 times within a 5 second period.
# If the job exceeds these values, it will be stopped and marked as failed.
respawn limit 7 5
# zeppelin was installed in /usr/share/zeppelin in this example
chdir /usr/share/zeppelin
exec bin/zeppelin-daemon.sh upstart
Building from Source
If you want to build from source instead of using binary package, follow the instructions here.