zeppelin/python
Mina Lee 85d70579f5 [ZEPPELIN-986] Create publish release script
### What is this PR for?
This PR is to automate release publish to maven repository.
We used to use maven-deploy-plugin and maven-release-plugin for release but somehow it didn't work well with Zeppelin so 0.5.5 and 0.5.6 haven't been published to maven repository.

Publishing release to maven repository will eventually help zeppelin to reduce binary package size by leading users to use Dynamic interpreter loading(#908).
Originally below modules were skipped for maven release
 - all interpreters(except spark)
 - zeppelin-display
 - zeppelin-server
 - zeppelin-distribution

on the other hand this pr will skip only:
 - zeppelin-distribution

### What type of PR is it?
Infra

### Todos
- [x] Include SparkR/R interpreter in release
- [x] Create common_release.sh to remove build configuration duplication
- [x] Check curl networking failure

### What is the Jira issue?
[ZEPPELIN-986](https://issues.apache.org/jira/browse/ZEPPELIN-986)

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? Yes, https://cwiki.apache.org/confluence/display/ZEPPELIN/Preparing+Zeppelin+Release will be updated accordingly once this pr is merged.

Author: Mina Lee <minalee@apache.org>

Closes #994 from minahlee/ZEPPELIN-986 and squashes the following commits:

b0e8e67 [Mina Lee] Revert "Add geode, scalding profile in maven artifact build"
cd4cbcd [Mina Lee] curl failure check
c0ea07c [Mina Lee] Fix wrong indentation
a88bc1d [Mina Lee] Add geode, scalding profile in maven artifact build
2cced61 [Mina Lee] Add r to binary package and maven build
903bc12 [Mina Lee] Move duplicate code to common_release.sh
a3eb676 [Mina Lee] Include zeppelin-server module in publish artifiact
48d338f [Mina Lee] Rollback mistakenly removed plugin
aafaf42 [Mina Lee] Follow google shell  style guide
30dcc65 [Mina Lee] remove deploy plugin from pom since custom script will be used instead for deploy
cd1f08c [Mina Lee] Refactor create release script
e764f5f [Mina Lee] Add maven publish release script
2016-06-22 21:22:07 -07:00
..
src Python: fix for 'run all' paragraphs 2016-06-20 10:54:36 +09:00
pom.xml [ZEPPELIN-986] Create publish release script 2016-06-22 21:22:07 -07:00
README.md Python interpreter and doc cleanup 2016-06-17 10:30:58 +09:00

Overview

Python interpreter for Apache Zeppelin

Architecture

Current interpreter implementation spawns new system python process through ProcessBuilder and re-directs it's stdin\strout to Zeppelin

Details

  • Py4j support

Py4j enables Python programs to dynamically access Java objects in a JVM. It is required in order to use Zeppelin dynamic forms feature.

  • bootstrap process

Interpreter environment is setup with thex bootstrap.py It defines help() and z convenience functions

Dev prerequisites

  • Python 2 or 3 installed with py4j (0.9.2) and matplotlib (1.31 or later) installed on each

  • Tests only checks the interpreter logic and starts any Python process! Python process is mocked with a class that simply output it input.

  • Code wrote in bootstrap.py and bootstrap_input.py should always be Python 2 and 3 compliant.

  • Use PEP8 convention for python code.

Technical overview

  • When interpreter is starting it launches a python process inside a Java ProcessBuilder. Python is started with -i (interactive mode) and -u (unbuffered stdin, stdout and stderr) options. Thus the interpreter has a "sleeping" python process.

  • Interpreter sends command to python with a Java outputStreamWiter and read from an InputStreamReader. To know when stop reading stdout, interpreter sends print "*!?flush reader!?*"after each command and reads stdout until he receives back the *!?flush reader!?*.

  • When interpreter is starting, it sends some Python code (bootstrap.py and bootstrap_input.py) to initialize default behavior and functions (help(), z.input()...). bootstrap_input.py is sent only if py4j library is detected inside Python process.

  • Py4J python and java libraries is used to load Input zeppelin Java class into the python process (make java code with python code !). Therefore the interpreter can directly create Zeppelin input form inside the Python process (and eventually with some python variable already defined). JVM opens a random open port to be accessible from python process.

  • JavaBuilder can't send SIGINT signal to interrupt paragraph execution. Therefore interpreter directly send a kill SIGINT PID to python process to interrupt execution. Python process catch SIGINT signal with some code defined in bootstrap.py

  • Matplotlib display feature is made with SVG export (in string) and then displays it with html code.