zeppelin/python
Malay Majithia 84cb4b5fb9 ZEPPELIN-277 Add Tab as Autocomplete for Notebook non-md paragraphs
### What is this PR for?
This PR will add tab as auto complete invoker if paragraph is non-md and user has not pressed the tab as a first character in the line

### What type of PR is it?
[Improvement]

### What is the Jira issue?
* https://issues.apache.org/jira/browse/ZEPPELIN-277
* https://issues.apache.org/jira/browse/ZEPPELIN-2736

### How should this be tested?

- Build: mvn clean package -Denforcer.skip -DskipTests -Drat.skip
- Open a paragraph
- Press tab with following options: first character, after space

### Questions:
* Does the licenses files need an update? no
* Is there breaking changes for older versions? no
* Does this needs documentation? no

Author: Malay Majithia <malay.majithia@gmail.com>
Author: Lee moon soo <moon@apache.org>

Closes #2542 from malayhm/ZEPPELIN-277 and squashes the following commits:

436f22d [Malay Majithia] Added Tab auto completion flag for python sql and r
b37e084 [Malay Majithia] Fixed lint error
18fc814 [Malay Majithia] Merge branch 'master' into ZEPPELIN-277
b09730e [Malay Majithia] Merge branch 'master' into ZEPPELIN-277
63d69e1 [Malay Majithia] Merge branch 'ZEPPELIN-277' of github.com:malayhm/zeppelin into ZEPPELIN-277
a75f0fe [Malay Majithia] Improved the first character check logic
2ec879d [Malay Majithia] Merge pull request #1 from Leemoonsoo/ZEPPELIN-277-completion-key
77afdba [Lee moon soo] fix style
77b47b6 [Malay Majithia] If all the previous line characters are tab, don't show autocomplete on tab
46f612a [Malay Majithia] ZEPPELIN-277 Add Tab as Autocomplete for Notebook non-md paragraphs
865c0a6 [Lee moon soo] Set python and spark interpreter completionKey
05d5860 [Lee moon soo] Update doc
973068b [Lee moon soo] apply tab completion based on editor.completionKey
5f4d81c [Malay Majithia] If all the previous line characters are tab, don't show autocomplete on tab
655ba88 [Malay Majithia] ZEPPELIN-277 Add Tab as Autocomplete for Notebook non-md paragraphs
2017-10-18 02:12:19 -07:00
..
src ZEPPELIN-277 Add Tab as Autocomplete for Notebook non-md paragraphs 2017-10-18 02:12:19 -07:00
pom.xml ZEPPELIN-2982. Copy interpreter-setting.json to interpreter dir 2017-10-13 13:51:22 +08:00
README.md [ZEPPELIN-2753] Basic Implementation of IPython Interpreter 2017-08-28 08:11:04 +08:00

Overview

Python interpreter for Apache Zeppelin

Architecture

Current interpreter implementation spawns new system python process through ProcessBuilder and re-directs it's stdin\strout to Zeppelin

Details

  • UnitTests

To run full suit of tests, including ones that depend on real Python interpreter AND external libraries installed (like Pandas, Pandasql, etc) do

mvn -Dpython.test.exclude='' test -pl python -am
  • Py4j support

Py4j enables Python programs to dynamically access Java objects in a JVM. It is required in order to use Zeppelin dynamic forms feature.

  • bootstrap process

Interpreter environment is setup with thex bootstrap.py It defines help() and z convenience functions

Dev prerequisites

  • Python 2 or 3 installed with py4j (0.9.2) and matplotlib (1.31 or later) installed on each

  • Tests only checks the interpreter logic and starts any Python process! Python process is mocked with a class that simply output it input.

  • Code wrote in bootstrap.py and bootstrap_input.py should always be Python 2 and 3 compliant.

  • Use PEP8 convention for python code.

Technical overview

  • When interpreter is starting it launches a python process inside a Java ProcessBuilder. Python is started with -i (interactive mode) and -u (unbuffered stdin, stdout and stderr) options. Thus the interpreter has a "sleeping" python process.

  • Interpreter sends command to python with a Java outputStreamWiter and read from an InputStreamReader. To know when stop reading stdout, interpreter sends print "*!?flush reader!?*"after each command and reads stdout until he receives back the *!?flush reader!?*.

  • When interpreter is starting, it sends some Python code (bootstrap.py and bootstrap_input.py) to initialize default behavior and functions (help(), z.input()...). bootstrap_input.py is sent only if py4j library is detected inside Python process.

  • Py4J python and java libraries is used to load Input zeppelin Java class into the python process (make java code with python code !). Therefore the interpreter can directly create Zeppelin input form inside the Python process (and eventually with some python variable already defined). JVM opens a random open port to be accessible from python process.

  • JavaBuilder can't send SIGINT signal to interrupt paragraph execution. Therefore interpreter directly send a kill SIGINT PID to python process to interrupt execution. Python process catch SIGINT signal with some code defined in bootstrap.py

  • Matplotlib figures are displayed inline with the notebook automatically using a built-in backend for zeppelin in conjunction with a post-execute hook.

  • %python.sql support for Pandas DataFrames is optional and provided using https://github.com/yhat/pandasql if user have one installed

IPython Overview

IPython interpreter for Apache Zeppelin

IPython Requirements

You need to install the following python packages to make the IPython interpreter work.

  • jupyter 5.x
  • IPython
  • ipykernel
  • grpcio

If you have installed anaconda, then you just need to install grpc.

IPython Architecture

Current interpreter delegate the whole work to ipython kernel via jupyter_client. Zeppelin would launch a python process which host the ipython kernel. Zeppelin interpreter process will communicate with the python process via grpc. Ideally every feature works in IPython should work in Zeppelin as well.