### What is this PR for?
Currently, zeppelin only support 3 kinds of dynamic form controls: TextBox, Select, CheckBox. All the things are in `Input.java`, this is hard to add new controls, this PR is for refactoring Input to make dynamic forms extensible. Main Changes:
* Make `Input` as the base class of dynamic forms also use it as the factory class
* All the concret dynamic forms extend `Input`
* Add method `toJson` and `fromJson` for `GUI` for `GUI`'s serialization/deserialization. I plan to do it for other classes as well, so that we can remove duplicated serde code and also make it easy to test serialization/deserialization
* Change `z.input` to `z.textbox` as I think z.input is a little misleading. But I still keep `z.input` and make `z.input` as deprecated.
* Ideally the new input forms' json should be the same as the old input form json. But there's one bug in the old input form, `type` is missing if the input forms are created in frontend for textbox and select. So I keep the old input forms for compatibility. I will load the old input forms json and convert it into new input forms, and after saving, `note.json` would have the new input forms json.
After this PR, user needs to do 3 things to add new ui controls
* Implement its UI control classes, (refer TextBox/CheckBox/Select), and specify it in `TypeAdapterFactory` of `Input` for serde.
* Add parsing logic in `Input.getInputForm` if you want to support this control in frontend.
* Add display logic in `paragraph-parameterizedQueryForm.html`
### What type of PR is it?
[ Improvement | Refactoring]
### Todos
* [ ] - Task
### What is the Jira issue?
* https://issues.apache.org/jira/browse/ZEPPELIN-2395
### How should this be tested?
Test is added
### Screenshots (if appropriate)
### Questions:
* Does the licenses files need update? Yes
* Is there breaking changes for older versions? No
* Does this needs documentation? No
Author: Jeff Zhang <zjffdu@apache.org>
Closes #2245 from zjffdu/ZEPPELIN-2395 and squashes the following commits:
|
||
|---|---|---|
| .. | ||
| src | ||
| pom.xml | ||
| README.md | ||
Overview
Python interpreter for Apache Zeppelin
Architecture
Current interpreter implementation spawns new system python process through ProcessBuilder and re-directs it's stdin\strout to Zeppelin
Details
- UnitTests
To run full suit of tests, including ones that depend on real Python interpreter AND external libraries installed (like Pandas, Pandasql, etc) do
mvn -Dpython.test.exclude='' test -pl python -am
- Py4j support
Py4j enables Python programs to dynamically access Java objects in a JVM. It is required in order to use Zeppelin dynamic forms feature.
- bootstrap process
Interpreter environment is setup with thex bootstrap.py
It defines help() and z convenience functions
Dev prerequisites
-
Python 2 or 3 installed with py4j (0.9.2) and matplotlib (1.31 or later) installed on each
-
Tests only checks the interpreter logic and starts any Python process! Python process is mocked with a class that simply output it input.
-
Code wrote in
bootstrap.pyandbootstrap_input.pyshould always be Python 2 and 3 compliant. -
Use PEP8 convention for python code.
Technical overview
-
When interpreter is starting it launches a python process inside a Java ProcessBuilder. Python is started with -i (interactive mode) and -u (unbuffered stdin, stdout and stderr) options. Thus the interpreter has a "sleeping" python process.
-
Interpreter sends command to python with a Java
outputStreamWiterand read from anInputStreamReader. To know when stop reading stdout, interpreter sendsprint "*!?flush reader!?*"after each command and reads stdout until he receives back the*!?flush reader!?*. -
When interpreter is starting, it sends some Python code (bootstrap.py and bootstrap_input.py) to initialize default behavior and functions (
help(), z.input()...). bootstrap_input.py is sent only if py4j library is detected inside Python process. -
Py4J python and java libraries is used to load Input zeppelin Java class into the python process (make java code with python code !). Therefore the interpreter can directly create Zeppelin input form inside the Python process (and eventually with some python variable already defined). JVM opens a random open port to be accessible from python process.
-
JavaBuilder can't send SIGINT signal to interrupt paragraph execution. Therefore interpreter directly send a
kill SIGINT PIDto python process to interrupt execution. Python process catch SIGINT signal with some code defined in bootstrap.py -
Matplotlib figures are displayed inline with the notebook automatically using a built-in backend for zeppelin in conjunction with a post-execute hook.
-
%python.sqlsupport for Pandas DataFrames is optional and provided using https://github.com/yhat/pandasql if user have one installed