zeppelin/docs/interpreter/python.md
Alexander Bezzubov 230d890142 ZEPPELIN-1048: Pandas support for python interpreter
### What is this PR for?
Display Pandas DataFrame using Zeppelin's Table Display system.

### What type of PR is it?
Feature

### Todos
* [x] fix NPE in logs on empty paragraph execution
* [x] matplotlib: refactor `zeppelin_show(plt)` -> `z.show(plt)`
* [x] pandas: support `z.show(df)`
* [x] update docs

### What is the Jira issue?
[ZEPPELIN-1048](https://issues.apache.org/jira/browse/ZEPPELIN-1048)

### How should this be tested?
"Zeppelin Tutorial: Python - matplotlib basic" should work, and

```python
import pandas as pd
rates = pd.read_csv("bank.csv", sep=";")
z.show(rates)
```
### Screenshots (if appropriate)
![screen shot 2016-06-23 at 10 29 00](https://cloud.githubusercontent.com/assets/5582506/16289133/85f0ddbc-392d-11e6-86a3-28d10e73f68d.png)

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? Yes

Author: Alexander Bezzubov <bzz@apache.org>

Closes #1067 from bzz/python/pandas-support and squashes the following commits:

3b1ad36 [Alexander Bezzubov] Python: update docs to reffer new API
ee6668b [Alexander Bezzubov] Python: update docs, add Pandas integration
71be418 [Alexander Bezzubov] Python: limit 1000 for table display system on DataFrame
52e787d [Alexander Bezzubov] Python: pandas DataFrame using Table display system
bc91b86 [Alexander Bezzubov] Python: skip interpreting empty paragraphs
a7248cd [Alexander Bezzubov] Python: draft of pandas support
15646a1 [Alexander Bezzubov] Python: refactoring to z.show()
2016-06-23 22:22:19 +09:00

3.3 KiB

layout title description group
page Python Interpreter Python Interpreter manual

{% include JB/setup %}

Python 2 & 3 Interpreter for Apache Zeppelin

Configuration

Property Default Description
zeppelin.python python Path of the already installed Python binary (could be python2 or python3). If python is not in your $PATH you can set the absolute directory (example : /usr/bin/python)
zeppelin.python.maxResult 1000 Max number of dataframe rows to display.

Enabling Python Interpreter

In a notebook, to enable the Python interpreter, click on the Gear icon and select Python

Using the Python Interpreter

In a paragraph, use %python to select the Python interpreter and then input all commands.

The interpreter can only work if you already have python installed (the interpreter doesn't bring it own python binaries).

To access the help, type help()

Python modules

The interpreter can use all modules already installed (with pip, easy_install...)

Use Zeppelin Dynamic Forms

You can leverage Zeppelin Dynamic Form inside your Python code.

Zeppelin Dynamic Form can only be used if py4j Python library is installed in your system. If not, you can install it with pip install py4j.

Example :

%python
### Input form
print (z.input("f1","defaultValue"))

### Select form
print (z.select("f1",[("o1","1"),("o2","2")],"2"))

### Checkbox form
print("".join(z.checkbox("f3", [("o1","1"), ("o2","2")],["1"])))

Zeppelin features not fully supported by the Python Interpreter

  • Interrupt a paragraph execution (cancel() method) is currently only supported in Linux and MacOs. If interpreter runs in another operating system (for instance MS Windows) , interrupt a paragraph will close the whole interpreter. A JIRA ticket (ZEPPELIN-893) is opened to implement this feature in a next release of the interpreter.
  • Progression bar in webUI (getProgress() method) is currently not implemented.
  • Code-completion is currently not implemented.

Matplotlib integration

The python interpreter can display matplotlib graph with the function z.show(). You need to have matplotlib module installed and a XServer running to use this functionality !

%python
import matplotlib.pyplot as plt
plt.figure()
(.. ..)
z.show(plt)
plt.close()

zeppelin_show function can take optional parameters to adapt graph width and height

%python
z.show(plt, width='50px')
z.show(plt, height='150px')

pythonmatplotlib

Pandas integration

Zeppelin Display System provides simple API to visualize data in Pandas DataFrames, same as in Matplotlib.

Example:

import pandas as pd
rates = pd.read_csv("bank.csv", sep=";")
z.show(rates)

Technical description

For in-depth technical details on current implementation plese reffer python/README.md