Commit graph

12 commits

Author SHA1 Message Date
Alex Goodman
582981677c ZEPPELIN-1328 - z.show in python interpreter does not display PNG images in python 3
### What is this PR for?
Support for plotting PNG images via matplotlib inline for the python interpreter was recently added (#1329). However, these changes did not work for python3 since it handles strings differently. This PR aims to make the inline plotting compatible with both python 2 and 3.

### What type of PR is it?
Bug Fix

### What is the Jira issue?
* [ZEPPELIN-1328](https://issues.apache.org/jira/browse/ZEPPELIN-1328)

### How should this be tested?
In a python interpreteter cell, make sure the following produce an image:
```python
%python
import matplotlib.pyplot as plt
import numpy as np

x = np.arange(5)
plt.plot(x)
z.show(plt, fmt='png') # Repeat for fmt='svg'
```
This should be tested for both python2 and 3 interpreters (via the interpreter settings page).

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No

Author: Alex Goodman <agoodm@users.noreply.github.com>

Closes #1343 from agoodm/ZEPPELIN-1328 and squashes the following commits:

772313f [Alex Goodman] Redo io import structure to make z.show() work for both matplotlib plots and pandas dataframes in python2/3
6a8f3ab [Alex Goodman] Add python3 support for matplotlib inline plotting in python interpreter
2016-08-22 22:37:43 +09:00
Alex Goodman
8b40268d16 ZEPPELIN-1318 - Add support for matplotlib displaying png images in python interpreter
### What is this PR for?
This PR adds support for plotting png images using the matplotlib helper function within a python interpreter (eg `z.show()`). The primary motivation for this is due to the overhead incurred from svg images, which can lag the notebooks if multiple, complicated images are generated (for example, multiple filled contour plots). png images are more lightweight, but of course come at a cost of image quality due to them being raster rather than vector like svg. The support for png images is incorporated through the use of a new optional argument to `z.show` called `fmt` which can be one of `'svg'` or `'png'`. The same code that is currently used in show is used for svg images while the code for png images relies on converting the image directly to a byte array and then entering the decoded byte string directly into an HTML image tag. Currently `fmt` defaults to `'png'` but I think we should consider discussing the pros and cons of each option in this PR.

### What type of PR is it?
Improvement

### What is the Jira issue?
[ZEPPELIN-1318](https://issues.apache.org/jira/browse/ZEPPELIN-1318)

### How should this be tested?
In a notebook cell, enter:
```python
%python
import matplotlib.pyplot as plt
import numpy as np

plt.figure()
plt.plot(np.arange(10))
z.show(plt, fmt=fmt)
```
Where `fmt` may be one of `'svg'` or `'png'`, and any other input should result in a `ValueError`. I would also recommend testing the example in the screenshot below.

### Screenshots (if appropriate)
![zeppelin plot](https://puu.sh/qzc4t/b8fcfe856e.png)

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? Yes (if the changes to the `help()` docstring are not sufficient)

Author: Alex Goodman <agoodm@users.noreply.github.com>

Closes #1329 from agoodm/ZEPPELIN-1318 and squashes the following commits:

2e9ce4c [Alex Goodman] Update python.md
1efa0c9 [Alex Goodman] ZEPPELIN-1318 - Add support for png images in z.show()
2016-08-14 11:28:38 +09:00
Jeff Zhang
3dec4d7006 ZEPPELIN-1287. No need to call print to display output in PythonInterpreter
### What is this PR for?
It is not necessary to call print to display output in PythonInterpreter. 2 main changes:
* the root cause is the displayhook in bootstrap.py
* also did some code refactoring on PythonInterpreter

### What type of PR is it?
[Bug Fix]

### Todos
* [ ] - Task

### What is the Jira issue?
* https://issues.apache.org/jira/browse/ZEPPELIN-1287

### How should this be tested?
Verify it manually

### Screenshots (if appropriate)
![2016-08-04_1404](https://cloud.githubusercontent.com/assets/164491/17392006/090279d2-5a4d-11e6-840b-4cddb595a42e.png)

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No

Author: Jeff Zhang <zjffdu@apache.org>

Closes #1278 from zjffdu/ZEPPELIN-1287 and squashes the following commits:

b48b56f [Jeff Zhang] fix unit test fail
3e9f169 [Jeff Zhang] address comments
0eade71 [Jeff Zhang] ZEPPELIN-1287. No need to call print to display output in PythonInterpreter
2016-08-11 10:39:22 +02:00
Alexander Bezzubov
a922fd28c1 Small refactoring of Python interpreter
### What is this PR for?
Small refactoring of Python interpreter, that is what it is.

### What type of PR is it?
Refactoring

### Todos
* [x] refactor `help()`
* [x] impl `maxResult` fetch from JVM

### How should this be tested?
`cd python && mvn -Dpython.test.exclude='' test ` pass (given that `pip install pandasql` and `pip install py4j`)

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No

Author: Alexander Bezzubov <bzz@apache.org>

Closes #1275 from bzz/python/refactoring and squashes the following commits:

15a35c8 [Alexander Bezzubov] Make .help() method a single string literal
e800fd7 [Alexander Bezzubov] Make Python fetch maxResults from JVM
2016-08-05 12:09:35 +09:00
Paul Bustios
9eac20d08a [ZEPPELIN-1261] Bug fix in z.show() for matplotlib graphs
### What is this PR for?
Bug fix in z.show() for matplotlib graphs and code refactoring

### What type of PR is it?
Bug Fix

### What is the Jira issue?
[ZEPPELIN-1261](https://issues.apache.org/jira/browse/ZEPPELIN-1261)

### How should this be tested?
```
%python
import matplotlib.pyplot as plt

x = [1,2,3,4,5]
y = [6,7,8,9,0]

plt.plot(x, y, marker="o")
z.show(plt, height="20em")
plt.close()
```
```
%python
import matplotlib.pyplot as plt

x = [1,2,3,4,5]
y = [6,7,8,9,0]

plt.plot(x, y, marker="o")
z.show(plt, height="300px")
plt.close()
```

### Screenshots (if appropriate)
![plot](https://dl.dropboxusercontent.com/u/20947972/z.show.height.example.png)

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No

Author: Paul Bustios <pbustios@gmail.com>

Closes #1267 from bustios/ZEPPELIN-1261 and squashes the following commits:

82c631a [Paul Bustios] Add 100% as a default value for width and height z.show() parameters
3e0ef22 [Paul Bustios] Bug fix in show_matplotlib
2016-08-03 13:52:04 +09:00
paulbustios
6f867ceb0c [ZEPPELIN-1255] Add cast to string in z.show() for Pandas DataFrame
### What is this PR for?
Casting data types in Pandas DataFrame to string in z.show()

### What type of PR is it?
Bug Fix

### What is the Jira issue?
[ZEPPELIN-1255](https://issues.apache.org/jira/browse/ZEPPELIN-1255)

### How should this be tested?
```
%python

import pandas as pd

df = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data', header=None)
df.columns=[1, 2, 3, 'PetalWidth', 'Name']
z.show(df)

%python.sql

SELECT * FROM df  LIMIT 10
```

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No

Author: paulbustios <pbustios@gmail.com>

Closes #1249 from bustios/ZEPPELIN-1255 and squashes the following commits:

82c1412 [paulbustios] Add test case for z.show() Pandas DataFrame
4a8c0a9 [paulbustios] [ZEPPELIN-1255] Add cast to string in z.show() for Pandas DataFrame
2016-08-01 21:50:25 +09:00
Sangmin Yoon
80d910049b [ZEPPELIN-1206] fix "name 'z' is not defined" with python3
### What is this PR for?
PythonInterpreter can not use dynamic form with python3.
Fix this problem.

### What type of PR is it?
Bug Fix

### What is the Jira issue?
* https://issues.apache.org/jira/browse/ZEPPELIN-1206

### How should this be tested?
```
%python
print(z.input("test"))
```

Make a note above, and run it.

### Questions:
* Does the licenses files need update? NO
* Is there breaking changes for older versions? NO
* Does this needs documentation? NO

Author: Sangmin Yoon <sangmin.yoon@croquis.com>

Closes #1213 from sixmen/fix_python3 and squashes the following commits:

be6f68b [Sangmin Yoon] fix "name 'z' is not defined" with python3
2016-07-27 07:52:17 +09:00
Alexander Bezzubov
d8b54cf76d ZEPPELIN-1115: Python - interpreter for SQL over DataFrame
### What is this PR for?
Add new interpreter to Python group: `%python.sql` for SQL over DataFrame support

### What type of PR is it?
Improvement

### TODOs
* [x] add new interpreter `%python.sql`
* [x] add test
* [x] make Python-dependant tests, excluded from CI
   * PythonInterpreterWithPythonInstalledTest
   * PythonPandasSqlInterpreterTest
   * run manually by `mvn -Dpython.test.exclude='' test -pl python -am`
* [x] add docs `%python.sql`
* [x] make `%python.sql` fail gracefully in case there is no Pandas or PandaSQL installed
* [x] after #747 is merged - rebase and remove `-Dpython.test.exclude=''` from both profiles

### What is the Jira issue?
[ZEPPELIN-1115](https://issues.apache.org/jira/browse/ZEPPELIN-1115)

### How should this be tested?
`mvn -Dpython.test.exclude='' test -pl python -am` should pass or manually run
 - Given the DataFrame i.e

  ```
%python
import pandas as pd
rates = pd.read_csv("bank.csv", sep=";")
  ```
 - SQL query it like

  ```
%python.sql
SELECT * FROM rates LIMIT 10
  ```

### Screenshots (if appropriate)
![screen shot 2016-07-11 at 23 56 04](https://cloud.githubusercontent.com/assets/5582506/16735171/1ebb9354-47c3-11e6-9354-6364e9374a20.png)

### Questions:
* Does the licenses files need update? No, no dependencies were included in source or binary release
* Is there breaking changes for older versions? No
* Does this needs documentation? Yes

Author: Alexander Bezzubov <bzz@apache.org>

Closes #1164 from bzz/ZEPPELIN-1115/python/add-sql-for-dataframes and squashes the following commits:

0f2f852 [Alexander Bezzubov] Fail SQL gracefully if no python dependencies installed
aca2bdf [Alexander Bezzubov] Fix typos in docs 
158ba6a [Alexander Bezzubov] Remove third-party dependant test from CI
5fe46fc [Alexander Bezzubov] Update Python Matplotlib notebook example
72884c8 [Alexander Bezzubov] Add docs for %python.sql feature
e931dc4 [Alexander Bezzubov] Make test for PythonPandasSqlInterpreter usable
76bbb44 [Alexander Bezzubov] Complete implementation of the PythonPandasSqlInterpreter
f6ca1eb [Alexander Bezzubov] Add %python.sql to interpreter menue
11ba490 [Alexander Bezzubov] Add draft implementation of %python.sql for DataFrames
2016-07-15 18:37:18 +09:00
Alexander Bezzubov
230d890142 ZEPPELIN-1048: Pandas support for python interpreter
### What is this PR for?
Display Pandas DataFrame using Zeppelin's Table Display system.

### What type of PR is it?
Feature

### Todos
* [x] fix NPE in logs on empty paragraph execution
* [x] matplotlib: refactor `zeppelin_show(plt)` -> `z.show(plt)`
* [x] pandas: support `z.show(df)`
* [x] update docs

### What is the Jira issue?
[ZEPPELIN-1048](https://issues.apache.org/jira/browse/ZEPPELIN-1048)

### How should this be tested?
"Zeppelin Tutorial: Python - matplotlib basic" should work, and

```python
import pandas as pd
rates = pd.read_csv("bank.csv", sep=";")
z.show(rates)
```
### Screenshots (if appropriate)
![screen shot 2016-06-23 at 10 29 00](https://cloud.githubusercontent.com/assets/5582506/16289133/85f0ddbc-392d-11e6-86a3-28d10e73f68d.png)

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? Yes

Author: Alexander Bezzubov <bzz@apache.org>

Closes #1067 from bzz/python/pandas-support and squashes the following commits:

3b1ad36 [Alexander Bezzubov] Python: update docs to reffer new API
ee6668b [Alexander Bezzubov] Python: update docs, add Pandas integration
71be418 [Alexander Bezzubov] Python: limit 1000 for table display system on DataFrame
52e787d [Alexander Bezzubov] Python: pandas DataFrame using Table display system
bc91b86 [Alexander Bezzubov] Python: skip interpreting empty paragraphs
a7248cd [Alexander Bezzubov] Python: draft of pandas support
15646a1 [Alexander Bezzubov] Python: refactoring to z.show()
2016-06-23 22:22:19 +09:00
Mina Lee
ff4973d435 [ZEPPELIN-1045] Apply new mechanism to PythonInterpreter
### What is this PR for?
This PR is applying new interpreter register mechanism to python interpreter.

### What type of PR is it?
Improvement

### What is the Jira issue?
[ZEPPELIN-1045](https://issues.apache.org/jira/browse/ZEPPELIN-1045)

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No

Author: Mina Lee <minalee@apache.org>

Closes #1063 from minahlee/ZEPPELIN-1045 and squashes the following commits:

66b8f73 [Mina Lee] Add zeppelin.python.maxResult property to python interpreter
5013890 [Mina Lee] Apply new mechanism to PythonInterpreter
2016-06-23 22:16:39 +09:00
Alexander Bezzubov
c806d4a4c7 Python interpreter and doc cleanup
### What is this PR for?
This is first step improving current Python interpreter implementation.
It has just a cleanup, style and docs improvements.

### What type of PR is it?
Improvement

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No

Author: Alexander Bezzubov <bzz@apache.org>

Closes #1021 from bzz/improve/python and squashes the following commits:

3cafa22 [Alexander Bezzubov] Python: make interpreter logs less verbose
744f7d2 [Alexander Bezzubov] Python: move technical details from doc to README.md
0f74e5d [Alexander Bezzubov] Python: return first running job
08c2bb4 [Alexander Bezzubov] Python: upd Java code to conform project conventions
e84cd5c [Alexander Bezzubov] Python: normalize newlines in python
58efcc9 [Alexander Bezzubov] Python: normalize newlines in Java
2016-06-17 10:30:58 +09:00
Hervé RIVIERE
34734b9c8a [ZEPPELIN-502] Python interpreter group
### What is this PR for?
Adding a python 2 &3 interpreter. It's a basic implementation (no py4j for example), with a java ProcessBuilder object used to instantiate a python REPL.

The interpreter doesn't bring it own python binary but uses the python specified by python.path configutation. Thus, you can still use your specific installed python modules (scikit-learn, matplotlib...) and the interpreter is able to work with python 2 & 3 without change.

I had a python helper  function (zeppelin_show() ) to easily display matplotlib graph as SVG.

### What type of PR is it?
[Feature]

### Todos
* [x] - Code review
* [x] - Improve bootstrap.py : choose available helper functions and their names
* [x] - Unit / IT tests ?
* [x] documentation updates needed, that AhyoungRyu pointed out
* [X] LICENSE needs to be updated to include all non-apache licensed dependencies (i.e AFAIK Py4j is BSD ) in bin-license
* [x]  double-check that code formatting conforms project style guide
* [x]  the branch need to be rebased on latest master.

### What is the Jira issue?
[ZEPPELIN-502](https://issues.apache.org/jira/browse/ZEPPELIN-502?jql=project%20%3D%20ZEPPELIN%20AND%20text%20~%20%22python%22)

### How should this be tested?

1. In interpreter screen, in Python section, specify in python.path the python binary you want to use
2. In a paragraph, you can use the interpreter with **_%python_**. Calling help() will describe you the interpreter functionnalities.
3. Install py4j (pip install py4j) if you want to use input form

### Screenshots
![image](https://cloud.githubusercontent.com/assets/12515751/14936724/5108fb60-0ef4-11e6-93ea-232a037f7957.png)

![image](https://cloud.githubusercontent.com/assets/12515751/14943716/98a75c4a-0fe0-11e6-9d4b-e10c39d53a15.png)

![image](https://cloud.githubusercontent.com/assets/12515751/14936715/0eec90de-0ef4-11e6-811b-7ebe46f0d279.png)

![image](https://cloud.githubusercontent.com/assets/12515751/14943722/b89b7824-0fe0-11e6-9c73-c12f7372d487.png)

### Questions:
* Does the licenses files need update? Yes, only bin-license (py4j)
* Is there breaking changes for older versions? No
* Does this needs documentation? Yes

Author: Hervé RIVIERE <hriviere@users.noreply.github.com>

Closes #869 from hriviere/PR_interpreter_python and squashes the following commits:

80b6e75 [Hervé RIVIERE] [ZEPPELIN-502] move BSD py4j license to zeppelin-distribution/src/bin_license/license
a4b82a5 [Hervé RIVIERE] [ZEPPELIN-502]Improving doc following @AhyoungRyu review
3252353 [Hervé RIVIERE] [ZEPPELIN-502] Formatting code to respect project convention
54ec4f1 [Hervé RIVIERE] [ZEPPELIN-502]Improving doc following @AhyoungRyu review
6a831bc [Hervé RIVIERE] [ZEPPELIN-502] Add BSD py4j license
11e1b9c [Hervé RIVIERE] [ZEPPELIN-502] minor changes in python.md
e5d0bdb [Hervé RIVIERE] [ZEPPELIN-502] change PYTHON_PATH to ZEPPELIN_PYTHON
c62ac98 [Hervé RIVIERE] [ZEPPELIN-502] Improve python.md
5008125 [Hervé RIVIERE] [ZEPPELIN-502] Improve python.md with features not yet supported and technical description
7d533e1 [Hervé RIVIERE] [ZEPPELIN-502] Add tests and reformating code to help tests writing
fecaf25 [Hervé RIVIERE] [ZEPPELIN-502] Rename python.path to python and default from /usr/bin/python to python
02d1320 [Hervé RIVIERE] [ZEPPELIN-502] Input form, change from simple input form to native (pyspark syntax)
60d2956 [Hervé RIVIERE] [ZEPPELIN-502] Indent as pep8 convention
9bdb192 [Hervé RIVIERE] [ZEPPELIN-502] Add python.md to _navigation.html
7142aa5 [Hervé RIVIERE] [ZEPPELIN-502] Catch exception in logger.error
1a86ad7 [Hervé RIVIERE] [ZEPPELIN-502] Python interpreter group
2016-05-31 23:34:05 +09:00