Commit graph

8 commits

Author SHA1 Message Date
Alexander Bezzubov
d8b54cf76d ZEPPELIN-1115: Python - interpreter for SQL over DataFrame
### What is this PR for?
Add new interpreter to Python group: `%python.sql` for SQL over DataFrame support

### What type of PR is it?
Improvement

### TODOs
* [x] add new interpreter `%python.sql`
* [x] add test
* [x] make Python-dependant tests, excluded from CI
   * PythonInterpreterWithPythonInstalledTest
   * PythonPandasSqlInterpreterTest
   * run manually by `mvn -Dpython.test.exclude='' test -pl python -am`
* [x] add docs `%python.sql`
* [x] make `%python.sql` fail gracefully in case there is no Pandas or PandaSQL installed
* [x] after #747 is merged - rebase and remove `-Dpython.test.exclude=''` from both profiles

### What is the Jira issue?
[ZEPPELIN-1115](https://issues.apache.org/jira/browse/ZEPPELIN-1115)

### How should this be tested?
`mvn -Dpython.test.exclude='' test -pl python -am` should pass or manually run
 - Given the DataFrame i.e

  ```
%python
import pandas as pd
rates = pd.read_csv("bank.csv", sep=";")
  ```
 - SQL query it like

  ```
%python.sql
SELECT * FROM rates LIMIT 10
  ```

### Screenshots (if appropriate)
![screen shot 2016-07-11 at 23 56 04](https://cloud.githubusercontent.com/assets/5582506/16735171/1ebb9354-47c3-11e6-9354-6364e9374a20.png)

### Questions:
* Does the licenses files need update? No, no dependencies were included in source or binary release
* Is there breaking changes for older versions? No
* Does this needs documentation? Yes

Author: Alexander Bezzubov <bzz@apache.org>

Closes #1164 from bzz/ZEPPELIN-1115/python/add-sql-for-dataframes and squashes the following commits:

0f2f852 [Alexander Bezzubov] Fail SQL gracefully if no python dependencies installed
aca2bdf [Alexander Bezzubov] Fix typos in docs 
158ba6a [Alexander Bezzubov] Remove third-party dependant test from CI
5fe46fc [Alexander Bezzubov] Update Python Matplotlib notebook example
72884c8 [Alexander Bezzubov] Add docs for %python.sql feature
e931dc4 [Alexander Bezzubov] Make test for PythonPandasSqlInterpreter usable
76bbb44 [Alexander Bezzubov] Complete implementation of the PythonPandasSqlInterpreter
f6ca1eb [Alexander Bezzubov] Add %python.sql to interpreter menue
11ba490 [Alexander Bezzubov] Add draft implementation of %python.sql for DataFrames
2016-07-15 18:37:18 +09:00
AhyoungRyu
6bd4ede7e5 [DOC][MINOR] Add shell interpreter docs to _navigation.html
### What is this PR for?
After #1087 merged, a new docs `shell.md` was added. But in the docs website, still Shell interpreter link points to `pleasecontribute.html`. So I changed this link, applied TOC and added more descriptions.

### What type of PR is it?
Documentation

### Todos
* [x] - Change `pleasecontribute.html` -> `shell.html`
* [x] - Apply TOC(table of contents)
* [x] - Add more description to `shell.md`

### Questions:
* Does the licenses files need update? no
* Is there breaking changes for older versions? no
* Does this needs documentation? no

Author: AhyoungRyu <fbdkdud93@hanmail.net>

Closes #1138 from AhyoungRyu/improve/shell-docs and squashes the following commits:

69d567d [AhyoungRyu] Address @felixcheung feedback
fca76a6 [AhyoungRyu] Apply TOC to rest-credential.md
c8e988b [AhyoungRyu] Change docs group manual -> interpreter
a0bf1d5 [AhyoungRyu] Add shell.html to _navigation.html
2016-07-13 00:08:06 +09:00
AhyoungRyu
5975125f18 [ZEPPELIN-1018] Apply auto "Table of Contents" generator to Zeppelin docs website
### What is this PR for?
I added auto TOC(Table of Contents) generator for Zeppelin documentation website. TOC can help people looking through whole contents at a glance and finding what they want quickly.

I just added `<div id="toc"></div>`  to the each documentation header. [`toc`](https://github.com/apache/zeppelin/compare/master...AhyoungRyu:ZEPPELIN-1018?expand=1#diff-85af09fb498a5667ea455391533f945dR3)  recognize `<h2>` & `<h3>` as a title in the docs  and it automatically generate TOC. So I set a rule for this work. (I'll write this rule on `docs/CONTRIBUTING.md` or [docs/howtocontributewebsite](https://zeppelin.apache.org/docs/0.6.0-SNAPSHOT/development/howtocontributewebsite.html)).

```
# Level-1 Heading  <- Use only for the main title of the page
## Level-2 Heading <- Start with this one
### Level-3 heading <- Only use this one for child of Level-2

toc only recognize Level-2 & Level-3
```

Please see the below attached screenshot image.

### What type of PR is it?
Improvement & Documentation

### Todos
* [x] - Add TOC generator
* [x] - Apply TOC(`<div id="toc"></div>`) to every documentation and reorganize each headers(apply the above rule)
* [x] - Fix some broken code block in several docs
* [x] - Apply TOC to `r.md` (Currently R docs has some duplicated info since [this one](d5e87fb8ba) and [this one](7d6cc7e991) )
* [x] - Apply TOC to `install.md` after #1010 merged
* [x] - Apply TOC to `interpreterinstallation.md` after #1042 merged

### What is the Jira issue?
[ZEPPELIN-1018](https://issues.apache.org/jira/browse/ZEPPELIN-1018)

### How should this be tested?
1. Apply this patch and build `docs/` with [this guide](https://github.com/apache/zeppelin/tree/master/docs#build-documentation)
2.  Visit some docs page. Then you can see TOC in the header of page.

### Screenshots (if appropriate)
 - Automatically generated TOC in Spark interpreter docs page
<img width="831" alt="screen shot 2016-06-16 at 9 37 18 pm" src="https://cloud.githubusercontent.com/assets/10060731/16140902/945b9c7a-340a-11e6-91f3-b6174738bed0.png">

### Questions:
* Does the licenses files need update?
No. Actually I used [jekyll-table-of-contents#copyright](https://github.com/ghiculescu/jekyll-table-of-contents#copyright). But I don't need to add a license for this :)
* Is there breaking changes for older versions? No
* Does this needs documentation? Maybe

Author: AhyoungRyu <fbdkdud93@hanmail.net>

Closes #1031 from AhyoungRyu/ZEPPELIN-1018 and squashes the following commits:

e66397b [AhyoungRyu] Apply TOC to interpreterinstallation.md
009579b [AhyoungRyu] Add more info to 'What is the next?' in install.md
04cf501 [AhyoungRyu] Revert 'where to start' section
b7cbe5f [AhyoungRyu] Fix typo
cf0911c [AhyoungRyu] Rename license file
388f35a [AhyoungRyu] Add jekyll-table-of-contents license info
6394c70 [AhyoungRyu] Fix image path in python.md
d00e4b1 [AhyoungRyu] Move interpreter/screenshot/ -> asset/../img/docs-img/
3ffb383 [AhyoungRyu] Remove duplicated info in r.md & apply toc
a03ca99 [AhyoungRyu] Exclude toc.js from pom.xml
3fae7df [AhyoungRyu] Apply auto generated toc to install.md
d114a9d [AhyoungRyu] Address @felixcheung feedback
6a788fe [AhyoungRyu] Resize TOC tab indent
6760c00 [AhyoungRyu] Apply auto TOC to all of docs under docs/storage/
fbde57f [AhyoungRyu] Apply auto TOC to all of docs under docs/quickstart/
db76eb6 [AhyoungRyu] Apply auto TOC to all of docs under docs/install/
f35db47 [AhyoungRyu] Apply auto TOC to all of docs under docs/displaysystem/
b05365f [AhyoungRyu] Apply auto TOC to all of docs under docs/rest-api/
163691c [AhyoungRyu] Apply auto TOC to all of docs under docs/manual/
bef398e [AhyoungRyu] Apply auto TOC to all of docs under docs/development/
9c5f76b [AhyoungRyu] Apply auto TOC to all of docs under docs/interpreter/
587d4ba [AhyoungRyu] Apply auto TOC to all of docs under docs/security/
1f10b97 [AhyoungRyu] Change toc configuration
78dca9e [AhyoungRyu] Add toc.js for auto generating TOC
2016-06-25 22:57:44 -07:00
Mina Lee
df7dd5c373 [HOTFXI] Fix python test case and resolve rat license issue
### What is this PR for?
Update  `testPy4jIsNotInstalled `, `testPy4jIsInstalled` test
 - `z.show` -> `def show` to check `show` function is defined
 - check if `bootstrap_input.py` excuted by checking `z = Py4jZeppelinContext` instead of `z = PyZeppelinContext`
 - add license header in `__init__.py` file

### What type of PR is it?
Hot Fix

Author: Mina Lee <minalee@apache.org>

Closes #1075 from minahlee/adjustPythonTest and squashes the following commits:

d46c5e1 [Mina Lee] Update api name in docs
6d82e9f [Mina Lee] Add license to __init__.py
f66e9dc [Mina Lee] Fix python test case
2016-06-23 18:43:49 -07:00
Alexander Bezzubov
230d890142 ZEPPELIN-1048: Pandas support for python interpreter
### What is this PR for?
Display Pandas DataFrame using Zeppelin's Table Display system.

### What type of PR is it?
Feature

### Todos
* [x] fix NPE in logs on empty paragraph execution
* [x] matplotlib: refactor `zeppelin_show(plt)` -> `z.show(plt)`
* [x] pandas: support `z.show(df)`
* [x] update docs

### What is the Jira issue?
[ZEPPELIN-1048](https://issues.apache.org/jira/browse/ZEPPELIN-1048)

### How should this be tested?
"Zeppelin Tutorial: Python - matplotlib basic" should work, and

```python
import pandas as pd
rates = pd.read_csv("bank.csv", sep=";")
z.show(rates)
```
### Screenshots (if appropriate)
![screen shot 2016-06-23 at 10 29 00](https://cloud.githubusercontent.com/assets/5582506/16289133/85f0ddbc-392d-11e6-86a3-28d10e73f68d.png)

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? Yes

Author: Alexander Bezzubov <bzz@apache.org>

Closes #1067 from bzz/python/pandas-support and squashes the following commits:

3b1ad36 [Alexander Bezzubov] Python: update docs to reffer new API
ee6668b [Alexander Bezzubov] Python: update docs, add Pandas integration
71be418 [Alexander Bezzubov] Python: limit 1000 for table display system on DataFrame
52e787d [Alexander Bezzubov] Python: pandas DataFrame using Table display system
bc91b86 [Alexander Bezzubov] Python: skip interpreting empty paragraphs
a7248cd [Alexander Bezzubov] Python: draft of pandas support
15646a1 [Alexander Bezzubov] Python: refactoring to z.show()
2016-06-23 22:22:19 +09:00
Mina Lee
ff4973d435 [ZEPPELIN-1045] Apply new mechanism to PythonInterpreter
### What is this PR for?
This PR is applying new interpreter register mechanism to python interpreter.

### What type of PR is it?
Improvement

### What is the Jira issue?
[ZEPPELIN-1045](https://issues.apache.org/jira/browse/ZEPPELIN-1045)

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No

Author: Mina Lee <minalee@apache.org>

Closes #1063 from minahlee/ZEPPELIN-1045 and squashes the following commits:

66b8f73 [Mina Lee] Add zeppelin.python.maxResult property to python interpreter
5013890 [Mina Lee] Apply new mechanism to PythonInterpreter
2016-06-23 22:16:39 +09:00
Alexander Bezzubov
c806d4a4c7 Python interpreter and doc cleanup
### What is this PR for?
This is first step improving current Python interpreter implementation.
It has just a cleanup, style and docs improvements.

### What type of PR is it?
Improvement

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No

Author: Alexander Bezzubov <bzz@apache.org>

Closes #1021 from bzz/improve/python and squashes the following commits:

3cafa22 [Alexander Bezzubov] Python: make interpreter logs less verbose
744f7d2 [Alexander Bezzubov] Python: move technical details from doc to README.md
0f74e5d [Alexander Bezzubov] Python: return first running job
08c2bb4 [Alexander Bezzubov] Python: upd Java code to conform project conventions
e84cd5c [Alexander Bezzubov] Python: normalize newlines in python
58efcc9 [Alexander Bezzubov] Python: normalize newlines in Java
2016-06-17 10:30:58 +09:00
Hervé RIVIERE
34734b9c8a [ZEPPELIN-502] Python interpreter group
### What is this PR for?
Adding a python 2 &3 interpreter. It's a basic implementation (no py4j for example), with a java ProcessBuilder object used to instantiate a python REPL.

The interpreter doesn't bring it own python binary but uses the python specified by python.path configutation. Thus, you can still use your specific installed python modules (scikit-learn, matplotlib...) and the interpreter is able to work with python 2 & 3 without change.

I had a python helper  function (zeppelin_show() ) to easily display matplotlib graph as SVG.

### What type of PR is it?
[Feature]

### Todos
* [x] - Code review
* [x] - Improve bootstrap.py : choose available helper functions and their names
* [x] - Unit / IT tests ?
* [x] documentation updates needed, that AhyoungRyu pointed out
* [X] LICENSE needs to be updated to include all non-apache licensed dependencies (i.e AFAIK Py4j is BSD ) in bin-license
* [x]  double-check that code formatting conforms project style guide
* [x]  the branch need to be rebased on latest master.

### What is the Jira issue?
[ZEPPELIN-502](https://issues.apache.org/jira/browse/ZEPPELIN-502?jql=project%20%3D%20ZEPPELIN%20AND%20text%20~%20%22python%22)

### How should this be tested?

1. In interpreter screen, in Python section, specify in python.path the python binary you want to use
2. In a paragraph, you can use the interpreter with **_%python_**. Calling help() will describe you the interpreter functionnalities.
3. Install py4j (pip install py4j) if you want to use input form

### Screenshots
![image](https://cloud.githubusercontent.com/assets/12515751/14936724/5108fb60-0ef4-11e6-93ea-232a037f7957.png)

![image](https://cloud.githubusercontent.com/assets/12515751/14943716/98a75c4a-0fe0-11e6-9d4b-e10c39d53a15.png)

![image](https://cloud.githubusercontent.com/assets/12515751/14936715/0eec90de-0ef4-11e6-811b-7ebe46f0d279.png)

![image](https://cloud.githubusercontent.com/assets/12515751/14943722/b89b7824-0fe0-11e6-9c73-c12f7372d487.png)

### Questions:
* Does the licenses files need update? Yes, only bin-license (py4j)
* Is there breaking changes for older versions? No
* Does this needs documentation? Yes

Author: Hervé RIVIERE <hriviere@users.noreply.github.com>

Closes #869 from hriviere/PR_interpreter_python and squashes the following commits:

80b6e75 [Hervé RIVIERE] [ZEPPELIN-502] move BSD py4j license to zeppelin-distribution/src/bin_license/license
a4b82a5 [Hervé RIVIERE] [ZEPPELIN-502]Improving doc following @AhyoungRyu review
3252353 [Hervé RIVIERE] [ZEPPELIN-502] Formatting code to respect project convention
54ec4f1 [Hervé RIVIERE] [ZEPPELIN-502]Improving doc following @AhyoungRyu review
6a831bc [Hervé RIVIERE] [ZEPPELIN-502] Add BSD py4j license
11e1b9c [Hervé RIVIERE] [ZEPPELIN-502] minor changes in python.md
e5d0bdb [Hervé RIVIERE] [ZEPPELIN-502] change PYTHON_PATH to ZEPPELIN_PYTHON
c62ac98 [Hervé RIVIERE] [ZEPPELIN-502] Improve python.md
5008125 [Hervé RIVIERE] [ZEPPELIN-502] Improve python.md with features not yet supported and technical description
7d533e1 [Hervé RIVIERE] [ZEPPELIN-502] Add tests and reformating code to help tests writing
fecaf25 [Hervé RIVIERE] [ZEPPELIN-502] Rename python.path to python and default from /usr/bin/python to python
02d1320 [Hervé RIVIERE] [ZEPPELIN-502] Input form, change from simple input form to native (pyspark syntax)
60d2956 [Hervé RIVIERE] [ZEPPELIN-502] Indent as pep8 convention
9bdb192 [Hervé RIVIERE] [ZEPPELIN-502] Add python.md to _navigation.html
7142aa5 [Hervé RIVIERE] [ZEPPELIN-502] Catch exception in logger.error
1a86ad7 [Hervé RIVIERE] [ZEPPELIN-502] Python interpreter group
2016-05-31 23:34:05 +09:00