Commit graph

9 commits

Author SHA1 Message Date
1ambda
ee309066a4 [ZEPPELIN-1695] Use shared versions in test libraries (maven)
### What is this PR for?

Use shared test library versions in maven config so that lib versions do mot be fragmented.
Previously we used multiple versions of

- Junit (4.11, 4.12)
- mockito (1.9.0, 1.10.8, ...)
- powermock (...)

### What type of PR is it?
[Improvement]

### What is the Jira issue?

[ZEPPELIN-1695](https://issues.apache.org/jira/browse/ZEPPELIN-1695)

### How should this be tested?

Use this command to see test libraries share versions or not

```
$ mvn org.apache.maven.plugins:maven-help-plugin:2.2:effective-pom | vim -
```

### Questions:
* Does the licenses files need update? - YES, I updated JUnit version to 4.12 from 4.11
* Is there breaking changes for older versions? - NO
* Does this needs documentation? - NO

Author: 1ambda <1amb4a@gmail.com>

Closes #1727 from 1ambda/feat/centralise-testing-libraries and squashes the following commits:

b6fd336 [1ambda] chore: Shared mockito, powermock version
941f1ba [1ambda] chore: Update junit to 4.12
2016-12-07 10:22:52 +09:00
Alex Goodman
438dbca686 ZEPPELIN-1345 - Create a custom matplotlib backend that natively supports inline plotting in a python interpreter cell
### What is this PR for?

This PR is the first of two major steps needed to improve matplotlib integration in Zeppelin (ZEPPELIN-1344). The latter, which is a plotting backend with fully interactive tools enabled, will be done afterwards in a separate PR. This PR specifically for automatically displaying output from calls to matplotlib plotting functions inline with each paragraph. Thanks to the addition of post-execute hooks (ZEPPELIN-1423), there is no need to call any `show()` function to display an inline plot, just like in Jupyter.
### What type of PR is it?

Improvement
### Todos

The main code has been written and anyone who reads this is encouraged to test it, but there are a few minor todos:
- [x] - Add unit tests
- [x] - Add documentation
- [x] - Add screenshot showing iterative plotting with angular mode
### What is the Jira issue?

[ZEPPELIN-1345](https://issues.apache.org/jira/browse/ZEPPELIN-1345)
### How should this be tested?

In a pyspark or python paragraph, enter and run

``` python
import matplotlib.pyplot as plt
plt.plot([1, 2, 3])
```

The plot should be displayed automatically without calling any `show()` function whatsoever. A special method called `configure_mpl()` can also be used to modify the inline plotting behavior. For example,

``` python
z.configure_mpl(close=False, angular=True)
plt.plot([1, 2, 3])
```

allows for iterative updates to the plot provided you have PY4J installed for your python installation (which of course is always the case if you use pypsark). To clarify, this feature only currently works with pyspark (not python as there are no `angularBind()` and `angularUnbind()` methods yet). Doing something like:

```
plt.plot([3, 2, 1])
```

will update the plot that was generated by the previous paragraph by leveraging Zeppelin's Angular Display System. However, by setting `close=False`, matplotlib will no longer automatically close figures so it is now up to the user to explicitly close each figure instance they create. There's quite a bit more options for `z.configure_mpl()`, but I will save that discussion for the documentation.
### Screenshots (if appropriate)
![img](http://i.imgur.com/e1xHKnV.gif)

### Questions:
- Does the licenses files need update? No
- Is there breaking changes for older versions? No
- Does this needs documentation? Yes

Author: Alex Goodman <agoodm@users.noreply.github.com>

Closes #1534 from agoodm/ZEPPELIN-1345 and squashes the following commits:

9ef6ff7 [Alex Goodman] Move mpl backend files to /interpreter
24f89c6 [Alex Goodman] Catch potential NullPointerExceptions from hook registry
bdb584e [Alex Goodman] Make sure expressions are printed when no plots are shown
22b6fe4 [Alex Goodman] Remove unused variable
d3d1aa0 [Alex Goodman] Fix CI test failure
c90d204 [Alex Goodman] Update spark.md
bcf0bf3 [Alex Goodman] Update python.md for new matplotlib integration
c9b65a5 [Alex Goodman] Add iterative plotting example image
8029a05 [Alex Goodman] Update python/README.md
f2d9e86 [Alex Goodman] Exclude tests are excluded in python/pom.xml
86b1c90 [Alex Goodman] Fix tutorial notebook not loading
c37b00f [Alex Goodman] Fix legend in tutorial notebook
a321d79 [Alex Goodman] Update python.md
82350e3 [Alex Goodman] Update matplotlib tutorial notebook
9792f97 [Alex Goodman] Add unit tests
8b9b973 [Alex Goodman] Fix NullPointerExceptions in unit tests
82135ad [Alex Goodman] Removed unused variable
f9c9498 [Alex Goodman] Added support for Angular Display System
edf750a [Alex Goodman] Add new matplotlib backend for python/pyspark interpreters
2016-11-08 07:20:21 -08:00
Mina Lee
04da56403b [MINOR] Change url in pom.xml files
### What is this PR for?
Set project url to `http://zeppelin.apache.org` in pom.xml files

### What type of PR is it?
Refactoring

### Questions:
* Does the licenses files need update? no
* Is there breaking changes for older versions? no
* Does this needs documentation? no

Author: Mina Lee <minalee@apache.org>

Closes #1221 from minahlee/pom_url and squashes the following commits:

10de8cb [Mina Lee] Remove child url
ef0ef04 [Mina Lee] Change main class package name
ead4064 [Mina Lee] Use consistent url in pom.xml
2016-07-31 16:41:56 +09:00
Alexander Bezzubov
d8b54cf76d ZEPPELIN-1115: Python - interpreter for SQL over DataFrame
### What is this PR for?
Add new interpreter to Python group: `%python.sql` for SQL over DataFrame support

### What type of PR is it?
Improvement

### TODOs
* [x] add new interpreter `%python.sql`
* [x] add test
* [x] make Python-dependant tests, excluded from CI
   * PythonInterpreterWithPythonInstalledTest
   * PythonPandasSqlInterpreterTest
   * run manually by `mvn -Dpython.test.exclude='' test -pl python -am`
* [x] add docs `%python.sql`
* [x] make `%python.sql` fail gracefully in case there is no Pandas or PandaSQL installed
* [x] after #747 is merged - rebase and remove `-Dpython.test.exclude=''` from both profiles

### What is the Jira issue?
[ZEPPELIN-1115](https://issues.apache.org/jira/browse/ZEPPELIN-1115)

### How should this be tested?
`mvn -Dpython.test.exclude='' test -pl python -am` should pass or manually run
 - Given the DataFrame i.e

  ```
%python
import pandas as pd
rates = pd.read_csv("bank.csv", sep=";")
  ```
 - SQL query it like

  ```
%python.sql
SELECT * FROM rates LIMIT 10
  ```

### Screenshots (if appropriate)
![screen shot 2016-07-11 at 23 56 04](https://cloud.githubusercontent.com/assets/5582506/16735171/1ebb9354-47c3-11e6-9354-6364e9374a20.png)

### Questions:
* Does the licenses files need update? No, no dependencies were included in source or binary release
* Is there breaking changes for older versions? No
* Does this needs documentation? Yes

Author: Alexander Bezzubov <bzz@apache.org>

Closes #1164 from bzz/ZEPPELIN-1115/python/add-sql-for-dataframes and squashes the following commits:

0f2f852 [Alexander Bezzubov] Fail SQL gracefully if no python dependencies installed
aca2bdf [Alexander Bezzubov] Fix typos in docs 
158ba6a [Alexander Bezzubov] Remove third-party dependant test from CI
5fe46fc [Alexander Bezzubov] Update Python Matplotlib notebook example
72884c8 [Alexander Bezzubov] Add docs for %python.sql feature
e931dc4 [Alexander Bezzubov] Make test for PythonPandasSqlInterpreter usable
76bbb44 [Alexander Bezzubov] Complete implementation of the PythonPandasSqlInterpreter
f6ca1eb [Alexander Bezzubov] Add %python.sql to interpreter menue
11ba490 [Alexander Bezzubov] Add draft implementation of %python.sql for DataFrames
2016-07-15 18:37:18 +09:00
Mina Lee
e0f77d68e8 Bump up version to 0.7.0-SNAPSHOT
### What is this PR for?
Bump up version to 0.7.0-SNAPSHOT

Author: Mina Lee <minalee@apache.org>

Closes #1016 from minahlee/0.7.0-SNAPSHOT and squashes the following commits:

541e1b3 [Mina Lee] Bump up zeppelin-examples version to 0.7.0-SNAPSHOT
ea8c0ad [Mina Lee] Bump up version to 0.7.0-SNAPSHOT
2016-07-06 04:45:48 +09:00
Alexander Bezzubov
2ee7f48cff ZEPPELIN-1105: Python - add paragraph ERROR status
### What is this PR for?
Implement paragraph ERROR status for Python interpreter in case of Error or Exception in the output.

### What type of PR is it?
Improvement

### What is the Jira issue?
[ZEPPELIN-1105](https://issues.apache.org/jira/browse/ZEPPELIN-1105)

### How should this be tested?
CI should pass, or

```
mvn "-Dtest=org.apache.zeppelin.python.PythonInterpreterWithPythonInstalledTest" test -pl python
```

should pass, or paragraph status should be ERROR for something like

```
import some-thing
```

### Screenshots (if appropriate)
![screen shot 2016-07-04 at 21 30 23](https://cloud.githubusercontent.com/assets/5582506/16560453/8fd0dddc-422e-11e6-9977-c3aea052db39.png)

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No

Author: Alexander Bezzubov <bzz@apache.org>

Closes #1124 from bzz/ZEPPELIN-1105/python/add-paragraph-error-statu and squashes the following commits:

a7bf8f3 [Alexander Bezzubov] Python: add missing license header
b585982 [Alexander Bezzubov] Python: include Python-dependant tests to 1 CI profile
e7d5371 [Alexander Bezzubov] Python: add ERROR paragraph status on Error and Exception in output
4c1107b [Alexander Bezzubov] Refactoring: rename and extract var assignment
2016-07-05 17:57:22 +09:00
Mina Lee
85d70579f5 [ZEPPELIN-986] Create publish release script
### What is this PR for?
This PR is to automate release publish to maven repository.
We used to use maven-deploy-plugin and maven-release-plugin for release but somehow it didn't work well with Zeppelin so 0.5.5 and 0.5.6 haven't been published to maven repository.

Publishing release to maven repository will eventually help zeppelin to reduce binary package size by leading users to use Dynamic interpreter loading(#908).
Originally below modules were skipped for maven release
 - all interpreters(except spark)
 - zeppelin-display
 - zeppelin-server
 - zeppelin-distribution

on the other hand this pr will skip only:
 - zeppelin-distribution

### What type of PR is it?
Infra

### Todos
- [x] Include SparkR/R interpreter in release
- [x] Create common_release.sh to remove build configuration duplication
- [x] Check curl networking failure

### What is the Jira issue?
[ZEPPELIN-986](https://issues.apache.org/jira/browse/ZEPPELIN-986)

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? Yes, https://cwiki.apache.org/confluence/display/ZEPPELIN/Preparing+Zeppelin+Release will be updated accordingly once this pr is merged.

Author: Mina Lee <minalee@apache.org>

Closes #994 from minahlee/ZEPPELIN-986 and squashes the following commits:

b0e8e67 [Mina Lee] Revert "Add geode, scalding profile in maven artifact build"
cd4cbcd [Mina Lee] curl failure check
c0ea07c [Mina Lee] Fix wrong indentation
a88bc1d [Mina Lee] Add geode, scalding profile in maven artifact build
2cced61 [Mina Lee] Add r to binary package and maven build
903bc12 [Mina Lee] Move duplicate code to common_release.sh
a3eb676 [Mina Lee] Include zeppelin-server module in publish artifiact
48d338f [Mina Lee] Rollback mistakenly removed plugin
aafaf42 [Mina Lee] Follow google shell  style guide
30dcc65 [Mina Lee] remove deploy plugin from pom since custom script will be used instead for deploy
cd1f08c [Mina Lee] Refactor create release script
e764f5f [Mina Lee] Add maven publish release script
2016-06-22 21:22:07 -07:00
Mina Lee
5af7747798 Remove incubating from pom files
### What is this PR for?
Remove `incubating` term from pom files

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No

Author: Mina Lee <minalee@nflabs.com>

Closes #942 from minahlee/tlp/removeIncubating and squashes the following commits:

e605b54 [Mina Lee] Remove incubating from pom files
2016-06-02 12:51:44 -07:00
Hervé RIVIERE
34734b9c8a [ZEPPELIN-502] Python interpreter group
### What is this PR for?
Adding a python 2 &3 interpreter. It's a basic implementation (no py4j for example), with a java ProcessBuilder object used to instantiate a python REPL.

The interpreter doesn't bring it own python binary but uses the python specified by python.path configutation. Thus, you can still use your specific installed python modules (scikit-learn, matplotlib...) and the interpreter is able to work with python 2 & 3 without change.

I had a python helper  function (zeppelin_show() ) to easily display matplotlib graph as SVG.

### What type of PR is it?
[Feature]

### Todos
* [x] - Code review
* [x] - Improve bootstrap.py : choose available helper functions and their names
* [x] - Unit / IT tests ?
* [x] documentation updates needed, that AhyoungRyu pointed out
* [X] LICENSE needs to be updated to include all non-apache licensed dependencies (i.e AFAIK Py4j is BSD ) in bin-license
* [x]  double-check that code formatting conforms project style guide
* [x]  the branch need to be rebased on latest master.

### What is the Jira issue?
[ZEPPELIN-502](https://issues.apache.org/jira/browse/ZEPPELIN-502?jql=project%20%3D%20ZEPPELIN%20AND%20text%20~%20%22python%22)

### How should this be tested?

1. In interpreter screen, in Python section, specify in python.path the python binary you want to use
2. In a paragraph, you can use the interpreter with **_%python_**. Calling help() will describe you the interpreter functionnalities.
3. Install py4j (pip install py4j) if you want to use input form

### Screenshots
![image](https://cloud.githubusercontent.com/assets/12515751/14936724/5108fb60-0ef4-11e6-93ea-232a037f7957.png)

![image](https://cloud.githubusercontent.com/assets/12515751/14943716/98a75c4a-0fe0-11e6-9d4b-e10c39d53a15.png)

![image](https://cloud.githubusercontent.com/assets/12515751/14936715/0eec90de-0ef4-11e6-811b-7ebe46f0d279.png)

![image](https://cloud.githubusercontent.com/assets/12515751/14943722/b89b7824-0fe0-11e6-9c73-c12f7372d487.png)

### Questions:
* Does the licenses files need update? Yes, only bin-license (py4j)
* Is there breaking changes for older versions? No
* Does this needs documentation? Yes

Author: Hervé RIVIERE <hriviere@users.noreply.github.com>

Closes #869 from hriviere/PR_interpreter_python and squashes the following commits:

80b6e75 [Hervé RIVIERE] [ZEPPELIN-502] move BSD py4j license to zeppelin-distribution/src/bin_license/license
a4b82a5 [Hervé RIVIERE] [ZEPPELIN-502]Improving doc following @AhyoungRyu review
3252353 [Hervé RIVIERE] [ZEPPELIN-502] Formatting code to respect project convention
54ec4f1 [Hervé RIVIERE] [ZEPPELIN-502]Improving doc following @AhyoungRyu review
6a831bc [Hervé RIVIERE] [ZEPPELIN-502] Add BSD py4j license
11e1b9c [Hervé RIVIERE] [ZEPPELIN-502] minor changes in python.md
e5d0bdb [Hervé RIVIERE] [ZEPPELIN-502] change PYTHON_PATH to ZEPPELIN_PYTHON
c62ac98 [Hervé RIVIERE] [ZEPPELIN-502] Improve python.md
5008125 [Hervé RIVIERE] [ZEPPELIN-502] Improve python.md with features not yet supported and technical description
7d533e1 [Hervé RIVIERE] [ZEPPELIN-502] Add tests and reformating code to help tests writing
fecaf25 [Hervé RIVIERE] [ZEPPELIN-502] Rename python.path to python and default from /usr/bin/python to python
02d1320 [Hervé RIVIERE] [ZEPPELIN-502] Input form, change from simple input form to native (pyspark syntax)
60d2956 [Hervé RIVIERE] [ZEPPELIN-502] Indent as pep8 convention
9bdb192 [Hervé RIVIERE] [ZEPPELIN-502] Add python.md to _navigation.html
7142aa5 [Hervé RIVIERE] [ZEPPELIN-502] Catch exception in logger.error
1a86ad7 [Hervé RIVIERE] [ZEPPELIN-502] Python interpreter group
2016-05-31 23:34:05 +09:00