mirror of
https://github.com/apache/zeppelin
synced 2026-05-24 09:38:26 +00:00
18 commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
32517c9d9f |
[ZEPPELIN-2753] Basic Implementation of IPython Interpreter
### What is this PR for? This is the first step for implement IPython Interpreter in Zeppelin. I just use the jupyter_client to create and manage the ipython kernel. We don't need to care about python compilation and execution, all the things are delegated to ipython kernel. Ideally all the features of ipython should be available in Zeppelin as well. For now, user can use %python.ipython for IPython Interpreter. And if ipython is available, the default python interpreter will use ipython. But user can still set `zeppelin.python.useIPython` as false to enforce to use the old implementation of python interpreter. Main features: * IPython interpreter support ** All the ipython features are available, including visualization, ipython magics. * ZeppelinContext support * Streaming output support * Support Ipython in PySpark Regarding the visualization, ideally all the visualization libraries work in jupyter should also work here. In unit test, I only verify the following 3 popular visualization library. could add more later. * matplotlib * bokeh * ggplot ### What type of PR is it? [Feature ] ### Todos * [ ] - Task ### What is the Jira issue? * https://issues.apache.org/jira/browse/ZEPPELIN-2753 ### How should this be tested? Unit test is added. ### Screenshots (if appropriate) Verify bokeh in IPython Interpreter  Verify matplotlib  Verify ZeppelinContext  Verify Streaming  ### Questions: * Does the licenses files need update? No * Is there breaking changes for older versions? No * Does this needs documentation? No Author: Jeff Zhang <zjffdu@apache.org> Closes #2474 from zjffdu/ZEPPELIN-2753 and squashes the following commits: |
||
|
|
1c23f21388 |
[ZEPPELIN-2707][DOCS][HOTFIX] fix: broken image URLs in 0.8.0-SNAPSHOT doc
### What is this PR for?
fix: broken image URLs in 0.8.0-SNAPSHOT doc
using the path `/asset` (the absolute path) for image URLs is actually invalid. That's because each version has its own image directory. So they should use the relative path. `{{BASE_PATH}}`
```
➜ asf-zeppelin tree site | grep asset
├── assets # root asset, we shouldn't use it in versioned doc.
│ │ ├── assets
│ │ ├── assets
│ │ ├── assets
│ │ ├── assets
│ │ ├── assets
│ │ ├── assets
│ │ ├── assets
│ │ ├── assets
│ │ ├── assets
│ ├── assets
```
### What type of PR is it?
[Bug Fix]
### Todos
DONE
### What is the Jira issue?
[ZEPPELIN-2707](https://issues.apache.org/jira/browse/ZEPPELIN-2707)
### How should this be tested?
1. cd `docs/`
2. build: `bundle exec jekyll build --safe`
3. check whether links in `_site` include `/docs/0.8.0-SNAPSHOT` as prefix or not
### Screenshots (if appropriate)
#### Current
http://zeppelin.apache.org/docs/0.8.0-SNAPSHOT/usage/interpreter/overview.html

#### After

### Questions:
* Does the licenses files need update? - NO
* Is there breaking changes for older versions? - NO
* Does this needs documentation? - NO
Author: 1ambda <1amb4a@gmail.com>
Closes #2450 from 1ambda/ZEPPELIN-2707/should-use-its-own-asset-directory and squashes the following commits:
|
||
|
|
4b6d3e5574 |
[ZEPPELIN-2596] Improving documentation page
### What is this PR for? Improving documentation page. Please check *TODO* and *Screenshots* sections for detail. The motivation is described in [the JIRA ticket](https://issues.apache.org/jira/browse/ZEPPELIN-2583) and discussion is ongoing on the mailing list. ### What type of PR is it? [Improvement | Documentation] ### Todos * [x] - improved the navbar style * [x] - improved the main page * [x] - re-organized content structure * [x] - added tutorial pages: `spark_with_zeppelin.md`, `python_with_zeppelin.md`, `sql_with_zeppelin.md` for overview * [x] - added `multi_user_support.md` page to provide overview * [x] - added the empty `interpreter_binding_mode` page. This will be handed in the different issue: [ZEPPELIN-2582](https://issues.apache.org/jira/browse/ZEPPELIN-2582) * [x] - added the empty `trouble_shooting` page. This can be filled in the following PRs. * [x] - added the empty `useful_developer_tools` page. This can be filled in the following PRs. ### What is the Jira issue? [ZEPPELIN-2596](https://issues.apache.org/jira/browse/ZEPPELIN-2596) ### How should this be tested? 1. checkout 2. `cd docs` 3. `bundle install` (make sure that you have ruby 2.1.0+ and bundle) 4. `bundle exec jekyll serve --watch` 5. open `localhost:4000` ### Screenshots (if appropriate) #### better navbar: before  #### better navbar: after  #### improved main page: before  #### improved main page: after  #### organized content structure: before  #### organized content structure: after  ### Questions: * Does the licenses files need update? - NO * Is there breaking changes for older versions? - NO * Does this needs documentation? - related with docs Author: 1ambda <1amb4a@gmail.com> Closes #2371 from 1ambda/updating-version-doc and squashes the following commits: |
||
|
|
caa664d6ee |
[ZEPPELIN-1683] Run python process in docker container
### What is this PR for? Inspired by ZEPPELIN-1671 conda interpreter. Docker can provides kind of virtual environment for python like conda does. This PR implements %python.docker interpreter that helps run python process in docker container. This PR implements feature on top of https://github.com/apache/zeppelin/pull/1645 ### What type of PR is it? Feature ### Todos * [x] - basic feature * [x] - unittest * [x] - documentation ### What is the Jira issue? https://issues.apache.org/jira/browse/ZEPPELIN-1683 ### How should this be tested? see screenshot ### Screenshots (if appropriate)  ### Questions: * Does the licenses files need update? no * Is there breaking changes for older versions? no * Does this needs documentation? yes Author: Lee moon soo <moon@apache.org> Closes #1654 from Leemoonsoo/pydocker and squashes the following commits: |
||
|
|
3665901504 |
[ZEPPELIN-1671] Conda interpreter
### What is this PR for? Conda interpreter that manages conda environment for PythonInterpreter ### What type of PR is it? Feature ### Todos * [x] - Basic impl * [x] - update doc ### What is the Jira issue? https://issues.apache.org/jira/browse/ZEPPELIN-1671 ### How should this be tested? Recreate(or create new) your python interpreter setting in gui. List all conda env ``` %python.conda ``` Activate env ``` %python.conda activate [name] ``` Deactivate env ``` %python.conda deactivate ``` ### Screenshots (if appropriate)  ### Questions: * Does the licenses files need update? no * Is there breaking changes for older versions? no * Does this needs documentation? yes Author: Lee moon soo <moon@apache.org> Closes #1645 from Leemoonsoo/conda and squashes the following commits: |
||
|
|
5b1b811540 |
[ZEPPELIN-1644] make document easier to follow key instructions
### What is this PR for? Doc should deliver key features and recommended usage more simple and easy way. - docs/install/install.md has lots of duplicated section with README.md. - docs/install/install.md includes install from binary as well as build from source. I've seen that makes some beginners try download binary and then source build it again. - recommended and key usage need to be highlighted. - Be less verbose in key instructions. Move optional, additional info from in the middle of key instruction to end of the each page. ### What type of PR is it? Improvement ### Todos * [x] - improve doc ### What is the Jira issue? https://issues.apache.org/jira/browse/ZEPPELIN-1644 ### How should this be tested? Run doc locally ### Screenshots (if appropriate) ### Questions: * Does the licenses files need update? no * Is there breaking changes for older versions? no * Does this needs documentation? no Author: Lee moon soo <moon@apache.org> Closes #1615 from Leemoonsoo/ZEPPELIN-1644 and squashes the following commits: |
||
|
|
438dbca686 |
ZEPPELIN-1345 - Create a custom matplotlib backend that natively supports inline plotting in a python interpreter cell
### What is this PR for? This PR is the first of two major steps needed to improve matplotlib integration in Zeppelin (ZEPPELIN-1344). The latter, which is a plotting backend with fully interactive tools enabled, will be done afterwards in a separate PR. This PR specifically for automatically displaying output from calls to matplotlib plotting functions inline with each paragraph. Thanks to the addition of post-execute hooks (ZEPPELIN-1423), there is no need to call any `show()` function to display an inline plot, just like in Jupyter. ### What type of PR is it? Improvement ### Todos The main code has been written and anyone who reads this is encouraged to test it, but there are a few minor todos: - [x] - Add unit tests - [x] - Add documentation - [x] - Add screenshot showing iterative plotting with angular mode ### What is the Jira issue? [ZEPPELIN-1345](https://issues.apache.org/jira/browse/ZEPPELIN-1345) ### How should this be tested? In a pyspark or python paragraph, enter and run ``` python import matplotlib.pyplot as plt plt.plot([1, 2, 3]) ``` The plot should be displayed automatically without calling any `show()` function whatsoever. A special method called `configure_mpl()` can also be used to modify the inline plotting behavior. For example, ``` python z.configure_mpl(close=False, angular=True) plt.plot([1, 2, 3]) ``` allows for iterative updates to the plot provided you have PY4J installed for your python installation (which of course is always the case if you use pypsark). To clarify, this feature only currently works with pyspark (not python as there are no `angularBind()` and `angularUnbind()` methods yet). Doing something like: ``` plt.plot([3, 2, 1]) ``` will update the plot that was generated by the previous paragraph by leveraging Zeppelin's Angular Display System. However, by setting `close=False`, matplotlib will no longer automatically close figures so it is now up to the user to explicitly close each figure instance they create. There's quite a bit more options for `z.configure_mpl()`, but I will save that discussion for the documentation. ### Screenshots (if appropriate)  ### Questions: - Does the licenses files need update? No - Is there breaking changes for older versions? No - Does this needs documentation? Yes Author: Alex Goodman <agoodm@users.noreply.github.com> Closes #1534 from agoodm/ZEPPELIN-1345 and squashes the following commits: |
||
|
|
8f344db93e |
[ZEPPELIN-1421] Fix dead link in docs/README.md
### What is this PR for? There is a dead link in [docs/README.md](https://github.com/apache/zeppelin/blob/master/docs/README.md). It should be `https://zeppelin.apache.org/docs/latest/` not `https://zeppelin.apache.org/docs/latest` ### What type of PR is it? Bug Fix ### What is the Jira issue? [ZEPPELIN-1421](https://issues.apache.org/jira/browse/ZEPPELIN-1421) ### How should this be tested? - Before [https://zeppelin.apache.org/docs/latest](https://zeppelin.apache.org/docs/latest) - After [https://zeppelin.apache.org/docs/latest/](https://zeppelin.apache.org/docs/latest/) ### Screenshots (if appropriate) ### Questions: * Does the licenses files need update? no * Is there breaking changes for older versions? no * Does this needs documentation? no Author: AhyoungRyu <fbdkdud93@hanmail.net> Closes #1420 from AhyoungRyu/ZEPPELIN-1421 and squashes the following commits: |
||
|
|
8b40268d16 |
ZEPPELIN-1318 - Add support for matplotlib displaying png images in python interpreter
### What is this PR for? This PR adds support for plotting png images using the matplotlib helper function within a python interpreter (eg `z.show()`). The primary motivation for this is due to the overhead incurred from svg images, which can lag the notebooks if multiple, complicated images are generated (for example, multiple filled contour plots). png images are more lightweight, but of course come at a cost of image quality due to them being raster rather than vector like svg. The support for png images is incorporated through the use of a new optional argument to `z.show` called `fmt` which can be one of `'svg'` or `'png'`. The same code that is currently used in show is used for svg images while the code for png images relies on converting the image directly to a byte array and then entering the decoded byte string directly into an HTML image tag. Currently `fmt` defaults to `'png'` but I think we should consider discussing the pros and cons of each option in this PR. ### What type of PR is it? Improvement ### What is the Jira issue? [ZEPPELIN-1318](https://issues.apache.org/jira/browse/ZEPPELIN-1318) ### How should this be tested? In a notebook cell, enter: ```python %python import matplotlib.pyplot as plt import numpy as np plt.figure() plt.plot(np.arange(10)) z.show(plt, fmt=fmt) ``` Where `fmt` may be one of `'svg'` or `'png'`, and any other input should result in a `ValueError`. I would also recommend testing the example in the screenshot below. ### Screenshots (if appropriate)  ### Questions: * Does the licenses files need update? No * Is there breaking changes for older versions? No * Does this needs documentation? Yes (if the changes to the `help()` docstring are not sufficient) Author: Alex Goodman <agoodm@users.noreply.github.com> Closes #1329 from agoodm/ZEPPELIN-1318 and squashes the following commits: |
||
|
|
85d4df4f0c |
[ZEPPELIN-1219] Add searching feature to Zeppelin docs site
### What is this PR for? As more and more document pages are added, it's really hard to find specific pages. So I added searching feature to Zeppelin documentation site([jekyll](https://jekyllrb.com/) based site) using [lunr.js](http://lunrjs.com/). - **How does it work?** I created [`search_data.json`]( |
||
|
|
d8b54cf76d |
ZEPPELIN-1115: Python - interpreter for SQL over DataFrame
### What is this PR for? Add new interpreter to Python group: `%python.sql` for SQL over DataFrame support ### What type of PR is it? Improvement ### TODOs * [x] add new interpreter `%python.sql` * [x] add test * [x] make Python-dependant tests, excluded from CI * PythonInterpreterWithPythonInstalledTest * PythonPandasSqlInterpreterTest * run manually by `mvn -Dpython.test.exclude='' test -pl python -am` * [x] add docs `%python.sql` * [x] make `%python.sql` fail gracefully in case there is no Pandas or PandaSQL installed * [x] after #747 is merged - rebase and remove `-Dpython.test.exclude=''` from both profiles ### What is the Jira issue? [ZEPPELIN-1115](https://issues.apache.org/jira/browse/ZEPPELIN-1115) ### How should this be tested? `mvn -Dpython.test.exclude='' test -pl python -am` should pass or manually run - Given the DataFrame i.e ``` %python import pandas as pd rates = pd.read_csv("bank.csv", sep=";") ``` - SQL query it like ``` %python.sql SELECT * FROM rates LIMIT 10 ``` ### Screenshots (if appropriate)  ### Questions: * Does the licenses files need update? No, no dependencies were included in source or binary release * Is there breaking changes for older versions? No * Does this needs documentation? Yes Author: Alexander Bezzubov <bzz@apache.org> Closes #1164 from bzz/ZEPPELIN-1115/python/add-sql-for-dataframes and squashes the following commits: |
||
|
|
6bd4ede7e5 |
[DOC][MINOR] Add shell interpreter docs to _navigation.html
### What is this PR for? After #1087 merged, a new docs `shell.md` was added. But in the docs website, still Shell interpreter link points to `pleasecontribute.html`. So I changed this link, applied TOC and added more descriptions. ### What type of PR is it? Documentation ### Todos * [x] - Change `pleasecontribute.html` -> `shell.html` * [x] - Apply TOC(table of contents) * [x] - Add more description to `shell.md` ### Questions: * Does the licenses files need update? no * Is there breaking changes for older versions? no * Does this needs documentation? no Author: AhyoungRyu <fbdkdud93@hanmail.net> Closes #1138 from AhyoungRyu/improve/shell-docs and squashes the following commits: |
||
|
|
5975125f18 |
[ZEPPELIN-1018] Apply auto "Table of Contents" generator to Zeppelin docs website
### What is this PR for? I added auto TOC(Table of Contents) generator for Zeppelin documentation website. TOC can help people looking through whole contents at a glance and finding what they want quickly. I just added `<div id="toc"></div>` to the each documentation header. [`toc`](https://github.com/apache/zeppelin/compare/master...AhyoungRyu:ZEPPELIN-1018?expand=1#diff-85af09fb498a5667ea455391533f945dR3) recognize `<h2>` & `<h3>` as a title in the docs and it automatically generate TOC. So I set a rule for this work. (I'll write this rule on `docs/CONTRIBUTING.md` or [docs/howtocontributewebsite](https://zeppelin.apache.org/docs/0.6.0-SNAPSHOT/development/howtocontributewebsite.html)). ``` # Level-1 Heading <- Use only for the main title of the page ## Level-2 Heading <- Start with this one ### Level-3 heading <- Only use this one for child of Level-2 toc only recognize Level-2 & Level-3 ``` Please see the below attached screenshot image. ### What type of PR is it? Improvement & Documentation ### Todos * [x] - Add TOC generator * [x] - Apply TOC(`<div id="toc"></div>`) to every documentation and reorganize each headers(apply the above rule) * [x] - Fix some broken code block in several docs * [x] - Apply TOC to `r.md` (Currently R docs has some duplicated info since [this one]( |
||
|
|
df7dd5c373 |
[HOTFXI] Fix python test case and resolve rat license issue
### What is this PR for? Update `testPy4jIsNotInstalled `, `testPy4jIsInstalled` test - `z.show` -> `def show` to check `show` function is defined - check if `bootstrap_input.py` excuted by checking `z = Py4jZeppelinContext` instead of `z = PyZeppelinContext` - add license header in `__init__.py` file ### What type of PR is it? Hot Fix Author: Mina Lee <minalee@apache.org> Closes #1075 from minahlee/adjustPythonTest and squashes the following commits: |
||
|
|
230d890142 |
ZEPPELIN-1048: Pandas support for python interpreter
### What is this PR for? Display Pandas DataFrame using Zeppelin's Table Display system. ### What type of PR is it? Feature ### Todos * [x] fix NPE in logs on empty paragraph execution * [x] matplotlib: refactor `zeppelin_show(plt)` -> `z.show(plt)` * [x] pandas: support `z.show(df)` * [x] update docs ### What is the Jira issue? [ZEPPELIN-1048](https://issues.apache.org/jira/browse/ZEPPELIN-1048) ### How should this be tested? "Zeppelin Tutorial: Python - matplotlib basic" should work, and ```python import pandas as pd rates = pd.read_csv("bank.csv", sep=";") z.show(rates) ``` ### Screenshots (if appropriate)  ### Questions: * Does the licenses files need update? No * Is there breaking changes for older versions? No * Does this needs documentation? Yes Author: Alexander Bezzubov <bzz@apache.org> Closes #1067 from bzz/python/pandas-support and squashes the following commits: |
||
|
|
ff4973d435 |
[ZEPPELIN-1045] Apply new mechanism to PythonInterpreter
### What is this PR for? This PR is applying new interpreter register mechanism to python interpreter. ### What type of PR is it? Improvement ### What is the Jira issue? [ZEPPELIN-1045](https://issues.apache.org/jira/browse/ZEPPELIN-1045) ### Questions: * Does the licenses files need update? No * Is there breaking changes for older versions? No * Does this needs documentation? No Author: Mina Lee <minalee@apache.org> Closes #1063 from minahlee/ZEPPELIN-1045 and squashes the following commits: |
||
|
|
c806d4a4c7 |
Python interpreter and doc cleanup
### What is this PR for? This is first step improving current Python interpreter implementation. It has just a cleanup, style and docs improvements. ### What type of PR is it? Improvement ### Questions: * Does the licenses files need update? No * Is there breaking changes for older versions? No * Does this needs documentation? No Author: Alexander Bezzubov <bzz@apache.org> Closes #1021 from bzz/improve/python and squashes the following commits: |
||
|
|
34734b9c8a |
[ZEPPELIN-502] Python interpreter group
### What is this PR for? Adding a python 2 &3 interpreter. It's a basic implementation (no py4j for example), with a java ProcessBuilder object used to instantiate a python REPL. The interpreter doesn't bring it own python binary but uses the python specified by python.path configutation. Thus, you can still use your specific installed python modules (scikit-learn, matplotlib...) and the interpreter is able to work with python 2 & 3 without change. I had a python helper function (zeppelin_show() ) to easily display matplotlib graph as SVG. ### What type of PR is it? [Feature] ### Todos * [x] - Code review * [x] - Improve bootstrap.py : choose available helper functions and their names * [x] - Unit / IT tests ? * [x] documentation updates needed, that AhyoungRyu pointed out * [X] LICENSE needs to be updated to include all non-apache licensed dependencies (i.e AFAIK Py4j is BSD ) in bin-license * [x] double-check that code formatting conforms project style guide * [x] the branch need to be rebased on latest master. ### What is the Jira issue? [ZEPPELIN-502](https://issues.apache.org/jira/browse/ZEPPELIN-502?jql=project%20%3D%20ZEPPELIN%20AND%20text%20~%20%22python%22) ### How should this be tested? 1. In interpreter screen, in Python section, specify in python.path the python binary you want to use 2. In a paragraph, you can use the interpreter with **_%python_**. Calling help() will describe you the interpreter functionnalities. 3. Install py4j (pip install py4j) if you want to use input form ### Screenshots     ### Questions: * Does the licenses files need update? Yes, only bin-license (py4j) * Is there breaking changes for older versions? No * Does this needs documentation? Yes Author: Hervé RIVIERE <hriviere@users.noreply.github.com> Closes #869 from hriviere/PR_interpreter_python and squashes the following commits: |