zeppelin/python/src/main/resources/bootstrap_sql.py
Alexander Bezzubov d8b54cf76d ZEPPELIN-1115: Python - interpreter for SQL over DataFrame
### What is this PR for?
Add new interpreter to Python group: `%python.sql` for SQL over DataFrame support

### What type of PR is it?
Improvement

### TODOs
* [x] add new interpreter `%python.sql`
* [x] add test
* [x] make Python-dependant tests, excluded from CI
   * PythonInterpreterWithPythonInstalledTest
   * PythonPandasSqlInterpreterTest
   * run manually by `mvn -Dpython.test.exclude='' test -pl python -am`
* [x] add docs `%python.sql`
* [x] make `%python.sql` fail gracefully in case there is no Pandas or PandaSQL installed
* [x] after #747 is merged - rebase and remove `-Dpython.test.exclude=''` from both profiles

### What is the Jira issue?
[ZEPPELIN-1115](https://issues.apache.org/jira/browse/ZEPPELIN-1115)

### How should this be tested?
`mvn -Dpython.test.exclude='' test -pl python -am` should pass or manually run
 - Given the DataFrame i.e

  ```
%python
import pandas as pd
rates = pd.read_csv("bank.csv", sep=";")
  ```
 - SQL query it like

  ```
%python.sql
SELECT * FROM rates LIMIT 10
  ```

### Screenshots (if appropriate)
![screen shot 2016-07-11 at 23 56 04](https://cloud.githubusercontent.com/assets/5582506/16735171/1ebb9354-47c3-11e6-9354-6364e9374a20.png)

### Questions:
* Does the licenses files need update? No, no dependencies were included in source or binary release
* Is there breaking changes for older versions? No
* Does this needs documentation? Yes

Author: Alexander Bezzubov <bzz@apache.org>

Closes #1164 from bzz/ZEPPELIN-1115/python/add-sql-for-dataframes and squashes the following commits:

0f2f852 [Alexander Bezzubov] Fail SQL gracefully if no python dependencies installed
aca2bdf [Alexander Bezzubov] Fix typos in docs 
158ba6a [Alexander Bezzubov] Remove third-party dependant test from CI
5fe46fc [Alexander Bezzubov] Update Python Matplotlib notebook example
72884c8 [Alexander Bezzubov] Add docs for %python.sql feature
e931dc4 [Alexander Bezzubov] Make test for PythonPandasSqlInterpreter usable
76bbb44 [Alexander Bezzubov] Complete implementation of the PythonPandasSqlInterpreter
f6ca1eb [Alexander Bezzubov] Add %python.sql to interpreter menue
11ba490 [Alexander Bezzubov] Add draft implementation of %python.sql for DataFrames
2016-07-15 18:37:18 +09:00

28 lines
No EOL
1.2 KiB
Python

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# Setup SQL over Pandas DataFrames
# It requires next dependencies to be installed:
# - pandas
# - pandasql
from __future__ import print_function
try:
from pandasql import sqldf
pysqldf = lambda q: sqldf(q, globals())
except ImportError:
pysqldf = lambda q: print("Can not run SQL over Pandas DataFrame" +
"Make sure 'pandas' and 'pandasql' libraries are installed")