zeppelin/docs/interpreter/postgresql.md
AhyoungRyu 85d4df4f0c [ZEPPELIN-1219] Add searching feature to Zeppelin docs site
### What is this PR for?
As more and more document pages are added, it's really hard to find specific pages. So I added searching feature to Zeppelin documentation site([jekyll](https://jekyllrb.com/) based site) using [lunr.js](http://lunrjs.com/).

 - **How does it work?**

  I created [`search_data.json`](6e02423f54/docs/search_data.json) which is used for docs info template. `lunr.js` combines all of the text from all of the docs in `docs/` into `_site/search_data.json`. It looks like below.
![screen shot 2016-08-03 at 4 49 59 am](https://cloud.githubusercontent.com/assets/10060731/17342828/f2908be8-5935-11e6-8eee-b189677c0531.png)
All the info are comes from [Jekyll YAML front matter](https://jekyllrb.com/docs/frontmatter/) variables. (i.e. title, group, description.. that's why I rewrote all docs' title and description.)
[search.js](6e02423f54/docs/assets/themes/zeppelin/js/search.js) will do this job using this data!

### What type of PR is it?
Improvement & Feature

### Todos
* [x] - Keep consistency for all docs pages' `Title`
* [x] - Add some overview sentences to all docs pages' `Description` section (this will be used as the result preview)
* [x] - Add apache license header to all docs page (some pages are missing the license header currently)
* [x] - Add LICENSE for `lunr.min.js`

### What is the Jira issue?
[ZEPPELIN-1219](https://issues.apache.org/jira/browse/ZEPPELIN-1219)

### How should this be tested?
1. Apply this patch and build `ZEPPELIN_HOME/docs` dir -> please see [docs/README.md#build-documentation](https://github.com/apache/zeppelin/tree/master/docs#build-documentation)
2. Click `search` icon in navbar and go to `search.html` page
3. Type anything you want to search in the search bar (i.e. type `python`, `spark`, `dynamic` ... )

### Screenshots (if appropriate)
![screen shot 2016-08-03 at 4 42 28 pm](https://cloud.githubusercontent.com/assets/10060731/17357851/d092e2ca-5999-11e6-9917-a3d4113e6e43.png)

![search](https://cloud.githubusercontent.com/assets/10060731/17357828/b2486cd6-5999-11e6-873b-121fac033b03.gif)

### Questions:
* Does the licenses files need update? Yes, for `lunr.min.js`
* Is there breaking changes for older versions? no
* Does this needs documentation? no

Author: AhyoungRyu <fbdkdud93@hanmail.net>

Closes #1266 from AhyoungRyu/ZEPPELIN-1219 and squashes the following commits:

7ec8854 [AhyoungRyu] Modify 'no result' sentence
91b71a7 [AhyoungRyu] Remove Apache license header since JSON doesn't allow comment
34afd5d [AhyoungRyu] Add Apache license header to search_data.json
6784282 [AhyoungRyu] Minor search page UI update
0389d28 [AhyoungRyu] Make index.md not to be searched
9f1ba42 [AhyoungRyu] Disable enterkey press & change icon
bd4956a [AhyoungRyu] Add docs.js & search.js to exclude list in pom.xml
624b051 [AhyoungRyu] Add Apache license header to search.js
1381152 [AhyoungRyu] Fix search result skipping issue
6e775f5 [AhyoungRyu] Make pleasecontribute.md not to be searched
ee11136 [AhyoungRyu] Fix some typos
fa01299 [AhyoungRyu] Refine 'description' in some docs as @bzz suggested
da0cff9 [AhyoungRyu] Exclude lunr.min.js
36ba7f1 [AhyoungRyu] Add lunr.min.js license info
f6a05a6 [AhyoungRyu] Apply css style for the search results
68eb997 [AhyoungRyu] Attach 'Apache Zeppelin ZEPPELIN_VERSION Documentation: ' to title
d908c37 [AhyoungRyu] Add searching page
a951fa6 [AhyoungRyu] Add search icon to navbar
0688a79 [AhyoungRyu] Keep consistency all docs' front matter for the right search result
040f532 [AhyoungRyu] Add template for storing docs info based on jekyll front matter
0705bd6 [AhyoungRyu] Add js files: lunr.min.js & search.js
2016-08-10 12:39:22 +09:00

230 lines
7.4 KiB
Markdown

---
layout: page
title: "PostgreSQL, Apache HAWQ (incubating) Interpreter for Apache Zeppelin"
description: "Apache Zeppelin supports PostgreSQL, Apache HAWQ(incubating) and Greenplum SQL data processing engines."
group: interpreter
---
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
{% include JB/setup %}
# PostgreSQL, Apache HAWQ (incubating) Interpreter for Apache Zeppelin
<div id="toc"></div>
## Important Notice
Postgresql Interpreter will be deprecated and merged into JDBC Interpreter. You can use Postgresql by using JDBC Interpreter with same functionality. See the example below of settings and dependencies.
### Properties
<table class="table-configuration">
<tr>
<th>Property</th>
<th>Value</th>
</tr>
<tr>
<td>psql.driver</td>
<td>org.postgresql.Driver</td>
</tr>
<tr>
<td>psql.url</td>
<td>jdbc:postgresql://localhost:5432/</td>
</tr>
<tr>
<td>psql.user</td>
<td>psqlUser</td>
</tr>
<tr>
<td>psql.password</td>
<td>psqlPassword</td>
</tr>
</table>
### Dependencies
<table class="table-configuration">
<tr>
<th>Artifact</th>
<th>Exclude</th>
</tr>
<tr>
<td>org.postgresql:postgresql:9.4-1201-jdbc41</td>
<td></td>
</tr>
</table>
---
## Overview
[<img align="right" src="http://img.youtube.com/vi/wqXXQhJ5Uk8/0.jpg" alt="zeppelin-view" hspace="10" width="250"></img>](https://www.youtube.com/watch?v=wqXXQhJ5Uk8)
This interpreter seamlessly supports the following SQL data processing engines:
* [PostgreSQL](http://www.postgresql.org/) - OSS, Object-relational database management system (ORDBMS)
* [pache HAWQ (incubating)](http://hawq.incubator.apache.org/) - Powerful open source SQL-On-Hadoop engine.
* [Greenplum](http://pivotal.io/big-data/pivotal-greenplum-database) - MPP database built on open source PostgreSQL.
This [Video Tutorial](https://www.youtube.com/watch?v=wqXXQhJ5Uk8) illustrates some of the features provided by the `Postgresql Interpreter`.
<table class="table-configuration">
<tr>
<th>Name</th>
<th>Class</th>
<th>Description</th>
</tr>
<tr>
<td>%psql.sql</td>
<td>PostgreSqlInterpreter</td>
<td>Provides SQL environment for PostgreSQL, HAWQ and Greenplum</td>
</tr>
</table>
## Create Interpreter
By default Zeppelin creates one `PSQL` instance. You can remove it or create new instances.
Multiple PSQL instances can be created, each configured to the same or different backend databases. But over time a `Notebook` can have only one PSQL interpreter instance `bound`. That means you _cannot_ connect to different databases in the same `Notebook`. This is a known Zeppelin limitation.
To create new PSQL instance open the `Interpreter` section and click the `+Create` button. Pick a `Name` of your choice and from the `Interpreter` drop-down select `psql`. Then follow the configuration instructions and `Save` the new instance.
> Note: The `Name` of the instance is used only to distinct the instances while binding them to the `Notebook`. The `Name` is irrelevant inside the `Notebook`. In the `Notebook` you must use `%psql.sql` tag.
## Bind to Notebook
In the `Notebook` click on the `settings` icon in the top right corner. The select/deselect the interpreters to be bound with the `Notebook`.
## Configuration
You can modify the configuration of the PSQL from the `Interpreter` section. The PSQL interpreter expenses the following properties:
<table class="table-configuration">
<tr>
<th>Property Name</th>
<th>Description</th>
<th>Default Value</th>
</tr>
<tr>
<td>postgresql.url</td>
<td>JDBC URL to connect to </td>
<td>jdbc:postgresql://localhost:5432</td>
</tr>
<tr>
<td>postgresql.user</td>
<td>JDBC user name</td>
<td>gpadmin</td>
</tr>
<tr>
<td>postgresql.password</td>
<td>JDBC password</td>
<td></td>
</tr>
<tr>
<td>postgresql.driver.name</td>
<td>JDBC driver name. In this version the driver name is fixed and should not be changed</td>
<td>org.postgresql.Driver</td>
</tr>
<tr>
<td>postgresql.max.result</td>
<td>Max number of SQL result to display to prevent the browser overload</td>
<td>1000</td>
</tr>
</table>
## How to use
```
Tip: Use (CTRL + .) for SQL auto-completion.
```
### DDL and SQL commands
Start the paragraphs with the full `%psql.sql` prefix tag! The short notation: `%psql` would still be able run the queries but the syntax highlighting and the auto-completions will be disabled.
You can use the standard CREATE / DROP / INSERT commands to create or modify the data model:
```sql
%psql.sql
drop table if exists mytable;
create table mytable (i int);
insert into mytable select generate_series(1, 100);
```
Then in a separate paragraph run the query.
```sql
%psql.sql
select * from mytable;
```
> Note: You can have multiple queries in the same paragraph but only the result from the first is displayed. [[1](https://issues.apache.org/jira/browse/ZEPPELIN-178)], [[2](https://issues.apache.org/jira/browse/ZEPPELIN-212)].
For example, this will execute both queries but only the count result will be displayed. If you revert the order of the queries the mytable content will be shown instead.
```sql
%psql.sql
select count(*) from mytable;
select * from mytable;
```
### PSQL command line tools
Use the Shell Interpreter (`%sh`) to access the command line [PSQL](http://www.postgresql.org/docs/9.4/static/app-psql.html) interactively:
```bash
%sh
psql -h phd3.localdomain -U gpadmin -p 5432 <<EOF
\dn
\q
EOF
```
This will produce output like this:
```
Name | Owner
--------------------+---------
hawq_toolkit | gpadmin
information_schema | gpadmin
madlib | gpadmin
pg_catalog | gpadmin
pg_toast | gpadmin
public | gpadmin
retail_demo | gpadmin
```
### Apply Zeppelin Dynamic Forms
You can leverage [Zeppelin Dynamic Form](../manual/dynamicform.html) inside your queries. You can use both the `text input` and `select form` parametrization features
```sql
%psql.sql
SELECT ${group_by}, count(*) as count
FROM retail_demo.order_lineitems_pxf
GROUP BY ${group_by=product_id,product_id|product_name|customer_id|store_id}
ORDER BY count ${order=DESC,DESC|ASC}
LIMIT ${limit=10};
```
### Example HAWQ PXF/HDFS Tables
Create HAWQ external table that read data from tab-separated-value data in HDFS.
```sql
%psql.sql
CREATE EXTERNAL TABLE retail_demo.payment_methods_pxf (
payment_method_id smallint,
payment_method_code character varying(20)
) LOCATION ('pxf://${NAME_NODE_HOST}:50070/retail_demo/payment_methods.tsv.gz?profile=HdfsTextSimple') FORMAT 'TEXT' (DELIMITER = E'\t');
```
And retrieve content
```sql
%psql.sql
select * from retail_demo.payment_methods_pxf
```
## Auto-completion
The PSQL Interpreter provides a basic auto-completion functionality. On `(Ctrl+.)` it list the most relevant suggestions in a pop-up window. In addition to the SQL keyword the interpreter provides suggestions for the Schema, Table, Column names as well.