Interpreter documentation merge with commit #578

This commit is contained in:
Jesang Yoon 2016-01-18 03:49:23 +09:00
parent af55811b54
commit 781954b82c
10 changed files with 669 additions and 751 deletions

File diff suppressed because it is too large Load diff

View file

@ -6,12 +6,10 @@ group: manual
---
{% include JB/setup %}
## Elasticsearch Interpreter for Apache Zeppelin
[Elasticsearch](https://www.elastic.co/products/elasticsearch) is a highly scalable open-source full-text search and analytics engine. It allows you to store, search, and analyze big volumes of data quickly and in near real time. It is generally used as the underlying engine/technology that powers applications that have complex search features and requirements.
## Configuration
<table class="table-configuration">
<tr>
<th>Property</th>
@ -44,7 +42,6 @@ group: manual
![Interpreter configuration](../assets/themes/zeppelin/img/docs-img/elasticsearch-config.png)
</center>
> **Note #1 :** You can add more properties to configure the Elasticsearch client.
> **Note #2 :** If you use Shield, you can add a property named `shield.user` with a value containing the name and the password ( format: `username:password` ). For more details about Shield configuration, consult the [Shield reference guide](https://www.elastic.co/guide/en/shield/current/_using_elasticsearch_java_clients_with_shield.html). Do not forget, to copy the shield client jar in the interpreter directory (`ZEPPELIN_HOME/interpreters/elasticsearch`).
@ -56,8 +53,9 @@ In a notebook, to enable the **Elasticsearch** interpreter, click the **Gear** i
In a paragraph, use `%elasticsearch` to select the Elasticsearch interpreter and then input all commands. To get the list of available commands, use `help`.
```bash
| %elasticsearch
| help
%elasticsearch
help
Elasticsearch interpreter:
General format: <command> /<indices>/<types>/<id> <option> <JSON>
- indices: list of indices separated by commas (depends on the command)
@ -83,8 +81,8 @@ Commands:
With the `get` command, you can find a document by id. The result is a JSON document.
```bash
| %elasticsearch
| get /index/type/id
%elasticsearch
get /index/type/id
```
Example:
@ -100,16 +98,16 @@ With the `search` command, you can send a search query to Elasticsearch. There a
* See [Elasticsearch query string syntax](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#query-string-syntax) for more details about the content of such a query.
```bash
| %elasticsearch
| search /index1,index2,.../type1,type2,... <JSON document containing the query or query_string elements>
%elasticsearch
search /index1,index2,.../type1,type2,... <JSON document containing the query or query_string elements>
```
If you want to modify the size of the result set, you can add a line that is setting the size, before your search command.
```bash
| %elasticsearch
| size 50
| search /index1,index2,.../type1,type2,... <JSON document containing the query or query_string elements>
%elasticsearch
size 50
search /index1,index2,.../type1,type2,... <JSON document containing the query or query_string elements>
```
> A search query can also contain [aggregations](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations.html). If there is at least one aggregation, the result of the first aggregation is shown, otherwise, you get the search hits.
@ -119,30 +117,30 @@ Examples:
* With a JSON query:
```bash
| %elasticsearch
| search / { "query": { "match_all": { } } }
|
| %elasticsearch
| search /logs { "query": { "query_string": { "query": "request.method:GET AND status:200" } } }
|
| %elasticsearch
| search /logs { "aggs": {
| "content_length_stats": {
| "extended_stats": {
| "field": "content_length"
| }
| }
| } }
%elasticsearch
search / { "query": { "match_all": { } } }
%elasticsearch
search /logs { "query": { "query_string": { "query": "request.method:GET AND status:200" } } }
%elasticsearch
search /logs { "aggs": {
"content_length_stats": {
"extended_stats": {
"field": "content_length"
}
}
} }
```
* With query_string elements:
```bash
| %elasticsearch
| search /logs request.method:GET AND status:200
|
| %elasticsearch
| search /logs (404 AND (POST OR DELETE))
%elasticsearch
search /logs request.method:GET AND status:200
%elasticsearch
search /logs (404 AND (POST OR DELETE))
```
> **Important** : a document in Elasticsearch is a JSON document, so it is hierarchical, not flat as a row in a SQL table.
@ -193,8 +191,8 @@ Examples:
With the `count` command, you can count documents available in some indices and types. You can also provide a query.
```bash
| %elasticsearch
| count /index1,index2,.../type1,type2,... <JSON document containing the query OR a query string>
%elasticsearch
count /index1,index2,.../type1,type2,... <JSON document containing the query OR a query string>
```
Examples:
@ -209,27 +207,26 @@ Examples:
With the `index` command, you can insert/update a document in Elasticsearch.
```bash
| %elasticsearch
| index /index/type/id <JSON document>
|
| %elasticsearch
| index /index/type <JSON document>
%elasticsearch
index /index/type/id <JSON document>
%elasticsearch
index /index/type <JSON document>
```
### Delete
With the `delete` command, you can delete a document.
```bash
| %elasticsearch
| delete /index/type/id
%elasticsearch
delete /index/type/id
```
### Apply Zeppelin Dynamic Forms
You can leverage [Zeppelin Dynamic Form]({{BASE_PATH}}/manual/dynamicform.html) inside your queries. You can use both the `text input` and `select form` parameterization features.
```bash
| %elasticsearch
| size ${limit=10}
| search /index/type { "query": { "match_all": { } } }
%elasticsearch
size ${limit=10}
search /index/type { "query": { "match_all": { } } }
```

View file

@ -6,7 +6,6 @@ group: manual
---
{% include JB/setup %}
## Flink interpreter for Apache Zeppelin
[Apache Flink](https://flink.apache.org) is an open source platform for distributed stream and batch data processing. Flinks core is a streaming dataflow engine that provides data distribution, communication, and fault tolerance for distributed computations over data streams. Flink also builds batch processing on top of the streaming engine, overlaying native iteration support, managed memory, and program optimization.

View file

@ -6,9 +6,7 @@ group: manual
---
{% include JB/setup %}
## Geode/Gemfire OQL Interpreter for Apache Zeppelin
<table class="table-configuration">
<tr>
<th>Name</th>
@ -36,7 +34,6 @@ This interpreter supports the [Geode](http://geode.incubator.apache.org/) [Objec
This [Video Tutorial](https://www.youtube.com/watch?v=zvzzA9GXu3Q) illustrates some of the features provided by the `Geode Interpreter`.
### Create Interpreter
By default Zeppelin creates one `Geode/OQL` instance. You can remove it or create more instances.
Multiple Geode instances can be created, each configured to the same or different backend Geode cluster. But over time a `Notebook` can have only one Geode interpreter instance `bound`. That means you _cannot_ connect to different Geode clusters in the same `Notebook`. This is a known Zeppelin limitation.
@ -46,11 +43,9 @@ To create new Geode instance open the `Interpreter` section and click the `+Crea
> Note: The `Name` of the instance is used only to distinguish the instances while binding them to the `Notebook`. The `Name` is irrelevant inside the `Notebook`. In the `Notebook` you must use `%geode.oql` tag.
### Bind to Notebook
In the `Notebook` click on the `settings` icon in the top right corner. The select/deselect the interpreters to be bound with the `Notebook`.
### Configuration
You can modify the configuration of the Geode from the `Interpreter` section. The Geode interpreter expresses the following properties:
<table class="table-configuration">
@ -77,13 +72,11 @@ You can modify the configuration of the Geode from the `Interpreter` section. T
</table>
### How to use
> *Tip 1: Use (CTRL + .) for OQL auto-completion.*
> *Tip 2: Always start the paragraphs with the full `%geode.oql` prefix tag! The short notation: `%geode` would still be able run the OQL queries but the syntax highlighting and the auto-completions will be disabled.*
#### Create / Destroy Regions
The OQL specification does not support [Geode Regions](https://cwiki.apache.org/confluence/display/GEODE/Index#Index-MainConceptsandComponents) mutation operations. To `create`/`destroy` regions one should use the [GFSH](http://geode-docs.cfapps.io/docs/tools_modules/gfsh/chapter_overview.html) shell tool instead. In the following it is assumed that the GFSH is colocated with Zeppelin server.
```bash
@ -104,8 +97,7 @@ EOF
Above snippet re-creates two regions: `regionEmployee` and `regionCompany`. Note that you have to explicitly specify the locator host and port. The values should match those you have used in the Geode Interpreter configuration. Comprehensive list of [GFSH Commands by Functional Area](http://geode-docs.cfapps.io/docs/tools_modules/gfsh/gfsh_quick_reference.html).
#### Basic OQL
#### Basic OQL
```sql
%geode.oql
SELECT count(*) FROM /regionEmployee
@ -145,7 +137,6 @@ SELECT e.key, e.value FROM /regionEmployee.entrySet e
> Note: You can have multiple queries in the same paragraph but only the result from the first is displayed. [[1](https://issues.apache.org/jira/browse/ZEPPELIN-178)], [[2](https://issues.apache.org/jira/browse/ZEPPELIN-212)].
#### GFSH Commands From The Shell
Use the Shell Interpreter (`%sh`) to run OQL commands form the command line:
```bash
@ -155,7 +146,6 @@ gfsh -e "connect" -e "list members"
```
#### Apply Zeppelin Dynamic Forms
You can leverage [Zeppelin Dynamic Form](../manual/dynamicform.html) inside your OQL queries. You can use both the `text input` and `select form` parameterization features
```sql

View file

@ -6,12 +6,10 @@ group: manual
---
{% include JB/setup %}
## Hive Interpreter for Apache Zeppelin
The [Apache Hive](https://hive.apache.org/) ™ data warehouse software facilitates querying and managing large datasets residing in distributed storage. Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL. At the same time this language also allows traditional map/reduce programmers to plug in their custom mappers and reducers when it is inconvenient or inefficient to express this logic in HiveQL.
### Configuration
<table class="table-configuration">
<tr>
<th>Property</th>
@ -73,7 +71,6 @@ The [Apache Hive](https://hive.apache.org/) ™ data warehouse software facilita
This interpreter provides multiple configuration with `${prefix}`. User can set a multiple connection properties by this prefix. It can be used like `%hive(${prefix})`.
## How to use
Basically, you can use
```sql
@ -92,7 +89,6 @@ select * from my_table;
You can also run multiple queries up to 10 by default. Changing these settings is not implemented yet.
### Apply Zeppelin Dynamic Forms
You can leverage [Zeppelin Dynamic Form]({{BASE_PATH}}/manual/dynamicform.html) inside your queries. You can use both the `text input` and `select form` parameterization features.
```sql

View file

@ -6,11 +6,9 @@ group: manual
---
{% include JB/setup %}
## Ignite Interpreter for Apache Zeppelin
### Overview
[Apache Ignite](https://ignite.apache.org/) In-Memory Data Fabric is a high-performance, integrated and distributed in-memory platform for computing and transacting on large-scale data sets in real-time, orders of magnitude faster than possible with traditional disk-based or flash technologies.
![Apache Ignite](../assets/themes/zeppelin/img/docs-img/ignite-logo.png)
@ -18,64 +16,60 @@ group: manual
You can use Zeppelin to retrieve distributed data from cache using Ignite SQL interpreter. Moreover, Ignite interpreter allows you to execute any Scala code in cases when SQL doesn't fit to your requirements. For example, you can populate data into your caches or execute distributed computations.
### Installing and Running Ignite example
In order to use Ignite interpreters, you may install Apache Ignite in some simple steps:
1. Download Ignite [source release](https://ignite.apache.org/download.html#sources) or [binary release](https://ignite.apache.org/download.html#binaries) whatever you want. But you must download Ignite as the same version of Zeppelin's. If it is not, you can't use scala code on Zeppelin. You can find ignite version in Zepplin at the pom.xml which is placed under `path/to/your-Zeppelin/ignite/pom.xml` ( Of course, in Zeppelin source release ). Please check `ignite.version` .<br>Currently, Zeppelin provides ignite only in Zeppelin source release. So, if you download Zeppelin binary release( `zeppelin-0.5.0-incubating-bin-spark-xxx-hadoop-xx` ), you can not use ignite interpreter on Zeppelin. We are planning to include ignite in a future binary release.
2. Examples are shipped as a separate Maven project, so to start running you simply need to import provided <dest_dir>/apache-ignite-fabric-1.2.0-incubating-bin/pom.xml file into your favourite IDE, such as Eclipse.
1. Download Ignite [source release](https://ignite.apache.org/download.html#sources) or [binary release](https://ignite.apache.org/download.html#binaries) whatever you want. But you must download Ignite as the same version of Zeppelin's. If it is not, you can't use scala code on Zeppelin. You can find ignite version in Zepplin at the pom.xml which is placed under `path/to/your-Zeppelin/ignite/pom.xml` ( Of course, in Zeppelin source release ). Please check `ignite.version` .<br>Currently, Zeppelin provides ignite only in Zeppelin source release. So, if you download Zeppelin binary release( `zeppelin-0.5.0-incubating-bin-spark-xxx-hadoop-xx` ), you can not use ignite interpreter on Zeppelin. We are planning to include ignite in a future binary release.
2. Examples are shipped as a separate Maven project, so to start running you simply need to import provided <dest_dir>/apache-ignite-fabric-1.2.0-incubating-bin/pom.xml file into your favourite IDE, such as Eclipse.
* In case of Eclipse, Eclipse -> File -> Import -> Existing Maven Projects
* Set examples directory path to Eclipse and select the pom.xml.
* Then start `org.apache.ignite.examples.ExampleNodeStartup` (or whatever you want) to run at least one or more ignite node. When you run example code, you may notice that the number of node is increase one by one.
> **Tip. If you want to run Ignite examples on the cli not IDE, you can export executable Jar file from IDE. Then run it by using below command.**
```
$ nohup java -jar </path/to/your Jar file name>
```
* In case of Eclipse, Eclipse -> File -> Import -> Existing Maven Projects
* Set examples directory path to Eclipse and select the pom.xml.
* Then start `org.apache.ignite.examples.ExampleNodeStartup` (or whatever you want) to run at least one or more ignite node. When you run example code, you may notice that the number of node is increase one by one.
> **Tip. If you want to run Ignite examples on the cli not IDE, you can export executable Jar file from IDE. Then run it by using below command.**
```
$ nohup java -jar </path/to/your Jar file name>
```
### Configuring Ignite Interpreter
At the "Interpreters" menu, you may edit Ignite interpreter or create new one. Zeppelin provides these properties for Ignite.
<table class="table-configuration">
<table class="table-configuration">
<tr>
<th>Property Name</th>
<th>value</th>
<th>Description</th>
<th>Property Name</th>
<th>value</th>
<th>Description</th>
</tr>
<tr>
<td>ignite.addresses</td>
<td>127.0.0.1:47500..47509</td>
<td>Coma separated list of Ignite cluster hosts. See [Ignite Cluster Configuration](https://apacheignite.readme.io/v1.2/docs/cluster-config) section for more details.</td>
<td>ignite.addresses</td>
<td>127.0.0.1:47500..47509</td>
<td>Coma separated list of Ignite cluster hosts. See [Ignite Cluster Configuration](https://apacheignite.readme.io/v1.2/docs/cluster-config) section for more details.</td>
</tr>
<tr>
<td>ignite.clientMode</td>
<td>true</td>
<td>You can connect to the Ignite cluster as client or server node. See [Ignite Clients vs. Servers](https://apacheignite.readme.io/v1.2/docs/clients-vs-servers) section for details. Use true or false values in order to connect in client or server mode respectively.</td>
<td>ignite.clientMode</td>
<td>true</td>
<td>You can connect to the Ignite cluster as client or server node. See [Ignite Clients vs. Servers](https://apacheignite.readme.io/v1.2/docs/clients-vs-servers) section for details. Use true or false values in order to connect in client or server mode respectively.</td>
</tr>
<tr>
<td>ignite.config.url</td>
<td></td>
<td>Configuration URL. Overrides all other settings.</td>
</tr
<tr>
<td>ignite.jdbc.url</td>
<td>jdbc:ignite:cfg://default-ignite-jdbc.xml</td>
<td>Ignite JDBC connection URL.</td>
</tr>
<tr>
<td>ignite.peerClassLoadingEnabled</td>
<td>true</td>
<td>Enables peer-class-loading. See [Zero Deployment](https://apacheignite.readme.io/v1.2/docs/zero-deployment) section for details. Use true or false values in order to enable or disable P2P class loading respectively.</td>
<td>ignite.config.url</td>
<td></td>
<td>Configuration URL. Overrides all other settings.</td>
</tr>
</table>
<tr>
<td>ignite.jdbc.url</td>
<td>jdbc:ignite:cfg://default-ignite-jdbc.xml</td>
<td>Ignite JDBC connection URL.</td>
</tr>
<tr>
<td>ignite.peerClassLoadingEnabled</td>
<td>true</td>
<td>Enables peer-class-loading. See [Zero Deployment](https://apacheignite.readme.io/v1.2/docs/zero-deployment) section for details. Use true or false values in order to enable or disable P2P class loading respectively.</td>
</tr>
</table>
![Configuration of Ignite Interpreter](../assets/themes/zeppelin/img/docs-img/ignite-interpreter-setting.png)
### Interpreter Binding for Zeppelin Notebook
After configuring Ignite interpreter, create your own notebook. Then you can bind interpreters like below image.
![Binding Interpreters](../assets/themes/zeppelin/img/docs-img/ignite-interpreter-binding.png)
@ -83,38 +77,37 @@ After configuring Ignite interpreter, create your own notebook. Then you can bin
For more interpreter binding information see [here](http://zeppelin.incubator.apache.org/docs/manual/interpreters.html).
### How to use Ignite SQL interpreter
In order to execute SQL query, use ` %ignite.ignitesql ` prefix. <br>
Supposing you are running `org.apache.ignite.examples.streaming.wordcount.StreamWords`, then you can use "words" cache( Of course you have to specify this cache name to the Ignite interpreter setting section `ignite.jdbc.url` of Zeppelin ).
For example, you can select top 10 words in the words cache using the following query
```
%ignite.ignitesql
select _val, count(_val) as cnt from String group by _val order by cnt desc limit 10
```
![IgniteSql on Zeppelin](../assets/themes/zeppelin/img/docs-img/ignite-sql-example.png)
```
%ignite.ignitesql
select _val, count(_val) as cnt from String group by _val order by cnt desc limit 10
```
![IgniteSql on Zeppelin](../assets/themes/zeppelin/img/docs-img/ignite-sql-example.png)
As long as your Ignite version and Zeppelin Ignite version is same, you can also use scala code. Please check the Zeppelin Ignite version before you download your own Ignite.
```
%ignite
import org.apache.ignite._
import org.apache.ignite.cache.affinity._
import org.apache.ignite.cache.query._
import org.apache.ignite.configuration._
```
%ignite
import org.apache.ignite._
import org.apache.ignite.cache.affinity._
import org.apache.ignite.cache.query._
import org.apache.ignite.configuration._
import scala.collection.JavaConversions._
import scala.collection.JavaConversions._
val cache: IgniteCache[AffinityUuid, String] = ignite.cache("words")
val cache: IgniteCache[AffinityUuid, String] = ignite.cache("words")
val qry = new SqlFieldsQuery("select avg(cnt), min(cnt), max(cnt) from (select count(_val) as cnt from String group by _val)", true)
val qry = new SqlFieldsQuery("select avg(cnt), min(cnt), max(cnt) from (select count(_val) as cnt from String group by _val)", true)
val res = cache.query(qry).getAll()
val res = cache.query(qry).getAll()
collectionAsScalaIterable(res).foreach(println _)
```
![Using Scala Code](../assets/themes/zeppelin/img/docs-img/ignite-scala-example.png)
collectionAsScalaIterable(res).foreach(println _)
```
![Using Scala Code](../assets/themes/zeppelin/img/docs-img/ignite-scala-example.png)
Apache Ignite also provides a guide docs for Zeppelin ["Ignite with Apache Zeppelin"](https://apacheignite.readme.io/docs/data-analysis-with-apache-zeppelin)

View file

@ -16,69 +16,70 @@ group: manual
### Installing and Running Lens
In order to use Lens interpreters, you may install Apache Lens in some simple steps:
1. Download Lens for latest version from [the ASF](http://www.apache.org/dyn/closer.lua/lens/2.3-beta). Or the older release can be found [in the Archives](http://archive.apache.org/dist/lens/).
2. Before running Lens, you have to set HIVE_HOME and HADOOP_HOME. If you want to get more information about this, please refer to [here](http://lens.apache.org/lenshome/install-and-run.html#Installation). Lens also provides Pseudo Distributed mode. [Lens pseudo-distributed setup](http://lens.apache.org/lenshome/pseudo-distributed-setup.html) is done by using [docker](https://www.docker.com/). Hive server and hadoop daemons are run as separate processes in lens pseudo-distributed setup.
3. Now, you can start lens server (or stop).
```
./bin/lens-ctl start (or stop)
```
1. Download Lens for latest version from [the ASF](http://www.apache.org/dyn/closer.lua/lens/2.3-beta). Or the older release can be found [in the Archives](http://archive.apache.org/dist/lens/).
2. Before running Lens, you have to set HIVE_HOME and HADOOP_HOME. If you want to get more information about this, please refer to [here](http://lens.apache.org/lenshome/install-and-run.html#Installation). Lens also provides Pseudo Distributed mode. [Lens pseudo-distributed setup](http://lens.apache.org/lenshome/pseudo-distributed-setup.html) is done by using [docker](https://www.docker.com/). Hive server and hadoop daemons are run as separate processes in lens pseudo-distributed setup.
3. Now, you can start lens server (or stop).
```
./bin/lens-ctl start (or stop)
```
### Configuring Lens Interpreter
At the "Interpreters" menu, you can edit Lens interpreter or create new one. Zeppelin provides these properties for Lens.
<table class="table-configuration">
<tr>
<th>Property Name</th>
<th>value</th>
<th>Description</th>
<th>Property Name</th>
<th>value</th>
<th>Description</th>
</tr>
<tr>
<td>lens.client.dbname</td>
<td>default</td>
<td>The database schema name</td>
<td>lens.client.dbname</td>
<td>default</td>
<td>The database schema name</td>
</tr>
<tr>
<td>lens.query.enable.persistent.resultset</td>
<td>false</td>
<td>Whether to enable persistent resultset for queries. When enabled, server will fetch results from driver, custom format them if any and store in a configured location. The file name of query output is queryhandle-id, with configured extensions</td>
<td>lens.query.enable.persistent.resultset</td>
<td>false</td>
<td>Whether to enable persistent resultset for queries. When enabled, server will fetch results from driver, custom format them if any and store in a configured location. The file name of query output is queryhandle-id, with configured extensions</td>
</tr>
<tr>
<td>lens.server.base.url</td>
<td>http://hostname:port/lensapi</td>
<td>The base url for the lens server. you have to edit "hostname" and "port" that you may use(ex. http://0.0.0.0:9999/lensapi)</td>
<td>lens.server.base.url</td>
<td>http://hostname:port/lensapi</td>
<td>The base url for the lens server. you have to edit "hostname" and "port" that you may use(ex. http://0.0.0.0:9999/lensapi)</td>
</tr>
<tr>
<td>lens.session.cluster.user </td>
<td>default</td>
<td>Hadoop cluster username</td>
<td>lens.session.cluster.user </td>
<td>default</td>
<td>Hadoop cluster username</td>
</tr>
<tr>
<td>zeppelin.lens.maxResult</td>
<td>1000</td>
<td>Max number of rows to display</td>
<td>zeppelin.lens.maxResult</td>
<td>1000</td>
<td>Max number of rows to display</td>
</tr>
<tr>
<td>zeppelin.lens.maxThreads</td>
<td>10</td>
<td>If concurrency is true then how many threads?</td>
<td>zeppelin.lens.maxThreads</td>
<td>10</td>
<td>If concurrency is true then how many threads?</td>
</tr>
<tr>
<td>zeppelin.lens.run.concurrent</td>
<td>true</td>
<td>Run concurrent Lens Sessions</td>
<td>zeppelin.lens.run.concurrent</td>
<td>true</td>
<td>Run concurrent Lens Sessions</td>
</tr>
<tr>
<td>xxx</td>
<td>yyy</td>
<td>anything else from [Configuring lens server](https://lens.apache.org/admin/config-server.html)</td>
<td>xxx</td>
<td>yyy</td>
<td>anything else from [Configuring lens server](https://lens.apache.org/admin/config-server.html)</td>
</tr>
</table>
![Apache Lens Interpreter Setting](../assets/themes/zeppelin/img/docs-img/lens-interpreter-setting.png)
### Interpreter Bindging for Zeppelin Notebook
After configuring Lens interpreter, create your own notebook, then you can bind interpreters like below image.
After configuring Lens interpreter, create your own notebook, then you can bind interpreters like below image.
![Zeppelin Notebook Interpreter Biding](../assets/themes/zeppelin/img/docs-img/lens-interpreter-binding.png)
For more interpreter binding information see [here](http://zeppelin.incubator.apache.org/docs/manual/interpreters.html).
@ -90,80 +91,79 @@ As you can see in this video, they are using Lens Client Shell(./bin/lens-cli.sh
<li> Create and Use(Switch) Databases.
```
create database newDb
```
```
use newDb
```
```
create database newDb
```
```
use newDb
```
<li> Create Storage.
```
create storage your/path/to/lens/client/examples/resources/db-storage.xml
```
```
create storage your/path/to/lens/client/examples/resources/db-storage.xml
```
<li> Create Dimensions, Show fields and join-chains of them.
```
create dimension your/path/to/lens/client/examples/resources/customer.xml
```
```
dimension show fields customer
```
```
dimension show joinchains customer
```
```
create dimension your/path/to/lens/client/examples/resources/customer.xml
```
```
dimension show fields customer
```
```
dimension show joinchains customer
```
<li> Create Caches, Show fields and join-chains of them.
```
create cube your/path/to/lens/client/examples/resources/sales-cube.xml
```
```
cube show fields sales
```
```
cube show joinchains sales
```
```
create cube your/path/to/lens/client/examples/resources/sales-cube.xml
```
```
cube show fields sales
```
```
cube show joinchains sales
```
<li> Create Dimtables and Fact.
```
create dimtable your/path/to/lens/client/examples/resources/customer_table.xml
```
```
create fact your/path/to/lens/client/examples/resources/sales-raw-fact.xml
```
```
create dimtable your/path/to/lens/client/examples/resources/customer_table.xml
```
```
create fact your/path/to/lens/client/examples/resources/sales-raw-fact.xml
```
<li> Add partitions to Dimtable and Fact.
```
dimtable add single-partition --dimtable_name customer_table --storage_name local --path your/path/to/lens/client/examples/resources/customer-local-part.xml
```
```
fact add partitions --fact_name sales_raw_fact --storage_name local --path your/path/to/lens/client/examples/resources/sales-raw-local-parts.xml
```
```
dimtable add single-partition --dimtable_name customer_table --storage_name local --path your/path/to/lens/client/examples/resources/customer-local-part.xml
```
```
fact add partitions --fact_name sales_raw_fact --storage_name local --path your/path/to/lens/client/examples/resources/sales-raw-local-parts.xml
```
<li> Now, you can run queries on cubes.
```
query execute cube select customer_city_name, product_details.description, product_details.category, product_details.color, store_sales from sales where time_range_in(delivery_time, '2015-04-11-00', '2015-04-13-00')
```
![Lens Query Result](../assets/themes/zeppelin/img/docs-img/lens-result.png)
```
query execute cube select customer_city_name, product_details.description, product_details.category, product_details.color, store_sales from sales where time_range_in(delivery_time, '2015-04-11-00', '2015-04-13-00')
```
![Lens Query Result](../assets/themes/zeppelin/img/docs-img/lens-result.png)
These are just examples that provided in advance by Lens. If you want to explore whole tutorials of Lens, see the [tutorial video](https://cwiki.apache.org/confluence/display/LENS/2015/07/13/20+Minute+video+demo+of+Apache+Lens+through+examples).
### Lens UI Service
Lens also provides web UI service. Once the server starts up, you can open the service on http://serverhost:19999/index.html and browse. You may also check the structure that you made and use query easily here.
![Lens UI Servive](../assets/themes/zeppelin/img/docs-img/lens-ui-service.png)
![Lens UI Servive](../assets/themes/zeppelin/img/docs-img/lens-ui-service.png)

View file

@ -6,9 +6,7 @@ group: manual
---
{% include JB/setup %}
## PostgreSQL, HAWQ Interpreter for Apache Zeppelin
<table class="table-configuration">
<tr>
<th>Name</th>
@ -30,11 +28,9 @@ This interpreter seamlessly supports the following SQL data processing engines:
* [Apache HAWQ](http://pivotal.io/big-data/pivotal-hawq) - Powerful [Open Source](https://wiki.apache.org/incubator/HAWQProposal) SQL-On-Hadoop engine.
* [Greenplum](http://pivotal.io/big-data/pivotal-greenplum-database) - MPP database built on open source PostgreSQL.
This [Video Tutorial](https://www.youtube.com/watch?v=wqXXQhJ5Uk8) illustrates some of the features provided by the `Postgresql Interpreter`.
### Create Interpreter
By default Zeppelin creates one `PSQL` instance. You can remove it or create new instances.
Multiple PSQL instances can be created, each configured to the same or different backend databases. But over time a `Notebook` can have only one PSQL interpreter instance `bound`. That means you _cannot_ connect to different databases in the same `Notebook`. This is a known Zeppelin limitation.
@ -44,54 +40,50 @@ To create new PSQL instance open the `Interpreter` section and click the `+Creat
> Note: The `Name` of the instance is used only to distinct the instances while binding them to the `Notebook`. The `Name` is irrelevant inside the `Notebook`. In the `Notebook` you must use `%psql.sql` tag.
### Bind to Notebook
In the `Notebook` click on the `settings` icon in the top right corner. The select/deselect the interpreters to be bound with the `Notebook`.
### Configuration
You can modify the configuration of the PSQL from the `Interpreter` section. The PSQL interpreter expenses the following properties:
<table class="table-configuration">
<tr>
<th>Property Name</th>
<th>Description</th>
<th>Default Value</th>
</tr>
<tr>
<td>postgresql.url</td>
<td>JDBC URL to connect to </td>
<td>jdbc:postgresql://localhost:5432</td>
</tr>
<tr>
<td>postgresql.user</td>
<td>JDBC user name</td>
<td>gpadmin</td>
</tr>
<tr>
<td>postgresql.password</td>
<td>JDBC password</td>
<td></td>
</tr>
<tr>
<td>postgresql.driver.name</td>
<td>JDBC driver name. In this version the driver name is fixed and should not be changed</td>
<td>org.postgresql.Driver</td>
</tr>
<tr>
<td>postgresql.max.result</td>
<td>Max number of SQL result to display to prevent the browser overload</td>
<td>1000</td>
</tr>
<tr>
<th>Property Name</th>
<th>Description</th>
<th>Default Value</th>
</tr>
<tr>
<td>postgresql.url</td>
<td>JDBC URL to connect to </td>
<td>jdbc:postgresql://localhost:5432</td>
</tr>
<tr>
<td>postgresql.user</td>
<td>JDBC user name</td>
<td>gpadmin</td>
</tr>
<tr>
<td>postgresql.password</td>
<td>JDBC password</td>
<td></td>
</tr>
<tr>
<td>postgresql.driver.name</td>
<td>JDBC driver name. In this version the driver name is fixed and should not be changed</td>
<td>org.postgresql.Driver</td>
</tr>
<tr>
<td>postgresql.max.result</td>
<td>Max number of SQL result to display to prevent the browser overload</td>
<td>1000</td>
</tr>
</table>
### How to use
```
Tip: Use (CTRL + .) for SQL auto-completion.
```
#### DDL and SQL commands
Start the paragraphs with the full `%psql.sql` prefix tag! The short notation: `%psql` would still be able run the queries but the syntax highlighting and the auto-completions will be disabled.
You can use the standard CREATE / DROP / INSERT commands to create or modify the data model:
@ -121,7 +113,6 @@ select * from mytable;
```
#### PSQL command line tools
Use the Shell Interpreter (`%sh`) to access the command line [PSQL](http://www.postgresql.org/docs/9.4/static/app-psql.html) interactively:
```bash
@ -147,7 +138,6 @@ This will produce output like this:
```
#### Apply Zeppelin Dynamic Forms
You can leverage [Zeppelin Dynamic Form](../manual/dynamicform.html) inside your queries. You can use both the `text input` and `select form` parametrization features
```sql
@ -160,7 +150,6 @@ LIMIT ${limit=10};
```
#### Example HAWQ PXF/HDFS Tables
Create HAWQ external table that read data from tab-separated-value data in HDFS.
```sql
@ -179,5 +168,4 @@ select * from retail_demo.payment_methods_pxf
```
### Auto-completion
The PSQL Interpreter provides a basic auto-completion functionality. On `(Ctrl+.)` it list the most relevant suggestions in a pop-up window. In addition to the SQL keyword the interpreter provides suggestions for the Schema, Table, Column names as well.

View file

@ -6,13 +6,10 @@ group: manual
---
{% include JB/setup %}
## Scalding Interpreter for Apache Zeppelin
[Scalding](https://github.com/twitter/scalding) is an open source Scala library for writing MapReduce jobs.
### Building the Scalding Interpreter
You have to first build the Scalding interpreter by enable the **scalding** profile as follows:
```
@ -20,9 +17,8 @@ mvn clean package -Pscalding -DskipTests
```
### Enabling the Scalding Interpreter
In a notebook, to enable the **Scalding** interpreter, click on the **Gear** icon,select **Scalding**, and hit **Save**.
<center>
![Interpreter Binding](../assets/themes/zeppelin/img/docs-img/scalding-InterpreterBinding.png)
@ -32,11 +28,9 @@ In a notebook, to enable the **Scalding** interpreter, click on the **Gear** ico
</center>
### Configuring the Interpreter
Zeppelin comes with a pre-configured Scalding interpreter in local mode, so you do not need to install anything.
### Testing the Interpreter
In example, by using the [Alice in Wonderland](https://gist.github.com/johnynek/a47699caa62f4f38a3e2) tutorial, we will count words (of course!), and plot a graph of the top 10 words in the book.
```
@ -78,7 +72,6 @@ If you click on the icon for the pie chart, you should be able to see a chart li
![Scalding - Pie - Chart](../assets/themes/zeppelin/img/docs-img/scalding-pie.png)
### Current Status & Future Work
The current implementation of the Scalding interpreter does not support canceling jobs, or fine-grained progress updates.
The pre-configured Scalding interpreter only supports Scalding in local mode. Hadoop mode for Scalding is currently unsupported, and will be future work (contributions welcome!).

View file

@ -19,14 +19,12 @@ limitations under the License.
-->
{% include JB/setup %}
## Interpreters in Zeppelin
In this section, we will explain about the role of interpreters, interpreters group and interpreter settings in Zeppelin.
The concept of Zeppelin interpreter allows any language/data-processing-backend to be plugged into Zeppelin.
Currently, Zeppelin supports many interpreters such as Scala ( with Apache Spark ), Python ( with Apache Spark ), SparkSQL, Hive, Markdown, Shell and so on.
## What is Zeppelin interpreter?
Zeppelin Interpreter is a plug-in which enables Zeppelin users to use a specific language/data-processing-backend. For example, to use scala code in Zeppelin, you need `%spark` interpreter.
When you click the ```+Create``` button in the interpreter page, the interpreter drop-down list box will show all the available interpreters on your server.
@ -34,13 +32,11 @@ When you click the ```+Create``` button in the interpreter page, the interpreter
<img src="/assets/themes/zeppelin/img/screenshots/interpreter_create.png">
## What is Zeppelin Interpreter Setting?
Zeppelin interpreter setting is the configuration of a given interpreter on Zeppelin server. For example, the properties are required for hive JDBC interpreter to connect to the Hive server.
<img src="/assets/themes/zeppelin/img/screenshots/interpreter_setting.png">
## What is Zeppelin Interpreter Group?
Every Interpreter is belonged to an **Interpreter Group**. Interpreter Group is a unit of start/stop interpreter.
By default, every interpreter is belonged to a single group, but the group might contain more interpreters. For example, spark interpreter group is including Spark support, pySpark,
SparkSQL and the dependency loader.
@ -51,7 +47,6 @@ Each interpreters is belonged to a single group and registered together. All of
<img src="/assets/themes/zeppelin/img/screenshots/interpreter_setting_spark.png">
## Programming Languages for Interpreter
If the interpreter uses a specific programming language ( like Scala, Python, SQL ), it is generally recommended to add a syntax highlighting supported for that to the notebook paragraph editor.
To check out the list of languages supported, see the `mode-*.js` files under `zeppelin-web/bower_components/ace-builds/src-noconflict` or from [github.com/ajaxorg/ace-builds](https://github.com/ajaxorg/ace-builds/tree/master/src-noconflict).
@ -61,4 +56,3 @@ If you want to add a new set of syntax highlighting,
1. Add the `mode-*.js` file to `zeppelin-web/bower.json` ( when built, `zeppelin-web/src/index.html` will be changed automatically. ).
2. Add to the list of `editorMode` in `zeppelin-web/src/app/notebook/paragraph/paragraph.controller.js` - it follows the pattern 'ace/mode/x' where x is the name.
3. Add to the code that checks for `%` prefix and calls `session.setMode(editorMode.x)` in `setParagraphMode` located in `zeppelin-web/src/app/notebook/paragraph/paragraph.controller.js`.