zeppelin/beam
mingmxu b87bcf5a99 [ZEPPELIN-2865] upgrade Beam interpreter to latest version
### What is this PR for?
upgrade Beam interpreter to use the latest version of Apache Beam.

### What type of PR is it?
[Improvement]

### Todos
*

### What is the Jira issue?
* https://issues.apache.org/jira/browse/ZEPPELIN-2865

### How should this be tested?
* Start the Zeppelin server
* The prefix of interpreter is %beam and then write your code with required imports and the runner

Refer to `docs/interpreter/beam.md` for an example;

### Screenshots (if appropriate)

### Questions:
* Does the licenses files need update?  no
* Is there breaking changes for older versions? no
* Does this needs documentation? yes, updated `docs/interpreter/beam.md` and `README.md`

Author: mingmxu <mingmxu@ebay.com>

Closes #2541 from XuMingmin/ZEPPELIN-2865 and squashes the following commits:

520f0fd7 [mingmxu] restore the notice message of scala-2.10
93b3e24d [mingmxu] upgrade to Apache Beam 2.0.0
2017-08-20 20:16:32 +09:00
..
src/main [ZEPPELIN-2403] interpreter property widgets 2017-07-06 15:54:55 +09:00
pom.xml [ZEPPELIN-2865] upgrade Beam interpreter to latest version 2017-08-20 20:16:32 +09:00
README.md [ZEPPELIN-2865] upgrade Beam interpreter to latest version 2017-08-20 20:16:32 +09:00

Overview

Beam interpreter for Apache Zeppelin

Architecture

Current interpreter implementation supports the static repl. It compiles the code in memory, execute it and redirect the output to zeppelin.

Building the Beam Interpreter

You have to first build the Beam interpreter by enable the beam profile as follows:

mvn clean package -Pbeam -DskipTests -Pscala-2.10

Notice

  • Flink runner comes with binary compiled for scala 2.10. So, currently we support only Scala 2.10

Technical overview

  • Upon starting an interpreter, an instance of JavaCompiler is created.

  • When the user runs commands with beam, the JavaParser go through the code to get a class that contains the main method.

  • Then it replaces the class name with random class name to avoid overriding while compilation. it creates new out & err stream to get the data in new stream instead of the console, to redirect output to zeppelin.

  • If there is any error during compilation, it can catch and redirect to zeppelin.