Commit graph

109 commits

Author SHA1 Message Date
Prabhjyot Singh
31f584cfee [ZEPPELIN-1320] Run zeppelin interpreter process as web front end user
Have recreated this from https://github.com/apache/zeppelin/pull/1322
### What is this PR for?

While running a Notebook using shell, spark, python uses same user as which zeppelin server is running. Which means these interprets have same permission on file system as zeppelin server.
IMO users should be able to impersonate themselves as a complete security system.
### What type of PR is it?

[Improvement]
### Todos
- [x] - Update doc
- [x] - FIX NPEs
- [x] - FIX CI
### What is the Jira issue?
- [ZEPPELIN-1320](https://issues.apache.org/jira/browse/ZEPPELIN-1320)
### How should this be tested?
- Enable shiro auth in shiro.ini
- Add ssh key for the same user you want to try and impersonate (say user1).

```
adduser user1
ssh-keygen
ssh user1localhost mkdir -p .ssh
cat ~/.ssh/id_rsa.pub | ssh user1localhost 'cat >> .ssh/authorized_keys'
```
- Start zeppelin server, try and run following in paragraph in a notebook
- Go to interpreter setting page, and enable "User Impersonate" in any of the interpreter (in my example its shell interpreter)

```
%sh
whoami
```

Check that it should run as new user, i.e. "user1"
### Screenshots (if appropriate)

![user impersonate](https://cloud.githubusercontent.com/assets/674497/20213127/f32fdc52-a82c-11e6-8e33-aebd6a943c5f.gif)

### Questions:
- Does the licenses files need update? no
- Is there breaking changes for older versions? no
- Does this needs documentation? yes

Author: Prabhjyot Singh <prabhjyotsingh@gmail.org>

Closes #1554 from prabhjyotsingh/ZEPPELIN-1320-2 and squashes the following commits:

dc69c9d [Prabhjyot Singh] @Leemoonsoo review comment: making ZEPPELIN_SSH_COMMAND configurable
1b26cc0 [Prabhjyot Singh] add doc
5a76839 [Prabhjyot Singh] show User Impersonate only when interpreter setting is "per user" and "isolated"
02c3084 [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into ZEPPELIN-1320-2
03b2f20 [Prabhjyot Singh] use user instead of ""
0ff80ec [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into ZEPPELIN-1320-2
dd0731d [Prabhjyot Singh] fix missing test cases
aff1bf0 [Prabhjyot Singh] user should have option to run these interpreters as different user.
2016-11-17 19:07:29 -08:00
Jeff Zhang
465c51a419 ZEPPELIN-335. Pig Interpreter
### What is this PR for?
Based on #338 , I refactor most of pig interpreter. As I don't think the approach in #338 is the best approach. In #338, we use script `bin/pig` to launch pig script, it is different to control that job (hard to kill and get progress and stats info).  In this PR, I use pig api to launch pig script. Besides that I implement another interpreter type `%pig.query` to leverage the display system of zeppelin. For the details you can check `pig.md`

### What type of PR is it?
[Feature]

### Todos
* Syntax Highlight
* new interpreter type `%pig.udf`, so that user can write pig udf in zeppelin directly and don't need to build udf jar manually.

### What is the Jira issue?
* https://issues.apache.org/jira/browse/ZEPPELIN-335

### How should this be tested?
Unit test is added and also manual test is done

### Screenshots (if appropriate)

![image](https://cloud.githubusercontent.com/assets/164491/18986649/54217b4c-8730-11e6-9e33-25f98a98a9b6.png)

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No

Author: Jeff Zhang <zjffdu@apache.org>
Author: Ali Bajwa <abajwa@hortonworks.com>
Author: AhyoungRyu <ahyoungryu@apache.org>
Author: Jeff Zhang <zjffdu@gmail.com>

Closes #1476 from zjffdu/ZEPPELIN-335 and squashes the following commits:

73a07f0 [Jeff Zhang] minor update
a1b742b [Jeff Zhang] minor update on doc
e858301 [Jeff Zhang] address comments
c85a090 [Jeff Zhang] add license
58b4b2f [Jeff Zhang] minor update of docs
1ae7db2 [Jeff Zhang] Merge pull request #2 from AhyoungRyu/ZEPPELIN-335/docs
fe014a7 [AhyoungRyu] Fix docs title in front matter
df7a6db [AhyoungRyu] Add pig.md to dropdown menu
5e2e222 [AhyoungRyu] Minor update for pig.md
39f161a [Jeff Zhang] address comments
05a3b9b [Jeff Zhang] add pig.md
a09a7f7 [Jeff Zhang] refactor pig Interpreter
c28beb5 [Ali Bajwa] Updated based on comments: 1. Documentation: added pig.md with interpreter documentation and added pig entry to index.md 2. Added test junit test based on passwd file parsing example here https://pig.apache.org/docs/r0.10.0/start.html#run 3. Removed author tag from comment (this was copied from shell interpreter https://github.com/apache/incubator-zeppelin/blob/master/shell/src/main/java/org/apache/zeppelin/shell/ShellInterpreter.java#L42) 4. Implemented cancel functionality 5. Display output stream in case of error
2586336 [Ali Bajwa] exposed timeout and pig executable via interpreter and added comments
7abad20 [Ali Bajwa] initial commit of pig interpreter
2016-10-15 12:26:50 -07:00
Mina Lee
abd95fa5e4 [HOTFIX] Set default ZEPPELIN_INTP_MEM
### What is this PR for?
This PR sets default value for ZEPPELIN_INTP_MEM to avoid OOM Exception in SparkInterpreter when Zeppelin has zero configuration. This PR should be merged to both branch-0.6 and master.

### What type of PR is it?
Bug Fix

### How should this be tested?
1. Build with:
```
mvn clean package -DskipTests -pl '!zeppelin-distribution,!file,!alluxio,!livy,!hbase,!bigquery,!python,!jdbc,!ignite,!lens,!postgresql,!cassandra,!kylin,!elasticsearch,!flink,!markdown,!shell,!angular'
```
2. Unset SPARK_HOME in conf/zeppelin-env.sh if you have.
3. Run Zeppelin with java 1.7.
4. Run tutorial and see if it doesn't hang.

### Questions:
* Does the licenses files need update? no
* Is there breaking changes for older versions? no
* Does this needs documentation? no

Author: Mina Lee <minalee@apache.org>

Closes #1505 from minahlee/hotfix/default_intp_jvm and squashes the following commits:

0dfda4f [Mina Lee] Set default ZEPPELIN_INTP_MEM
2016-10-12 16:44:53 +09:00
rajarajan-g
eb01bddd98 [ZEPPELIN-1182] Zeppelin should have Startup and Shutdown message
### What is this PR for?
PR is for logging configuration details of Zeppelin server.

### What type of PR is it?
Improvement

### Todos
* [ ] - Task

### What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-1182

### How should this be tested?
Please check the log if information on Server host, server path, context path, zeppelin version, class path is available in that file.

### Screenshots (if appropriate)

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No

Note:
    For configuration details such as  zeppelin version, server host, port & context path , these are available as part of zeppelin configuration.

Logging information

```
 INFO [2016-09-02 14:35:25,218] ({main} ZeppelinConfiguration.java[create]:107) - Server Host: 0.0.0.0
 INFO [2016-09-02 14:35:25,218] ({main} ZeppelinConfiguration.java[create]:108) - Server Port: 8080
 INFO [2016-09-02 14:35:25,218] ({main} ZeppelinConfiguration.java[create]:109) - Context Path: /
 INFO [2016-09-02 14:35:25,224] ({main} ZeppelinConfiguration.java[create]:110) - Zeppelin Version: 0.7.0-SNAPSHOT
```

For Zeppelin class path and restart information,  I am logging it from Shell script to log file as in below paragraph, So the format of logging is different . I know this is not per standards, Please let me know if you have any suggestions
```
Zeppelin is restarted

ZEPPELIN_CLASSPATH: /home/rajarajang/Workspace/stsWorksapce/zeppelin/zeppelin-server/target/lib/*:/home/rajarajang/Workspace/stsWorksapce/zeppelin/zeppelin-zengine/target/lib/*:/home/rajarajang/Workspace/stsWorksapce/zeppelin/zeppelin-interpreter/target/lib/*:/home/rajarajang/Workspace/stsWorksapce/zeppelin/*::/home/rajarajang/Workspace/stsWorksapce/zeppelin/conf:/home/rajarajang/Workspace/stsWorksapce/zeppelin/zeppelin-interpreter/target/classes:/home/rajarajang/Workspace/stsWorksapce/zeppelin/zeppelin-zengine/target/classes:/home/rajarajang/Workspace/stsWorksapce/zeppelin/zeppelin-server/target/classes
```

Author: rajarajan-g <rajarajan.ganesan@imaginea.com>

Closes #1399 from rajarajan-g/ZEPPELIN-1182 and squashes the following commits:

f4f7f44 [rajarajan-g] updated as per review comments
da69b16 [rajarajan-g] Added log statements for configuration
2016-09-21 08:18:05 -07:00
Jeff Zhang
93e37620c4 ZEPPELIN-1185. ZEPPELIN_INTP_JAVA_OPTS should not use ZEPPELIN_JAVA_OPTS
### What is this PR for?

Don't use ZEPPELIN_JAVA_OPTS as the default value of ZEPPELIN_INTP_JAVA_OPTS

### What type of PR is it?
Improvement

### What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-1185

### How should this be tested?
Tested manually. By exporting the following variable, I can debug zeppelin server correctly and remote interpreter process can ran successfully. (Before this PR, the remote  interpreter process will fail to launch because it would also listen the same debug port)
```
export ZEPPELIN_JAVA_OPTS="-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=5005"
```

Author: Jeff Zhang <zjffdu@apache.org>

Closes #1189 from zjffdu/ZEPPELIN-1185 and squashes the following commits:

9e48ad7 [Jeff Zhang] change for windows
3ff5561 [Jeff Zhang] update doc format
e82d889 [Jeff Zhang] add migration doc
ef5a360 [Jeff Zhang] ZEPPELIN-1185. ZEPPELIN_INTP_JAVA_OPTS should not use ZEPPELIN_JAVA_OPTS as default value
2016-09-01 09:30:45 +05:30
Lee moon soo
c1935e1e8d [ZEPPELIN-1264] [HOTFIX] Fix CI test failure with Failed to create interpreter: org.apache.zeppelin.interpreter.remote.mock.MockInterpreterA
### What is this PR for?
Fix CI test failure with error

```
14:05:27,226 ERROR org.apache.zeppelin.interpreter.remote.RemoteInterpreter:237 - Failed to create interpreter: org.apache.zeppelin.interpreter.remote.mock.MockInterpreterA
14:05:27,227 ERROR org.apache.zeppelin.interpreter.remote.RemoteInterpreter:264 - Failed to initialize interpreter: org.apache.zeppelin.interpreter.remote.mock.MockInterpreterA. Remove it from interpreterGroup
14:05:27,240  INFO org.apache.zeppelin.scheduler.SchedulerFactory:131 - Job jobName1 started by scheduler test
14:05:27,240  INFO org.apache.zeppelin.interpreter.remote.RemoteInterpreter:223 - Create remote interpreter org.apache.zeppelin.interpreter.remote.mock.MockInterpreterA
14:05:27,242 ERROR org.apache.zeppelin.interpreter.remote.RemoteInterpreter:237 - Failed to create interpreter: org.apache.zeppelin.interpreter.remote.mock.MockInterpreterA
14:05:27,243 ERROR org.apache.zeppelin.scheduler.Job:189 - Job failed
org.apache.zeppelin.interpreter.InterpreterException: org.apache.thrift.TApplicationException: Internal error processing createInterpreter
	at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.init(RemoteInterpreter.java:238)
	at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:383)
	at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:299)
	at org.apache.zeppelin.scheduler.RemoteSchedulerTest$2.jobRun(RemoteSchedulerTest.java:210)
	at org.apache.zeppelin.scheduler.Job.run(Job.java:176)
	at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:329)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.thrift.TApplicationException: Internal error processing createInterpreter
	at org.apache.thrift.TApplicationException.read(TApplicationException.java:111)
	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
	at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_createInterpreter(RemoteInterpreterService.java:196)
	at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.createInterpreter(RemoteInterpreterService.java:180)
	at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.init(RemoteInterpreter.java:227)
	... 12 more
```

Some unittest launches remote interpreter process for the test with some mock interpreter implementation. So mock interpreter class in the test should be available for interpreter's classpath for the test.

### What type of PR is it?
Hot Fix

### Todos
* [x] - Add necessary test-classes directory in interpreter process's classpath

### What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-1264

### Questions:
* Does the licenses files need update? no
* Is there breaking changes for older versions? no
* Does this needs documentation? no

Author: Lee moon soo <moon@apache.org>

Closes #1261 from Leemoonsoo/ZEPPELIN-1264 and squashes the following commits:

10ad928 [Lee moon soo] Add zeppelin-interpreter/target/test-classes, zeppelin-zengine/target/test-classes in classpath of interpreter
2016-08-02 16:12:48 -05:00
Jeff Zhang
1e478b2293 ZEPPELIN-1175. AM log is not available for yarn-client mode
### What is this PR for?
For now, we share the same class path for zeppelin server and remote interpreter process. The cause the issue that AM log is not available for yarn-client mode because the yarn app also use the `ZEPPELIN_HOME/conf/log4j.properties` which is only for zeppelin server. So this PR just distinguish the CLASSPATH of zeppelin server and remote interpreter process. I use `ZEPPELIN_INTP_CLASSPATH` to represent the classpath of remote interpreter process and won't include `ZEPPELIN_HOME/conf/log4j.properties` in `ZEPPELIN_INTP_CLASSPATH`.

### What type of PR is it?
[Improvement]

### What is the Jira issue?
* https://issues.apache.org/jira/browse/ZEPPELIN-1175

### How should this be tested?
Tested manually.

### Screenshots (if appropriate)

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? Yes, if user put custom config file (hive-site.xml) under ZEPPELIN_HOME/conf, it won't take effect after this PR
* Does this needs documentation? Yes

Author: Jeff Zhang <zjffdu@apache.org>

Closes #1228 from zjffdu/ZEPPELIN-1175 and squashes the following commits:

0973477 [Jeff Zhang] ZEPPELIN-1175. AM log is not available for yarn-client mode
2016-08-01 11:42:03 +05:30
Lee moon soo
4efb39f450 [ZEPPELIN-1046] bin/install-interpreter.sh for netinst package
### What is this PR for?
Implementation of bin/install-interpreter.sh for netinst package which suggested in the [discussion](http://apache-zeppelin-users-incubating-mailing-list.75479.x6.nabble.com/Ask-opinion-regarding-0-6-0-release-package-tp3298p3314.html).

Some usages will be

```
# download all interpreters provided by Apache Zeppelin project
bin/install-interpreter.sh --all

# download an interpreter with name (for example markdown interpreter)
bin/install-interpreter.sh --name md

# download an (3rd party) interpreter with specific maven artifact name
bin/install-interpreter.sh --name md -t org.apache.zeppelin:zeppelin-markdown:0.6.0-SNAPSHOT
```

If it looks fine, i'll continue the work (refactor code, and add test)

### What type of PR is it?
Feature

### Todos
* [x] - working implementation
* [x] - refactor
* [x] - add test

### What is the Jira issue?
* Open an issue on Jira https://issues.apache.org/jira/browse/ZEPPELIN/
* Put link here, and add [ZEPPELIN-*Jira number*] in PR title, eg. [ZEPPELIN-533]

### How should this be tested?
Outline the steps to test the PR here.

### Screenshots (if appropriate)

### Questions:
* Does the licenses files need update?
* Is there breaking changes for older versions?
* Does this needs documentation?

Author: Lee moon soo <moon@apache.org>
Author: AhyoungRyu <fbdkdud93@hanmail.net>

Closes #1042 from Leemoonsoo/netinst and squashes the following commits:

f81d16e [Lee moon soo] address mina's comment
049bc89 [Lee moon soo] Update docs
7307c67 [Lee moon soo] Merge remote-tracking branch 'AhyoungRyu/netinst-docs' into netinst
7e749ad [Lee moon soo] Address mina's comment
0eedd2a [AhyoungRyu] Address @minahlee feedback
13f2d04 [Lee moon soo] generate netinst package
03c664e [AhyoungRyu] Add a new line
5d0a971 [AhyoungRyu] Revert install.md to latest version
13899fb [AhyoungRyu] Reorganize interpreter installation docs
4c1f029 [Lee moon soo] Proxy support
9079580 [Lee moon soo] fix artifact name
1077296 [Lee moon soo] update test
aebca17 [Lee moon soo] Add docs
d547551 [Lee moon soo] Remove test entries
6ee06b8 [Lee moon soo] Make DependencyResolver in zeppelin-interpreter module not aware of ZEPPELIN_HOME
7b1b36a [Lee moon soo] update usage
49f0568 [Lee moon soo] Add conf/interpreter-list
1b558fd [Lee moon soo] update some text
ec7d152 [Lee moon soo] add tip
2c81a3f [Lee moon soo] update
78a7c52 [Lee moon soo] Refactor and add test
47f5706 [Lee moon soo] Install multiple interpreters at once
38e2556 [Lee moon soo] Initial implementation of install-interpreter.sh
2016-06-23 20:58:10 -07:00
Lee moon soo
fb4a76a20e Fix interpreter.sh classpath
### What is this PR for?
This PR apply fix #769 again, which is reverted by #208.
Also removing unnecessary code from interpreter.sh

### What type of PR is it?
Bug Fix

### Todos
* [x] - Apply #769 again
* [x] - Remove unnecessary code

### What is the Jira issue?

### How should this be tested?

### Screenshots (if appropriate)

### Questions:
* Does the licenses files need update? no
* Is there breaking changes for older versions? no
* Does this needs documentation? no

Author: Lee moon soo <moon@apache.org>

Closes #889 from Leemoonsoo/fix_interpreter_sh_classpath and squashes the following commits:

8468fd5 [Lee moon soo] Remove unnecessary construction of classpath
ea8fee8 [Lee moon soo] Apply pr769 again, reverted by pr208
2016-05-19 11:37:03 -07:00
Frank Rosner
54f4420f9f ZEPPELIN-815 don't create a sub shell for the runner
### What is this PR for?

This pull request is supposed to fix ZEPPELIN-815. The issue is that one cannot stop Zeppelin by sending a signal, because it starts the runner in a sub shell. The pull request starts the runner in the same process as the `zeppelin.sh`, making it react to signals.

### What type of PR is it?

Improvement

### Todos

None

### What is the Jira issue?

https://issues.apache.org/jira/browse/ZEPPELIN-815

### How should this be tested?
- Start `zeppelin.sh`
- Get PID of the `zeppelin.sh` process
- Send SIGINT (`kill -SIGINT <pid>`) to the `zeppelin.sh` process
- Observe that Zeppelin is stopped

### Screenshots

none

### Questions:
* What is the reason to put an exec into a sub shell in the first place?

Author: Frank Rosner <frank@fam-rosner.de>

Closes #844 from FRosner/ZEPPELIN-815 and squashes the following commits:

45896d2 [Frank Rosner] ZEPPELIN-815 don't create a sub shell for the runner
2016-04-22 19:51:22 -07:00
Jongyoul Lee
53451e9124 [MINOR] Set log4j.configuration into JAVA_OPTS and JAVA_INTP_OPTS explicitly
### What is this PR for?
Set log4j into JVM option for enforcing logging configuration.

### What type of PR is it?
[Bug Fix]

### Todos
* [x] - Set "-Dlog4j.configuration=..." into JVM option

### What is the Jira issue?
Minor issue

### How should this be tested?

1. Run spark with default log4j setting and check logs/zeppelin*.out. you can see logs from Spark.
1. Apply this patch.
1. Run spark with default log4j setting and check logs/zeppelin-interpreter*.log. you can see logs from Spark.

### Screenshots (if appropriate)

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No

Author: Jongyoul Lee <jongyoul@gmail.com>

Closes #830 from jongyoul/minor-set-log4j-explicitly and squashes the following commits:

2185284 [Jongyoul Lee] [MINOR] - Added "-Dlog4j.configuration" into JAVA_OPTS and JAVA_INTP_OPTS explicitly
2016-04-14 08:46:20 +09:00
Amos Elb
d5e87fb8ba R Interpreter for Zeppelin
This is the initial PR for an R Interpreter for Zeppelin.  There's still some work to be done (e.g., tests), but its useable, it brings to Zeppelin features from R like its library of statistics and machine learning packages, as well as advanced interactive visualizations.  So I'd like to open it up for others to comment and/or become involved.

 Summary:

- There are two interpreters, one emulates a REPL, the other uses knitr to weave markdown and formatted R output.  The two interpreters share a single execution environment.

- Visualisations:  Besides R's own graphics, this also supports interactive visualizations with googleVis and rCharts.  I am working on htmlwidgets (almost done) with the author of that package, and a next-step project is to get Shiny/ggvis working.  Sometimes, a visualization won't load until the page is reloaded.  I'm not sure why this is.

- Licensing:  To talk to R, this integrates code forked from rScala.  rScala was released with a BSD-license option, and the author's permission was obtained.

- Spark:  Getting R to share a single spark context with the Spark interpreter group is going to be a project.  For right now, the R interpreters live in their own "r" interpreter group, and new spark contexts are created on startup.

- Zeppelin Context:  Not yet integrated, in significant part because there's no ZeppelinContext to talk to until it lives in the Spark interpreter group.

- Documentation:  A notebook is included that demonstrates what the interpreter does and how to use it.

- Tests:  Working on it...

P.S.:  This is my first PR on a project of this size; let me know what I messed up and I'll try to fix it ASAP.

Author: Amos Elb <amos.elberg@me.com>
Author: Amos B. Elberg <amos.elberg@me.com>

Closes #208 from elbamos/rinterpreter and squashes the following commits:

ffc1a25 [Amos Elb] Fix rat issue
a08ec5b [Amos B. Elberg] R Interpreter
2016-04-05 16:35:18 +09:00
Felix Cheung
26a2d641c7 [ZEPPELIN-767] HBase interpreter does not work with HBase on a remote cluster
### What is this PR for?
HBase interpreter fails with message "ERROR: KeeperErrorCode = ConnectionLoss for /hbase" when connecting to a remote HBASE (for instance, HBase running in CDH cluster)

Initially it's thought that zkquoram setttings are not getting applied, but deeper investigations reveal that hbase-site.xml cannot be loaded.

HBASE_HOME or HBASE_CONF_DIR is set by `hbase` script when running hbase shell - interpreter will need to at minimum replicate that behavior to add the directory with hbase-site.xml to CLASS_PATH in order to fix this issue.

### What type of PR is it?
Bug Fix

### Todos
* [x] - Bug fix
* [x] - Update documentation

### What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-767

### How should this be tested?
(tested) Run HBase locally (standalone: https://hbase.apache.org/book.html#quickstart)
(tested) Set HBASE_HOME in env and work with HBASE on a Hadoop cluster
(tested) Set HBASE_CONF_DIR in env and work with HBASE on a Hadoop cluster

### Screenshots (if appropriate)
N/A

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? Added

Author: Felix Cheung <felixcheung_m@hotmail.com>

Closes #799 from felixcheung/hbaseconf and squashes the following commits:

ae90626 [Felix Cheung] fix test
a82c2a6 [Felix Cheung] fix bug, add doc, update text
eeb341f [Felix Cheung] set hbase conf dir to classpath
2016-04-01 14:53:04 -07:00
Minwoo Kang
566ffd0b9c [ZEPPELIN-715]Provide version information
### What is this PR for?
Provides version information from it's REST api, GUI and command line

### What type of PR is it?
Feature

### What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-715

### How should this be tested?
- unit test
- runtime checking

### Screenshots (if appropriate)
- REST API

![zeppelinversionapi](https://cloud.githubusercontent.com/assets/10624086/13978232/f18b790a-f112-11e5-8361-d814b37184fb.png)

- GUI

![zeppelinversiongui](https://cloud.githubusercontent.com/assets/10624086/13978241/00ca2826-f113-11e5-86de-fcae1f95cbcc.png)

- Command line

![zeppelinversioncli](https://cloud.githubusercontent.com/assets/10624086/13978263/10801942-f113-11e5-93e9-8e122740ff00.png)

Author: Minwoo Kang <minwoo.kang@outlook.com>

Closes #792 from mwkang/ZEPPELIN-715 and squashes the following commits:

6829aaf [Minwoo Kang] [ZEPPELIN-715]Changed the command line arguments
d054227 [Minwoo Kang] [ZEPPELIN-715]Provide version information
d60f41a [Minwoo Kang] [ZEPPELIN-715]Provide version information (add ASF licenses, fix style)
1b34004 [Minwoo Kang] [ZEPPELIN-715]Provide version information (add ASF licenses)
c185a03 [Minwoo Kang] [ZEPPELIN-715]Provide version information (fix style)
f5b2e66 [Minwoo Kang] [ZEPPELIN-715]Provide version information
2016-03-25 21:26:26 -07:00
Silvio Fiorito
2dc464cfda [ZEPPELIN-647] - Native Windows support for startup scripts and configuration
### What is this PR for?
This is to give Windows first-class support for running Zeppelin without the need for Cygwin or other hacks.

### What type of PR is it?
Improvement

### Todos
* [x] - Fix notebook dir path handling which right now assumes URI compatible string (see https://github.com/apache/incubator-zeppelin/blob/master/zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/repo/VFSNotebookRepo.java#L63)
* [x] - Add documentation for configuring and running on Windows
* [x] - Independent code review of the CMD scripts to ensure they're correct

### Is there a relevant Jira issue?
ZEPPELIN-647

### How should this be tested?
* Pull this PR
* Build
* Override default ZEPPELIN_NOTEBOOK_DIR in zeppelin-env.cmd to be an absolute file URI such as file:///c:/notebook
* Start with bin\zeppelin.cmd
* If using any Hadoop system ensure you have winutils.exe in your HADOOP_HOME\bin, see (https://github.com/steveloughran/winutils)

### Screenshots (if appropriate)

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? Yes

Author: Silvio Fiorito <silvio.fiorito@granturing.com>
Author: Silvio Fiorito <Silvio Fiorito>

Closes #734 from granturing/windows-support and squashes the following commits:

8aadd45 [Silvio Fiorito] Fixes to handle spaces in paths properly, both for ZEPPELIN_HOME and CLASSPATH
73aaf4f [Silvio Fiorito] Default to the appropriate interpreter when running on Windows
db28fe9 [Silvio Fiorito] Support for running unit tests on Windows using the appropriate interpreter script
a1e3097 [Silvio Fiorito] Support for Windows CMD shell interpreter
82acdcf [Silvio Fiorito] Merge branch 'master' into windows-support
9e8b309 [Silvio Fiorito] Initital doc updates for running on Windows
03baf62 [Silvio Fiorito] Additional fix for embedded pyspark environment variables
2b9f01c [Silvio Fiorito] Fix for pyspark PYTHONPATH environment variable not being set properly due to delayed expansion
c700808 [Silvio Fiorito] Check for Windows path before creating URI to prevent URISyntaxExecption
d30e4b9 [Silvio Fiorito] And again fix indentations missed last time
5b49d3e [Silvio Fiorito] Cleaned up indentation
9e40482 [Silvio Fiorito] Initial support for Windows platform, startup scripts
2016-03-24 08:04:26 -07:00
AhyoungRyu
1a4e9ca229 Fix interpreter.sh to get Spark interpreter log file
### What is this PR for?
Currently, if users set their own `SPARK_HOME`, they can not get `zeppelin-interpreter-spark-xxxx.log` file. This PR is for fixing this issue.
(This issue is reported by weipuz)

### What type of PR is it?
Hot Fix

### Todos

### What is the Jira issue?
None

### How should this be tested?
After applying this PR,
1. Set your own `SPARK_HOME`.
2. Run `sc.version`(or whatever you want) with Spark interpreter.
3. Check under your `ZEPPELIN_HOME/logs/` directory, then you can find `zeppelin-interpreter-spark-xxx.log` file.

### Screenshots (if appropriate)

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No

Author: AhyoungRyu <fbdkdud93@hanmail.net>

Closes #769 from AhyoungRyu/fix-spark-log and squashes the following commits:

dcdad56 [AhyoungRyu] Ping travis
8564f8c [AhyoungRyu] Fix interpreter.sh to get Spark interpreter log
2016-03-11 15:50:21 -08:00
Zhong Wang
cc24227bf0 Remove duplicated java option concats in common.sh
### What is this PR for?
There are some java option concats in common.sh, which are executed twice when start an interpreter. This makes some of the options invalid, such as remote debugging options

### What type of PR is it?
Bug Fix

### Todos

### Is there a relevant Jira issue?
[ZEPPELIN-686](https://issues.apache.org/jira/browse/ZEPPELIN-686)

### How should this be tested?
Set remote debug options:
```
export ZEPPELIN_INTP_JAVA_OPTS=-agentlib:jdwp=transport=dt_socket,server=y,address=8000,suspend=n
```

Start the Zeppelin daemon, then create & run a job to trigger starting an interpreter. The job should fail without the fix.

### Screenshots (if appropriate)

### Questions:
* Does the licenses files need update?
NO

* Is there breaking changes for older versions?
NO

* Does this needs documentation?
NO

Author: Zhong Wang <wangzhong.neu@gmail.com>

Closes #749 from zhongneu/fix-duplicated-java-opts and squashes the following commits:

f89dbb6 [Zhong Wang] revert change for JAVA_OPTS for compatibility
75959ea [Zhong Wang] remove unneccessary concats in common.sh
2016-03-09 22:02:49 -08:00
Jeff Steinmetz
d2f9e6475e allows zeppelin to be run and managed as a service. ZEPPELIN-641
### What is this PR for?
allows zeppelin to be run and managed as a service, does not start in background via nohup
the service manager handles process instead

### What type of PR is it?
Improvement

### Todos
* None, should work as is

### Is there a relevant Jira issue?
ZEPPELIN-641

### How should this be tested?
bin/zeppelin-daemon.sh upstart

### Screenshots (if appropriate)

### Questions:
* Does the licenses files need update? no
* Is there breaking changes for older versions? no
* Does this needs documentation? updated

Author: Jeff Steinmetz <jeffrey.steinmetz@gmail.com>

Closes #722 from jeffsteinmetz/ZEPPELIN-641 and squashes the following commits:

205f8f0 [Jeff Steinmetz] add zeppelin.conf example to docs
1aab016 [Jeff Steinmetz] allows zeppelin to be run and managed as a service.  Jira Ticket ZEPPELIN-641
06ed0a3 [Jeff Steinmetz] allows zeppelin to be run and managed as a service.  Jira Ticket ZEPPELIN-641
2016-02-21 12:59:08 -08:00
Luciano Resende
831f426dba [ZEPPELIN-408] Properly honor notebook dir from xml configuration
This is a fork of #420 (stalled since December) which address the provided comments and also add minor test case on the property being addressed.

Author: Luciano Resende <lresende@apache.org>

Closes #731 from lresende/ZEPPELIN-408 and squashes the following commits:

d700872 [Luciano Resende] [ZEPPELIN-408] Properly honor notebook dir from xml configuration
2016-02-21 09:10:10 -08:00
Luciano Resende
44991ba04a [MINOR] Remove obsolete and old copyright notices in legal header
There is already a collective copyright notice in the NOTICE file

Author: Luciano Resende <lresende@apache.org>

Closes #732 from lresende/cleanup and squashes the following commits:

2b12ec5 [Luciano Resende] [MINOR] Remove obsolete and old copyright notices in legal header
2016-02-21 09:08:55 -08:00
Mina Lee
218a3b5bca [Zeppelin-630] Introduce new way of dependency loading to intepreter
### What is this PR for?
With this PR user will be able to set external libraries to be loaded to specific interpreter.

Note that the scope of this PR is downloading libraries to local repository, not distributing them to other nodes. Only spark interpreter distributes loaded dependencies to worker nodes at the moment.

Here is a brief explanation how the code works.
1. get rest api request for interpreter dependency setting from front-end
2. download the libraries in `ZEPPELIN_HOME/local-repo` and copy them to `ZEPPELIN_HOME/local-repo/{interpreterId}`
3. `ZEPPELIN_HOME/local-repo/{interpreterId}/*.jar` are added to interpreter classpath when interpreter process starts

### What type of PR is it?
Improvement

### Todos
* [x] Add tests
* [x] Update docs

### Is there a relevant Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-630
And this PR will resolve [ZEPPELIN-194](https://issues.apache.org/jira/browse/ZEPPELIN-194) [ZEPPELIN-381](https://issues.apache.org/jira/browse/ZEPPELIN-381) [ZEPPELIN-609](https://issues.apache.org/jira/browse/ZEPPELIN-609)

### How should this be tested?
1. Add repository(in interpreter menu, click gear button placed top right side)

    ```
id: spark-packages
url: http://dl.bintray.com/spark-packages/maven
snapshot: false
    ```
2. Set dependency in spark interpreter(click edit button of spark interpreter setting)

    ```
artifact: com.databricks:spark-csv_2.10:1.3.0
    ```
3. Download example csv file

    ```
$ wget https://github.com/databricks/spark-csv/raw/master/src/test/resources/cars.csv
    ```
4. run below code in paragraph

    ```
val df = sqlContext.read
    .format("com.databricks.spark.csv")
    .option("header", "true") // Use first line of all files as header
    .option("inferSchema", "true") // Automatically infer data types
    .load("file:///your/download/path/cars.csv")
df.registerTempTable("cars")
    ```
    ```
%sql select * from cars
    ```

### Screenshots (if appropriate)
* Toggle repository list
<img width="1146" alt="screen shot 2016-01-25 at 12 24 44 pm" src="https://cloud.githubusercontent.com/assets/8503346/12563475/52f060ac-c35f-11e5-8621-d8eb97b4d6a1.png">

* Add new repository
<img width="1146" alt="screen shot 2016-01-25 at 12 25 23 pm" src="https://cloud.githubusercontent.com/assets/8503346/12563472/52eb545e-c35f-11e5-9050-a5306d2765f1.png">

* Show repository info
<img width="1146" alt="screen shot 2016-01-25 at 12 25 28 pm" src="https://cloud.githubusercontent.com/assets/8503346/12563473/52ebab84-c35f-11e5-9acb-3a356c855dc7.png">

* Interpreter dependency
<img width="1146" alt="screen shot 2016-01-25 at 12 27 27 pm" src="https://cloud.githubusercontent.com/assets/8503346/12563471/52eadd9e-c35f-11e5-8e1a-f583ea8800aa.png">

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions?
  - For the users who use rest api for creat/update interpreter setting, `dependencies` object should be added to request payload.
  - %dep interpreter is deprecated. The functionality is still there, but recommend to load third party dependency via interpreter menu.

* Does this needs documentation? Yes

Author: Mina Lee <minalee@nflabs.com>

Closes #673 from minahlee/ZEPPELIN-630 and squashes the following commits:

62a75c9 [Mina Lee] Merge branch 'master' of https://github.com/apache/incubator-zeppelin into ZEPPELIN-630
545c173 [Mina Lee] Change variable name LOCAL_REPO_DIR -> LOCAL_INTERPRETER_REPO
1e3dd47 [Mina Lee] Fix docs indentation
320f400 [Mina Lee] Add documentation
6b90c3d [Mina Lee] Fix mislocated interpreter setting save/cancel button
e161b98 [Mina Lee] Add tests and split ZeppelinRestApiTest into two files
387e21e [Mina Lee] Close input tag
ee7532b [Mina Lee] Combine catch block for readability
eb4a78f [Mina Lee] Handle url with file protocol for repository URL input field
bae0c02 [Mina Lee] * Fix DependencyResolver addRepo/delRepo method * Manage repository information in `conf/interpreter.json` * Front-end modification to manage repository list * Add RestApi for adding/deleting repository * Fix tests
fe9cb92 [Mina Lee] Enable adding interpreter dependency via GUI
d5c931b [Mina Lee] Fix test after rebase
1b6a818 [Mina Lee] Remove test with unused ZeppelinContext load() method
37005c5 [Mina Lee] Remove unused methods and add deprecated message for dep interpreter
2cd715c [Mina Lee] Add env variable/property to configuration template files
848d931 [Mina Lee] Make external libraries to be added to interpreter process classpath
2016-02-01 11:10:43 +09:00
beeva-victorgarcia
404846f969 JDBC generic interpreter
Only you need to add to the classpath the jdbc connector jar and the interpreter add the particular properties for each db.
In the file zeppelin-daemon.sh add:
ZEPPELIN_CLASSPATH+=":${ZEPPELIN_HOME}/jdbc/jdbc/connector jar"

Author: beeva-victorgarcia <victor.garcia@beeva.com>
Author: Victor <viktor.manuel.garcia@gmail.com>
Author: vgmartinez <viktor.manuel.garcia@gmail.com>

Closes #361 from vgmartinez/jdbc_interpreter and squashes the following commits:

2513c1b [vgmartinez] Merge branch 'master' into jdbc_interpreter
8046692 [vgmartinez] merged with master
e602621 [beeva-victorgarcia] remove spaces
37a4c1a [beeva-victorgarcia] remove dependency
4085849 [beeva-victorgarcia] rebase branch
bd20ac2 [beeva-victorgarcia] remove README.md
2f93406 [beeva-victorgarcia] clean commit
f0ad06d [beeva-victorgarcia] Merge branch 'master' of https://github.com/apache/incubator-zeppelin
a0e0d54 [beeva-victorgarcia] add some test
fe92f89 [beeva-victorgarcia] add multiple connections for interpreter
a06718c [beeva-victorgarcia] -a
09006f1 [Victor] fix test
462c3b1 [Victor] Merge branch 'master' of https://github.com/apache/incubator-zeppelin into jdbc_interpreter
a66a5b7 [Victor] add descriptions
710699c [Victor] deleted cassandra/.cache-main from commit
4f28f5a [Victor] change psql to jdbc
53d0a81 [Victor] generic interpreter for jdbc
2016-01-18 12:25:52 +09:00
Felix Cheung
69537f1412 [ZEPPELIN-395] Support Spark 1.6
Adding support for Spark 1.6

Status:
- [x] Spark/Scala
- [x] PySpark
- [x] Spark SQL - broken - fixed
![image](https://cloud.githubusercontent.com/assets/8969467/11355751/4b9caad8-920c-11e5-9392-7a92b34da582.png)
![image](https://cloud.githubusercontent.com/assets/8969467/11413943/0ea52fb2-93a4-11e5-973e-038982ea1f64.png)

TODO:
- [x] update pom when the artifacts are on central repo
- [x] update travis to build 1.6
- [x] update doc (updated readme)

Author: Felix Cheung <felixcheung_m@hotmail.com>

Closes #463 from felixcheung/spark16 and squashes the following commits:

e2f444f [Felix Cheung] push readme update
0809031 [Felix Cheung] reduce test to run for spark 1.5.2
eaf3127 [Felix Cheung] change to final url for spark download, add to travis
52c1d75 [Felix Cheung] fix url for spark hist (this is one that works for now)
97e6b3b [Felix Cheung] 1.6 from maven
30844d7 [Felix Cheung] fix progressing result DataFrame - z.showDF and %sql work now
f0c2207 [Felix Cheung] Spark/PySpark working
2016-01-07 13:31:23 -08:00
Lee moon soo
f54b49b40a [ZEPPELIN-515] Hadoop libraries in ${HADOOP_HOME}/share folder not included in CLASSPATH
### What is this PR for?
Find and add jar under ${HADOOP_HOME}/share, recursively.

### What type of PR is it?
Bug Fix

### Is there a relevant Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-515

### Questions:
* Does the licenses files need update? no
* Is there breaking changes for older versions? no
* Does this needs documentation? no

Author: Lee moon soo <moon@apache.org>

Closes #551 from Leemoonsoo/ZEPPELIN-515 and squashes the following commits:

350aaa0 [Lee moon soo] find jar under share in recursive way
2015-12-21 09:19:34 +09:00
Lee moon soo
b3cca395b7 ZEPPELIN-305 Do not add jvm memory option when using spark-submit
https://issues.apache.org/jira/browse/ZEPPELIN-305

When `SPARK_HOME` is defined and `bin/interpreter.sh` is launching interpreter process using spark-`submit` command, Jvm memory option from `bin/interpreter.sh` and from `spark-submit` command conflicts.

Author: Lee moon soo <moon@apache.org>

Closes #526 from Leemoonsoo/ZEPPELIN-305 and squashes the following commits:

446b596 [Lee moon soo] Not apply ${ZEPPELIN_INTP_MEM} when using SPARK_SUBMIT
2015-12-13 07:09:53 +09:00
Mina Lee
b22fe2fa8e [ZEPPELIN-495] Enable running interpreters with distribution package
In distribution package, zeppelin-interpreter module is not added to classpath when Zeppelin starts interpreter process.
This PR packages zeppelin-interpreter classes and dependencies into one jar file, and adds this jar to classpath.

Author: Mina Lee <minalee@nflabs.com>

Closes #524 from minahlee/ZEPPELIN-495 and squashes the following commits:

efc5f31 [Mina Lee] [ZEPPELIN-495] Exclude dependency-reduced-pom.xml from license check
401a6cb [Mina Lee] [ZEPPELIN-495] Enable running interpreters with distribution package
2015-12-11 11:30:58 +09:00
Lee moon soo
ddde27a7ab ZEPPELIN-469 Interpreter process loads unnecessary classes
Addresses issue https://issues.apache.org/jira/browse/ZEPPELIN-469

This PR fixes problem by remove `export ZEPPELIN_CLASSPATH`, so classpath from bin/zeppelin-daemon.sh is not propagated bin/interpreter.sh.

It can be verified by printing System classloader inside of notebook, like

```scala
val cl = ClassLoader.getSystemClassLoader()
val ucl = cl.asInstanceOf[java.net.URLClassLoader]
ucl.getURLs.foreach(u=>println(u))
```

Result is

Before

```
cl: ClassLoader = sun.misc.Launcher$AppClassLoader36c51089
ucl: java.net.URLClassLoader = sun.misc.Launcher$AppClassLoader36c51089
file:/zeppelin/
file:/zeppelin/
file:/zeppelin/interpreter/spark/dep/datanucleus-api-jdo-3.2.6.jar
file:/zeppelin/interpreter/spark/dep/datanucleus-core-3.2.10.jar
file:/zeppelin/interpreter/spark/dep/datanucleus-rdbms-3.2.9.jar
file:/zeppelin/interpreter/spark/dep/zeppelin-spark-dependencies-0.6.0-incubating-SNAPSHOT.jar
file:/zeppelin/interpreter/spark/zeppelin-spark-0.6.0-incubating-SNAPSHOT.jar
file:/zeppelin/lib/asm-3.1.jar
file:/zeppelin/lib/aws-java-sdk-core-1.10.1.jar
file:/zeppelin/lib/aws-java-sdk-kms-1.10.1.jar
file:/zeppelin/lib/aws-java-sdk-s3-1.10.1.jar
...
...
...
file:/zeppelin/lib/regexp-1.3.jar
file:/zeppelin/lib/scala-library-2.10.4.jar
file:/zeppelin/lib/slf4j-api-1.7.10.jar
file:/zeppelin/lib/slf4j-log4j12-1.7.10.jar
file:/zeppelin/lib/stax2-api-3.1.1.jar
file:/zeppelin/lib/woodstox-core-asl-4.2.0.jar
file:/zeppelin/lib/wsdl4j-1.6.3.jar
file:/zeppelin/lib/xml-apis-1.4.01.jar
file:/zeppelin/lib/xmlschema-core-2.0.3.jar
file:/zeppelin/lib/zeppelin-interpreter-0.6.0-incubating-SNAPSHOT.jar
file:/zeppelin/lib/zeppelin-zengine-0.6.0-incubating-SNAPSHOT.jar
file:/zeppelin/zeppelin-server-0.6.0-incubating-SNAPSHOT.jar
file:/zeppelin/
file:/zeppelin/conf/
file:/zeppelin/conf/
file:/zeppelin/conf/
```

After

```
cl: ClassLoader = sun.misc.Launcher$AppClassLoader338bd37a
ucl: java.net.URLClassLoader = sun.misc.Launcher$AppClassLoader338bd37a
file:/zeppelin/
file:/zeppelin/
file:/zeppelin/interpreter/spark/dep/datanucleus-api-jdo-3.2.6.jar
file:/zeppelin/interpreter/spark/dep/datanucleus-core-3.2.10.jar
file:/zeppelin/interpreter/spark/dep/datanucleus-rdbms-3.2.9.jar
file:/zeppelin/interpreter/spark/dep/zeppelin-spark-dependencies-0.6.0-incubating-SNAPSHOT.jar
file:/zeppelin/interpreter/spark/zeppelin-spark-0.6.0-incubating-SNAPSHOT.jar
file:/zeppelin/
file:/zeppelin/conf/
file:/zeppelin/conf/
```

Author: Lee moon soo <moon@apache.org>

Closes #485 from Leemoonsoo/ZEPPELIN-469 and squashes the following commits:

63dcaaf [Lee moon soo] do not export ZEPPELIN_CLASSPATH
2015-11-29 07:50:13 +09:00
Chae-Sung Lim
96b720239d ZEPPELIN-383-Additional_modifications ] Typo fixes
#412

fix :: bin/interpreter.sh

ZEPPELIN_CLASSPATH_OVERRIDE**SS** -> ZEPPELIN_CLASSPATH_OVERRIDE**S**

Author: Chae-Sung Lim <estail7s@gmail.com>

Closes #422 from cloverhearts/ZEPPELIN-383-Additional_modifications and squashes the following commits:

90a267e [Chae-Sung Lim] ZEPPELIN-383-Additional_modifications
2015-11-13 09:55:38 +09:00
Eric Charles
748533b2a5 ZEPPELIN-383 Override classpath with ZEPPELIN_CLASSPATH_OVERRIDES
Prepend ZEPPELIN_CLASSPATH_OVERRIDES environment variable when building CLASSPATH in the shell scripts.

This PR replaces the closed  #398 and #386 - Sorry for the mess...

Author: Eric Charles <eric@datalayer.io>

Closes #412 from echarles/ZEPPELIN-383-CLASSPATH_OVERRIDES and squashes the following commits:

8572ec5 [Eric Charles] Use ZEPPELIN_CLASSPATH_OVERRIDES instead of CLASSPATH_OVERRIDES
d63cc55 [Eric Charles] Prepend CLASSPATH_OVERRIDES environment variable when building CLASSPATH in the shell scripts
2015-11-12 11:49:35 +09:00
Rohit Agarwal
b929b34b50 ZEPPELIN-238: Remove unused swagger code
Author: Rohit Agarwal <rohita@qubole.com>

Closes #226 from mindprince/ZEPPELIN-238 and squashes the following commits:

a2df01d [Rohit Agarwal] ZEPPELIN-238: Remove unused swagger code
2015-09-10 16:00:08 -07:00
Lee moon soo
b4b4f5521a ZEPPELIN-262 Use spark-submit to run spark interpreter process
https://issues.apache.org/jira/browse/ZEPPELIN-262

This patch make zeppelin uses spark-submit to run spark interpreter process, when SPARK_HOME is defined. This will potentially solve all the configuration problems related to spark interpreter.

#### How to use?

Define SPARK_HOME env variable in conf/zeppelin-env.sh
Then it'll use your SPARK_HOME/bin/spark-submit, so you will not need any additional configuration :-)

#### Backward compatibility

If You have not defined your SPARK_HOME, you still able to run spark interpreter in old (current) way.
However it is not encouraged anymore.

Author: Lee moon soo <moon@apache.org>

Closes #270 from Leemoonsoo/spark_submit and squashes the following commits:

4eb0848 [Lee moon soo] export and check SPARK_SUBMIT
a8a3440 [Lee moon soo] handle spark.files correctly for pyspark when spark-submit is used
d4acd1b [Lee moon soo] Add PYTHONPATH
c9418c6 [Lee moon soo] Bring back some entries with more commments
cac2bb8 [Lee moon soo] Take care classpath of SparkIMain
5d3154e [Lee moon soo] Remove clean. otherwise mvn clean package will remove interpreter/spark/dep directory
2d27e9c [Lee moon soo] use spark-submit to run spark interpreter process when SPARK_HOME is defined
2015-09-07 21:40:58 -07:00
Lee moon soo
5de01c6800 ZEPPELIN-160 Working with provided Spark, Hadoop.
Zeppelin currently embeds all spark dependencies under interpreter/spark and loading them on runtime.

Which is useful because of user can try Zeppelin + Spark with local mode without installation and configuration of spark.

However, when user has existing spark and hadoop installation, it'll be really helpful to just pointing them instead of build zeppelin with specific version of spark and hadoop combination.

This PR implements ability to use external spark and hadoop installation, by doing

* spark-dependencies module packages spark/hadoop dependencies under interpreter/spark/dep, to support local mode (current behavior)
* When SPARK_HOME and HADOOP_HOME is defined, bin/interpreter.sh exclude interpreter/spark/dep from classpath and include system installed spark and hadoop into the classpath.

This patch makes Zeppelin binary independent from spark version. Once Zeppelin is been built, SPARK_HOME can point any version of spark.

Author: Lee moon soo <moon@apache.org>

Closes #244 from Leemoonsoo/spark_provided and squashes the following commits:

654c378 [Lee moon soo] use consistant, simpler expressions
57b3f96 [Lee moon soo] Add comment
eb4ec09 [Lee moon soo] fix reading spark-*.conf file
bacfd93 [Lee moon soo] Update readme
3a88c77 [Lee moon soo] Test use explicitly %spark
5a17d9c [Lee moon soo] Call sqlContext.sql using reflection
615c395 [Lee moon soo] get correct method
0c28561 [Lee moon soo] call listenerBus() using reflection
62b8c45 [Lee moon soo] Print all logs
5edb6fd [Lee moon soo] Use reflection to call addListener
af7a925 [Lee moon soo] add pyspark flag
5f8a734 [Lee moon soo] test -> package
a0150cf [Lee moon soo] not use travis-install for mvn test
cd4519c [Lee moon soo] try sys.stdout.write instead of print
6304180 [Lee moon soo] enable 1.2.x test
797c0e2 [Lee moon soo] enable 1.3.x test
8de7add [Lee moon soo] trying to find why travis is not closing the test
cf0a61e [Lee moon soo] rm -rf only interpreter directory instead of mvn clean
2606c04 [Lee moon soo] bringing travis-install.sh back
df8f0ba [Lee moon soo] test more efficiently
9d6b40f [Lee moon soo] Update .travis
2ca3d95 [Lee moon soo] set SPARK_HOME
2a61ecd [Lee moon soo] Clear interpreter directory on mvn clean
f1e8789 [Lee moon soo] update travis config
9e812e7 [Lee moon soo] Use reflection not to use import org.apache.spark.scheduler.Stage
c3d96c1 [Lee moon soo] Handle ZEPPELIN_CLASSPATH proper way
0f9598b [Lee moon soo] py4j version as a property
1b7f951 [Lee moon soo] Add dependency for compile and test
b1d62a5 [Lee moon soo] Add scala-library in test scope
c49be62 [Lee moon soo] Add hadoop jar and spark jar from HADOOP_HOME, SPARK_HOME when they are defined
2052aa3 [Lee moon soo] Load interpreter/spark/dep only when SPARK_HOME is undefined
54fdf0d [Lee moon soo] Separate spark-dependency into submodule
2015-09-01 10:05:41 -07:00
Lee moon soo
29a7f8e742 ZEPPELIN-165 Correct PYTHONPATH when SPARK_HOME is defined
https://issues.apache.org/jira/browse/ZEPPELIN-165

When SPARK_HOME is defined, PYTHONPATH is defined as
```
${SPARK_HOME}/python/lib/pyspark.zip:${SPARK_HOME}/python/lib/py4j-0.8.2.1-src.zip"
```
instead of
```
${SPARK_HOME}/python:${SPARK_HOME}/python/lib/py4j-0.8.2.1-src.zip"
```

Author: Lee moon soo <moon@apache.org>

Closes #151 from Leemoonsoo/ZEPPELIN-156 and squashes the following commits:

4c222f8 [Lee moon soo] Add pyspark.zip
e74fe7f [Lee moon soo] Correct PYTHONPATH when SPARK_HOME is defined
2015-07-22 14:31:40 +09:00
Jongyoul Lee
3bd2b2122a [ZEPPELIN-18] Running pyspark without deploying python libraries to every yarn node
- Spark supports pyspark on yarn cluster without deploying python libraries from Spark 1.4
 - https://issues.apache.org/jira/browse/SPARK-6869
 - apache/spark#5580, apache/spark#5478

Author: Jongyoul Lee <jongyoul@gmail.com>

Closes #118 from jongyoul/ZEPPELIN-18 and squashes the following commits:

a47e27c [Jongyoul Lee] - Fixed test script for spark 1.4.0
72a65fd [Jongyoul Lee] - Fixed test script for spark 1.4.0
ee6d100 [Jongyoul Lee] - Cleanup codes
47fd9c9 [Jongyoul Lee] - Cleanup codes
248e330 [Jongyoul Lee] - Cleanup codes
4cd10b5 [Jongyoul Lee] - Removed meaningless codes comments
c9cda29 [Jongyoul Lee] - Removed setting SPARK_HOME - Changed the location of pyspark's directory into interpreter/spark
ef240f5 [Jongyoul Lee] - Fixed typo
06002fd [Jongyoul Lee] - Fixed typo
4b35c8d [Jongyoul Lee] [ZEPPELIN-18] Running pyspark without deploying python libraries to every yarn node - Dummy for trigger
682986e [Jongyoul Lee] rebased
8a7bf47 [Jongyoul Lee] [ZEPPELIN-18] Running pyspark without deploying python libraries to every yarn node - rebasing
ad610fb [Jongyoul Lee] rebased
94bdf30 [Jongyoul Lee] [ZEPPELIN-18] Running pyspark without deploying python libraries to every yarn node - Fixed checkstyle
929333d [Jongyoul Lee] rebased
64b8195 [Jongyoul Lee] [ZEPPELIN-18] Running pyspark without deploying python libraries to every yarn node - rebasing
0a2d90e [Jongyoul Lee] rebased
b05ae6e [Jongyoul Lee] [ZEPPELIN-18] Remove setting SPARK_HOME for PySpark - Excludes python/** from apache-rat
71e2a92 [Jongyoul Lee] [ZEPPELIN-18] Running pyspark without deploying python libraries to every yarn node - Removed verbose setting
0ddb436 [Jongyoul Lee] [ZEPPELIN-18] Running pyspark without deploying python libraries to every yarn node - Followed spark's way to support pyspark - https://issues.apache.org/jira/browse/SPARK-6869 - https://github.com/apache/spark/pull/5580 - https://github.com/apache/spark/pull/5478/files
1b192f6 [Jongyoul Lee] [ZEPPELIN-18] Remove setting SPARK_HOME for PySpark - Removed redundant dependency setting
32fd9e1 [Jongyoul Lee] [ZEPPELIN-18] Running pyspark without deploying python libraries to every yarn node - rebasing
2015-07-05 10:49:14 -07:00
Lee moon soo
12e5abf280 ZEPPELIN-79 Zeppelin does not kill some interpreters when server is stopped
https://issues.apache.org/jira/browse/ZEPPELIN-79

Zeppelin sometimes left interpreter process after it is stopped.
This pr solve the problem by increase timeout for graceful shutdown

Author: Lee moon soo <moon@apache.org>

Closes #135 from Leemoonsoo/ZEPPELIN-79 and squashes the following commits:

d2b1fa6 [Lee moon soo] Close and destroy interpreters in parallel
4558417 [Lee moon soo] Increase graceful shutdown timeout
2015-07-05 10:45:01 -07:00
Damien Corneau
8c7424a191 Zeppelin-web Spring Cleaning
After so much time in the Wild Wild West of Internet, It's Time for the Spring Cleaning of Zeppelin-web.
This PR will be taking care of cleaning the code, architecture, cutting code into smaller pieces etc...

* [x] - Change Code Folder Structure to a Folder Tree Style
* [x] - Change original code and compiled code folder names
* [x] - Update Contributing README.md to explain most of changes
* [x] - Organize well the components and app folders

We will do a first merge and handle this part in a different PR:
* [ ] - Replace as much code as possible by their lodash.js counterpart
* [ ] - Cut the code into more smaller components (who said paragraph.js?)
* [ ] - Move Jquery code out of the controllers (by directive when possible or to somewhere else)

Needs to make sure that:
* [x] - #127 is handled

Author: Damien Corneau <corneadoug@gmail.com>
Author: CORNEAU Damien <corneadoug@gmail.com>

Closes #56 from corneadoug/improvement/SpringCleaning and squashes the following commits:

453af1a [Damien Corneau] Merge Master and Fix ports
678c0fa [Damien Corneau] Fix RAT excluded and add Apache licenses in zeppelin-web
0addb80 [Damien Corneau] Change AppScriptServlet configuration
ef764fc [Damien Corneau] Improve uglifyjs options
15cc7b1 [CORNEAU Damien] Fix README
e3ca174 [CORNEAU Damien] Small fix in doc
775f3ca [Damien Corneau] Remove unused ngdoc comments
25a3a63 [Damien Corneau] Fix Interpreter Create form
bdde389 [Damien Corneau] Set loonknfeel to default for everypage, and change only if looknfeel is different
bdf3a8e [Damien Corneau] Include lodash + Interpreter Web refactoring Part1: reducing code
931067a [Damien Corneau] Align form label to form input + improve form disable opacity
75d12c3 [Damien Corneau] Fix CSS of paragraph forms
e3f3016 [Damien Corneau] Fix ZEPPELIN-102
a6ec901 [Damien Corneau] Fix ZEPPELIN-103
7eccca8 [Damien Corneau] Fix navbar selected menu + small code improvement
a1fe1c1 [Damien Corneau] Refactoring of Websocket
a36adf9 [Damien Corneau] Move all websocket calls to a service
b21cc69 [Damien Corneau] Refactor Navbar controller to controller pattern + data factory
5a40c4c [Damien Corneau] Separate navbar to its own html file
2dac138 [Damien Corneau] Move directives to solo directory
9201360 [Damien Corneau] Fix project after git clean
9b249ea [Damien Corneau] Clean JSHint errors except for already defined and configuration functions related errors
ef6baa0 [CORNEAU Damien] Update Zeppelin-web CONTRIBUTING.md
411df6a [Damien Corneau] Create Zeppelin-web CONTRIBUTING.md
d3c22cf [CORNEAU Damien] Update Zeppelin-web README.md
48eed51 [Damien Corneau] Move the font css
3e28c3c [Damien Corneau] Change Zeppelin-web code and compiled code folders
0ee04a2 [Damien Corneau] Fix Grunt watch
15b502c [Damien Corneau] Change Zeppelin Folder Structure and its GruntFile
9f9059e [Damien Corneau] Add spark dependency reduced pom to gitignore
2015-07-01 23:55:11 -07:00
Mina Lee
ec7a59c097 Start zeppelin only with java not spark-submit
Since spark agrs are supported through interpreter setting menu in UI,
launching zeppelin with spark-submit is not needed anymore.

Author: Mina Lee <minalee@nflabs.com>

Closes #90 from minahlee/rm/spark-submit_runner and squashes the following commits:

6c0937c [Mina Lee] Start zeppelin only with java not spark-submit
2015-06-09 00:03:34 -07:00
Lee moon soo
dcb03a9fe6 Simplify classpath.
When Zeppelin constructs JVM CLASSPATH, it adds path of every single jar. That creates really long CLASSPATH and it may cause some problem like https://issues.apache.org/jira/browse/ZEPPELIN-68.
This PR solves it by constructing CLASSPATH with wildcard.

Another problem is, Zeppelin constructs CLASSPATH with duplicated entry or not related entry.
This PR solves it by constructing CLASSPATH not in common.sh but in each script zeppelin-daemon.sh, interpreter.sh, zeppelin.sh while they need different entry in CLASSPATH.

Author: Lee moon soo <moon@apache.org>

Closes #65 from Leemoonsoo/ZEPPELIN-68 and squashes the following commits:

ae3ea5a [Lee moon soo] Print rat.txt when build failed
4eee837 [Lee moon soo] ZEPPELIN-68 simplify classpath
2015-05-13 14:16:19 +09:00
rahul agarwal
f4bc662d39 Fixes a typo in conf_dir check
bin/zeppelin.sh checks for conf_dir's existence. There's a typo as {$config_dir}.

Author: rahul agarwal <rahul@ragarwal.me>

Closes #51 from rahul67/hotfix/zeppelin-script-typo and squashes the following commits:

4e4f7d1 [rahul agarwal] Fixes a typo in config_dir check
2015-04-28 07:04:13 +09:00
Lee moon soo
669d408dc9 Rename package/groupId to org.apache and apply rat plugin.
This PR handles https://issues.apache.org/jira/browse/ZEPPELIN-12.

* groupId at pom.xml file is changed from com.nflabs.zeppelin to org.apache.zeppelin
* package name is changed from com.nflabs.zeppelin to org.apache.zeppelin
* apache-rat plugin is applied (license header is added to every file) and NOTICE is updated (https://www.apache.org/legal/src-headers.html)
* removed sphinx doc. because of doc was out dated (it was for 0.3.0)

Please, review the changes. Especially, check NOTICE file if there're something i missed.

Author: Lee moon soo <moon@apache.org>

Closes #13 from Leemoonsoo/rat and squashes the following commits:

892695f [Lee moon soo] hive interpreter module com.nflabs -> org.apache. Add license to the hive/pom.xml
c9a07c9 [Lee moon soo] Use correct package name
06a802a [Lee moon soo] One file is missed while renaming it
9902997 [Lee moon soo] Add missing import
643530a [Lee moon soo] Exclude .log from rat
fb15d0b [Lee moon soo] Exclude dependency-reduced-pom.xml from rat plugin
5faf7b1 [Lee moon soo] Apply rat plugin and com.nflabs -> org.apache
5edc77b [Lee moon soo] Update license of ScreenCaptureHtmlUnitDriver.java
1bfef1f [Lee moon soo] Update notice file
d7172fe [Lee moon soo] Add source file license header
92eb87f [Lee moon soo] Remove old sphinx doc
be06c43 [Lee moon soo] Remove unused erb
1ffca75 [Lee moon soo] Remove unused file
2015-04-06 13:05:04 +09:00
Jongyoul Lee
f11cdf699e [HOTFIX] Code convention
- Fixed style guide.
- Followed by #20

Author: Jongyoul Lee <jongyoul@gmail.com>

Closes #25 from jongyoul/ZEPPELIN-13-HOTFIX and squashes the following commits:

1c4ac3e [Jongyoul Lee] [ZEPPELIN-13] ZEPPELIN_CONF_DIR cannot be reached until ZEPPELIN_CONF_DIR become set - Fixed style guide.
1348463 [Jongyoul Lee] [ZEPPELIN-13] ZEPPELIN_CONF_DIR cannot be reached until ZEPPELIN_CONF_DIR become set - Fixed style guide.
2015-04-05 21:59:24 +09:00
Jongyoul Lee
c335c6e886 [ZEPPELIN-13] ZEPPELIN_CONF_DIR cannot be reached until ZEPPELIN_CONF_DIR become set
bin/common.sh tries to find and set ZEPPELIN_CONF_DIR in order to read zeppelin-env.sh, but ZEPPELIN_CONF_DIR is defined in zeppelin-env.sh, so we cannot use different ZEPPELIN_CONF_DIR.

Author: Jongyoul Lee <jongyoul@gmail.com>

Closes #20 from jongyoul/ZEPPELIN-13 and squashes the following commits:

f998c4e [Jongyoul Lee] [ZEPPELIN-13] ZEPPELIN_CONF_DIR cannot be reached until ZEPPELIN_CONF_DIR become set - Fixed wrong if statements
6490755 [Jongyoul Lee] [ZEPPELIN-13] ZEPPELIN_CONF_DIR cannot be reached until ZEPPELIN_CONF_DIR become set - Fix the orders of checking configuration between zeppelin-daemon.sh and zeppelin.sh
a61d28a [Jongyoul Lee] [ZEPPELIN-13] ZEPPELIN_CONF_DIR cannot be reached until ZEPPELIN_CONF_DIR become set - Reverted note.json
29619d3 [Jongyoul Lee] [ZEPPELIN-13] ZEPPELIN_CONF_DIR cannot be reached until ZEPPELIN_CONF_DIR become set - Added option of --config on zeppelin{-daemon}.sh - Removed ZEPPELIN_CONF_DIR from zeppelin-env.sh
2015-04-02 22:26:18 +09:00
Lee moon soo
6f100f5374 Merge pull request #364 from NFLabs/new/separate_process_interpreter
Run interpreter on separate JVM
2015-03-14 02:26:49 +09:00
Lee moon soo
a3df56a265 Take ZEPPELIN_JAVA_OPTS as default value of ZEPPELIN_INTP_JAVA_OPTS 2015-03-13 17:34:13 +09:00
Jongyoul Lee
e1bbbaa35d #369 Package Zeppelin to DEB
- Fixed indentations
2015-03-11 15:41:07 +09:00
Jongyoul Lee
cb75dbb7db #369 Package Zeppelin to DEB
- Changed two shell script for recognizing symbolic link
- Used maven assembly plugin for structuring Zeppelin dependencies
- Added init.d script
- Added Debian prerm script
2015-03-11 15:40:01 +09:00
Lee moon soo
927a5e2b78 Merge branch 'master' into new/separate_process_interpreter
Conflicts:
	spark/src/main/java/com/nflabs/zeppelin/spark/SparkSqlInterpreter.java
2015-03-08 10:13:09 +09:00
Lee moon soo
2250a5c478 stop zeppelin server first, and then force interpreter process stop, if there're anything left 2015-03-08 01:59:16 +09:00
Lee moon soo
835bbc9959 User friendly log file name for interpreter process 2015-03-08 01:53:48 +09:00
Lee moon soo
32b6333ef0 Add some test for remote interpreter 2015-03-07 15:49:51 +09:00