### What is this PR for?
There has been issues with downloading\caching Spark, esp in #1689#1696
This is hotfix for Spark download on CI.
### What type of PR is it?
Hot Fix
### Todos
- [x] do not use distrs.apache.org
- [x] levirage `download-maven-plugin` cache for Spark download
- [x] set timeout 1min and 5 re-tries on download
- [x] un-pack them under `/target/` so `mvn clean` works as expected
- [x] mute logs for `./testing/install_external_dependencies.sh`
### How should this be tested?
In CI logs, Spark should be downloaded by `spark-dependencies` and cached under `${HOME}/.m2/repository/.cache/maven-download-plugin`
### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No
Author: Alexander Bezzubov <bzz@apache.org>
Closes#1709 from bzz/make-ci-stabel and squashes the following commits:
06c031c [Alexander Bezzubov] Move logging config to MAVEN_OPTS
702dcdd [Alexander Bezzubov] Spark download\cached, using download-maven-plugin
7040b09 [Alexander Bezzubov] Switch Spark download dir
1d85b5c [Alexander Bezzubov] Mute dependency install logs
78109af [Alexander Bezzubov] Set readTimeOut for download-maven-plugin
7a64690 [Alexander Bezzubov] Bump download-maven-plugin version to lastes 1.3.0
605dea9 [Alexander Bezzubov] Spark 2.0.1 on CI, same as in pom.xml
9ee9c04 [Alexander Bezzubov] Direct Spark download url for CI as INFRA-12996
### What is this PR for?
Take 2 of #1618 because I had some earlier problems with rebasing. Since, then I have added some new features, namely:
- Matplotlib integration tests for pyspark
- `install_external_dependencies.sh` which conditionally installs the R or python dependencies based on the specified build profile in `.travis.yml`. This saves several minutes of time for a few of the build profiles since the R dependencies are compiled from source and therefore take quite a bit of time to install.
- The extra python unit tests which require external dependencies (`matplotlib` and `pandas`) are now relegated to two separate build profiles. This is done primarily to efficiently test both Python 2 and 3.
- Some minor bugs in the python and pyspark interpreters (mostly with respect to python 3 compatibility) were caught as a result of these tests, and are also fixed in this PR.
### What type of PR is it?
Improvement and Bugfix
### What is the Jira issue?
[ZEPPELIN-1639](https://issues.apache.org/jira/browse/ZEPPELIN-1639)
### How should this be tested?
CI tests should be green!
### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No
Author: Alex Goodman <agoodm@users.noreply.github.com>
Closes#1632 from agoodm/ZEPPELIN-1639 and squashes the following commits:
01380c2 [Alex Goodman] Make sure python 3 profile uses scala 2.11
363019e [Alex Goodman] Use spark 2.0 with python 3
a4f43af [Alex Goodman] Update comments in .travis.yml
5a60181 [Alex Goodman] Isolate python tests
73663f6 [Alex Goodman] Update tests for new InterpreterContext constructor
5709c5d [Alex Goodman] Re-add pyspark to build profile
ee95d67 [Alex Goodman] Move python 3 tests to all modules with spark 2.0
3a76958 [Alex Goodman] Travis
42da31c [Alex Goodman] Shorten python version
b6b88be [Alex Goodman] Add python dependencies to .travis.yml
### What is this PR for?
* Not download Spark for first profile which does license check only
* Always download Spark from archive not mirror
- We need to check which spark versions are in mirror or not, and update [this line](https://github.com/apache/zeppelin/blob/master/testing/downloadSpark.sh#L79) which is unsustainable
- Sometimes mirror site has problem such as `Not Found`, like we have issue in CI right now.
* Remove unused variable `SPARK_VER_RANGE` from `testing/downloadSpark.sh` (https://github.com/apache/zeppelin/pull/1578#issuecomment-258122087)
> Note: CI will still fail until #1595 is merged
### What type of PR is it?
Hot Fix
### Questions:
* Does the licenses files need update? no
* Is there breaking changes for older versions? no
* Does this needs documentation? no
Author: Mina Lee <minalee@apache.org>
Closes#1599 from minahlee/downloadSparkFromArchive and squashes the following commits:
89a46ca [Mina Lee] Always download spark binary package from archive
### What is this PR for?
removing support on old versions of Spark including testing and building them.
### What type of PR is it?
[Feature]
### Todos
* [x] - Remove .travis.yml
* [x] - Remove pom.xml
* [x] - Remove some docs
### What is the Jira issue?
* https://issues.apache.org/jira/browse/ZEPPELIN-1599
### How should this be tested?
No test. Check travis simplified
### Screenshots (if appropriate)
### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? You cannot use spark from 1.1 to 1.3 any longer
* Does this needs documentation? Yes, should remove some docs
Removed some profiles concerning old versions of Spark
Author: Jongyoul Lee <jongyoul@gmail.com>
Closes#1578 from jongyoul/ZEPPELIN-1599 and squashes the following commits:
acf514f [Jongyoul Lee] Fixed the script not for recognizing old versions
4bc11d6 [Jongyoul Lee] Added some docs for the deprecation on support for old versions of Spark
207502d [Jongyoul Lee] Removed some tests for old versions of Spark Removed some profiles concerning old versions of Spark
### What is this PR for?
Simplify travis test to reduce usage of resources
### What type of PR is it?
[Improvement]
### Todos
* [x] - Remove start-up/stop SparkCluster
### What is the Jira issue?
* https://issues.apache.org/jira/browse/ZEPPELIN-1520
### How should this be tested?
Travis will pass without any error
### Screenshots (if appropriate)
### Questions:
* Does the licenses files need update? no
* Is there breaking changes for older versions? no
* Does this needs documentation? no
Author: Jongyoul Lee <jongyoul@gmail.com>
Closes#1487 from jongyoul/ZEPPELIN-1520 and squashes the following commits:
3bccf66 [Jongyoul Lee] Removed some unused scripts anymore
15a3711 [Jongyoul Lee] Cleaned up commented lines
1237658 [Jongyoul Lee] Removed checking mechanism
f37dacf [Jongyoul Lee] Changed master to local[2]
2aac444 [Jongyoul Lee] Remove scripts of start/stop SparkCluster
### What is this PR for?
Older Apache Spark releases seems to have been removed from mirrors, and thus the build scripts needs to be updated to download older releases from the archives.
### What type of PR is it?
[Bug Fix]
### What is the Jira issue?
[ZEPPELIN-956](https://issues.apache.org/jira/browse/ZEPPELIN-956)
### How should this be tested?
Existing build tests
Author: Luciano Resende <lresende@apache.org>
Closes#967 from lresende/download and squashes the following commits:
4fcbf7b [Luciano Resende] [ZEPPELIN-956] Download old spark versions direct from archive
### What is this PR for?
Fix Spark download on CI
### What type of PR is it?
Hot Fix
### What is the Jira issue?
[ZEPPELIN-783](https://issues.apache.org/jira/browse/ZEPPELIN-783)
### How should this be tested?
CI must be green
### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No
Author: Alexander Bezzubov <bzz@apache.org>
Closes#818 from bzz/ZEPPELIN-783-stable-ci-part-2 and squashes the following commits:
b4d66b3 [Alexander Bezzubov] ZEPPELIN-783: advanced Spark download failover procedure
### What is this PR for?
Improve CI by hard-ending spark download failures that are responsible for recent CI red on `master`.
### What type of PR is it?
Bug Fix | Hot Fix
### Todos
- [x] cleanup on spark download attempts
- [x] leverage Travis CI [cacheing](https://docs.travis-ci.com/user/caching) for spark and pyspark binaries under `.spark-dist`
### What is the Jira issue?
[ZEPPELIN-783](https://issues.apache.org/jira/browse/ZEPPELIN-783)
### How should this be tested?
CI must be green
### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No
Author: Alexander Bezzubov <bzz@apache.org>
Closes#810 from bzz/ZEPPELIN-783-fix-ci-spark-download and squashes the following commits:
9d59646 [Alexander Bezzubov] ZEPPELIN-783: consistent download timeout
b6310f0 [Alexander Bezzubov] ZEPPELIN-783: add debug info: download, Zepeplin config
5d0eb2d [Alexander Bezzubov] ZEPPELIN-783: pyspark&spark cache under .spark-distr, but unpack to root
d4ef96d [Alexander Bezzubov] ZEPPELIN-783: exclude .spark-dist cache from RAT
388d76b [Alexander Bezzubov] ZEPPELIN-783: backport from Spark download to start\stop scripts
fa8b516 [Alexander Bezzubov] ZEPPELIN-783: reconcile CI-time and build-time Spark download locations
542a305 [Alexander Bezzubov] ZEPPELIN-783: use TravisCI caching for relieable Spark download
bd1d5e2 [Alexander Bezzubov] ZEPPELIN-783: add cleanup on download failure
b413743 [Alexander Bezzubov] ZEPPELIN-783: refactoring - extract SPARK_ARCHIVE var
346e075 [Alexander Bezzubov] ZEPPELIN-783: upd shell style
### What is this PR for?
Refactor download and use travis' builtin mechanism to retry on failure - hopefully it will then hit a different apache mirror; this should mitigate the intermitted download failures that result in test failures
### What type of PR is it?
Improvement
### Todos
* [x] - Separate download script
* [x] - Travis to retry download step
* [x] - Add timeout to kill download if not complete in 5 min (need to tune this)
* [x] - Add timer to log how long download takes
### Is there a relevant Jira issue?
N/A
### How should this be tested?
Travis
### Screenshots (if appropriate)
N/A
### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No
Author: Felix Cheung <felixcheung_m@hotmail.com>
Closes#727 from felixcheung/retrydownload and squashes the following commits:
5bfebea [Felix Cheung] up timeout duration
653c5bf [Felix Cheung] fix start spark script
44ac10f [Felix Cheung] change permission
1e9e642 [Felix Cheung] fix version check
e9b2272 [Felix Cheung] change timeout value
baf73a3 [Felix Cheung] separate download, travis to retry download
### What is this PR for?
There has been a few cases Travis fails to download Spark release but doesn't stop. Stopping would make it easier to track down and check the exit code.
### What type of PR is it?
Improvement
### Todos
* [x] - Update spark download & start/stop script
### Is there a relevant Jira issue?
N/A
### How should this be tested?
Run Travis CI
### Screenshots (if appropriate)
N/A
### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No
Author: Felix Cheung <felixcheung_m@hotmail.com>
Closes#710 from felixcheung/checkdownload and squashes the following commits:
d89cde9 [Felix Cheung] fix version check
3c64db8 [Felix Cheung] spark script stop on error
### What is this PR for?
Add paragraph scope for angular object. While it changes some internal api and ZeppelinServer - Interpreter process protocol (thrift), it's better be merged after creating 0.5.6 release branch
### What type of PR is it?
Improvement
### Is there a relevant Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-551
### How should this be tested?
Creating AngularObject now taking 'paragraphId' as a parameter in addition to 'noteId'.
When 'paragraphId' is null, the AngularObject becomes notebook scope, otherwise it becomes paragraph scope.
### Screenshots (if appropriate)
### Questions:
* Does the licenses files need update? no
* Is there breaking changes for older versions? Yes
Incompatible with interpreter binary that built with older version because of this PR updates thrift idl
* Does this needs documentation? internal api change
Author: Lee moon soo <moon@apache.org>
Closes#588 from Leemoonsoo/ZEPPELIN-551 and squashes the following commits:
11d27a8 [Lee moon soo] Merge branch 'master' into ZEPPELIN-551
b9a55fe [Lee moon soo] Add javadoc
917c1ca [Lee moon soo] Reduce build log message
1bba810 [Lee moon soo] Remove unused var
25aea61 [Lee moon soo] Handle scope correctly
8d7c07d [Lee moon soo] Add more tests
7d7fe2c [Lee moon soo] Fix test
f2fa347 [Lee moon soo] Take care paragraphs scope angular object
9d24a3b [Lee moon soo] Update ZeppelinContext
f35fe8e [Lee moon soo] Update zeppelin-server and zeppelin-zengine
8b13c1e [Lee moon soo] Add paragraph scope of angular object
Address https://issues.apache.org/jira/browse/ZEPPELIN-377.
This patch change spark package download location from apache archive to mirror, to download in 10min.
Also add missing test for 1.5.1 and change test version from 1.4.0 to 1.4.1
Author: Lee moon soo <moon@apache.org>
Closes#380 from Leemoonsoo/fix_spark_test and squashes the following commits:
142583a [Lee moon soo] Add test for 1.5.1
b8323e6 [Lee moon soo] Use mirror for 1.3.x and later version of spark
Author: caofangkun <caofangkun@gmail.com>
Closes#186 from caofangkun/zeppelin-207 and squashes the following commits:
a2d155a [caofangkun] ZEPPELIN-207: travis-ci build log is too long to be displayed
713ce5a [caofangkun] ZEPPELIN-207: travis-ci build log is too long to be displayed
3ee6a15 [caofangkun] ZEPPELIN-207: travis-ci build log is too long to be displayed
aa2568d [caofangkun] ZEPPELIN-207: travis-ci build log is too long to be displayed
e5b0067 [caofangkun] ZEPPELIN-207: travis-ci build log is too long to be displayed
This PR makes Spark-1.4 as default Zeppelin's build profile. https://issues.apache.org/jira/browse/ZEPPELIN-104
And Enabling CI test Zeppelin with spark version 1.4
Author: Lee moon soo <moon@apache.org>
Closes#99 from Leemoonsoo/ZEPPELIN-104 and squashes the following commits:
675301d [Lee moon soo] start spark worker correctly for version 1.4
19f29c0 [Lee moon soo] Avoid travis log message size limit 4MB
220a7fb [Lee moon soo] Let CI test with spark 1.4
d7d6ba5 [Lee moon soo] Make default spark version 1.4
Move Spark specific dependencyManagement and properties from pom.xml to spark/pom.xml.
Which interfere other interpreter's dependency version.
Author: Lee moon soo <moon@apache.org>
Closes#88 from Leemoonsoo/pom_refactor and squashes the following commits:
9916875 [Lee moon soo] automated ci test not only spark-1.3 but also spark-1.2, spark-1.1
aa6d1fd [Lee moon soo] Test pyspark with spark cluster
be0b7c4 [Lee moon soo] Remove unnecessary #
40698f3 [Lee moon soo] Make default version 1.3.1
18cb474 [Lee moon soo] Parse version correctly
b5f7343 [Lee moon soo] Make hadoop version configurable in test
bb47e81 [Lee moon soo] Add license header
8b6d3f5 [Lee moon soo] Gracefully shutdown ZeppelinServer in test
80698e9 [Lee moon soo] Add test against spark cluster
654d761 [Lee moon soo] Move spark specific dependencyManagement and properties block from pom.xml to spark/pom.xml
This PR handles https://issues.apache.org/jira/browse/ZEPPELIN-12.
* groupId at pom.xml file is changed from com.nflabs.zeppelin to org.apache.zeppelin
* package name is changed from com.nflabs.zeppelin to org.apache.zeppelin
* apache-rat plugin is applied (license header is added to every file) and NOTICE is updated (https://www.apache.org/legal/src-headers.html)
* removed sphinx doc. because of doc was out dated (it was for 0.3.0)
Please, review the changes. Especially, check NOTICE file if there're something i missed.
Author: Lee moon soo <moon@apache.org>
Closes#13 from Leemoonsoo/rat and squashes the following commits:
892695f [Lee moon soo] hive interpreter module com.nflabs -> org.apache. Add license to the hive/pom.xml
c9a07c9 [Lee moon soo] Use correct package name
06a802a [Lee moon soo] One file is missed while renaming it
9902997 [Lee moon soo] Add missing import
643530a [Lee moon soo] Exclude .log from rat
fb15d0b [Lee moon soo] Exclude dependency-reduced-pom.xml from rat plugin
5faf7b1 [Lee moon soo] Apply rat plugin and com.nflabs -> org.apache
5edc77b [Lee moon soo] Update license of ScreenCaptureHtmlUnitDriver.java
1bfef1f [Lee moon soo] Update notice file
d7172fe [Lee moon soo] Add source file license header
92eb87f [Lee moon soo] Remove old sphinx doc
be06c43 [Lee moon soo] Remove unused erb
1ffca75 [Lee moon soo] Remove unused file