expanded build instructions to include pyspark, and clarified maven and node.js requirements

This commit is contained in:
Jeff Steinmetz 2015-11-07 19:31:47 -08:00
parent a4db1688c9
commit 1eb27eb072

View file

@ -27,75 +27,106 @@ To know more about Zeppelin, visit our web site [http://zeppelin.incubator.apach
### Before Build
If you don't have requirements prepared, install it.
(The installation method may vary according to your environment, example is for Ubuntu.)
```
sudo apt-get update
sudo apt-get install openjdk-7-jdk
sudo apt-get install git
sudo apt-get install maven
sudo apt-get install nodejs
sudo apt-get install npm
sudo ln -s /usr/bin/nodejs /usr/bin/node
sudo apt-get install libfontconfig
```
_Note:_
Ensure node is installed by running `node --version`
Ensure maven is running version 3.1.x or higher with `mvn -version`
### Build
If you want to build Zeppelin from the source, please first clone this repository. And then:
```
mvn clean package -DskipTests
```
Build with specific Spark version
To build with a specific Spark version, Hadoop version or specific features, define one or more of the `spark`, `pyspark`, `hadoop` and `yarn` profiles, such as:
-Pspark-1.5 [Version to run in local spark mode]
-Ppyspark [optional: enable PYTHON support in spark via the %pyspark interpreter]
-Pyarn [optional: enable YARN support]
-Dhadoop.version=2.2.0 [hadoop distribution]
-Phadoop-2.2 [hadoop version]
The current default build command runs as:
mvn clean package -Pspark-1.5 -Phadoop-2.4 -Pyarn -Ppyspark
Spark 1.5.x
```
mvn clean package -Pspark-1.5 -Dhadoop.version=2.2.0 -Phadoop-2.2 -DskipTests
```
Spark 1.4.x
```
mvn clean package -Pspark-1.4 -Dhadoop.version=2.2.0 -Phadoop-2.2 -DskipTests
```
Spark 1.3.x
```
mvn clean package -Pspark-1.3 -Dhadoop.version=2.2.0 -Phadoop-2.2 -DskipTests
```
Spark 1.2.x
```
mvn clean package -Pspark-1.2 -Dhadoop.version=2.2.0 -Phadoop-2.2 -DskipTests
```
Spark 1.1.x
```
mvn clean package -Pspark-1.1 -Dhadoop.version=2.2.0 -Phadoop-2.2 -DskipTests
```
CDH 5.X
```
mvn clean package -Pspark-1.2 -Dhadoop.version=2.5.0-cdh5.3.0 -Phadoop-2.4 -DskipTests
```
Yarn (Hadoop 2.7.x)
```
mvn clean package -Pspark-1.4 -Dspark.version=1.4.1 -Dhadoop.version=2.7.0 -Phadoop-2.6 -Pyarn -DskipTests
```
Yarn (Hadoop 2.6.x)
```
mvn clean package -Pspark-1.1 -Dhadoop.version=2.6.0 -Phadoop-2.6 -Pyarn -DskipTests
```
Yarn (Hadoop 2.4.x)
```
mvn clean package -Pspark-1.1 -Dhadoop.version=2.4.0 -Phadoop-2.4 -Pyarn -DskipTests
```
Yarn (Hadoop 2.3.x)
```
mvn clean package -Pspark-1.1 -Dhadoop.version=2.3.0 -Phadoop-2.3 -Pyarn -DskipTests
```
Yarn (Hadoop 2.2.x)
```
mvn clean package -Pspark-1.1 -Dhadoop.version=2.2.0 -Phadoop-2.2 -Pyarn -DskipTests
```
Ignite (1.1.0-incubating and later)
```
mvn clean package -Dignite.version=1.1.0-incubating -DskipTests
```
### Configure
If you wish to configure Zeppelin option (like port number), configure the following files:
```
./conf/zeppelin-env.sh
./conf/zeppelin-site.xml
@ -140,9 +171,15 @@ Yarn
For configuration details check __./conf__ subdirectory.
### Package
To package final distribution do:
To package the final distribution includint the compressed archive, run:
mvn clean package -P build-distr
mvn clean package -Pbuild-distr
To build a distribution with specific profiles, run:
mvn clean package -Pbuild-distr -Pspark-1.5 -Phadoop-2.4 -Pyarn -Ppyspark
The profiles `-Pspark-1.5 -Phadoop-2.4 -Pyarn -Ppyspark` can be adjusted if you wish to build to a specific spark versions, or omit support such as `yarn`.
The archive is generated under _zeppelin-distribution/target_ directory