zeppelin/docs/install/virtual_machine.md
AhyoungRyu 5ddc1ef87e [ZEPPELIN-996] Improve first page and dropdown menu in documentation site
### What is this PR for?
Current Zeppelin documentation site is little bit hard to find a way for Zeppelin beginners. It will not easy  to improve this at a time, but I did the below as a start of this work.

1. Restructured dropdown menu and added each category names
2. Added a overview list(with short description) to first page of website (index.md) so that people can look through the overall contents in Zeppelin website at a glance. (as [Apache Spark](http://spark.apache.org/docs/latest/#where-to-go-from-here) and [Apache Mesos](http://mesos.apache.org/documentation/latest/) does)

Please see the attached screenshot images :)

### What type of PR is it?
Improvement & Documentation

### Todos
* [x] - Change outdated screenshot images
* [x] - Combine `text.md`, `table.md` and `html.md` to `basicdisplaysystem.md`
* [x] - Fix dead link in `virtual_machine.md`
* [x] - Improve dropdown menu and reorder
* [x] - Improve first page(`index.md`)
* [x] - Combine with #995 after it is merged into master

### What is the Jira issue?
[ZEPPELIN-996](https://issues.apache.org/jira/browse/ZEPPELIN-996)

### How should this be tested?
1. Apply this patch and [build the docs website with jekyll](https://github.com/apache/zeppelin/tree/master/docs#build-documentation)
2. Check the first page(index.html) and dropdown menu

### Screenshots (if appropriate)
 - Dropdown menu
![dropdown](https://cloud.githubusercontent.com/assets/10060731/16061421/b44f8034-3241-11e6-88fd-43aa5031b453.gif)

 - First page
<img width="717" alt="screen shot 2016-06-14 at 1 28 58 pm" src="https://cloud.githubusercontent.com/assets/10060731/16058631/3ab2cb6c-3234-11e6-95f4-180290df3d02.png">
<img width="694" alt="screen shot 2016-06-14 at 1 29 11 pm" src="https://cloud.githubusercontent.com/assets/10060731/16058639/43d68918-3234-11e6-921c-28436bfca33d.png">
<img width="649" alt="screen shot 2016-06-14 at 1 29 39 pm" src="https://cloud.githubusercontent.com/assets/10060731/16058650/501ec6d6-3234-11e6-9292-53ae84acc18a.png">
<img width="684" alt="screen shot 2016-06-14 at 1 29 57 pm" src="https://cloud.githubusercontent.com/assets/10060731/16058643/4637c8f2-3234-11e6-9b12-a233906f4c8b.png">
<img width="650" alt="screen shot 2016-06-14 at 1 30 12 pm" src="https://cloud.githubusercontent.com/assets/10060731/16058655/56c5af22-3234-11e6-8d29-9b7937728948.png">
<img width="636" alt="screen shot 2016-06-14 at 1 30 22 pm" src="https://cloud.githubusercontent.com/assets/10060731/16058656/58d1187e-3234-11e6-9171-ab7390b4a526.png">

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No

Author: AhyoungRyu <fbdkdud93@hanmail.net>

Closes #1004 from AhyoungRyu/ZEPPELIN-996 and squashes the following commits:

1dbf805 [AhyoungRyu] Add 'Apache' before 'Zeppelin'
6153a50 [AhyoungRyu] Remove useless dash
61a5ad6 [AhyoungRyu] Revert to Text -> HTML -> Table in navbar
1886f8c [AhyoungRyu] Fix indentation
af70939 [AhyoungRyu] Apply scrollable menu to 'more' tab & fix UI issue
2654d92 [AhyoungRyu] Combine overflow-x & overflow-y to overflow
904acd6 [AhyoungRyu] Resize dropdown menu maxheight
6e62e31 [AhyoungRyu] Change dynamicform image
230c670 [AhyoungRyu] Revert to Text -> HTML -> Table
ad53799 [AhyoungRyu] Adjust image size
c75c3a0 [AhyoungRyu] Add HDFS logo to available interpreter image
6a2f40b [AhyoungRyu] Address @bzz feedback
649a14d [AhyoungRyu] Address @coureadoug feedback
67cff3a [AhyoungRyu] Add all documentation list with short description to first page
ce4b122 [AhyoungRyu] Limit image width
6ff4db6 [AhyoungRyu] Improve dropdown menu and reorder menus
61da430 [AhyoungRyu] Fix dead link in virtual_machine.md
6251558 [AhyoungRyu] Change 'Zeppelin Configuration' section placement so that it can be separated
4eecab8 [AhyoungRyu] Combine text.md, html.md, table.md
4d021af [AhyoungRyu] Delete outdated images and add new images
2016-06-14 23:47:46 -07:00

7.9 KiB

layout title description group
page Install A Zeppelin ready Virtual Machine install

{% include JB/setup %}

Vagrant Virtual Machine for Apache Zeppelin

Apache Zeppelin distribution includes a scripts directory

scripts/vagrant/zeppelin-dev

This script creates a virtual machine that launches a repeatable, known set of core dependencies required for developing Zeppelin. It can also be used to run an existing Zeppelin build if you don't plan to build from source. For PySpark users, this script includes several helpful Python Libraries. For SparkR users, this script includes several helpful R Libraries.

####Installing the required components to launch a virtual machine.

This script requires three applications, Ansible, Vagrant and Virtual Box. All of these applications are freely available as Open Source projects and extremely easy to set up on most operating systems.

Create a Zeppelin Ready VM in 4 Steps (5 on Windows)

If you are running Windows and don't yet have python installed, install Python 2.7.x first.

  1. Download and Install Vagrant: Vagrant Downloads

  2. Install Ansible: Ansible Python pip install

    sudo easy_install pip
    sudo pip install ansible
    ansible --version
    

    After then, please check whether it reports ansible version 1.9.2 or higher.

  3. Install Virtual Box: Virtual Box Downloads

  4. Type vagrant up from within the /scripts/vagrant/zeppelin-dev directory

Thats it ! You can now run vagrant ssh and this will place you into the guest machines terminal prompt.

If you don't wish to build Zeppelin from scratch, run the z-manager installer script while running in the guest VM:

curl -fsSL https://raw.githubusercontent.com/NFLabs/z-manager/master/zeppelin-installer.sh | bash

Building Zeppelin

You can now git clone git://git.apache.org/zeppelin.git into a directory on your host machine, or directly in your virtual machine.

Cloning Zeppelin into the /scripts/vagrant/zeppelin-dev directory from the host, will allow the directory to be shared between your host and the guest machine.

Cloning the project again may seem counter intuitive, since this script likley originated from the project repository. Consider copying just the vagrant/zeppelin-dev script from the Zeppelin project as a stand alone directory, then once again clone the specific branch you wish to build.

Synced folders enable Vagrant to sync a folder on the host machine to the guest machine, allowing you to continue working on your project's files on your host machine, but use the resources in the guest machine to compile or run your project. (1) Synced Folder Description from Vagrant Up

By default, Vagrant will share your project directory (the directory with the Vagrantfile) to /vagrant. Which means you should be able to build within the guest machine after you cd /vagrant/zeppelin

What's in this VM?

Running the following commands in the guest machine should display these expected versions:

node --version should report v0.12.7 mvn --version should report Apache Maven 3.3.3 and Java version: 1.7.0_85

The virtual machine consists of:

  • Ubuntu Server 14.04 LTS
  • Node.js 0.12.7
  • npm 2.11.3
  • ruby 1.9.3 + rake, make and bundler (only required if building jekyll documentation)
  • Maven 3.3.3
  • Git
  • Unzip
  • libfontconfig to avoid phatomJs missing dependency issues
  • openjdk-7-jdk
  • Python addons: pip, matplotlib, scipy, numpy, pandas
  • R and R Packages required to run the R Interpreter and the related R tutorial notebook, including: Knitr, devtools, repr, rCharts, ggplot2, googleVis, mplot, htmltools, base64enc, data.table

How to build & run Zeppelin

This assumes you've already cloned the project either on the host machine in the zeppelin-dev directory (to be shared with the guest machine) or cloned directly into a directory while running inside the guest machine. The following build steps will also include Python and R support via PySpark and SparkR:

cd /zeppelin
mvn clean package -Pspark-1.6 -Ppyspark -Phadoop-2.4 -Psparkr -DskipTests
./bin/zeppelin-daemon.sh start

On your host machine browse to http://localhost:8080/

If you turned off port forwarding in the Vagrantfile browse to http://192.168.51.52:8080

Tweaking the Virtual Machine

If you plan to run this virtual machine along side other Vagrant images, you may wish to bind the virtual machine to a specific IP address, and not use port fowarding from your local host.

Comment out the forward_port line, and uncomment the private_network line in Vagrantfile. The subnet that works best for your local network will vary so adjust 192.168.*.* accordingly.

#config.vm.network "forwarded_port", guest: 8080, host: 8080
config.vm.network "private_network", ip: "192.168.51.52"

vagrant halt followed by vagrant up will restart the guest machine bound to the IP address of 192.168.51.52. This approach usually is typically required if running other virtual machines that discover each other directly by IP address, such as Spark Masters and Slaves as well as Cassandra Nodes, Elasticsearch Nodes, and other Spark data sources. You may wish to launch nodes in virtual machines with IP addresses in a subnet that works for your local network, such as: 192.168.51.53, 192.168.51.54, 192.168.51.53, etc..

Python Extras

With Zeppelin running, Numpy, SciPy, Pandas and Matplotlib will be available. Create a pyspark notebook, and try the below code.

%pyspark

import numpy
import scipy
import pandas
import matplotlib

print "numpy " + numpy.__version__
print "scipy " + scipy.__version__
print "pandas " + pandas.__version__
print "matplotlib " + matplotlib.__version__

To Test plotting using Matplotlib into a rendered %html SVG image, try

%pyspark

import matplotlib
matplotlib.use('Agg')   # turn off interactive charting so this works for server side SVG rendering
import matplotlib.pyplot as plt
import numpy as np
import StringIO

# clear out any previous plots on this notebook
plt.clf()

def show(p):
    img = StringIO.StringIO()
    p.savefig(img, format='svg')
    img.seek(0)
    print "%html <div style='width:600px'>" + img.buf + "</div>"

# Example data
people = ('Tom', 'Dick', 'Harry', 'Slim', 'Jim')
y_pos = np.arange(len(people))
performance = 3 + 10 * np.random.rand(len(people))
error = np.random.rand(len(people))

plt.barh(y_pos, performance, xerr=error, align='center', alpha=0.4)
plt.yticks(y_pos, people)
plt.xlabel('Performance')
plt.title('How fast do you want to go today?')

show(plt)

R Extras

With zeppelin running, an R Tutorial notebook will be available. The R packages required to run the examples and graphs in this tutorial notebook were installed by this virtual machine. The installed R Packages include: Knitr, devtools, repr, rCharts, ggplot2, googleVis, mplot, htmltools, base64enc, data.table