Increase readability and delete useless <br> tag tutorial.md

2026-05-24 09:38:26 +00:00 · 2015-12-29 02:04:41 -08:00 · 2015-12-29 02:04:41 -08:00 · c63e59fbf3
commit c63e59fbf3
parent e867df3a35
1 changed files with 13 additions and 15 deletions
--- a/docs/tutorial/tutorial.md
+++ b/docs/tutorial/tutorial.md
@ -17,20 +17,20 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.
 -->
-### Zeppelin Tutorial
+## Zeppelin Tutorial

-We will assume you have Zeppelin installed already. If that's not the case, see [Install](../install/install.html).
+This tutorial walks you through some of the fundamental Zeppelin concepts. We will assume you have already installed Zeppelin. If not, please see [here](../install/install.html) first.

-Zeppelin's current main backend processing engine is [Apache Spark](https://spark.apache.org). If you're new to the system, you might want to start by getting an idea of how it processes data to get the most out of Zeppelin.
+Current main backend processing engine of Zeppelin is [Apache Spark](https://spark.apache.org). If you're new to this system, you might want to start by getting an idea of how it processes data to get the most out of Zeppelin.

 <br />
-### Tutorial with Local File
+## Tutorial with Local File

-#### Data Refine
+### 1. Data Refine

 Before you start Zeppelin tutorial, you will need to download [bank.zip](http://archive.ics.uci.edu/ml/machine-learning-databases/00222/bank.zip). 

-First, to transform data from csv format into RDD of `Bank` objects, run following script. This will also remove header using `filter` function.
+First, to transform csv format data into RDD of `Bank` objects, run following script. This will also remove header using `filter` function.

 ```scala

@ -38,7 +38,7 @@ val bankText = sc.textFile("yourPath/bank/bank-full.csv")

 case class Bank(age:Integer, job:String, marital : String, education : String, balance : Integer)

-// split each line, filter out header (starts with "age"), and map it into Bank case class  
+// split each line, filter out header (starts with "age"), and map it into Bank case class
 val bank = bankText.map(s=>s.split(";")).filter(s=>s(0)!="\"age\"").map(
    s=>Bank(s(0).toInt, 
            s(1).replaceAll("\"", ""),
@ -52,8 +52,7 @@ val bank = bankText.map(s=>s.split(";")).filter(s=>s(0)!="\"age\"").map(
 bank.toDF().registerTempTable("bank")
 ```

-<br />
-#### Data Retrieval
+### 2. Data Retrieval

 Suppose we want to see age distribution from `bank`. To do this, run:

@ -74,9 +73,9 @@ Now we want to see age distribution with certain marital status and add combo bo
 ```

 <br />
-### Tutorial with Streaming Data 
+## Tutorial with Streaming Data 

-#### Data Refine
+### 1. Data Refine

 Since this tutorial is based on Twitter's sample tweet stream, you must configure authentication with a Twitter account. To do this, take a look at [Twitter Credential Setup](https://databricks-training.s3.amazonaws.com/realtime-processing-with-spark-streaming.html#twitter-credential-setup). After you get API keys, you should fill out credential related values(`apiKey`, `apiSecret`, `accessToken`, `accessTokenSecret`) with your API keys on following script.

@ -136,12 +135,11 @@ twt.print
 ssc.start()
 ```

-<br />
-#### Data Retrieval
+### 2. Data Retrieval

 For each following script, every time you click run button you will see different result since it is based on real-time data.

-Let's begin by extracting maximum 10 tweets which contain the word "girl".
+Let's begin by extracting maximum 10 tweets which contain the word **girl**.

 ```sql
 %sql select * from tweets where text like '%girl%' limit 10
@ -154,7 +152,7 @@ This time suppose we want to see how many tweets have been created per sec durin
 ```


-You can make user-defined function and use it in Spark SQL. Let's try it by making function named `sentiment`. This function will return one of the three attitudes(positive, negative, neutral) towards the parameter.
+You can make user-defined function and use it in Spark SQL. Let's try it by making function named `sentiment`. This function will return one of the three attitudes( positive, negative, neutral ) towards the parameter.

 ```scala
 def sentiment(s:String) : String = {