Notebooks are quickly becoming the go to way of running and developing code in data science. While it's not the only way, it's certainly popular and is an Apache Incubating Project. In this tutorial, we'll walk through how to get a Zeppelin notebook setup on your machine or cluster for data science development.
This article will walk you through how to resolve the somewhat common java.net.BindException: Address already in use exception that can occur when you're trying to start Spark.
In this tutorial we're gong to set up a complete predictive modeling pipeline in Spark using DataFrames, Pipelines and MLlib. The first part of this tutorial will explain some of the basic concepts that we're going to need to build this model, walk you through how to download the data we'll use, and lastly create our Spark Cluster on Amazon AWS and read and write from AWS S3!