In this tutorial we're going to be doing a full-stack machine learning project. We're going all the way from data manipulation to feature creation and finally serving predictions.
Graphs are a simple way of representing relationships in data and Apache Spark provides a simple way of creating and manipulating them. This tutorial will walk you through the basics of GraphX in Apache Spark using Scala. You'll analyze flight data from 2008 and run algorithms like PageRank to better understand all the flights that took place!
This introductory tutorial will walk you through the basic RDD abstraction in Spark. It has code samples in both Scala as well as Python Spark (PySpark). We'll answer the question, what is an RDD?