Struggling to conquer Apache Spark?

Learning is hard enough as it is but when you bring in distributed computing frameworks in sophisticated programming languages - things don't get any easier. While self-study can certainly help, without a good guide, things are always more difficult than they should be. That's why I created Spark Tutorials, to make it easier to learn and use Apache Spark. is here to provide simple, easy to follow tutorials to help you get up and running quickly. You'll learn the foundational abstractions in Apache Spark from RDDs to DataFrames and MLLib. Start off with some of the articles below.

Opening CSV Files in Apache Spark - The Spark Data Sources API and Spark-CSV

This guide will show you how to read in csv files in Apache Spark. We'll walk through how to use this package in both Python and Scala.

Visit Article »

Getting Started with Apache Spark DataFrames in Python and Scala

In this easy to follow tutorial, learn the basics of Spark DataFrames, how they're composed of RDDs and what they allow you to do in Scala. They're a similar abstraction to pandas DataFrames or R's DataFrames.

Visit Article »

Spark Will Not Start with Spark Error-java.lang.OutOfMemoryError PermGen space

This article will walk you through how to resolve the java.lang.OutOfMemoryError: PermGen space exception that can occur when you're trying to start Spark.

Visit Article »