This document provides an introduction and overview of Spark: - Spark is an open-source in-memory data processing engine that can handle large datasets across clusters of computers using an API in Scala, Python, or R. - IBM is heavily committed to Spark, contributing the most code and fixing the most issues reported by other organizations to continually improve the full analytics stack. - An example is presented on using Spark to predict hospital readmissions from diabetes patient data, obtaining AUC scores comparable to other published models.