This document discusses using Alluxio with Spark to improve performance. It begins with an overview of Alluxio and how it fits into the data ecosystem. Common use cases with Spark include accelerating I/O to remote storage, sharing data across jobs at memory speed, and managing data across storage systems. Using Alluxio with Spark consolidates memory, provides data resilience during crashes, and allows easy access to Alluxio data from Spark. Performance evaluations show Alluxio providing 2-17x speedups over Spark for reading RDDs and DataFrames from both local SSD and remote S3 storage.