This document provides an overview of Apache Spark, including what it is, its evolution and features, components, and the difference between Spark and Hadoop. Spark was originally developed in 2009 as a fast and general engine for large-scale data processing. It has since become a top-level Apache project and is designed to be up to 100 times faster than Hadoop in memory and 10 times faster on disk. Spark supports SQL, streaming, machine learning and graph processing through components built on its core engine.