This document discusses Spark MLlib's goal of making practical machine learning easy and scalable. It introduces key concepts in Spark ML Pipelines including DataFrames, transformers, estimators, and evaluators. DataFrames provide a structured dataset with named columns. Transformers, estimators, and evaluators are abstractions that define different steps in a machine learning workflow. Pipelines allow assembling these abstractions into reusable workflows. The document demonstrates an example text classification pipeline and discusses ongoing work to improve parameter tuning and feature engineering support.