- Spark ML pipelines involve estimators that are trained on datasets to produce immutable transformers. - A transformer must define transformSchema() to validate the input schema, transform() to do the work, and copy() for cloning. - Configurable transformers take parameters like inputCol and outputCol to allow configuration for meta algorithms. - Estimators are similar but fit() returns a model instead of directly transforming.