This document summarizes best practices for engineering production-ready software with Apache Spark. It discusses how data scientists focus on iterative experimentation and model performance while engineers focus on stability, maintainability, and performance. Simply taking prototypes to production can cause issues. The document demonstrates building well-engineered software with Spark that enables both research and experimentation, getting the best of data science and engineering. It emphasizes using practices like modular and testable code, and designing interfaces that allow experimentation on engineered artifacts.