This document discusses using Jupyter notebooks, Pandas, and Spark for analytics pipelines on both small and large datasets. It summarizes the challenges of working with different data volumes and timeframes. For small mobile transaction data, notebooks with Pandas and R are used, while larger retail data is analyzed with Spark ML and scikit-learn in notebooks running in Docker containers. Future work includes applying Spark to additional domains and building forecasting and streaming capabilities.