This document discusses challenges with deploying large machine learning models at scale for near real-time scoring. It describes an initial solution using AWS Lambda that had limitations. A better solution called Mleap is proposed, which allows exporting trained Spark ML models into a serialized format and loading them into a Scala pipeline for fast serving. A demo is shown building a model in PySpark, exporting it with Mleap, and loading it into a Docker container for real-time scoring in 50ms compared to Spark's 1.5 seconds.