Recently, there has been increased interest in running analytics and machine learning workloads on top of serverless frameworks in the cloud. The serverless execution model provides fine-grained scaling and unburdens users from having to manage servers, but also adds substantial performance overheads due to the fact that all data and intermediate state of compute task is stored on remote shared storage.
In this talk I first provide a detailed performance breakdown from a machine learning workload using Spark on AWS Lambda. I show how the intermediate state of tasks — such as model updates or broadcast messages — is exchanged using remote storage and what the performance overheads are. Later, I illustrate how the same workload performs on-premise using Apache Spark and Apache Crail deployed on a high-performance cluster (100Gbps network, NVMe Flash, etc.). Serverless computing simplifies the deployment of machine learning applications. The talk shows that performance does not need to be sacrificed.