Significant amount of effort is required to transform a machine learning (ML) model into a useful machine learning product. The incorporation of ML into real world applications almost feels like "1% algorithm and 99% perspiration". I will share with you my team experience in building 3 ML products at Zendesk. I will also discuss some real-world problems and scaling complexities you may encounter when building these products at web scale. Close collaboration with different groups including product, engineering and data science is imperative to strike the balance between model performance, scalability and computational efficiency. The talk mainly focuses on scaling our model building infrastructure with an aim to build at least 50,000 models a day. This is achieved as part of our efforts to deliver a ML product called Content Cues. In a nutshell, Content Cues summarizes text from customers support tickets to form insightful topics. It combines multiple ML algorithms including deep learning, clustering and other natural language processing approaches. These ML algorithms are then run through tens of thousands of eligible Zendesk customer data every day. My talk will cover the following topics: How we implement a horizontally scalable model building and model serving pipeline by combining AWS EMR, AWS Batch and Kubernetes How we tune the model building pipeline to optimize cost and efficiency without compromising resiliency Challenges in model monitoring, model versioning evolution and capturing of user feedback Speaker: Wai Chee Yau