When a machine learning model needs to served for interactive use cases, the models are either wrapped inside a Flask server or deployed using external services like Sagemaker. Both methods come with flaws. In this talk, you will learn about how ray serve uses ray to address the limitations of current approaches and enable scalable model serving.