This document discusses building machine learning models for prediction on edge devices like mobile phones and IoT devices. It outlines 5 approaches: 1) Calling application services, 2) Building/training models in the cloud and deploying them to devices, 3) Hosting models behind APIs in the cloud, 4) Utilizing platform APIs on devices, and 5) Passing input data directly to AWS Lambda. It also discusses challenges like limited resources on edge devices and the benefits of performing ML on the edge like latency, bandwidth, and privacy. Finally, it introduces Amazon SageMaker as a fully managed service for training and hosting custom models.