Algorithms and frameworks form a fundamental part of machine learning (ML). These critical components enable developers and data scientists to easily and quickly build ML models with well-defined interfaces for a range of use cases. The most commonly used algorithms and frameworks, built-in with Amazon SageMaker, make ML easier to address these use cases. In this session, we discuss the built-in algorithms and frameworks and how you can leverage them for your ML models. We also discuss the flexibility of bringing your own algorithm into Amazon SageMaker depending on your needs.
14. • XGBoost
• Factorization
Machines
• Linear methods
• Autoregressive
models
• K-Means Clustering
• PCA
Image classification
with convolutional
neural networks
• Spectral LDA
• Neural Topic Models
• Seq2Seq for
translation and similar
problems
• BlazingText
Built-in Algorithms
15. Streaming datasets,
for cheaper training
Train faster, in a
single pass
Greater reliability on
extremely large
datasets
Choice of several ML
algorithms
Algorithms Designed for Huge Datasets
38. … explore and refine
models in a single
notebook instance
TensorFlow, Apache MXNet, PyTorch, Chainer Containers
… deploy to
production
Sample your data… Use the same code
to train on the full
dataset in a cluster
of GPU instances …
39. Bring Your Own Algorithm
... add algorithm
code to a Docker
container ...
Choose your own framework
... publish to Amazon
ECR
Amazon ECS
40. Custom Docker
• Amazon SageMaker runs “docker run image train” for training
and “docker run image serve” for inference
• tini initializer gets confused by this
• The exec form of ENTRYPOINT is recommended
• Include only CUDA Toolkit, don’t include drivers
• /opt/ml used for providing data, hyperparameters, etc.
• Inference web server must respond to /invocations and /ping
on port 8080