SlideShare a Scribd company logo
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Deep Learning 모델의 효과적인
분산 트레이닝과 모델 최적화 방법
김무현, Data Scientist
AWS ML Solutions Lab
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon ML Solutions Lab
Brainstorming Modeling Teaching
Leverage Amazon experts with decades of ML
experience with technologies like Amazon Echo,
Amazon Alexa, Prime Air and Amazon Go
Amazon ML Solutions Lab
provides ML expertise
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Now let’s make it as
fast, efficient and inexpensive
as possible
Put machine learning in the
hands of every developer
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
M L F R A M E W O R K S &
I N F R A S T R U C T U R E
The Amazon ML Stack: Broadest & Deepest Set of Capabilities
A I S E R V I C E S
R E K O G N I T I O N
I M A G E
P O L L Y T R A N S C R I B E T R A N S L A T E C O M P R E H E N D
C O M P R E H E N D
M E D I C A L
L E XR E K O G N I T I O N
V I D E O
Vision Speech Chatbots
A M A Z O N S A G E M A K E R
B U I L D T R A I N
F O R E C A S TT E X T R A C T P E R S O N A L I Z E
D E P L O Y
Pre-built algorithms & notebooks
Data labeling (G R O U N D T R U T H )
One-click model training & tuning
Optimization ( N E O )
One-click deployment & hosting
M L S E R V I C E S
F r a m e w o r k s I n t e r f a c e s I n f r a s t r u c t u r e
E C 2 P 3
& P 3 d n
E C 2 C 5 F P G A s G R E E N G R A S S E L A S T I C
I N F E R E N C E
Models without training data (REINFORCEMENT LEARNING)
Algorithms & models ( A W S M A R K E T P L A C E )
Language Forecasting Recommendations
NEW NEWNEW
NEW
NEW
NEWNEW
NEW
NEW
RL Coach
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
• Optimizing Infrastructure and Frameworks
• Distributed training for TensorFlow, MXNet, Keras, PyTorch
• Let’s tune models using Amazon SageMaker HPO
• Optimizing the trained model for deployment
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Where to train and deploy deep learning models
Amazon
SageMaker
Amazon Elastic
Container Service
for Kubernetes
Amazon Elastic
Container Service
AWS Deep Learning
Containers
Amazon
EC2
AWS Deep Learning
AMIs
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Making TensorFlow faster
Training a ResNet-50 benchmark with the synthetic ImageNet dataset
using our optimized build of TensorFlow 1.11 on a c5.18xlarge instance
type is 11x faster than training on the stock binaries.
https://aws.amazon.com/about-aws/whats-new/2018/10/chainer4-4_theano_1-0-2_launch_deep_learning_ami/
October 2018
Available with Amazon SageMaker,
AWS Deep Learning AMIs, and AWS Deep Learning Containers
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon EC2 P3dn
https://aws.amazon.com/blogs/aws/new-ec2-p3dn-gpu-instances-with-100-gbps-networking-local-nvme-storage-for-faster-machine-learning-p3-price-reduction/
Reduce machine
learning training time
Better GPU
utilization
Support larger, more
complex models
K E Y F E A T U R E S
100Gbps of networking
bandwidth
8 NVIDIA Tesla
V100 GPUs
32GB of
memory per GPU
(2x more P3)
96 Intel
Skylake vCPUs
(50% more than P3)
with AVX-512
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon EC2 P3 instance type has the most powerful GPU, NVIDIA V100
But
Are you fully utilizing GPUs?
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Tensor Core and mixed-precision training
https://arxiv.org/abs/1710.03740
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How to port training scripts for mixed precision
https://docs.nvidia.com/deeplearning/sdk/mixed-precision-training/index.html
Porting the model to use FP16 data type where appropriate.
1. Use float16 data type on models containing convolutions or matrix
multiplication
2. Check if trainable variables is float32 before converting to float16
3. Use float32 for softmax calculation
Adding loss scaling to preserve small gradient values.
1. Multiply by a scale factor before computing gradient
2. Divide the calculated gradient by the same scale factor
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Code snip for mix-precision training in TensorFlow
x = tf.placeholder(tf.float32, [None, 784])
W1 = tf.Variable(tf.truncated_normal([784, FLAGS.num_hunits]))
b1 = tf.Variable(tf.zeros([FLAGS.num_hunits]))
z = tf.nn.relu(tf.matmul(x, W1) + b1)
W2 = tf.Variable(tf.truncated_normal([FLAGS.num_hunits, 10]))
b2 = tf.Variable(tf.zeros([10]))
y = tf.matmul(z, W2) + b2
y_ = tf.placeholder(tf.int64, [None])
cross_entropy = tf.losses.sparse_softmax_cross_entropy(labels=y_,
logits=y)
train_step =
tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
data = tf.placeholder(tf.float16, shape=(None, 784))
W1 = tf.get_variable('w1', (784, FLAGS.num_hunits), tf.float16)
b1 = tf.get_variable('b1', (FLAGS.num_hunits), tf.float16,
initializer=tf.zeros_initializer())
z = tf.nn.relu(tf.matmul(data, W1) + b1)
W2 = tf.get_variable('w2', (FLAGS.num_hunits, 10), tf.float16)
b2 = tf.get_variable('b2', (10), tf.float16,
initializer=tf.zeros_initializer())
y = tf.matmul(z, W2) + b2
y_ = tf.placeholder(tf.int64, shape=(None))
loss = tf.losses.sparse_softmax_cross_entropy(y_,
tf.cast(y, tf.float32))
* Source code from https://github.com/khcs/fp16-demo-tf
MLP normal implementation MLP mixed-precision implementation
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Code snip for mix-precision training in TensorFlow
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
# Train
for _ in range(3000):
batch_xs, batch_ys = mnist.train.next_batch(100)
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
def gradients_with_loss_scaling(loss, variables, loss_scale):
return [grad / loss_scale
for grad in tf.gradients(loss * loss_scale, variables)]
with tf.device('/gpu:0'), 
tf.variable_scope(
'fp32_storage', custom_getter=float32_variable_storage_getter):
data, target, logits, loss = create_model(nbatch, nin, nout, dtype)
variables = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES)
grads = gradients_with_loss_scaling(loss, variables, loss_scale)
optimizer = tf.train.MomentumOptimizer(learning_rate, momentum)
training_step_op = optimizer.apply_gradients(zip(grads, variables))
init_op = tf.global_variables_initializer()
sess.run(init_op)
for step in range(6000):
batch_xs, batch_ys = mnist.train.next_batch(100)
np_loss, _ = sess.run([loss, training_step_op],
feed_dict={data: batch_xs, target: batch_ys})* Source code from https://github.com/khcs/fp16-demo-tf
MLP normal implementation MLP mixed-precision implementation
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
For other Deep Learning frameworks such as Apache MXNet, PyTorch, etc
please refer to
AWS Deep Learning AMI Developer Guide
https://docs.aws.amazon.com/dlami/latest/devguide/tutorial-gpu-opt-training.html
NVIDIA Deep Learning SDK
https://docs.nvidia.com/deeplearning/sdk/mixed-precision-training/index.html
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Scaling TensorFlow near-linearly 256 GPUs
https://aws.amazon.com/about-aws/whats-new/2018/11/tensorflow-scalability-to-256-gpus/
Stock
TensorFlow
65%
scaling efficiency
with 256 GPUs
30m
training time
AWS-Optimized
TensorFlow
90%
scaling efficiency
with 256 GPUs
Available with
Amazon SageMaker
and the AWS Deep
Learning AMIs
14m
training time
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
I also have huge amount of data or large models for training
How to scale deep learning training tasks?
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Infra for distributed training - scale up
Amazon
Elastic Block
Store (EBS)
Amazon EC2
GPU GPU
GPU GPU
GPU GPU
GPU GPU
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Infra for distributed training - scale out
Amazon
Elastic Block
Store (EBS)
Amazon EC2
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Multi-GPUs and multi-nodes options
Using DL framework’s feature
• TensorFlow
- Multi-powering for multi-GPUs training
- Parameter server for multi-node training
• Apache MXNet
- Multi-GPUs by defining context with list of GPUs
- Parameter server for multi-node training
Using Horovod
• https://eng.uber.com/horovod/
• Open source distributed training framework based on Message Passing Interface (MPI)
• Baidu’s draft implementation of the TensorFlow ring-allreduce algorithm
• Support famous deep learning frameworks such as TensorFlow, MXNet, Keras, PyTorch
Performance scalability using Horovod
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Using Horovod
Install Horovod and related packages
à AWS Deep Learning AMI and Deep Learning Containers have all already
Modify your training code to be trained using Horovod
Run multi-GPUs or distributed training using Horovod mpirun command
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Using Horovod with TensorFlow
import tensorflow as tf
import horovod.tensorflow as hvd
# Initialize Horovod
hvd.init()
# Pin GPU to be used to process local rank (one GPU per
process)
config = tf.ConfigProto()
config.gpu_options.visible_device_list = str(hvd.local_rank())
# Build model...
loss = ...
opt = tf.train.AdagradOptimizer(0.01 * hvd.size())
# Add Horovod Distributed Optimizer
opt = hvd.DistributedOptimizer(opt)
# Add hook to broadcast variables from rank 0 to all other
processes during
# initialization.
hooks = [hvd.BroadcastGlobalVariablesHook(0)]
# Make training operation
train_op = opt.minimize(loss)
# Save checkpoints only on worker 0 to prevent other workers
from corrupting them.
checkpoint_dir = '/tmp/train_logs' if hvd.rank() == 0 else None
# The MonitoredTrainingSession takes care of session
initialization,
# restoring from a checkpoint, saving to a checkpoint, and
closing when done
# or an error occurs.
with
tf.train.MonitoredTrainingSession(checkpoint_dir=checkpoint_dir,
config=config,
hooks=hooks) as mon_sess:
while not mon_sess.should_stop():
# Perform synchronous training.
mon_sess.run(train_op)
( source code from https://github.com/horovod/horovod )
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Using Horovod with Apache MXNet
import mxnet as mx
import horovod.mxnet as hvd
from mxnet import autograd
# Initialize Horovod
hvd.init()
# Pin GPU to be used to process local rank
context = mx.gpu(hvd.local_rank())
num_workers = hvd.size()
# Build model
model = ...
model.hybridize()
# Create optimizer
optimizer_params = ...
opt = mx.optimizer.create('sgd', **optimizer_params)
# Initialize parameters
model.initialize(initializer, ctx=context)
# Fetch and broadcast parameters
params = model.collect_params()
if params is not None:
hvd.broadcast_parameters(params, root_rank=0)
# Create DistributedTrainer, a subclass of gluon.Trainer
trainer = hvd.DistributedTrainer(params, opt)
# Create loss function
loss_fn = ...
# Train model
for epoch in range(num_epoch):
train_data.reset()
for nbatch, batch in enumerate(train_data, start=1):
data = batch.data[0].as_in_context(context)
label = batch.label[0].as_in_context(context)
with autograd.record():
output = model(data.astype(dtype, copy=False))
loss = loss_fn(output, label)
loss.backward()
trainer.step(batch_size)
( source code from https://github.com/horovod/horovod )
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Using Horovod with Keras
import keras
import horovod.keras as hvd
# Horovod: initialize Horovod.
hvd.init()
# Horovod: pin GPU to be used to process local rank (one GPU
per process)
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
config.gpu_options.visible_device_list = str(hvd.local_rank())
# Horovod: adjust number of epochs based on number of GPUs.
epochs = int(math.ceil(12.0 / hvd.size()))
model = ...
# Horovod: adjust learning rate based on number of GPUs.
opt = keras.optimizers.Adadelta(1.0 * hvd.size())
# Horovod: add Horovod Distributed Optimizer.
opt = hvd.DistributedOptimizer(opt)
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=opt, metrics=['accuracy'])))
callbacks = [
# Horovod: broadcast initial variable states from rank 0 to
all other processes.
# This is necessary to ensure consistent initialization of
all workers when
# training is started with random weights or restored from a
checkpoint.
hvd.callbacks.BroadcastGlobalVariablesCallback(0),
]
# Horovod: save checkpoints only on worker 0 to prevent other
workers from corrupting them.
if hvd.rank() == 0:
callbacks.append(keras.callbacks.ModelCheckpoint(
'./checkpoint-{epoch}.h5'))
model.fit(x_train, y_train,
batch_size=batch_size,
callbacks=callbacks,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test))
( source code from https://github.com/horovod/horovod )
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Using Horovod in Amazon EC2
https://docs.aws.amazon.com/dlami/latest/devguide/tutorial-horovod-tensorflow.html
STEP 1. Configure Horovod Hosts file
172.100.1.200 slots=8
172.200.8.99 slots=8
172.48.3.124 slots=8
localhost slots=8
STEP 2. Configure nodes to not do “StrickHostKeyChecking”
STEP 3. Execute training script using mpirun command
~/anaconda3/envs/tensorflow_p36/bin/mpirun -np $gpus -hostfile ~/hosts -mca plm_rsh_no_tree_spawn 1 
-bind-to socket -map-by slot 
-x HOROVOD_HIERARCHICAL_ALLREDUCE=1 -x HOROVOD_FUSION_THRESHOLD=16777216 
-x NCCL_MIN_NRINGS=4 -x LD_LIBRARY_PATH -x PATH -mca pml ob1 -mca btl ^openib 
-x NCCL_SOCKET_IFNAME=$INTERFACE -mca btl_tcp_if_exclude lo,docker0 
-x TF_CPP_MIN_LOG_LEVEL=0 
python -W ignore ~/examples/horovod/tensorflow/train_imagenet_resnet_hvd.py 
--data_dir ~/data/tf-imagenet/ --num_epochs 90 --increased_aug -b $BATCH_SIZE 
--mom 0.977 --wdecay 0.0005 --loss_scale 256. --use_larc 
--lr_decay_mode linear_cosine --warmup_epochs 5 --clear_log
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Using Horovod in Amazon EKS
https://docs.aws.amazon.com/dlami/latest/devguide/deep-learning-containers-eks-tutorials-distributed-gpu-training.html
STEP 1. Install Kubeflow to setup a cluster for distributed training
STEP 2. Set the app name and initialize it.
STEP 3. Install mpi-operator from kubeflow
STEP 4. Create a MPI Job template, define the number of nodes (replicas), number of GPUs each
node has (gpusPerReplica)
STEP 5. Apply the manifest to the default environment. The MPI Job will create a launch pod
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Using Horovod in Amazon SageMaker
from sagemaker.tensorflow import TensorFlow
distributions = {'mpi': {'enabled': True, "processes_per_host": 2}}
# METHOD 1 - Using Amazon SageMaker provided VPC
estimator = TensorFlow(entry_point=train_script,
role=sagemaker_iam_role,
train_instance_count=2,
train_instance_type='ml.p3.8xlarge',
script_mode=True,
framework_version='1.12',
distributions=distributions)
# METHOD 2 - Using your own VPC for training performance improvement
estimator = TensorFlow(entry_point=train_script,
role=sagemaker_iam_role,
train_instance_count=2,
train_instance_type='ml.p3.8xlarge',
script_mode=True,
framework_version='1.12',
distributions=distributions,
security_group_ids=['sg-0919a36a89a15222f'],
subnets=['subnet-0c07198f3eb022ede', 'subnet-055b2819caae2fd1f’])
estimator.fit({"train":s3_train_path, "test":s3_test_path})
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Examples of hyperparameters
Neural Networks
Number of layers
Hidden layer width
Learning rate
Embedding
dimensions
Dropout
…
Decision Trees
Tree depth
Max leaf nodes
Gamma
Eta
Lambda
Alpha
…
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Automatic Model Tuning
Finding the optimal set of hyperparameters
1. Manual Search (”I know what I’m doing”)
2. Grid Search (“X marks the spot”)
• Typically training hundreds of models
• Slow and expensive
3. Random Search (“Spray and pray”)
• Works better and faster than Grid Search
• But… but… but… it’s random!
4. HPO: use Machine Learning
• Training fewer models
• Gaussian Process Regression and Bayesian Optimization
• You can now resume from a previous tuning job
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How to use Amazon SageMaker HPO
Configuration
Training Jobs
Resulting Models
Estimator
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Hardware optimization is extremely complex
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Neo is a compiler and runtime for machine learning
Compiler
Runtime
Processor vendors can integrate
hardware-specific optimizations
Device makers can embed runtime
into edge devices and IoT
github.com/neo-ai
Apache Software License
Neo
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How to compile a model
https://docs.aws.amazon.com/sagemaker/latest/dg/neo-job-compilation-cli.html
Configure the compilation job
{
"RoleArn":$ROLE_ARN,
"InputConfig": {
"S3Uri":"s3://jsimon-neo/model.tar.gz",
"DataInputConfig": "{"data": [1, 3, 224, 224]}",
"Framework": "MXNET"
},
"OutputConfig": {
"S3OutputLocation": "s3://jsimon-neo/",
"TargetDevice": "rasp3b"
},
"StoppingCondition": {
"MaxRuntimeInSeconds": 300
}
}
Compile the model
$ aws sagemaker create-compilation-job
--cli-input-json file://config.json
--compilation-job-name resnet50-mxnet-pi
$ aws s3 cp s3://jsimon-neo/model-
rasp3b.tar.gz .
$ gtar tfz model-rasp3b.tar.gz
compiled.params
compiled_model.json
compiled.so
Predict with the compiled model
from dlr import DLRModel
model = DLRModel('resnet50', input_shape,
output_shape, device)
out = model.run(input_data)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Model compilation using AWS console
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Performance improvement result
Image file name MXNet model (seconds)
Neo-compiled model
(seconds)
Improvement
(mxnet model / neo-
compiled model)
input_001 0.0299 0.0128 233.59%
input_002 0.0223 0.0129 172.86%
input_003 0.0275 0.0125 220.00%
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Do I need really
that much complex & deep
neural networks
to meet the required accuracy?
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Compressing deep learning models
• Compression is the process of reducing the size of a trained network,
either by removing certain layers or by shrinking layers, while
maintaining accuracy.
• A smaller model will predict faster and require less memory.
• The number of possible combinations makes is difficult to perform this
task manually, or even programmatically.
• Reinforcement learning to the rescue!
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Defining the problem
• Objective: find the smallest possible network
architecture from a pre-trained network
architecture, while producing the best
accuracy.
• Environment: a custom developed
environment that accepts a Boolean array of
layers to remove from the RL agent and
produces an observation describing layers.
• State: the layers.
• Action: A boolean array one for each layer.
• Reward: a combination of compression ratio
and accuracy.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon SageMaker RL
Reinforcement learning for every developer and data scientist
Broad support
for frameworks
Broad support for simulation
environments
2D & 3D physics
environments and
OpenGym support
Support Amazon Sumerian, AWS
RoboMaker and the open source
Robotics Operating System
(ROS) project
Fully
managed
Example notebooks
and tutorials
K E Y F E A T U R E S
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
https://github.com/awslabs/amazon-sagemaker-
examples/tree/master/reinforcement_learning/rl_network_compression_ray_custom
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Predictions drive
complexity and
cost in production
Training
10%
Inference
90%
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Are you making the most of your infrastructure?
One size does not fit allLow utilization and high costs
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Elastic Inference
https://aws.amazon.com/blogs/aws/amazon-elastic-inference-gpu-powered-deep-learning-inference-acceleration/
Match capacity
to demand
Available between 1 to 32
TFLOPS
K E Y F E A T U R E S
Integrated with
Amazon EC2,
Amazon SageMaker, and
Amazon DL AMIs
Support for TensorFlow, Apache
MXNet, and ONNX
with PyTorch coming soon
Single and
mixed-precision
operations
Lower inference costs
up to 75%
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Using Elastic Inference with TensorFlow
OPTION 1 - Using Elastic Inference TensorFlow Serving
$ amazonei_tensorflow_model_server --model_name=ssdresnet
--model_base_path=/tmp/ssd_resnet50_v1_coco --port=9000
OPTION 2 - Using Elastic Inference TensorFlow Predictor
from tensorflow.contrib.ei.python.predictor.ei_predictor import EIPredictor
img = mpimg.imread(FLAGS.image)
img = np.expand_dims(img, axis=0)
ssd_resnet_input = {'inputs': img}
eia_predictor = EIPredictor(model_dir='/tmp/ssd_resnet50_v1_coco/1/')
pred = eia_predictor(ssd_resnet_input)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Using Elastic Inference with Apache MXNet
OPTION 1 - Use EI with the MXNet Symbol API
import mxnet as mx
data = mx.sym.var('data', shape=(1,))
sym = mx.sym.exp(data)
# Pass mx.eia() as context during simple bind operation
executor = sym.simple_bind(ctx=mx.eia(), grad_req='null')
# Forward call is performed on remote accelerator
executor.forward(data=mx.nd.ones((1,)))
print('Inference %d, output = %s' % (i, executor.outputs[0]))
OPTION 2 - Use EI with the Module API
ctx = mx.eia()
sym, arg_params, aux_params = mx.model.load_checkpoint('resnet-152', 0)
mod = mx.mod.Module(symbol=sym, context=ctx, label_names=None)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Other tips
SageMaker Pipemode using TensorFlow Pipemode
Dataset extension
https://github.com/aws/sagemaker-tensorflow-
extensions
Apache MXNet can read training data from Amazon
S3 directly
https://mxnet.incubator.apache.org/versions/master/
faq/s3_integration.html
* dataset – a 3.9 GB CSV file– contained 2 million records, each record having
100 comma-separated, single-precision floating-point values.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Summary
Training
• Make it sure to utilize Tensor Core by using mix-precision training
• Learn to use Horovod for efficient multi-GPU or multi-node distributed
training
• Find the most optimal hyperparameter using SageMaker HPO
Deployment
• Compile your model using Amazon SageMaker Neo
• Use Amazon Elastic Inference to reduce inference cost if applicable
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Dive into Deep Learning
An interactive deep learning book
with code, math, and discussions
http://d2l.ai/
http://ko.d2l.ai/
STAT 157 Course at UC Berkeley, Spring 2019
한국어 version of the first 4 chapters is available NOW.
• GitHub Pull Request for any correction is welcome
• Raise issue at https://github.com/d2l-ai/d2l-ko/issues
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Getting started
https://ml.aws
https://aws.amazon.com/blogs/machine-learning
https://aws.amazon.com/sagemaker
https://github.com/awslabs/amazon-sagemaker-examples
https://medium.com/@julsimon

More Related Content

What's hot

AWS 신규 데이터베이스 서비스 분석 - 강민석 솔루션즈아키텍트 , AWS :: AWS Summit Seoul 2019
AWS 신규 데이터베이스 서비스 분석 - 강민석 솔루션즈아키텍트 , AWS :: AWS Summit Seoul 2019AWS 신규 데이터베이스 서비스 분석 - 강민석 솔루션즈아키텍트 , AWS :: AWS Summit Seoul 2019
AWS 신규 데이터베이스 서비스 분석 - 강민석 솔루션즈아키텍트 , AWS :: AWS Summit Seoul 2019
Amazon Web Services Korea
 
AWS의 블록체인 서비스 활용 방법 - 박혜영 솔루션즈 아키텍트, AWS / 박선준 솔루션즈 아키텍트, AWS :: AWS Summit S...
AWS의 블록체인 서비스 활용 방법 - 박혜영 솔루션즈 아키텍트, AWS / 박선준 솔루션즈 아키텍트, AWS :: AWS Summit S...AWS의 블록체인 서비스 활용 방법 - 박혜영 솔루션즈 아키텍트, AWS / 박선준 솔루션즈 아키텍트, AWS :: AWS Summit S...
AWS의 블록체인 서비스 활용 방법 - 박혜영 솔루션즈 아키텍트, AWS / 박선준 솔루션즈 아키텍트, AWS :: AWS Summit S...
Amazon Web Services Korea
 
20190206 AWS Black Belt Online Seminar Amazon SageMaker Basic Session
20190206 AWS Black Belt Online Seminar Amazon SageMaker Basic Session20190206 AWS Black Belt Online Seminar Amazon SageMaker Basic Session
20190206 AWS Black Belt Online Seminar Amazon SageMaker Basic Session
Amazon Web Services Japan
 
AWS를 활용한 Digital Manufacturing 실현 방법 및 사례 소개 - Douglas Bellin, 월드와이드 제조 솔루션 담...
AWS를 활용한 Digital Manufacturing 실현 방법 및 사례 소개 - Douglas Bellin, 월드와이드 제조 솔루션 담...AWS를 활용한 Digital Manufacturing 실현 방법 및 사례 소개 - Douglas Bellin, 월드와이드 제조 솔루션 담...
AWS를 활용한 Digital Manufacturing 실현 방법 및 사례 소개 - Douglas Bellin, 월드와이드 제조 솔루션 담...
Amazon Web Services Korea
 
Breaking the Monolith road to containers.pdf
Breaking the Monolith road to containers.pdfBreaking the Monolith road to containers.pdf
Breaking the Monolith road to containers.pdf
Amazon Web Services
 
20190402 AWS Black Belt Online Seminar Let's Dive Deep into AWS Lambda Part1 ...
20190402 AWS Black Belt Online Seminar Let's Dive Deep into AWS Lambda Part1 ...20190402 AWS Black Belt Online Seminar Let's Dive Deep into AWS Lambda Part1 ...
20190402 AWS Black Belt Online Seminar Let's Dive Deep into AWS Lambda Part1 ...
Amazon Web Services Japan
 
AWS Container Services – 유재석 (AWS 솔루션즈 아키텍트)
AWS Container Services – 유재석 (AWS 솔루션즈 아키텍트)AWS Container Services – 유재석 (AWS 솔루션즈 아키텍트)
AWS Container Services – 유재석 (AWS 솔루션즈 아키텍트)
Amazon Web Services Korea
 
[AWS Start-up ゼミ / DevDay 編] よくある課題を一気に解説! 御社の技術レベルがアップする 2018 秋期講習
[AWS Start-up ゼミ / DevDay 編] よくある課題を一気に解説! 御社の技術レベルがアップする 2018 秋期講習[AWS Start-up ゼミ / DevDay 編] よくある課題を一気に解説! 御社の技術レベルがアップする 2018 秋期講習
[AWS Start-up ゼミ / DevDay 編] よくある課題を一気に解説! 御社の技術レベルがアップする 2018 秋期講習
Amazon Web Services Japan
 
[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS, AI/ML Service 성능 향상을 위한 협력 모델 - 서...
[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS,  AI/ML Service 성능 향상을 위한 협력 모델 - 서...[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS,  AI/ML Service 성능 향상을 위한 협력 모델 - 서...
[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS, AI/ML Service 성능 향상을 위한 협력 모델 - 서...
Amazon Web Services Korea
 
20190821 AWS Black Belt Online Seminar AWS AppSync
20190821 AWS Black Belt Online Seminar AWS AppSync20190821 AWS Black Belt Online Seminar AWS AppSync
20190821 AWS Black Belt Online Seminar AWS AppSync
Amazon Web Services Japan
 
20191023 AWS Black Belt Online Seminar Amazon EMR
20191023 AWS Black Belt Online Seminar Amazon EMR20191023 AWS Black Belt Online Seminar Amazon EMR
20191023 AWS Black Belt Online Seminar Amazon EMR
Amazon Web Services Japan
 
AWS Black Belt Online Seminar AWS 認定クラウドプラクティショナー取得に向けて
AWS Black Belt Online Seminar AWS 認定クラウドプラクティショナー取得に向けてAWS Black Belt Online Seminar AWS 認定クラウドプラクティショナー取得に向けて
AWS Black Belt Online Seminar AWS 認定クラウドプラクティショナー取得に向けて
Amazon Web Services Japan
 
20190318 Amazon EC2 スポットインスタンス再入門
20190318 Amazon EC2 スポットインスタンス再入門20190318 Amazon EC2 スポットインスタンス再入門
20190318 Amazon EC2 スポットインスタンス再入門
Amazon Web Services Japan
 
20190731 Black Belt Online Seminar Amazon ECS Deep Dive
20190731 Black Belt Online Seminar Amazon ECS Deep Dive20190731 Black Belt Online Seminar Amazon ECS Deep Dive
20190731 Black Belt Online Seminar Amazon ECS Deep Dive
Amazon Web Services Japan
 
20180724 AWS Black Belt Online Seminar Amazon Elastic Container Service for K...
20180724 AWS Black Belt Online Seminar Amazon Elastic Container Service for K...20180724 AWS Black Belt Online Seminar Amazon Elastic Container Service for K...
20180724 AWS Black Belt Online Seminar Amazon Elastic Container Service for K...
Amazon Web Services Japan
 
20191009 AWS Black Belt Online Seminar Amazon GameLift
20191009 AWS Black Belt Online Seminar Amazon GameLift20191009 AWS Black Belt Online Seminar Amazon GameLift
20191009 AWS Black Belt Online Seminar Amazon GameLift
Amazon Web Services Japan
 
Optimize your ML workloads_converted.pdf
Optimize your ML workloads_converted.pdfOptimize your ML workloads_converted.pdf
Optimize your ML workloads_converted.pdf
Amazon Web Services
 
20200804 AWS Black Belt Online Seminar Amazon CodeGuru
20200804 AWS Black Belt Online Seminar Amazon CodeGuru20200804 AWS Black Belt Online Seminar Amazon CodeGuru
20200804 AWS Black Belt Online Seminar Amazon CodeGuru
Amazon Web Services Japan
 
[AWS Container Service] Getting Started with Kubernetes on AWS
[AWS Container Service] Getting Started with Kubernetes on AWS[AWS Container Service] Getting Started with Kubernetes on AWS
[AWS Container Service] Getting Started with Kubernetes on AWS
Amazon Web Services Korea
 
[AWS Dev Day] 인공지능 / 기계 학습 | AWS 기반 기계 학습 자동화 및 최적화를 위한 실전 기법 - 남궁영환 AWS 솔루션...
[AWS Dev Day] 인공지능 / 기계 학습 |  AWS 기반 기계 학습 자동화 및 최적화를 위한 실전 기법 - 남궁영환 AWS 솔루션...[AWS Dev Day] 인공지능 / 기계 학습 |  AWS 기반 기계 학습 자동화 및 최적화를 위한 실전 기법 - 남궁영환 AWS 솔루션...
[AWS Dev Day] 인공지능 / 기계 학습 | AWS 기반 기계 학습 자동화 및 최적화를 위한 실전 기법 - 남궁영환 AWS 솔루션...
Amazon Web Services Korea
 

What's hot (20)

AWS 신규 데이터베이스 서비스 분석 - 강민석 솔루션즈아키텍트 , AWS :: AWS Summit Seoul 2019
AWS 신규 데이터베이스 서비스 분석 - 강민석 솔루션즈아키텍트 , AWS :: AWS Summit Seoul 2019AWS 신규 데이터베이스 서비스 분석 - 강민석 솔루션즈아키텍트 , AWS :: AWS Summit Seoul 2019
AWS 신규 데이터베이스 서비스 분석 - 강민석 솔루션즈아키텍트 , AWS :: AWS Summit Seoul 2019
 
AWS의 블록체인 서비스 활용 방법 - 박혜영 솔루션즈 아키텍트, AWS / 박선준 솔루션즈 아키텍트, AWS :: AWS Summit S...
AWS의 블록체인 서비스 활용 방법 - 박혜영 솔루션즈 아키텍트, AWS / 박선준 솔루션즈 아키텍트, AWS :: AWS Summit S...AWS의 블록체인 서비스 활용 방법 - 박혜영 솔루션즈 아키텍트, AWS / 박선준 솔루션즈 아키텍트, AWS :: AWS Summit S...
AWS의 블록체인 서비스 활용 방법 - 박혜영 솔루션즈 아키텍트, AWS / 박선준 솔루션즈 아키텍트, AWS :: AWS Summit S...
 
20190206 AWS Black Belt Online Seminar Amazon SageMaker Basic Session
20190206 AWS Black Belt Online Seminar Amazon SageMaker Basic Session20190206 AWS Black Belt Online Seminar Amazon SageMaker Basic Session
20190206 AWS Black Belt Online Seminar Amazon SageMaker Basic Session
 
AWS를 활용한 Digital Manufacturing 실현 방법 및 사례 소개 - Douglas Bellin, 월드와이드 제조 솔루션 담...
AWS를 활용한 Digital Manufacturing 실현 방법 및 사례 소개 - Douglas Bellin, 월드와이드 제조 솔루션 담...AWS를 활용한 Digital Manufacturing 실현 방법 및 사례 소개 - Douglas Bellin, 월드와이드 제조 솔루션 담...
AWS를 활용한 Digital Manufacturing 실현 방법 및 사례 소개 - Douglas Bellin, 월드와이드 제조 솔루션 담...
 
Breaking the Monolith road to containers.pdf
Breaking the Monolith road to containers.pdfBreaking the Monolith road to containers.pdf
Breaking the Monolith road to containers.pdf
 
20190402 AWS Black Belt Online Seminar Let's Dive Deep into AWS Lambda Part1 ...
20190402 AWS Black Belt Online Seminar Let's Dive Deep into AWS Lambda Part1 ...20190402 AWS Black Belt Online Seminar Let's Dive Deep into AWS Lambda Part1 ...
20190402 AWS Black Belt Online Seminar Let's Dive Deep into AWS Lambda Part1 ...
 
AWS Container Services – 유재석 (AWS 솔루션즈 아키텍트)
AWS Container Services – 유재석 (AWS 솔루션즈 아키텍트)AWS Container Services – 유재석 (AWS 솔루션즈 아키텍트)
AWS Container Services – 유재석 (AWS 솔루션즈 아키텍트)
 
[AWS Start-up ゼミ / DevDay 編] よくある課題を一気に解説! 御社の技術レベルがアップする 2018 秋期講習
[AWS Start-up ゼミ / DevDay 編] よくある課題を一気に解説! 御社の技術レベルがアップする 2018 秋期講習[AWS Start-up ゼミ / DevDay 編] よくある課題を一気に解説! 御社の技術レベルがアップする 2018 秋期講習
[AWS Start-up ゼミ / DevDay 編] よくある課題を一気に解説! 御社の技術レベルがアップする 2018 秋期講習
 
[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS, AI/ML Service 성능 향상을 위한 협력 모델 - 서...
[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS,  AI/ML Service 성능 향상을 위한 협력 모델 - 서...[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS,  AI/ML Service 성능 향상을 위한 협력 모델 - 서...
[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS, AI/ML Service 성능 향상을 위한 협력 모델 - 서...
 
20190821 AWS Black Belt Online Seminar AWS AppSync
20190821 AWS Black Belt Online Seminar AWS AppSync20190821 AWS Black Belt Online Seminar AWS AppSync
20190821 AWS Black Belt Online Seminar AWS AppSync
 
20191023 AWS Black Belt Online Seminar Amazon EMR
20191023 AWS Black Belt Online Seminar Amazon EMR20191023 AWS Black Belt Online Seminar Amazon EMR
20191023 AWS Black Belt Online Seminar Amazon EMR
 
AWS Black Belt Online Seminar AWS 認定クラウドプラクティショナー取得に向けて
AWS Black Belt Online Seminar AWS 認定クラウドプラクティショナー取得に向けてAWS Black Belt Online Seminar AWS 認定クラウドプラクティショナー取得に向けて
AWS Black Belt Online Seminar AWS 認定クラウドプラクティショナー取得に向けて
 
20190318 Amazon EC2 スポットインスタンス再入門
20190318 Amazon EC2 スポットインスタンス再入門20190318 Amazon EC2 スポットインスタンス再入門
20190318 Amazon EC2 スポットインスタンス再入門
 
20190731 Black Belt Online Seminar Amazon ECS Deep Dive
20190731 Black Belt Online Seminar Amazon ECS Deep Dive20190731 Black Belt Online Seminar Amazon ECS Deep Dive
20190731 Black Belt Online Seminar Amazon ECS Deep Dive
 
20180724 AWS Black Belt Online Seminar Amazon Elastic Container Service for K...
20180724 AWS Black Belt Online Seminar Amazon Elastic Container Service for K...20180724 AWS Black Belt Online Seminar Amazon Elastic Container Service for K...
20180724 AWS Black Belt Online Seminar Amazon Elastic Container Service for K...
 
20191009 AWS Black Belt Online Seminar Amazon GameLift
20191009 AWS Black Belt Online Seminar Amazon GameLift20191009 AWS Black Belt Online Seminar Amazon GameLift
20191009 AWS Black Belt Online Seminar Amazon GameLift
 
Optimize your ML workloads_converted.pdf
Optimize your ML workloads_converted.pdfOptimize your ML workloads_converted.pdf
Optimize your ML workloads_converted.pdf
 
20200804 AWS Black Belt Online Seminar Amazon CodeGuru
20200804 AWS Black Belt Online Seminar Amazon CodeGuru20200804 AWS Black Belt Online Seminar Amazon CodeGuru
20200804 AWS Black Belt Online Seminar Amazon CodeGuru
 
[AWS Container Service] Getting Started with Kubernetes on AWS
[AWS Container Service] Getting Started with Kubernetes on AWS[AWS Container Service] Getting Started with Kubernetes on AWS
[AWS Container Service] Getting Started with Kubernetes on AWS
 
[AWS Dev Day] 인공지능 / 기계 학습 | AWS 기반 기계 학습 자동화 및 최적화를 위한 실전 기법 - 남궁영환 AWS 솔루션...
[AWS Dev Day] 인공지능 / 기계 학습 |  AWS 기반 기계 학습 자동화 및 최적화를 위한 실전 기법 - 남궁영환 AWS 솔루션...[AWS Dev Day] 인공지능 / 기계 학습 |  AWS 기반 기계 학습 자동화 및 최적화를 위한 실전 기법 - 남궁영환 AWS 솔루션...
[AWS Dev Day] 인공지능 / 기계 학습 | AWS 기반 기계 학습 자동화 및 최적화를 위한 실전 기법 - 남궁영환 AWS 솔루션...
 

Similar to Deep Learning 모델의 효과적인 분산 트레이닝과 모델 최적화 방법 - 김무현 데이터 사이언티스트, AWS :: AWS Summit Seoul 2019

Deep Learning with Tensorflow and Apache MXNet on AWS (April 2019)
Deep Learning with Tensorflow and Apache MXNet on AWS (April 2019)Deep Learning with Tensorflow and Apache MXNet on AWS (April 2019)
Deep Learning with Tensorflow and Apache MXNet on AWS (April 2019)
Julien SIMON
 
Deep Learning con TensorFlow and Apache MXNet su Amazon SageMaker
Deep Learning con TensorFlow and Apache MXNet su Amazon SageMakerDeep Learning con TensorFlow and Apache MXNet su Amazon SageMaker
Deep Learning con TensorFlow and Apache MXNet su Amazon SageMaker
Amazon Web Services
 
Deep Learning with TensorFlow and Apache MXNet on Amazon SageMaker (March 2019)
Deep Learning with TensorFlow and Apache MXNet on Amazon SageMaker (March 2019)Deep Learning with TensorFlow and Apache MXNet on Amazon SageMaker (March 2019)
Deep Learning with TensorFlow and Apache MXNet on Amazon SageMaker (March 2019)
Julien SIMON
 
Sviluppa, addestra e distribuisci modelli di machine learning.pdf
Sviluppa, addestra e distribuisci modelli di machine learning.pdfSviluppa, addestra e distribuisci modelli di machine learning.pdf
Sviluppa, addestra e distribuisci modelli di machine learning.pdf
Amazon Web Services
 
Machine learning for developers & data scientists with Amazon SageMaker - AIM...
Machine learning for developers & data scientists with Amazon SageMaker - AIM...Machine learning for developers & data scientists with Amazon SageMaker - AIM...
Machine learning for developers & data scientists with Amazon SageMaker - AIM...
Amazon Web Services
 
Scale - Amazon SageMaker Deep Dive for Builders
Scale - Amazon SageMaker Deep Dive for BuildersScale - Amazon SageMaker Deep Dive for Builders
Scale - Amazon SageMaker Deep Dive for Builders
Amazon Web Services
 
엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018
엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018
엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018
Amazon Web Services Korea
 
Machine Learning with Kubernetes- AWS Container Day 2019 Barcelona
Machine Learning with Kubernetes- AWS Container Day 2019 BarcelonaMachine Learning with Kubernetes- AWS Container Day 2019 Barcelona
Machine Learning with Kubernetes- AWS Container Day 2019 Barcelona
Amazon Web Services
 
Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...
Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...
Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...
Amazon Web Services
 
Accelerate Machine Learning Workloads using Amazon EC2 P3 Instances - SRV201 ...
Accelerate Machine Learning Workloads using Amazon EC2 P3 Instances - SRV201 ...Accelerate Machine Learning Workloads using Amazon EC2 P3 Instances - SRV201 ...
Accelerate Machine Learning Workloads using Amazon EC2 P3 Instances - SRV201 ...
Amazon Web Services
 
Machine learning using Kubernetes
Machine learning using KubernetesMachine learning using Kubernetes
Machine learning using Kubernetes
Arun Gupta
 
Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...
Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...
Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...
Amazon Web Services
 
Build, train, and deploy ML models at scale.pdf
Build, train, and deploy ML models at scale.pdfBuild, train, and deploy ML models at scale.pdf
Build, train, and deploy ML models at scale.pdf
Amazon Web Services
 
Build, train and deploy ML models at scale.pdf
Build, train and deploy ML models at scale.pdfBuild, train and deploy ML models at scale.pdf
Build, train and deploy ML models at scale.pdf
Amazon Web Services
 
Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...
Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...
Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...
Amazon Web Services
 
Train ML Models Using Amazon SageMaker with TensorFlow - SRV336 - Chicago AWS...
Train ML Models Using Amazon SageMaker with TensorFlow - SRV336 - Chicago AWS...Train ML Models Using Amazon SageMaker with TensorFlow - SRV336 - Chicago AWS...
Train ML Models Using Amazon SageMaker with TensorFlow - SRV336 - Chicago AWS...
Amazon Web Services
 
Amazon AI/ML Overview
Amazon AI/ML OverviewAmazon AI/ML Overview
Amazon AI/ML Overview
BESPIN GLOBAL
 
[AWS Dev Day] 실습워크샵 | 모두를 위한 컴퓨터 비전 딥러닝 툴킷, GluonCV 따라하기
[AWS Dev Day] 실습워크샵 | 모두를 위한 컴퓨터 비전 딥러닝 툴킷, GluonCV 따라하기[AWS Dev Day] 실습워크샵 | 모두를 위한 컴퓨터 비전 딥러닝 툴킷, GluonCV 따라하기
[AWS Dev Day] 실습워크샵 | 모두를 위한 컴퓨터 비전 딥러닝 툴킷, GluonCV 따라하기
Amazon Web Services Korea
 
Build Machine Learning Models with Amazon SageMaker (April 2019)
Build Machine Learning Models with Amazon SageMaker (April 2019)Build Machine Learning Models with Amazon SageMaker (April 2019)
Build Machine Learning Models with Amazon SageMaker (April 2019)
Julien SIMON
 
Train once, deploy anywhere on the cloud and at the edge with Neo - AIM301 - ...
Train once, deploy anywhere on the cloud and at the edge with Neo - AIM301 - ...Train once, deploy anywhere on the cloud and at the edge with Neo - AIM301 - ...
Train once, deploy anywhere on the cloud and at the edge with Neo - AIM301 - ...
Amazon Web Services
 

Similar to Deep Learning 모델의 효과적인 분산 트레이닝과 모델 최적화 방법 - 김무현 데이터 사이언티스트, AWS :: AWS Summit Seoul 2019 (20)

Deep Learning with Tensorflow and Apache MXNet on AWS (April 2019)
Deep Learning with Tensorflow and Apache MXNet on AWS (April 2019)Deep Learning with Tensorflow and Apache MXNet on AWS (April 2019)
Deep Learning with Tensorflow and Apache MXNet on AWS (April 2019)
 
Deep Learning con TensorFlow and Apache MXNet su Amazon SageMaker
Deep Learning con TensorFlow and Apache MXNet su Amazon SageMakerDeep Learning con TensorFlow and Apache MXNet su Amazon SageMaker
Deep Learning con TensorFlow and Apache MXNet su Amazon SageMaker
 
Deep Learning with TensorFlow and Apache MXNet on Amazon SageMaker (March 2019)
Deep Learning with TensorFlow and Apache MXNet on Amazon SageMaker (March 2019)Deep Learning with TensorFlow and Apache MXNet on Amazon SageMaker (March 2019)
Deep Learning with TensorFlow and Apache MXNet on Amazon SageMaker (March 2019)
 
Sviluppa, addestra e distribuisci modelli di machine learning.pdf
Sviluppa, addestra e distribuisci modelli di machine learning.pdfSviluppa, addestra e distribuisci modelli di machine learning.pdf
Sviluppa, addestra e distribuisci modelli di machine learning.pdf
 
Machine learning for developers & data scientists with Amazon SageMaker - AIM...
Machine learning for developers & data scientists with Amazon SageMaker - AIM...Machine learning for developers & data scientists with Amazon SageMaker - AIM...
Machine learning for developers & data scientists with Amazon SageMaker - AIM...
 
Scale - Amazon SageMaker Deep Dive for Builders
Scale - Amazon SageMaker Deep Dive for BuildersScale - Amazon SageMaker Deep Dive for Builders
Scale - Amazon SageMaker Deep Dive for Builders
 
엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018
엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018
엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018
 
Machine Learning with Kubernetes- AWS Container Day 2019 Barcelona
Machine Learning with Kubernetes- AWS Container Day 2019 BarcelonaMachine Learning with Kubernetes- AWS Container Day 2019 Barcelona
Machine Learning with Kubernetes- AWS Container Day 2019 Barcelona
 
Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...
Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...
Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...
 
Accelerate Machine Learning Workloads using Amazon EC2 P3 Instances - SRV201 ...
Accelerate Machine Learning Workloads using Amazon EC2 P3 Instances - SRV201 ...Accelerate Machine Learning Workloads using Amazon EC2 P3 Instances - SRV201 ...
Accelerate Machine Learning Workloads using Amazon EC2 P3 Instances - SRV201 ...
 
Machine learning using Kubernetes
Machine learning using KubernetesMachine learning using Kubernetes
Machine learning using Kubernetes
 
Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...
Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...
Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...
 
Build, train, and deploy ML models at scale.pdf
Build, train, and deploy ML models at scale.pdfBuild, train, and deploy ML models at scale.pdf
Build, train, and deploy ML models at scale.pdf
 
Build, train and deploy ML models at scale.pdf
Build, train and deploy ML models at scale.pdfBuild, train and deploy ML models at scale.pdf
Build, train and deploy ML models at scale.pdf
 
Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...
Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...
Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...
 
Train ML Models Using Amazon SageMaker with TensorFlow - SRV336 - Chicago AWS...
Train ML Models Using Amazon SageMaker with TensorFlow - SRV336 - Chicago AWS...Train ML Models Using Amazon SageMaker with TensorFlow - SRV336 - Chicago AWS...
Train ML Models Using Amazon SageMaker with TensorFlow - SRV336 - Chicago AWS...
 
Amazon AI/ML Overview
Amazon AI/ML OverviewAmazon AI/ML Overview
Amazon AI/ML Overview
 
[AWS Dev Day] 실습워크샵 | 모두를 위한 컴퓨터 비전 딥러닝 툴킷, GluonCV 따라하기
[AWS Dev Day] 실습워크샵 | 모두를 위한 컴퓨터 비전 딥러닝 툴킷, GluonCV 따라하기[AWS Dev Day] 실습워크샵 | 모두를 위한 컴퓨터 비전 딥러닝 툴킷, GluonCV 따라하기
[AWS Dev Day] 실습워크샵 | 모두를 위한 컴퓨터 비전 딥러닝 툴킷, GluonCV 따라하기
 
Build Machine Learning Models with Amazon SageMaker (April 2019)
Build Machine Learning Models with Amazon SageMaker (April 2019)Build Machine Learning Models with Amazon SageMaker (April 2019)
Build Machine Learning Models with Amazon SageMaker (April 2019)
 
Train once, deploy anywhere on the cloud and at the edge with Neo - AIM301 - ...
Train once, deploy anywhere on the cloud and at the edge with Neo - AIM301 - ...Train once, deploy anywhere on the cloud and at the edge with Neo - AIM301 - ...
Train once, deploy anywhere on the cloud and at the edge with Neo - AIM301 - ...
 

More from Amazon Web Services Korea

AWS Modern Infra with Storage Roadshow 2023 - Day 2
AWS Modern Infra with Storage Roadshow 2023 - Day 2AWS Modern Infra with Storage Roadshow 2023 - Day 2
AWS Modern Infra with Storage Roadshow 2023 - Day 2
Amazon Web Services Korea
 
AWS Modern Infra with Storage Roadshow 2023 - Day 1
AWS Modern Infra with Storage Roadshow 2023 - Day 1AWS Modern Infra with Storage Roadshow 2023 - Day 1
AWS Modern Infra with Storage Roadshow 2023 - Day 1
Amazon Web Services Korea
 
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...
Amazon Web Services Korea
 
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...
Amazon Web Services Korea
 
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...
Amazon Web Services Korea
 
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
Amazon Web Services Korea
 
[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...
[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...
[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...
Amazon Web Services Korea
 
Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...
Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...
Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...
Amazon Web Services Korea
 
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
Amazon Web Services Korea
 
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
Amazon Web Services Korea
 
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...
Amazon Web Services Korea
 
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...
Amazon Web Services Korea
 
From Insights to Action, How to build and maintain a Data Driven Organization...
From Insights to Action, How to build and maintain a Data Driven Organization...From Insights to Action, How to build and maintain a Data Driven Organization...
From Insights to Action, How to build and maintain a Data Driven Organization...
Amazon Web Services Korea
 
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...
Amazon Web Services Korea
 
Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...
Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...
Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...
Amazon Web Services Korea
 
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...
Amazon Web Services Korea
 
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...
Amazon Web Services Korea
 
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...
Amazon Web Services Korea
 
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...
Amazon Web Services Korea
 
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
Amazon Web Services Korea
 

More from Amazon Web Services Korea (20)

AWS Modern Infra with Storage Roadshow 2023 - Day 2
AWS Modern Infra with Storage Roadshow 2023 - Day 2AWS Modern Infra with Storage Roadshow 2023 - Day 2
AWS Modern Infra with Storage Roadshow 2023 - Day 2
 
AWS Modern Infra with Storage Roadshow 2023 - Day 1
AWS Modern Infra with Storage Roadshow 2023 - Day 1AWS Modern Infra with Storage Roadshow 2023 - Day 1
AWS Modern Infra with Storage Roadshow 2023 - Day 1
 
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...
 
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...
 
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...
 
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
 
[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...
[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...
[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...
 
Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...
Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...
Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...
 
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
 
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
 
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...
 
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...
 
From Insights to Action, How to build and maintain a Data Driven Organization...
From Insights to Action, How to build and maintain a Data Driven Organization...From Insights to Action, How to build and maintain a Data Driven Organization...
From Insights to Action, How to build and maintain a Data Driven Organization...
 
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...
 
Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...
Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...
Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...
 
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...
 
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...
 
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...
 
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...
 
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
 

Recently uploaded

PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..
UiPathCommunity
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Enhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZEnhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZ
Globus
 

Recently uploaded (20)

PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Enhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZEnhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZ
 

Deep Learning 모델의 효과적인 분산 트레이닝과 모델 최적화 방법 - 김무현 데이터 사이언티스트, AWS :: AWS Summit Seoul 2019

  • 1. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Deep Learning 모델의 효과적인 분산 트레이닝과 모델 최적화 방법 김무현, Data Scientist AWS ML Solutions Lab
  • 2. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon ML Solutions Lab Brainstorming Modeling Teaching Leverage Amazon experts with decades of ML experience with technologies like Amazon Echo, Amazon Alexa, Prime Air and Amazon Go Amazon ML Solutions Lab provides ML expertise
  • 3. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Now let’s make it as fast, efficient and inexpensive as possible Put machine learning in the hands of every developer
  • 4. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark M L F R A M E W O R K S & I N F R A S T R U C T U R E The Amazon ML Stack: Broadest & Deepest Set of Capabilities A I S E R V I C E S R E K O G N I T I O N I M A G E P O L L Y T R A N S C R I B E T R A N S L A T E C O M P R E H E N D C O M P R E H E N D M E D I C A L L E XR E K O G N I T I O N V I D E O Vision Speech Chatbots A M A Z O N S A G E M A K E R B U I L D T R A I N F O R E C A S TT E X T R A C T P E R S O N A L I Z E D E P L O Y Pre-built algorithms & notebooks Data labeling (G R O U N D T R U T H ) One-click model training & tuning Optimization ( N E O ) One-click deployment & hosting M L S E R V I C E S F r a m e w o r k s I n t e r f a c e s I n f r a s t r u c t u r e E C 2 P 3 & P 3 d n E C 2 C 5 F P G A s G R E E N G R A S S E L A S T I C I N F E R E N C E Models without training data (REINFORCEMENT LEARNING) Algorithms & models ( A W S M A R K E T P L A C E ) Language Forecasting Recommendations NEW NEWNEW NEW NEW NEWNEW NEW NEW RL Coach
  • 5. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Agenda • Optimizing Infrastructure and Frameworks • Distributed training for TensorFlow, MXNet, Keras, PyTorch • Let’s tune models using Amazon SageMaker HPO • Optimizing the trained model for deployment
  • 6. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 7. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Where to train and deploy deep learning models Amazon SageMaker Amazon Elastic Container Service for Kubernetes Amazon Elastic Container Service AWS Deep Learning Containers Amazon EC2 AWS Deep Learning AMIs
  • 8. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Making TensorFlow faster Training a ResNet-50 benchmark with the synthetic ImageNet dataset using our optimized build of TensorFlow 1.11 on a c5.18xlarge instance type is 11x faster than training on the stock binaries. https://aws.amazon.com/about-aws/whats-new/2018/10/chainer4-4_theano_1-0-2_launch_deep_learning_ami/ October 2018 Available with Amazon SageMaker, AWS Deep Learning AMIs, and AWS Deep Learning Containers
  • 9. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon EC2 P3dn https://aws.amazon.com/blogs/aws/new-ec2-p3dn-gpu-instances-with-100-gbps-networking-local-nvme-storage-for-faster-machine-learning-p3-price-reduction/ Reduce machine learning training time Better GPU utilization Support larger, more complex models K E Y F E A T U R E S 100Gbps of networking bandwidth 8 NVIDIA Tesla V100 GPUs 32GB of memory per GPU (2x more P3) 96 Intel Skylake vCPUs (50% more than P3) with AVX-512
  • 10. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon EC2 P3 instance type has the most powerful GPU, NVIDIA V100 But Are you fully utilizing GPUs?
  • 11. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Tensor Core and mixed-precision training https://arxiv.org/abs/1710.03740
  • 12. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. How to port training scripts for mixed precision https://docs.nvidia.com/deeplearning/sdk/mixed-precision-training/index.html Porting the model to use FP16 data type where appropriate. 1. Use float16 data type on models containing convolutions or matrix multiplication 2. Check if trainable variables is float32 before converting to float16 3. Use float32 for softmax calculation Adding loss scaling to preserve small gradient values. 1. Multiply by a scale factor before computing gradient 2. Divide the calculated gradient by the same scale factor
  • 13. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Code snip for mix-precision training in TensorFlow x = tf.placeholder(tf.float32, [None, 784]) W1 = tf.Variable(tf.truncated_normal([784, FLAGS.num_hunits])) b1 = tf.Variable(tf.zeros([FLAGS.num_hunits])) z = tf.nn.relu(tf.matmul(x, W1) + b1) W2 = tf.Variable(tf.truncated_normal([FLAGS.num_hunits, 10])) b2 = tf.Variable(tf.zeros([10])) y = tf.matmul(z, W2) + b2 y_ = tf.placeholder(tf.int64, [None]) cross_entropy = tf.losses.sparse_softmax_cross_entropy(labels=y_, logits=y) train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy) data = tf.placeholder(tf.float16, shape=(None, 784)) W1 = tf.get_variable('w1', (784, FLAGS.num_hunits), tf.float16) b1 = tf.get_variable('b1', (FLAGS.num_hunits), tf.float16, initializer=tf.zeros_initializer()) z = tf.nn.relu(tf.matmul(data, W1) + b1) W2 = tf.get_variable('w2', (FLAGS.num_hunits, 10), tf.float16) b2 = tf.get_variable('b2', (10), tf.float16, initializer=tf.zeros_initializer()) y = tf.matmul(z, W2) + b2 y_ = tf.placeholder(tf.int64, shape=(None)) loss = tf.losses.sparse_softmax_cross_entropy(y_, tf.cast(y, tf.float32)) * Source code from https://github.com/khcs/fp16-demo-tf MLP normal implementation MLP mixed-precision implementation
  • 14. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Code snip for mix-precision training in TensorFlow sess = tf.InteractiveSession() tf.global_variables_initializer().run() # Train for _ in range(3000): batch_xs, batch_ys = mnist.train.next_batch(100) sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys}) def gradients_with_loss_scaling(loss, variables, loss_scale): return [grad / loss_scale for grad in tf.gradients(loss * loss_scale, variables)] with tf.device('/gpu:0'), tf.variable_scope( 'fp32_storage', custom_getter=float32_variable_storage_getter): data, target, logits, loss = create_model(nbatch, nin, nout, dtype) variables = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES) grads = gradients_with_loss_scaling(loss, variables, loss_scale) optimizer = tf.train.MomentumOptimizer(learning_rate, momentum) training_step_op = optimizer.apply_gradients(zip(grads, variables)) init_op = tf.global_variables_initializer() sess.run(init_op) for step in range(6000): batch_xs, batch_ys = mnist.train.next_batch(100) np_loss, _ = sess.run([loss, training_step_op], feed_dict={data: batch_xs, target: batch_ys})* Source code from https://github.com/khcs/fp16-demo-tf MLP normal implementation MLP mixed-precision implementation
  • 15. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. For other Deep Learning frameworks such as Apache MXNet, PyTorch, etc please refer to AWS Deep Learning AMI Developer Guide https://docs.aws.amazon.com/dlami/latest/devguide/tutorial-gpu-opt-training.html NVIDIA Deep Learning SDK https://docs.nvidia.com/deeplearning/sdk/mixed-precision-training/index.html
  • 16. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 17. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Scaling TensorFlow near-linearly 256 GPUs https://aws.amazon.com/about-aws/whats-new/2018/11/tensorflow-scalability-to-256-gpus/ Stock TensorFlow 65% scaling efficiency with 256 GPUs 30m training time AWS-Optimized TensorFlow 90% scaling efficiency with 256 GPUs Available with Amazon SageMaker and the AWS Deep Learning AMIs 14m training time
  • 18. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. I also have huge amount of data or large models for training How to scale deep learning training tasks?
  • 19. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Infra for distributed training - scale up Amazon Elastic Block Store (EBS) Amazon EC2 GPU GPU GPU GPU GPU GPU GPU GPU
  • 20. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Infra for distributed training - scale out Amazon Elastic Block Store (EBS) Amazon EC2
  • 21. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Multi-GPUs and multi-nodes options Using DL framework’s feature • TensorFlow - Multi-powering for multi-GPUs training - Parameter server for multi-node training • Apache MXNet - Multi-GPUs by defining context with list of GPUs - Parameter server for multi-node training Using Horovod • https://eng.uber.com/horovod/ • Open source distributed training framework based on Message Passing Interface (MPI) • Baidu’s draft implementation of the TensorFlow ring-allreduce algorithm • Support famous deep learning frameworks such as TensorFlow, MXNet, Keras, PyTorch Performance scalability using Horovod
  • 22. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Using Horovod Install Horovod and related packages à AWS Deep Learning AMI and Deep Learning Containers have all already Modify your training code to be trained using Horovod Run multi-GPUs or distributed training using Horovod mpirun command
  • 23. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Using Horovod with TensorFlow import tensorflow as tf import horovod.tensorflow as hvd # Initialize Horovod hvd.init() # Pin GPU to be used to process local rank (one GPU per process) config = tf.ConfigProto() config.gpu_options.visible_device_list = str(hvd.local_rank()) # Build model... loss = ... opt = tf.train.AdagradOptimizer(0.01 * hvd.size()) # Add Horovod Distributed Optimizer opt = hvd.DistributedOptimizer(opt) # Add hook to broadcast variables from rank 0 to all other processes during # initialization. hooks = [hvd.BroadcastGlobalVariablesHook(0)] # Make training operation train_op = opt.minimize(loss) # Save checkpoints only on worker 0 to prevent other workers from corrupting them. checkpoint_dir = '/tmp/train_logs' if hvd.rank() == 0 else None # The MonitoredTrainingSession takes care of session initialization, # restoring from a checkpoint, saving to a checkpoint, and closing when done # or an error occurs. with tf.train.MonitoredTrainingSession(checkpoint_dir=checkpoint_dir, config=config, hooks=hooks) as mon_sess: while not mon_sess.should_stop(): # Perform synchronous training. mon_sess.run(train_op) ( source code from https://github.com/horovod/horovod )
  • 24. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Using Horovod with Apache MXNet import mxnet as mx import horovod.mxnet as hvd from mxnet import autograd # Initialize Horovod hvd.init() # Pin GPU to be used to process local rank context = mx.gpu(hvd.local_rank()) num_workers = hvd.size() # Build model model = ... model.hybridize() # Create optimizer optimizer_params = ... opt = mx.optimizer.create('sgd', **optimizer_params) # Initialize parameters model.initialize(initializer, ctx=context) # Fetch and broadcast parameters params = model.collect_params() if params is not None: hvd.broadcast_parameters(params, root_rank=0) # Create DistributedTrainer, a subclass of gluon.Trainer trainer = hvd.DistributedTrainer(params, opt) # Create loss function loss_fn = ... # Train model for epoch in range(num_epoch): train_data.reset() for nbatch, batch in enumerate(train_data, start=1): data = batch.data[0].as_in_context(context) label = batch.label[0].as_in_context(context) with autograd.record(): output = model(data.astype(dtype, copy=False)) loss = loss_fn(output, label) loss.backward() trainer.step(batch_size) ( source code from https://github.com/horovod/horovod )
  • 25. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Using Horovod with Keras import keras import horovod.keras as hvd # Horovod: initialize Horovod. hvd.init() # Horovod: pin GPU to be used to process local rank (one GPU per process) config = tf.ConfigProto() config.gpu_options.allow_growth = True config.gpu_options.visible_device_list = str(hvd.local_rank()) # Horovod: adjust number of epochs based on number of GPUs. epochs = int(math.ceil(12.0 / hvd.size())) model = ... # Horovod: adjust learning rate based on number of GPUs. opt = keras.optimizers.Adadelta(1.0 * hvd.size()) # Horovod: add Horovod Distributed Optimizer. opt = hvd.DistributedOptimizer(opt) model.compile(loss=keras.losses.categorical_crossentropy, optimizer=opt, metrics=['accuracy']))) callbacks = [ # Horovod: broadcast initial variable states from rank 0 to all other processes. # This is necessary to ensure consistent initialization of all workers when # training is started with random weights or restored from a checkpoint. hvd.callbacks.BroadcastGlobalVariablesCallback(0), ] # Horovod: save checkpoints only on worker 0 to prevent other workers from corrupting them. if hvd.rank() == 0: callbacks.append(keras.callbacks.ModelCheckpoint( './checkpoint-{epoch}.h5')) model.fit(x_train, y_train, batch_size=batch_size, callbacks=callbacks, epochs=epochs, verbose=1, validation_data=(x_test, y_test)) ( source code from https://github.com/horovod/horovod )
  • 26. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Using Horovod in Amazon EC2 https://docs.aws.amazon.com/dlami/latest/devguide/tutorial-horovod-tensorflow.html STEP 1. Configure Horovod Hosts file 172.100.1.200 slots=8 172.200.8.99 slots=8 172.48.3.124 slots=8 localhost slots=8 STEP 2. Configure nodes to not do “StrickHostKeyChecking” STEP 3. Execute training script using mpirun command ~/anaconda3/envs/tensorflow_p36/bin/mpirun -np $gpus -hostfile ~/hosts -mca plm_rsh_no_tree_spawn 1 -bind-to socket -map-by slot -x HOROVOD_HIERARCHICAL_ALLREDUCE=1 -x HOROVOD_FUSION_THRESHOLD=16777216 -x NCCL_MIN_NRINGS=4 -x LD_LIBRARY_PATH -x PATH -mca pml ob1 -mca btl ^openib -x NCCL_SOCKET_IFNAME=$INTERFACE -mca btl_tcp_if_exclude lo,docker0 -x TF_CPP_MIN_LOG_LEVEL=0 python -W ignore ~/examples/horovod/tensorflow/train_imagenet_resnet_hvd.py --data_dir ~/data/tf-imagenet/ --num_epochs 90 --increased_aug -b $BATCH_SIZE --mom 0.977 --wdecay 0.0005 --loss_scale 256. --use_larc --lr_decay_mode linear_cosine --warmup_epochs 5 --clear_log
  • 27. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Using Horovod in Amazon EKS https://docs.aws.amazon.com/dlami/latest/devguide/deep-learning-containers-eks-tutorials-distributed-gpu-training.html STEP 1. Install Kubeflow to setup a cluster for distributed training STEP 2. Set the app name and initialize it. STEP 3. Install mpi-operator from kubeflow STEP 4. Create a MPI Job template, define the number of nodes (replicas), number of GPUs each node has (gpusPerReplica) STEP 5. Apply the manifest to the default environment. The MPI Job will create a launch pod
  • 28. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Using Horovod in Amazon SageMaker from sagemaker.tensorflow import TensorFlow distributions = {'mpi': {'enabled': True, "processes_per_host": 2}} # METHOD 1 - Using Amazon SageMaker provided VPC estimator = TensorFlow(entry_point=train_script, role=sagemaker_iam_role, train_instance_count=2, train_instance_type='ml.p3.8xlarge', script_mode=True, framework_version='1.12', distributions=distributions) # METHOD 2 - Using your own VPC for training performance improvement estimator = TensorFlow(entry_point=train_script, role=sagemaker_iam_role, train_instance_count=2, train_instance_type='ml.p3.8xlarge', script_mode=True, framework_version='1.12', distributions=distributions, security_group_ids=['sg-0919a36a89a15222f'], subnets=['subnet-0c07198f3eb022ede', 'subnet-055b2819caae2fd1f’]) estimator.fit({"train":s3_train_path, "test":s3_test_path})
  • 29. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 30. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 31. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Examples of hyperparameters Neural Networks Number of layers Hidden layer width Learning rate Embedding dimensions Dropout … Decision Trees Tree depth Max leaf nodes Gamma Eta Lambda Alpha …
  • 32. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Automatic Model Tuning Finding the optimal set of hyperparameters 1. Manual Search (”I know what I’m doing”) 2. Grid Search (“X marks the spot”) • Typically training hundreds of models • Slow and expensive 3. Random Search (“Spray and pray”) • Works better and faster than Grid Search • But… but… but… it’s random! 4. HPO: use Machine Learning • Training fewer models • Gaussian Process Regression and Bayesian Optimization • You can now resume from a previous tuning job
  • 33. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. How to use Amazon SageMaker HPO Configuration Training Jobs Resulting Models Estimator
  • 34. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 35. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 36. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Hardware optimization is extremely complex
  • 37. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Neo is a compiler and runtime for machine learning Compiler Runtime Processor vendors can integrate hardware-specific optimizations Device makers can embed runtime into edge devices and IoT github.com/neo-ai Apache Software License Neo
  • 38. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. How to compile a model https://docs.aws.amazon.com/sagemaker/latest/dg/neo-job-compilation-cli.html Configure the compilation job { "RoleArn":$ROLE_ARN, "InputConfig": { "S3Uri":"s3://jsimon-neo/model.tar.gz", "DataInputConfig": "{"data": [1, 3, 224, 224]}", "Framework": "MXNET" }, "OutputConfig": { "S3OutputLocation": "s3://jsimon-neo/", "TargetDevice": "rasp3b" }, "StoppingCondition": { "MaxRuntimeInSeconds": 300 } } Compile the model $ aws sagemaker create-compilation-job --cli-input-json file://config.json --compilation-job-name resnet50-mxnet-pi $ aws s3 cp s3://jsimon-neo/model- rasp3b.tar.gz . $ gtar tfz model-rasp3b.tar.gz compiled.params compiled_model.json compiled.so Predict with the compiled model from dlr import DLRModel model = DLRModel('resnet50', input_shape, output_shape, device) out = model.run(input_data)
  • 39. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Model compilation using AWS console
  • 40. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Performance improvement result Image file name MXNet model (seconds) Neo-compiled model (seconds) Improvement (mxnet model / neo- compiled model) input_001 0.0299 0.0128 233.59% input_002 0.0223 0.0129 172.86% input_003 0.0275 0.0125 220.00%
  • 41. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Do I need really that much complex & deep neural networks to meet the required accuracy?
  • 42. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Compressing deep learning models • Compression is the process of reducing the size of a trained network, either by removing certain layers or by shrinking layers, while maintaining accuracy. • A smaller model will predict faster and require less memory. • The number of possible combinations makes is difficult to perform this task manually, or even programmatically. • Reinforcement learning to the rescue!
  • 43. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Defining the problem • Objective: find the smallest possible network architecture from a pre-trained network architecture, while producing the best accuracy. • Environment: a custom developed environment that accepts a Boolean array of layers to remove from the RL agent and produces an observation describing layers. • State: the layers. • Action: A boolean array one for each layer. • Reward: a combination of compression ratio and accuracy.
  • 44. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon SageMaker RL Reinforcement learning for every developer and data scientist Broad support for frameworks Broad support for simulation environments 2D & 3D physics environments and OpenGym support Support Amazon Sumerian, AWS RoboMaker and the open source Robotics Operating System (ROS) project Fully managed Example notebooks and tutorials K E Y F E A T U R E S
  • 45. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. https://github.com/awslabs/amazon-sagemaker- examples/tree/master/reinforcement_learning/rl_network_compression_ray_custom
  • 46. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Predictions drive complexity and cost in production Training 10% Inference 90%
  • 47. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Are you making the most of your infrastructure? One size does not fit allLow utilization and high costs
  • 48. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Elastic Inference https://aws.amazon.com/blogs/aws/amazon-elastic-inference-gpu-powered-deep-learning-inference-acceleration/ Match capacity to demand Available between 1 to 32 TFLOPS K E Y F E A T U R E S Integrated with Amazon EC2, Amazon SageMaker, and Amazon DL AMIs Support for TensorFlow, Apache MXNet, and ONNX with PyTorch coming soon Single and mixed-precision operations Lower inference costs up to 75%
  • 49. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Using Elastic Inference with TensorFlow OPTION 1 - Using Elastic Inference TensorFlow Serving $ amazonei_tensorflow_model_server --model_name=ssdresnet --model_base_path=/tmp/ssd_resnet50_v1_coco --port=9000 OPTION 2 - Using Elastic Inference TensorFlow Predictor from tensorflow.contrib.ei.python.predictor.ei_predictor import EIPredictor img = mpimg.imread(FLAGS.image) img = np.expand_dims(img, axis=0) ssd_resnet_input = {'inputs': img} eia_predictor = EIPredictor(model_dir='/tmp/ssd_resnet50_v1_coco/1/') pred = eia_predictor(ssd_resnet_input)
  • 50. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Using Elastic Inference with Apache MXNet OPTION 1 - Use EI with the MXNet Symbol API import mxnet as mx data = mx.sym.var('data', shape=(1,)) sym = mx.sym.exp(data) # Pass mx.eia() as context during simple bind operation executor = sym.simple_bind(ctx=mx.eia(), grad_req='null') # Forward call is performed on remote accelerator executor.forward(data=mx.nd.ones((1,))) print('Inference %d, output = %s' % (i, executor.outputs[0])) OPTION 2 - Use EI with the Module API ctx = mx.eia() sym, arg_params, aux_params = mx.model.load_checkpoint('resnet-152', 0) mod = mx.mod.Module(symbol=sym, context=ctx, label_names=None)
  • 51. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Other tips SageMaker Pipemode using TensorFlow Pipemode Dataset extension https://github.com/aws/sagemaker-tensorflow- extensions Apache MXNet can read training data from Amazon S3 directly https://mxnet.incubator.apache.org/versions/master/ faq/s3_integration.html * dataset – a 3.9 GB CSV file– contained 2 million records, each record having 100 comma-separated, single-precision floating-point values.
  • 52. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Summary Training • Make it sure to utilize Tensor Core by using mix-precision training • Learn to use Horovod for efficient multi-GPU or multi-node distributed training • Find the most optimal hyperparameter using SageMaker HPO Deployment • Compile your model using Amazon SageMaker Neo • Use Amazon Elastic Inference to reduce inference cost if applicable
  • 53. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Dive into Deep Learning An interactive deep learning book with code, math, and discussions http://d2l.ai/ http://ko.d2l.ai/ STAT 157 Course at UC Berkeley, Spring 2019 한국어 version of the first 4 chapters is available NOW. • GitHub Pull Request for any correction is welcome • Raise issue at https://github.com/d2l-ai/d2l-ko/issues
  • 54. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Getting started https://ml.aws https://aws.amazon.com/blogs/machine-learning https://aws.amazon.com/sagemaker https://github.com/awslabs/amazon-sagemaker-examples https://medium.com/@julsimon