March 16, 2019
Catherine Diep
Introduction to Running AI
Workloads on PowerAI
Agenda
• IBM PowerAI Overview
• AI Workload Demos using TensorFlow
• PyTorch Hands-On Lab
2
IBM PowerAI
3
•Download it from here: http://ibm.biz/download-powerai
•Connect to the PowerAI Conda repository
•Get the Docker container from here: https://hub.docker.com/r/ibmcom/powerai/
An enterprise software distribution that combines popular open source deep
learning frameworks, efficient AI development tools, and accelerated IBM®
Power Systems™ servers to take your deep learning projects to the next level
More Information
4
Visit website: http://bit.ly/powericpp
PoweAI Docker Containers
5
https://hub.docker.com/r/ibmcom/powerai/tags
docker run -ti --env LICENSE=yes --name container_name ibmcom/powerai:1.6.0-all-ubuntu18.04-py3 bash
Using PowerAI for your workloads
6
Activate the virtual environment for your framework to get started!
Example:
source /opt/DL/tensorflow/bin/tensorflow-activate
source /opt/DL/pytorch/bin/pytorch-activate
Deep learning overview
7
Deep Learning consists of algorithms that permit software to train itself— by
exposing multilayered neural networks to vast amounts of data.
• Map x → y
• Neural net is a graph
• Data flow: left to right
• Input(s) of the current cell are the
output(s) from previous cells
• Tweak all weights until output matches
expected outcome
Computation at a node
8
Training = tweak weights to minimize error (loss)
9
Repeat over and over and over ...
Training Flow
• Continuously feed in input data to model, comparing output predictions and actual labels in order to
compute loss.
• An optimization algorithm is used to minimize this loss by tweaking the weights and biases of the model.
• The model is progressively improved and predictions become more accurate as more data is fed.
Compute
(Run input through
model)
Compute loss
(Compare output to
label)
Adjust weights
and biases
(Minimize loss)
Output data
(Predictions)
Input data
- Gradient Descent Algorithm
- Adam Optimization Algorithm
- …
- Cross Entropy
- Mean Squared Error
- …
10
11
Let’s look at some TensorFlow workloads!
MNIST
12
The MNIST (Modified National Institute of Standards and Technology)
dataset consists of 60,000 images of handwritten digits like:
Each image has an associated label denoting which digit it is.
The above images would have labels 5, 0, 4, and 1.
Problem Description: Image Classification
13
We want to be able to train a deep learning model using the MNIST dataset that
will be able to look at images and predict what digits they are.
Computer Vision
14
How machines view images:
1. Download workload
=> cd
=> git clone https://github.com/pvaneck/tf_mnist
2. Training
=> cd tf_mnist
=> python ./train_basic_model.py
The training result will be saved in ~/tf_mnist/saved-model
15
Running the MNIST workload
3. Prediction
16
Running the MNIST workload (continued)
Predict the class of an image using the models saved in the ~/tf_mnist/saved-model directory with a
sample image from ~/tf_mnist/sample-images directory.
=> python ./classify_mnist.py sample-images/img_1.jpg
# Sample image (img_1.jpg):
# This is the program output:
2 (confidence = 0.99987)
3 (confidence = 0.00010)
0 (confidence = 0.00003)
8 (confidence = 0.00000)
5 (confidence = 0.00000)
The result shows that the answer with the most confidence is “2”.
17
Basic Linear Regression MNIST
Implement Placeholders
18
• Placeholders are input
• x is a 2D array for the images:
• Each row is one flattened 28x28 image
• First dimension is “None”, to be used to pull in a batch of images at a time
(more later)
• y_ is 2D array for the labels:
• Second dimension 10 for the one-hot representation
# Placeholder that will be fed image data.
x = tf.placeholder(tf.float32, [None, 784])
# Placeholder that will be fed the correct labels.
y_ = tf.placeholder(tf.float32, [None, 10])
Implement Weight and Bias
19
• Weight and Bias are variables: to be tweaked during training
• Weight is a 2D array: 784 x 10
• Bias is a vector: 10
• Initialized with certain values: important for optimization algorithm
def weight_variable(shape):
"""Generates a weight variable of a given shape."""
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(shape):
"""Generates a bias variable of a given shape."""
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)
# Define weight and bias.
W = weight_variable([784, 10])
b = bias_variable([10])
Implement Regression and Loss Optimizer
20
• Neural network: Regression + SoftMax
• Loss function: how far off is the prediction from the label
• Optimizer algorithm: how to tweak the variables
# Here we define our model which utilizes the softmax regression.
y = tf.nn.softmax(tf.matmul(x, W) + b)
# Define our loss.
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
# Define our optimizer.
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
21
Convolutional Neural Net
● Neural network with series of convolutional, downsampling, and non-linear layers
● Are deployed in many practical applications
○ Image and speech recognition, video analysis, drug discovery
● Most commonly used for image classification
22
Let’s try some PyTorch!
PyTorch
23
PyTorch is a relatively new deep learning framework. Yet, it has begun to gain
adoption especially among researchers and data scientists.
▪ Goal is to build a flexible deep learning research platform that supports
▪ Dynamic computation graphs
▪ Native Python packages
▪ Auto-differentiation (gradient computations)
▪ Open sourced January, 2017
▪ Rapid adoption one year in …
▪ 3,983 github repo mentioned PyTorch in their name or description
▪ Taught in universities classes (Stanford, Carnegie Mellon University, ....)
▪ Fast growing community
▪ 5,400 users wrote 21k posts discussing 5,200 topics on PyTorch forums
▪ Merged with Caffe2 March 30, 2018
▪ Currently at version 0.4.0
PyTorch Abstractions
24
● Tensor
○ Multi-dimensional arrays
○ Similar to NumPy (np) ndarrays, but can also be used on a GPU
○ Can easily convert from np arrays to torch tensor and vise versa
○ Support automatic differentiation for all operations on tensors.
○ Members includes .data (tensor), .grad (gradient w.r.t corresponding
.data), etc.
○ Package: “import torch”
○ Tensor tutorial
○ Autograd tutorial
● Module
○ Base class to build neural network.
○ Inputs & output for training are Tensors.
○ Store learnable weights & biases as parameters
○ May store states
○ Package: “import torch.nn.Module”
○ Module tutorial
● And many more ….
● Consult handouts for credentials and access information.
25
Open Jupyter notebooks
Backup
26
27
Softmax
▪
Cross entropy

OpenPOWER Workshop in Silicon Valley

  • 1.
    March 16, 2019 CatherineDiep Introduction to Running AI Workloads on PowerAI
  • 2.
    Agenda • IBM PowerAIOverview • AI Workload Demos using TensorFlow • PyTorch Hands-On Lab 2
  • 3.
    IBM PowerAI 3 •Download itfrom here: http://ibm.biz/download-powerai •Connect to the PowerAI Conda repository •Get the Docker container from here: https://hub.docker.com/r/ibmcom/powerai/ An enterprise software distribution that combines popular open source deep learning frameworks, efficient AI development tools, and accelerated IBM® Power Systems™ servers to take your deep learning projects to the next level
  • 4.
    More Information 4 Visit website:http://bit.ly/powericpp
  • 5.
    PoweAI Docker Containers 5 https://hub.docker.com/r/ibmcom/powerai/tags dockerrun -ti --env LICENSE=yes --name container_name ibmcom/powerai:1.6.0-all-ubuntu18.04-py3 bash
  • 6.
    Using PowerAI foryour workloads 6 Activate the virtual environment for your framework to get started! Example: source /opt/DL/tensorflow/bin/tensorflow-activate source /opt/DL/pytorch/bin/pytorch-activate
  • 7.
    Deep learning overview 7 DeepLearning consists of algorithms that permit software to train itself— by exposing multilayered neural networks to vast amounts of data. • Map x → y • Neural net is a graph • Data flow: left to right • Input(s) of the current cell are the output(s) from previous cells • Tweak all weights until output matches expected outcome
  • 8.
  • 9.
    Training = tweakweights to minimize error (loss) 9 Repeat over and over and over ...
  • 10.
    Training Flow • Continuouslyfeed in input data to model, comparing output predictions and actual labels in order to compute loss. • An optimization algorithm is used to minimize this loss by tweaking the weights and biases of the model. • The model is progressively improved and predictions become more accurate as more data is fed. Compute (Run input through model) Compute loss (Compare output to label) Adjust weights and biases (Minimize loss) Output data (Predictions) Input data - Gradient Descent Algorithm - Adam Optimization Algorithm - … - Cross Entropy - Mean Squared Error - … 10
  • 11.
    11 Let’s look atsome TensorFlow workloads!
  • 12.
    MNIST 12 The MNIST (ModifiedNational Institute of Standards and Technology) dataset consists of 60,000 images of handwritten digits like: Each image has an associated label denoting which digit it is. The above images would have labels 5, 0, 4, and 1.
  • 13.
    Problem Description: ImageClassification 13 We want to be able to train a deep learning model using the MNIST dataset that will be able to look at images and predict what digits they are.
  • 14.
  • 15.
    1. Download workload =>cd => git clone https://github.com/pvaneck/tf_mnist 2. Training => cd tf_mnist => python ./train_basic_model.py The training result will be saved in ~/tf_mnist/saved-model 15 Running the MNIST workload
  • 16.
    3. Prediction 16 Running theMNIST workload (continued) Predict the class of an image using the models saved in the ~/tf_mnist/saved-model directory with a sample image from ~/tf_mnist/sample-images directory. => python ./classify_mnist.py sample-images/img_1.jpg # Sample image (img_1.jpg): # This is the program output: 2 (confidence = 0.99987) 3 (confidence = 0.00010) 0 (confidence = 0.00003) 8 (confidence = 0.00000) 5 (confidence = 0.00000) The result shows that the answer with the most confidence is “2”.
  • 17.
  • 18.
    Implement Placeholders 18 • Placeholdersare input • x is a 2D array for the images: • Each row is one flattened 28x28 image • First dimension is “None”, to be used to pull in a batch of images at a time (more later) • y_ is 2D array for the labels: • Second dimension 10 for the one-hot representation # Placeholder that will be fed image data. x = tf.placeholder(tf.float32, [None, 784]) # Placeholder that will be fed the correct labels. y_ = tf.placeholder(tf.float32, [None, 10])
  • 19.
    Implement Weight andBias 19 • Weight and Bias are variables: to be tweaked during training • Weight is a 2D array: 784 x 10 • Bias is a vector: 10 • Initialized with certain values: important for optimization algorithm def weight_variable(shape): """Generates a weight variable of a given shape.""" initial = tf.truncated_normal(shape, stddev=0.1) return tf.Variable(initial) def bias_variable(shape): """Generates a bias variable of a given shape.""" initial = tf.constant(0.1, shape=shape) return tf.Variable(initial) # Define weight and bias. W = weight_variable([784, 10]) b = bias_variable([10])
  • 20.
    Implement Regression andLoss Optimizer 20 • Neural network: Regression + SoftMax • Loss function: how far off is the prediction from the label • Optimizer algorithm: how to tweak the variables # Here we define our model which utilizes the softmax regression. y = tf.nn.softmax(tf.matmul(x, W) + b) # Define our loss. cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1])) # Define our optimizer. train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
  • 21.
    21 Convolutional Neural Net ●Neural network with series of convolutional, downsampling, and non-linear layers ● Are deployed in many practical applications ○ Image and speech recognition, video analysis, drug discovery ● Most commonly used for image classification
  • 22.
  • 23.
    PyTorch 23 PyTorch is arelatively new deep learning framework. Yet, it has begun to gain adoption especially among researchers and data scientists. ▪ Goal is to build a flexible deep learning research platform that supports ▪ Dynamic computation graphs ▪ Native Python packages ▪ Auto-differentiation (gradient computations) ▪ Open sourced January, 2017 ▪ Rapid adoption one year in … ▪ 3,983 github repo mentioned PyTorch in their name or description ▪ Taught in universities classes (Stanford, Carnegie Mellon University, ....) ▪ Fast growing community ▪ 5,400 users wrote 21k posts discussing 5,200 topics on PyTorch forums ▪ Merged with Caffe2 March 30, 2018 ▪ Currently at version 0.4.0
  • 24.
    PyTorch Abstractions 24 ● Tensor ○Multi-dimensional arrays ○ Similar to NumPy (np) ndarrays, but can also be used on a GPU ○ Can easily convert from np arrays to torch tensor and vise versa ○ Support automatic differentiation for all operations on tensors. ○ Members includes .data (tensor), .grad (gradient w.r.t corresponding .data), etc. ○ Package: “import torch” ○ Tensor tutorial ○ Autograd tutorial ● Module ○ Base class to build neural network. ○ Inputs & output for training are Tensors. ○ Store learnable weights & biases as parameters ○ May store states ○ Package: “import torch.nn.Module” ○ Module tutorial ● And many more ….
  • 25.
    ● Consult handoutsfor credentials and access information. 25 Open Jupyter notebooks
  • 26.
  • 27.