- 1. November 17, 2017 Catherine Diep, Peter Tan Running An AI Workload With IBM PowerAI
- 2. Agenda Deep learning overview IBM PowerAI and deep learning TensorFlow frameworks Image Classification example with the MNIST Dataset Running the workload on IBM PowerAI 2
- 3. Deep Learning basic operations Map x -> y Neural net is a graph Data flow: left to right Input(s) of the current cell are the output(s) from previous cells Tweak all weights until output matches expected outcome 3 x y Deep Learning consists of algorithms that permit software to train itself— by exposing multilayered neural networks to vast amounts of data
- 4. Computation at a node 4
- 5. Training Flow • Continuously feed in input data to model, comparing output predictions and actual labels in order to compute loss. • An optimization algorithm is used to minimize this loss by tweaking the weights and biases of the model. • The model is progressively improved and predictions become more accurate as more data is fed. Compute (Run input through model) Compute loss (Compare output to label) Adjust weights and biases (Minimize loss) Output data (Predictions) Input data - Gradient Descent Algorithm - Adam Optimization Algorithm - … - Cross Entropy - Mean Squared Error - … 5
- 6. PowerAI PowerAI is an enterprise software distribution of popular open-source deep learning frameworks • Enterprise ready SW distribution built on open source • Performance- for faster training times • Tools for ease of development • More information can be found at IBM Marketplace PowerAI Portal Visit the IBM developerWorks for more learning materials and demonstrations 6
- 7. TensorFlow From Google TensorFlow is an open source software library for numerical computation using data flow graphs A framework for deep learning models Open sourced November 2015 Re-designed for research and production Rapid adoption • 5,500 github repo with Tensorflow in the title • Taught in universities classes (Stanford, Berkeley, Toronto, ....) Fast growing community • 12K+ Q&A on stackoverflow IBM support: • IBM PowerAI, Power System S822LC, record speed announced • IBM Data Science Experience • Watson Machine Learning Platform 7
- 8. TensorFlow Graph in Python 1. Construct graph: use tf objects • Node: operation • Edge: data (tensor) 2. Execute graph: connect to runtime • Initialize variables • Load data, feed through graph • Train model: compute parameters, loss • Save checkpoints • Distribute workload 8 Runtime Traningdata prediction Compute, loss function
- 9. MNIST Dataset 9 The MNIST (Modified National Institute of Standards and Technology) dataset consists of 60,000 images of handwritten digits like: Each image has an associated label denoting which digit it is. The above images would have labels 5, 0, 4, and 1.
- 10. Problem Description: Image Classification 10 We want to be able to train a deep learning model using the MNIST dataset that will be able to look at images and predict what digits they are. 10
- 11. Computer Vision 11 How machines view images:
- 12. Tensors for Regression 12 Placeholder: • 28x28 image flattened into a vector: [784] Variables: • Weight is a 2D array: [784,10] • Bias is a vector: [10] Prediction is simply: • y = x * weight + bias Optimizer adjusts weight and bias to minimize loss (error) • Predicted y is close to label MatMul Add Softmax n x 784 784 x 10 10 X Y W b n x 10 CPU GPU Deploy with a session to run on CPU or GPU
- 13. What is Softmax? 13 Normalized exponential function Function that is good for assigning probabilities to an object being one of several things. A softmax regression has two steps: • Add up the evidence of our input being in certain classes • Convert that evidence into probabilities Sum of all outputs will be equal to 1.0 softmax(𝑦𝑖) = 𝑒 𝑦𝑖 𝑖=1 𝑛 𝑒 𝑦𝑖
- 14. Implement Read data 14 from tensorflow.examples.tutorials.mnist import input_data import tensorflow as tf def main(): mnist = input_data.read_data_sets("MNIST_data/", one_hot=True) if __name__ == '__main__': main() input_data is a utility function provided by TensorFlow to retrieve MNIST dataset One_hot refers to how the labels will be represented: as one-hot vectors One-hot vector is a vector which is 0 in most dimensions, and 1 in a single dimension. E.g. 3 = [0, 0, 0, 1, 0, 0, 0, 0, 0, 0]
- 15. Implement Placeholders 15 Placeholders are input x is a 2D array for the images: • Each row is one flattened 28x28 image • First dimension is “None”, to be used to pull in a batch of images at a time (more later) y_ is 2D array for the labels: • Second dimension 10 for the one-hot representation # Placeholder that will be fed image data. x = tf.placeholder(tf.float32, [None, 784]) # Placeholder that will be fed the correct labels. y_ = tf.placeholder(tf.float32, [None, 10]) 15
- 16. Implement Weight and Bias 16 Weight and Bias are variables: to be tweaked during training • Weight is a 2D array: 784 x 10 • Bias is a vector: 10 Initialized with certain values: important for optimization algorithm def weight_variable(shape): """Generates a weight variable of a given shape.""" initial = tf.truncated_normal(shape, stddev=0.1) return tf.Variable(initial) def bias_variable(shape): """Generates a bias variable of a given shape.""" initial = tf.constant(0.1, shape=shape) return tf.Variable(initial) # Define weight and bias. W = weight_variable([784, 10]) b = bias_variable([10]) 16
- 17. Implement Regression and Loss Optimizer 17 Neural network: Regression + Softmax Loss function: how far off is the prediction from the label • cross_entropy = −1/𝑁 𝑖=1 𝑁 𝑦_𝑖 ∗ log(𝑦𝑖 ) where 𝑦_𝑖 = label, 𝑦𝑖 = predict Optimizer algorithm: how to tweak the variables # Here we define our model which utilizes the softmax regression. y = tf.nn.softmax(tf.matmul(x, W) + b) # Define our loss. cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1])) # Define our optimizer. train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
- 18. Create Session connecting to Runtime 18 Create a session You can connect to a runtime on a remote cluster for large scale training • Distributed Tensorflow Different types of session: • Normal session to run full training • Interactive session for modifying neural network on the fly # Launch session. sess = tf.InteractiveSession() # Initialize variables. tf.global_variables_initializer().run() 18
- 19. Train and Evaluate Model 19 Here, we run our training step 1100 times, feeding in batches of data to replace the placeholders The batches are random data points we retrieve from our image training set We then check the model with the test data to get our accuracy # Do the training. for i in range(1100): batch = mnist.train.next_batch(1) sess.run(train_step, feed_dict={x: batch[0], y_: batch[1]}) # See how model did. print("Test Accuracy %g" % sess.run(accuracy, feed_dict={ x:mnist.test.images, y_: mnist.test.labels}))
- 20. Demo on a IBM PowerAI Trial server • IBM has partnered with Nimbix to provide cognitive developers a trial account that provides 24-hours of free processing time on the PowerAI platform • Go to the IBM Marketplace PowerAI Portal • Click the “Request trial” button • Follow the instruction provided to register and access your IBM PowerAI Trial environment • Demo on the trial server can be found at http://localhost:8888/tree/demo 20
- 21. 21 1. Logon to Ubuntu as user “nimbix” with the password provided by Nimbix Bring up a terminal window, ssh to the nimbix server (i.e. NAE-165-254-189-20.jarvice.com) as nimbix user with provided password. => ssh nimbix@NAE-165-254-189-20.jarvice.com
- 22. 22 2. Download and Install Anaconda # Download Miniconda => cd => wget -c https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-ppc64le.sh # Install Miniconda => cd => chmod 744 Miniconda3-latest-Linux-ppc64le.sh => ./Miniconda3-latest-Linux-ppc64le.sh ## and following the online instruction to finish the install. ## Answer “yes” to install location to .bashrc file # logoff the nimbix and log back to nimbix. Or, do the following command => source ~/.bashrc
- 23. 23 3. Create an Miniconda environment with Python 2.7 => conda create -n image_cls python=2.7 4. Activate the Conda environment => source activate image_cls 5. Activate nvidia libraries => export PATH="/usr/lib/nvidia-361/bin:/usr/local/cuda-8.0/bin:$PATH" => export CUDA_HOME=/usr/local/cuda-8.0 => export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-8.0/lib64
- 24. 24 6. Prepare TensorFlow # Install numpy (Only need to run this step once) => pip install numpy # Activate tensorflow => source /opt/DL/tensorflow/bin/tensorflow-activate # Check to see if tensorflow is ready => pip list |grep tensorflow (image_cls) nimbix@JARVICENAE-0A0A1847:~$ pip list |grep tensorflow DEPRECATION: The default format will switch to columns in the future. You can use --format=(legacy|columns) (or define a format=(legacy|columns) in your pip.conf under the [list] section) to disable this warning. tensorflow (1.1.0) (image_cls) nimbix@JARVICENAE-0A0A1847:~$
- 25. 25 7. Digital Image Classification workload Install. => cd # Download the digital image classification workload. => git clone https://github.com/pvaneck/tf_mnist 8. Training. => cd tf_mnist => python ./train_basic_model.py ## the training result will be saved in ~/tf_mnist/saved-model directory
- 26. 26 9. Prediction # Predict the class of an image using the models saved in the ~/tf_mnist/saved-model directory with a sample image from ~/tf_mnist/sample-images directory => python ./classify_mnist.py sample-images/img_1.jpg # This is what img_1.jpg image looks like => # This is the program output: 2 (confidence = 0.99987) 3 (confidence = 0.00010) 0 (confidence = 0.00003) 8 (confidence = 0.00000) 5 (confidence = 0.00000) The result shows that the answer with the most confidence is “2”
- 27. 27