Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Getting started with gpu accelerated deep learning

57 views

Published on

Slides for Anthill Inside 2017 crisp talk proposal.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Getting started with gpu accelerated deep learning

  1. 1. GETTING STARTED WITH GPU ACCELERATED DEEP LEARNING A CRISP TALK FOR ANTHILL INSIDE 2017
  2. 2. DEEP LEARNING: OVERVIEW • Comprised of neurons and layers • Depth of the network = number of hidden layers • Number of parameters = number of weights • Learning using data over time • Output at every node: 𝑓 𝑖 𝑛 𝑤𝑖 𝑥𝑖 + 𝑏 • Activation function: f • Examples of activation function: linear, sigmoid • Many different types: • Convolutional Neural Networks (CNNs) • Recurrent Neural Networks (RNNs)
  3. 3. EXAMPLE DEEP LEARNING ARCHITECTURE: DEEPFACE • Nine layers “deep” • 120 million parameters • Trained on 4 million face images
  4. 4. GPU HORSEPOWER ON THE CLOUD • “On-demand” GPU computing power with high reliability • Providers: • Amazon EC2 • IBM Softlayer Bare Metal Servers • Nimbix • Google (Beta) • Microsoft (public availability in December)
  5. 5. GETTING STARTED: CUDA TOOLKIT AND DRIVERS • Step 1: Download CUDA (https://developer.nvidia.com/cuda- downloads) • Step 2: Check the md5 sum and verify before continuing • Step 3: Remove any other installation (“sudo apt-get purge nvidia-cuda*” - if you want to install the drivers too, then “sudo apt-get purge nvidia-*”.) • Step 4: Follow the command-line prompts • Step 5: Verify using “nvcc --version”
  6. 6. GETTING STARTED: CUDNN • Step 0: Install CUDA from the standard repositories • Step 1: Register an nVIDIA developer account and download cuDNN (https://developer.nvidia.com/cudnn) • Step 2: Check where your CUDA installation is. You can check it with “which nvcc” or “ldconfig -p | grep cuda” • Step 3: Copy the files: • $ cd folder/extracted/contents • $ sudo cp -P include/cudnn.h /usr/include • $ sudo cp -P lib64/libcudnn* /usr/lib/x86_64-linux-gnu/ • $ sudo chmod a+r /usr/lib/x86_64-linux-gnu/libcudnn*
  7. 7. GETTING STARTED: ANACONDA • Choose Anaconda if you: • Are new to conda or Python • Like the convenience of having Python and over 150 scientific packages automatically installed at once • Have the time and disk space (a few minutes and 3 GB), and/or • Don’t want to install each of the packages you want to use individually.
  8. 8. GETTING STARTED: CONDA ENVIRONMENTS • Isolated packages of libraries and dependencies • Create: • “conda create –name <name> <package>” • Activate/Deactivate environment: • “source activate <name>”/ “source deactivate” • Any changes you make to Python libraries is contained within the active environment • Importing an environment: • “conda env create –f <name of environment.yml file> • More about environments: https://conda.io/docs/using/envs.html
  9. 9. GETTING STARTED: THE LIBRARIES • Numpy, Scipy (pre-installed with Anaconda) • Keras, Scikit-Learn (sklearn), Matplotlib, PIL: • “conda create --name anthill scikit-learn matplotlib keras pillow” • OpenCV: • After activating the source, execute: • “conda install -c menpo opencv” • Tensorflow CPU/GPU mode: • “pip install --upgrade tensorflow” / “pip install --upgrade tensorflow-gpu”
  10. 10. DEEP LEARNING IN ACTION: WBC CLASSIFICATION For full details visit: https://blog.athelas.com/classifying-white-blood-cells-with-convolutional- neural-networks-2ca6da239331
  11. 11. DEEP LEARNING IN ACTION: WBC CLASSIFICATION • Five layers “deep” • Almost 1 million parameters
  12. 12. DEEP LEARNING IN ACTION: WBC CLASSIFICATION • 352 images of size 640x480 • 21 Monocyte (Mononuclear) • 33 Lymphocyte (Mononuclear) • 207 Neutrophil (Polynuclear) • 88 Eosinophil (Polynuclear) • 3 Basophil (Mononuclear)
  13. 13. DEEP LEARNING IN ACTION: WBC CLASSIFICATION • Basophils removed • From 352 to 10,000 images with 2,500 of each category • Images downsampled to 120x160 • Convert 4-class problem to 2-class problem • Normalize pixel values from 0 to 255 to 0 and 1
  14. 14. GPU VS CPU FOR DEEP LEARNING WORKLOADS • Training on 7965 samples and validating on 1992 samples • GPU Epoch time: 12 seconds for the first, 9 seconds thereafter • CPU Epoch time: 109 seconds for the first, 105-107 seconds thereafter • Configuration: • GPU: Tesla M60 • CPU: Intel® Xeon® Processor E5-2620 v3 @ 2.40 GHz x2
  15. 15. OUTPUT: TRAINING ON CPU (LEFT) VS GPU (RIGHT)

×