Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Introduction to deep learning

50 views

Published on

Introduction to deep learning

Published in: Technology
  • Be the first to comment

Introduction to deep learning

  1. 1. Introduction to Deep Learning Vishwas Lele (@vlele) Applied Information Sciences
  2. 2. Agenda •Introduction to Deep Learning • From ML to Deep Learning •Neural Network Basics • Dense, CNN, RNN • Key Concepts • CNN Basics •X-ray Image Analysis • Optimization (using pre-trained models) • Visualizing the learning
  3. 3. Demo • Why do I need to learn neural networks when I have Cognitive and other APIs?
  4. 4. What is Deep Learning?
  5. 5. Deep Learning Examples • Near-human-level image classification • Near-human-level speech recognition • Near-human-level handwriting transcription • Digital assistants Google Now, Cortana • Near-human-level autonomous driving
  6. 6. Why now? What about past AI winters? • Hardware • Datasets and benchmarks • Algorithmic advances
  7. 7. What about other ML techniques? • Probabilistic modeling (Naïve Bayes) • Kernel Methods • Support Vector Machines • Decision Trees • Random Forests • Gradient Boosting Machines
  8. 8. “Deep” in deep learning
  9. 9. How can a network help us? 0 1 2 3 4 7 5 6 9 8
  10. 10. Digit Classification
  11. 11. Network of layers • Layers are like LEGO bricks of deep learning • A data-processing module that takes as input one or more tensors and that outputs one or more tensors
  12. 12. Tensor shape (128, 256, 256, 1)
  13. 13. How layers transform data? output = relu(dot(W, input) + b) W- Weight b - Bias
  14. 14. Rectified Linear Unit (RELU)
  15. 15. Power of parallelization def naive_relu(x): i in range(x.shape[0]): for j in range(x.shape[1]): x[i, j] = max(x[i, j], 0) return x z = np.maximum(z, 0.)
  16. 16. Sigmoid function
  17. 17. How network learns?
  18. 18. Training Loop • Draw a batch of training samples x and corresponding targets y. • Run the network on x • Compute the loss of the network • Update all weights of the network in a way that slightly reduces the loss on this batch.
  19. 19. Key Concepts • Derivative • Gradient • Stochastic Gradient
  20. 20. Stochastic gradient descent • Draw a batch of training samples x and corresponding targets y. • Run the network on x to obtain predictions y_pred. • Compute the loss of the network on the batch, a measure of the mismatch between y_pred and y.
  21. 21. Stochastic gradient descent • Compute the gradient of the loss with regard to the network’s parameters (a backward pass). • Move the parameters a little in the opposite direction from the gradient—for example W = step * gradient—thus reducing the loss on the batch a bit.
  22. 22. Back propagation 0 1 2 3 4 7 5 6 9 8
  23. 23. Demo • Digit Classification • Keras Library
  24. 24. Training and validation loss
  25. 25. Overfitting
  26. 26. Weight Regularization • L1 regularization— The cost added is proportional to the absolute value of the weight coefficients • L2 regularization— The cost added is proportional to the square of the value of the weight coefficients
  27. 27. Dropout • Randomly dropping out (setting to zero) a number of output features of the layer during training.
  28. 28. Feature Engineering • process of using your own knowledge to make the algorithm work better by applying hardcoded (nonlearned) transformations to the data before it goes into the model
  29. 29. Feature engineering
  30. 30. How network learns?
  31. 31. Enter CNN or (Convolution Neural Networks)
  32. 32. Convolution Operation
  33. 33. Spatial Hierarchy
  34. 34. How Convolution works?
  35. 35. Sliding
  36. 36. Padding
  37. 37. Stride
  38. 38. Max-Pooling • Extracting windows from the input feature maps and outputting the max value of each channel. • It’s conceptually similar to convolution. • Except that instead of transforming local patches via a learned linear transformation
  39. 39. Demo • Digit Classification (using CNN) • Keras Library
  40. 40. Demo • X-ray Image Classification • Dealing with overfitting • Visualizing the learning
  41. 41. Visualizing the learning
  42. 42. Overfitting
  43. 43. With dropout regularization
  44. 44. With data augmentation
  45. 45. Demo • Activation heatmap

×