Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

of

Introduction to Deep Learning with Python Slide 1

YouTube videos are no longer supported on SlideShare

View original on YouTube

Introduction to Deep Learning with Python Slide 3 Introduction to Deep Learning with Python Slide 4 Introduction to Deep Learning with Python Slide 5 Introduction to Deep Learning with Python Slide 6 Introduction to Deep Learning with Python Slide 7 Introduction to Deep Learning with Python Slide 8 Introduction to Deep Learning with Python Slide 9 Introduction to Deep Learning with Python Slide 10 Introduction to Deep Learning with Python Slide 11 Introduction to Deep Learning with Python Slide 12 Introduction to Deep Learning with Python Slide 13 Introduction to Deep Learning with Python Slide 14 Introduction to Deep Learning with Python Slide 15 Introduction to Deep Learning with Python Slide 16 Introduction to Deep Learning with Python Slide 17 Introduction to Deep Learning with Python Slide 18 Introduction to Deep Learning with Python Slide 19 Introduction to Deep Learning with Python Slide 20 Introduction to Deep Learning with Python Slide 21 Introduction to Deep Learning with Python Slide 22 Introduction to Deep Learning with Python Slide 23 Introduction to Deep Learning with Python Slide 24 Introduction to Deep Learning with Python Slide 25 Introduction to Deep Learning with Python Slide 26 Introduction to Deep Learning with Python Slide 27 Introduction to Deep Learning with Python Slide 28 Introduction to Deep Learning with Python Slide 29 Introduction to Deep Learning with Python Slide 30 Introduction to Deep Learning with Python Slide 31 Introduction to Deep Learning with Python Slide 32 Introduction to Deep Learning with Python Slide 33 Introduction to Deep Learning with Python Slide 34 Introduction to Deep Learning with Python Slide 35 Introduction to Deep Learning with Python Slide 36 Introduction to Deep Learning with Python Slide 37 Introduction to Deep Learning with Python Slide 38 Introduction to Deep Learning with Python Slide 39 Introduction to Deep Learning with Python Slide 40 Introduction to Deep Learning with Python Slide 41 Introduction to Deep Learning with Python Slide 42 Introduction to Deep Learning with Python Slide 43 Introduction to Deep Learning with Python Slide 44 Introduction to Deep Learning with Python Slide 45 Introduction to Deep Learning with Python Slide 46 Introduction to Deep Learning with Python Slide 47 Introduction to Deep Learning with Python Slide 48 Introduction to Deep Learning with Python Slide 49 Introduction to Deep Learning with Python Slide 50 Introduction to Deep Learning with Python Slide 51 Introduction to Deep Learning with Python Slide 52 Introduction to Deep Learning with Python Slide 53 Introduction to Deep Learning with Python Slide 54 Introduction to Deep Learning with Python Slide 55 Introduction to Deep Learning with Python Slide 56 Introduction to Deep Learning with Python Slide 57 Introduction to Deep Learning with Python Slide 58 Introduction to Deep Learning with Python Slide 59 Introduction to Deep Learning with Python Slide 60 Introduction to Deep Learning with Python Slide 61 Introduction to Deep Learning with Python Slide 62 Introduction to Deep Learning with Python Slide 63 Introduction to Deep Learning with Python Slide 64 Introduction to Deep Learning with Python Slide 65 Introduction to Deep Learning with Python Slide 66 Introduction to Deep Learning with Python Slide 67 Introduction to Deep Learning with Python Slide 68 Introduction to Deep Learning with Python Slide 69 Introduction to Deep Learning with Python Slide 70 Introduction to Deep Learning with Python Slide 71 Introduction to Deep Learning with Python Slide 72 Introduction to Deep Learning with Python Slide 73 Introduction to Deep Learning with Python Slide 74 Introduction to Deep Learning with Python Slide 75 Introduction to Deep Learning with Python Slide 76 Introduction to Deep Learning with Python Slide 77 Introduction to Deep Learning with Python Slide 78 Introduction to Deep Learning with Python Slide 79 Introduction to Deep Learning with Python Slide 80 Introduction to Deep Learning with Python Slide 81 Introduction to Deep Learning with Python Slide 82
Upcoming SlideShare
Best Practices in Maintenance and Reliability
Next
Download to read offline and view in fullscreen.

Share

Introduction to Deep Learning with Python

Download to read offline

A presentation by Alec Radford, Head of Research at indico Data Solutions, on deep learning with Python's Theano library.

The emphasis of the presentation is high performance computing, natural language processing (using recurrent neural nets), and large scale learning with GPUs.

Video of the talk available here: https://www.youtube.com/watch?v=S75EdAcXHKk

Introduction to Deep Learning with Python

  1. 1. From multiplication to convolutional networks How do ML with Theano
  2. 2. Today’s Talk ● A motivating problem ● Understanding a model based framework ● Theano ○ Linear Regression ○ Logistic Regression ○ Net ○ Modern Net ○ Convolutional Net
  3. 3. Follow along Tutorial code at: https://github.com/Newmu/Theano-Tutorials Data at: http://yann.lecun.com/exdb/mnist/ Slides at: http://goo.gl/vuBQfe
  4. 4. A motivating problem How do we program a computer to recognize a picture of a handwritten digit as a 0-9? What could we do?
  5. 5. A dataset - MNIST What if we have 60,000 of these images and their label? X = images Y = labels X = (60000 x 784) #matrix (list of lists) Y = (60000) #vector (list) Given X as input, predict Y
  6. 6. An idea For each image, find the “most similar” image and guess that as the label.
  7. 7. An idea For each image, find the “most similar” image and guess that as the label. KNearestNeighbors ~95% accuracy
  8. 8. Trying things Make some functions computing relevant information for solving the problem
  9. 9. What we can code Make some functions computing relevant information for solving the problem feature engineering
  10. 10. What we can code Hard coded rules are brittle and often aren’t obvious or apparent for many problems.
  11. 11. Model A Machine Learning Framework 8 Inputs Computation Outputs
  12. 12. from arXiv:1409.4842v1 [cs.CV] 17 Sep 2014 A … model? - GoogLeNet
  13. 13. 3 mult Input Computation Output A very simple model by x 12
  14. 14. Theano intro
  15. 15. Theano intro imports
  16. 16. Theano intro imports theano symbolic variable initialization
  17. 17. Theano intro imports theano symbolic variable initialization our model
  18. 18. Theano intro imports theano symbolic variable initialization our model compiling to a python function
  19. 19. Theano intro imports theano symbolic variable initialization our model compiling to a python function usage
  20. 20. Theano
  21. 21. Theano imports
  22. 22. Theano imports training data generation
  23. 23. Theano imports training data generation symbolic variable initialization
  24. 24. Theano imports training data generation symbolic variable initialization our model
  25. 25. Theano imports training data generation symbolic variable initialization our model model parameter initialization
  26. 26. Theano imports training data generation symbolic variable initialization our model model parameter initialization metric to be optimized by model
  27. 27. Theano imports training data generation symbolic variable initialization our model model parameter initialization metric to be optimized by model learning signal for parameter(s)
  28. 28. Theano imports training data generation symbolic variable initialization our model model parameter initialization metric to be optimized by model learning signal for parameter(s) how to change parameter based on learning signal
  29. 29. Theano imports training data generation symbolic variable initialization our model model parameter initialization metric to be optimized by model learning signal for parameter(s) how to change parameter based on learning signal compiling to a python function
  30. 30. Theano imports training data generation symbolic variable initialization our model model parameter initialization metric to be optimized by model learning signal for parameter(s) how to change parameter based on learning signal compiling to a python function iterate through data 100 times and train model on each example of input, output pairs
  31. 31. Theano doing its thing
  32. 32. Zero One Two Three Four Five Six Seven Eight Nine 0.1 0. 0. 0.1 0. 0. 0. 0. 0.7 0.1 Logistic Regression softmax(X) T.dot(X, w)
  33. 33. Back to Theano
  34. 34. Back to Theano convert to correct dtype
  35. 35. Back to Theano convert to correct dtype initialize model parameters
  36. 36. Back to Theano convert to correct dtype initialize model parameters our model in matrix format
  37. 37. Back to Theano convert to correct dtype initialize model parameters our model in matrix format loading data matrices
  38. 38. Back to Theano convert to correct dtype initialize model parameters our model in matrix format loading data matrices now matrix types
  39. 39. Back to Theano convert to correct dtype initialize model parameters our model in matrix format loading data matrices now matrix types probability outputs and maxima predictions
  40. 40. Back to Theano convert to correct dtype initialize model parameters our model in matrix format loading data matrices now matrix types probability outputs and maxima predictions classification metric to optimize
  41. 41. Back to Theano convert to correct dtype initialize model parameters our model in matrix format loading data matrices now matrix types probability outputs and maxima predictions classification metric to optimize compile prediction function
  42. 42. Back to Theano convert to correct dtype initialize model parameters our model in matrix format loading data matrices now matrix types probability outputs and maxima predictions classification metric to optimize compile prediction function train on mini-batches of 128 examples
  43. 43. 0 1 2 3 4 5 6 7 8 9 What it learns
  44. 44. 0 1 2 3 4 5 6 7 8 9 What it learns Test Accuracy: 92.5%
  45. 45. Zero One Two Three Four Five Six Seven Eight Nine 0.0 0. 0. 0.1 0. 0. 0. 0. 0.9 0. y = softmax(T.dot(h, wo)) h = T.nnet.sigmoid(T.dot(X, wh)) An “old” net (circa 2000)
  46. 46. A “old” net in Theano
  47. 47. A “old” net in Theano generalize to compute gradient descent on all model parameters
  48. 48. 2D moons dataset courtesy of scikit-learn Understanding SGD
  49. 49. A “old” net in Theano generalize to compute gradient descent on all model parameters 2 layers of computation input -> hidden (sigmoid) hidden -> output (softmax)
  50. 50. Understanding Sigmoid Units
  51. 51. A “old” net in Theano generalize to compute gradient descent on all model parameters 2 layers of computation input -> hidden (sigmoid) hidden -> output (softmax) initialize both weight matrices
  52. 52. A “old” net in Theano generalize to compute gradient descent on all model parameters 2 layers of computation input -> hidden (sigmoid) hidden -> output (softmax) initialize both weight matrices updated version of updates
  53. 53. What an “old” net learns Test Accuracy: 98.4%
  54. 54. Zero One Two Three Four Five Six Seven Eight Nine 0.0 0. 0. 0.1 0. 0. 0. 0. 0.9 0. y = softmax(T.dot(h2, wo)) h2 = rectify(T.dot(h, wh)) h = rectify(T.dot(X, wh)) Noise Noise Noise (or augmentation) A “modern” net - 2012+
  55. 55. A “modern” net in Theano
  56. 56. rectifier A “modern” net in Theano
  57. 57. Understanding rectifier units
  58. 58. rectifier numerically stable softmax A “modern” net in Theano
  59. 59. rectifier numerically stable softmax a running average of the magnitude of the gradient A “modern” net in Theano
  60. 60. rectifier numerically stable softmax a running average of the magnitude of the gradient scale the gradient based on running average A “modern” net in Theano
  61. 61. 2D moons dataset courtesy of scikit-learn Understanding RMSprop
  62. 62. rectifier numerically stable softmax a running average of the magnitude of the gradient scale the gradient based on running average A “modern” net in Theano randomly drop values and scale rest
  63. 63. rectifier numerically stable softmax a running average of the magnitude of the gradient scale the gradient based on running average A “modern” net in Theano randomly drop values and scale rest Noise injected into model rectifiers now used 2 hidden layers
  64. 64. What a “modern” net learns Test Accuracy: 99.0%
  65. 65. Quantifying the difference
  66. 66. What a “modern” net is doing
  67. 67. from deeplearning.net Convolutional Networks
  68. 68. A convolutional network in Theano
  69. 69. a “block” of computation conv -> activate -> pool -> noise A convolutional network in Theano
  70. 70. a “block” of computation conv -> activate -> pool -> noise convert from 4tensor to normal matrix A convolutional network in Theano
  71. 71. a “block” of computation conv -> activate -> pool -> noise convert from 4tensor to normal matrix reshape into conv 4tensor (b, c, 0, 1) format A convolutional network in Theano
  72. 72. a “block” of computation conv -> activate -> pool -> noise convert from 4tensor to normal matrix reshape into conv 4tensor (b, c, 0, 1) format now 4tensor for conv instead of matrix A convolutional network in Theano
  73. 73. a “block” of computation conv -> activate -> pool -> noise convert from 4tensor to normal matrix reshape into conv 4tensor (b, c, 0, 1) format now 4tensor for conv instead of matrix conv weights (n_kernels, n_channels, kernel_w, kerbel_h) A convolutional network in Theano
  74. 74. a “block” of computation conv -> activate -> pool -> noise convert from 4tensor to normal matrix reshape into conv 4tensor (b, c, 0, 1) format now 4tensor for conv instead of matrix conv weights (n_kernels, n_channels, kernel_w, kerbel_h) highest conv layer has 128 filters and a 3x3 grid of responses A convolutional network in Theano
  75. 75. a “block” of computation conv -> activate -> pool -> noise convert from 4tensor to normal matrix reshape into conv 4tensor (b, c, 0, 1) format now 4tensor for conv instead of matrix conv weights (n_kernels, n_channels, kernel_w, kerbel_h) highest conv layer has 128 filters and a 3x3 grid of responses A convolutional network in Theano noise during training
  76. 76. a “block” of computation conv -> activate -> pool -> noise convert from 4tensor to normal matrix reshape into conv 4tensor (b, c, 0, 1) format now 4tensor for conv instead of matrix conv weights (n_kernels, n_channels, kernel_w, kerbel_h) highest conv layer has 128 filters and a 3x3 grid of responses A convolutional network in Theano noise during training no noise for prediction
  77. 77. Test Accuracy: 99.5% What a convolutional network learns
  78. 78. Takeaways ● A few tricks are needed to get good results ○ Noise important for regularization ○ Rectifiers for faster, better, learning ○ Don’t use SGD - lots of cheap simple improvements ● Models need room to compute. ● If your data has structure, your model should respect it.
  79. 79. Resources ● More in-depth theano tutorials ○ http://www.deeplearning.net/tutorial/ ● Theano docs ○ http://www.deeplearning.net/software/theano/library/ ● Community ○ http://www.reddit.com/r/machinelearning
  80. 80. A plug Keep up to date with indico: https://indico1.typeform.com/to/DgN5SP
  81. 81. Questions?
  • mirdavoudToraby

    Jan. 10, 2020
  • XuxhiilThaa

    Sep. 5, 2018
  • NamTK

    May. 9, 2018
  • AtaFatahi

    Feb. 13, 2018
  • JorgeMontecinosMeza

    Jan. 16, 2018
  • yachick

    Jul. 24, 2017
  • mikepham12

    May. 9, 2017
  • king6703

    Apr. 25, 2017
  • fly51fly

    Apr. 9, 2017
  • mystique_123

    Jan. 10, 2017
  • aminkhouani

    Dec. 12, 2016
  • williammdavis

    Dec. 9, 2016
  • computouchinc

    Dec. 9, 2016
  • JeanDu1

    Nov. 15, 2016
  • bvaneden

    Oct. 13, 2016
  • ssuser4adf35

    Oct. 9, 2016
  • jamessungjinkim

    Sep. 20, 2016
  • stanlee321

    Aug. 1, 2016
  • bprawit

    Jul. 13, 2016
  • KlaYT0N

    Jul. 11, 2016

A presentation by Alec Radford, Head of Research at indico Data Solutions, on deep learning with Python's Theano library. The emphasis of the presentation is high performance computing, natural language processing (using recurrent neural nets), and large scale learning with GPUs. Video of the talk available here: https://www.youtube.com/watch?v=S75EdAcXHKk

Views

Total views

73,387

On Slideshare

0

From embeds

0

Number of embeds

5,318

Actions

Downloads

2,091

Shares

0

Comments

0

Likes

153

×