Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Deep Learning Representations for All (a.ka. the AI hype)

356 views

Published on

Deep neural networks have revolutionized the data analytics scene by improving results in several and diverse benchmarks with the same recipe: learning feature representations from data. These achievements have raised the interest across multiple scientific fields, especially in those where large amounts of data and computation are available. This change of paradigm in data analytics has several ethical and economic implications that are driving large investments, political debates and sounding press coverage under the generic label of artificial intelligence (AI). This talk will present the fundamentals of deep learning through the classic example of image classification, and point at how the same principal has been adopted for several tasks. Finally, some of the forthcoming potentials and risks for AI will be pointed.

Published in: Data & Analytics
  • Be the first to comment

Deep Learning Representations for All (a.ka. the AI hype)

  1. 1. bit.ly/commsenselab @DocXavi Deep Learning Representations for All (a.k.a. the AI hype) Xavier Giro-i-Nieto @DocXavi xavier.giro@upc.edu
  2. 2. bit.ly/commsenselab @DocXavi 2 Xavier Giro-i-Nieto Associate Professor at Universitat Politècnica de Catalunya (UPC) Hola IDEAI Center for Intelligent Data Science & Artificial Intelligence
  3. 3. bit.ly/commsenselab @DocXavi 3 ● 11 faculty members ● 12 Phd students Research Group & Centers https://imatge.upc.edu/ https://www.bsc.es/ ● National computation center #1 ● Supercomputer MareNostrum ● Emerging Technologies for Artificial Intelligence Group, directed by Prof. Jordi Torres. https://ideai.upc.edu/ ● Center funded in 2017 ● 60 researchers IDEAI (Intelligent Data Science and Artificial Intelligence)
  4. 4. bit.ly/commsenselab @DocXavi 4 Why am I here ? Jitendra Malik, “What lead computer vision to deep learning ?” ACM Communications 2017.
  5. 5. bit.ly/commsenselab @DocXavi 5 Why am I here ? Jitendra Malik, “What lead computer vision to deep learning ?” ACM Communications 2017.
  6. 6. bit.ly/commsenselab @DocXavi 6 Why am I here ? ● Old style machine learning: ○ Engineer features (by some unspecified method) ○ Create a representation (descriptor) ○ Train shallow classifier on representation ● Example: ○ SIFT features (engineered) ○ BoW representation (engineered + unsupervised learning) ○ SVM classifier (convex optimization) ● Deep learning ○ Learn layers of features, representation, and classifier in one go based on the data alone ○ Primary methodology: deep neural networks (non-convex) Data Pixels Deep learning Convolutional Neural Network Data Pixels Low-level features SIFT Representation BoW/VLAD/Fisher Classifier SVM EngineeredUnsupervisedSupervised Supervised
  7. 7. bit.ly/commsenselab @DocXavi 7
  8. 8. bit.ly/commsenselab @DocXavi 8 Source: NVIDIA
  9. 9. bit.ly/commsenselab @DocXavi 9
  10. 10. bit.ly/commsenselab @DocXavi 10 DL basic unit: The Perceptron The Perceptron is seen as an analogy to a biological neuron, because it fire an impulse once the sum of all inputs is over a threshold. Minsky, Marvin, and Seymour A. Papert. Perceptrons: An introduction to computational geometry. 1969
  11. 11. bit.ly/commsenselab @DocXavi 11 DL basic unit: The Perceptron A single perceptron can only define linear decision boundaries. x2 Class 0 Class 1 2D input space data x
  12. 12. bit.ly/commsenselab @DocXavi 12 Non-linear decision boundaries Real world data often needs a non-linear decision boundary ● Images ● Audio ● Text
  13. 13. bit.ly/commsenselab @DocXavi 13
  14. 14. bit.ly/commsenselab @DocXavi 14 ● Needs a “finite number of hidden neurons”: finite may be extremely large ● How to find the parameters (weights, biases) of these neurons ?
  15. 15. bit.ly/commsenselab @DocXavi 15 Multilayer Perceptron (MLP) In practice, deep neural networks nets can usually represent more complex functions with less total neurons (and therefore, less parameters)
  16. 16. bit.ly/commsenselab @DocXavi 16 How to find the parameters ? Rumelhart, David E., Geoffrey E. Hinton, and Ronald J. Williams. "Learning representations by back-propagating errors." Cognitive modeling 5, no. 3 (1988). Training a neural network with the back-propagation algorithm.
  17. 17. bit.ly/commsenselab @DocXavi 17 How to learn a memory unit ? Hochreiter, Sepp, and Jürgen Schmidhuber. "Long short-term memory." Neural computation 9, no. 8 (1997).
  18. 18. bit.ly/commsenselab @DocXavi 18 How to reuse neurons ? Fully Connected layer (FC) Convolutional layer (Conv) Figures: Ranzatto
  19. 19. bit.ly/commsenselab @DocXavi 19 Convolutional Neural Network (CNN) #LeNet-5 LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
  20. 20. bit.ly/commsenselab @DocXavi 20 Many other researchers have also contrinbuted to the field as, for example, those pointed out by LSTM author Jürgen Schmidhuber in “Deep Learning Conspirancy”.
  21. 21. bit.ly/commsenselab @DocXavi 21 Why am I here ? Jitendra Malik, “What lead computer vision to deep learning ?” ACM Communications 2017.
  22. 22. bit.ly/commsenselab @DocXavi 22 Big data for Vision: ImageNet ● 1,000 object classes (categories). ● Images: ○ 1.2 M train ○ 100k test. Deng, Jia, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. "Imagenet: A large-scale hierarchical image database." CVPR 2009.
  23. 23. bit.ly/commsenselab @DocXavi 23 Data Challenge: Social Biases #Equalizer Burns, Kaylee, Lisa Anne Hendricks, Trevor Darrell, and Anna Rohrbach. "Women also Snowboard: Overcoming Bias in Captioning Models." ECCV 2018.
  24. 24. bit.ly/commsenselab @DocXavi 24 Data Challenge: Data access Personal data Internet of things - IoT Neil Lawrence, OpenAI won’t benefit humanity without open data sharing (The Guardian, 2015)
  25. 25. bit.ly/commsenselab @DocXavi 25 Why am I here ? Jitendra Malik, “What lead computer vision to deep learning ?” ACM Communications 2017.
  26. 26. bit.ly/commsenselab @DocXavi 26 Computation
  27. 27. bit.ly/commsenselab @DocXavi 27 Computation Deng, Jia, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. "Imagenet: A large-scale hierarchical image database." CVPR 2009.
  28. 28. bit.ly/commsenselab @DocXavi 28 Computation challenge Deng, Jia, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. "Imagenet: A large-scale hierarchical image database." CVPR 2009.
  29. 29. bit.ly/commsenselab @DocXavi 29 Why am I here ? Jitendra Malik, “What lead computer vision to deep learning ?” ACM Communications 2017.
  30. 30. bit.ly/commsenselab @DocXavi 30 Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." NIPS 2012 12,573 citations (June 2017) 28,366 (Sep 2018) 42,077 (Jun 2019)
  31. 31. bit.ly/commsenselab @DocXavi 31 ● 1,000 object classes (categories). ● Images: ○ 1.2 M train ○ 100k test. ImageNet Challenge Russakovsky, Olga, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang et al. "Imagenet large scale visual recognition challenge." International Journal of Computer Vision 115, no. 3 (2015): 211-252. [web]
  32. 32. bit.ly/commsenselab @DocXavi 32 ImageNet Challenge Russakovsky, Olga, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang et al. "Imagenet large scale visual recognition challenge." International Journal of Computer Vision 115, no. 3 (2015): 211-252. [web] Slide credit: Rob Fergus (NYU) -9.8% Based on SIFT + Fisher Vectors
  33. 33. bit.ly/commsenselab @DocXavi 33 Deeper Networks
  34. 34. bit.ly/commsenselab @DocXavi 34 ImageNet Image Recognition Electronic Frontier Foundation: “Measuring the Progress of AI Research” (2017)
  35. 35. bit.ly/commsenselab @DocXavi 35 Learning Representations Jitendra Malik, “What lead computer vision to deep learning ?” ACM Communications 2017.
  36. 36. bit.ly/commsenselab @DocXavi Text Audio 36 Speech Vision
  37. 37. bit.ly/commsenselab @DocXavi Text Audio 37 Speech Vision
  38. 38. bit.ly/commsenselab @DocXavi Text Audio 38 Speech Vision
  39. 39. bit.ly/commsenselab @DocXavi 39
  40. 40. bit.ly/commsenselab @DocXavi 40 Encoder 0 1 0 Cat A Krizhevsky, I Sutskever, GE Hinton “Imagenet classification with deep convolutional neural networks” NIPS 2012
  41. 41. bit.ly/commsenselab @DocXavi 41Slide concept: Perronin, F., Tutorial on LSVR @ CVPR’14, Output embedding for LSVR One-hot Representation [1,0,0] [0,1,0] [0,0,1]
  42. 42. bit.ly/commsenselab @DocXavi 42 Encoder Representation
  43. 43. bit.ly/commsenselab @DocXavi 43 Encoder Representation
  44. 44. bit.ly/commsenselab @DocXavi 44 Decoder Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation learning with deep convolutional generative adversarial networks." ICLR 2016. #DCGAN 0 1 0 Cat Fig: Xudong Mao #DCGAN
  45. 45. bit.ly/commsenselab @DocXavi 45 Encoder Decoder Representation
  46. 46. bit.ly/commsenselab @DocXavi 46 Encoder Decoder Representation
  47. 47. bit.ly/commsenselab @DocXavi 47 Encoder Decoder Representation
  48. 48. bit.ly/commsenselab @DocXavi 48 Isola, Phillip, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. "Image-to-image translation with conditional adversarial networks." CVPR 2017.
  49. 49. bit.ly/mmm-docxavi @DocXavi 49 Encoder Decoder Representation
  50. 50. bit.ly/mmm-docxavi @DocXavi 50 Neural Machine Translation (NMT) Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015)
  51. 51. bit.ly/mmm-docxavi @DocXavi 51 Encoder Decoder Representation
  52. 52. bit.ly/mmm-docxavi @DocXavi 52 Automatic Speech Recognition (ASR) Slide: Hannun, Awni. "Sequence Modeling with CTC." Distill 2.11 (2017): e8.
  53. 53. bit.ly/mmm-docxavi @DocXavi 53 Encoder Decoder Representation
  54. 54. bit.ly/mmm-docxavi @DocXavi 54 Speech Synthesis Oord, Aaron van den, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu. "Wavenet: A generative model for raw audio." arXiv preprint arXiv:1609.03499 (2016).
  55. 55. bit.ly/mmm-docxavi @DocXavi 55 Encoder Decoder Representation Salvador, Amaia, Michal Drozdzal, Xavier Giro-i-Nieto, and Adriana Romero. "Inverse Cooking: Recipe Generation from Food Images." CVPR 2019.
  56. 56. bit.ly/mmm-docxavi @DocXavi 56 Image-to-Text Salvador, Amaia, Michal Drozdzal, Xavier Giro-i-Nieto, and Adriana Romero. "Inverse Cooking: Recipe Generation from Food Images." CVPR 2019.
  57. 57. bit.ly/mmm-docxavi @DocXavi 57 Chung, Joon Son, Andrew Senior, Oriol Vinyals, and Andrew Zisserman. "Lip reading sentences in the wild." CVPR 2017
  58. 58. bit.ly/mmm-docxavi @DocXavi 58 Encoder Decoder Representation
  59. 59. 59 #StackGAN Zhang, Han, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaolei Huang, Xiaogang Wang, and Dimitris Metaxas. "Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks." ICCV 2017. [code] Image Synthesis
  60. 60. bit.ly/mmm-docxavi @DocXavi 60 Encoder Encoder Representation
  61. 61. bit.ly/mmm-docxavi @DocXavi 61 Owens, Andrew, Phillip Isola, Josh McDermott, Antonio Torralba, Edward H. Adelson, and William T. Freeman. "Visually indicated sounds." CVPR 2016.
  62. 62. bit.ly/mmm-docxavi @DocXavi 62 DecoderEncoder Representation
  63. 63. bit.ly/DLCV2018 #DLUPC 63 Ephrat, Ariel, Tavi Halperin, and Shmuel Peleg. "Improved speech reconstruction from silent video." In ICCV Workshop on Computer Vision for Audio-Visual Media. 2017.
  64. 64. bit.ly/mmm-docxavi @DocXavi 64 Encoder Decoder Representation
  65. 65. bit.ly/mmm-docxavi @DocXavi 65 Speech to Pixels Amanda Duarte, Francisco Roldan, Miquel Tubau, Janna Escur, Santiago Pascual, Amaia Salvador, Eva Mohedano et al. “Wav2Pix: Speech-conditioned Face Generation using Generative Adversarial Networks”. ICASSP 2019.
  66. 66. bit.ly/mmm-docxavi @DocXavi 66 Encoder Encoder Representation
  67. 67. bit.ly/mmm-docxavi @DocXavi 67 Multimodal Retrieval Amanda Duarte, Dídac Surís, Amaia Salvador, Jordi Torres, and Xavier Giró-i-Nieto. "Cross-modal Embeddings for Video and Audio Retrieval." ECCV Women in Computer Vision Workshop 2018. Best match Audio feature
  68. 68. bit.ly/mmm-docxavi @DocXavi 68 Multimodal Retrieval Best match Visual feature Audio feature Amanda Duarte, Dídac Surís, Amaia Salvador, Jordi Torres, and Xavier Giró-i-Nieto. "Cross-modal Embeddings for Video and Audio Retrieval." ECCV Women in Computer Vision Workshop 2018.
  69. 69. bit.ly/mmm-docxavi @DocXavi 69 Encoder Encoder Representation
  70. 70. bit.ly/mmm-docxavi @DocXavi 70 Harwath, David, Adrià Recasens, Dídac Surís, Galen Chuang, Antonio Torralba, and James Glass. "Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input." ECCV 2018
  71. 71. bit.ly/mmm-docxavi @DocXavi 71 Encoder Decoder Representation Encoder Representation
  72. 72. bit.ly/mmm-docxavi @DocXavi 72 Visual Question Answering Antol, Stanislaw, Aishwarya Agrawal, Jiasen Lu, Margaret Mitchell, Dhruv Batra, C. Lawrence Zitnick, and Devi Parikh. "VQA: Visual question answering." CVPR 2015.
  73. 73. bit.ly/mmm-docxavi @DocXavi 73 Encoder Decoder Representation Encoder Representation
  74. 74. bit.ly/DLCV2018 #DLUPC 74 Afouras, Triantafyllos, Joon Son Chung, and Andrew Zisserman. "The Conversation: Deep Audio-Visual Speech Enhancement." Interspeech 2018..
  75. 75. bit.ly/mmm-docxavi @DocXavi 75 Encoder Decoder Representation Encoder Representation
  76. 76. bit.ly/DLCV2018 #DLUPC 76 Karras, Tero, Timo Aila, Samuli Laine, Antti Herva, and Jaakko Lehtinen. "Audio-driven facial animation by joint end-to-end learning of pose and emotion." SIGGRAPH 2017
  77. 77. bit.ly/mmm-docxavi @DocXavi Text Audio 77 Speech Vision
  78. 78. bit.ly/commsenselab @DocXavi 78 Mnih, Volodymyr, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. "Playing atari with deep reinforcement learning." NIPS Deep Learning Workshop (2013).
  79. 79. bit.ly/commsenselab @DocXavi 79 Beyond Multimedia #AlphaGo Silver, David, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser et al. "Mastering the game of Go with deep neural networks and tree search." Nature 2016.
  80. 80. Wayve, “Sim2Real: Learning to Drive from Simulation without Real World Labels” (2018)
  81. 81. bit.ly/commsenselab @DocXavi Sermanet, Pierre, Corey Lynch, Yevgen Chebotar, Jasmine Hsu, Eric Jang, Stefan Schaal, Sergey Levine, and Google Brain. "Time-contrastive networks: Self-supervised learning from video." ICRA 2018.
  82. 82. bit.ly/commsenselab @DocXavi 82 Beyond Multimedia Carl Ekin, Sims Whitherspoon, “Machine learning can boost the value of wind energy”. DeepMind (2019)
  83. 83. bit.ly/commsenselab @DocXavi 83 Beyond Multimedia Esteva, Andre, Brett Kuprel, Roberto A. Novoa, Justin Ko, Susan M. Swetter, Helen M. Blau, and Sebastian Thrun. "Dermatologist-level classification of skin cancer with deep neural networks." Nature 542, no. 7639 (2017): 115.
  84. 84. bit.ly/commsenselab @DocXavi 84 Beyond Multimedia #AlphaFold Evans, R., J. Jumper, J. Kirkpatrick, L. Sifre, T. F. G. Green, C. Qin, A. Zidek et al. "De novo structure prediction with deeplearning based scoring." Annu Rev Biochem 77 (2018): 363-382. [blog]
  85. 85. bit.ly/commsenselab @DocXavi 85 Beyond Multimedia ?
  86. 86. bit.ly/commsenselab @DocXavi 86 The AI Hype ? Jitendra Malik, “What lead computer vision to deep learning ?” ACM Communications 2017.
  87. 87. bit.ly/commsenselab @DocXavi 87 Open research
  88. 88. bit.ly/commsenselab @DocXavi 88 The AI Hype (?) Annual Conference on Neural Information Processing Systems (NIPS) @ Barcelona (2016)
  89. 89. bit.ly/commsenselab @DocXavi 89 The AI Hype (?) Annual Conference on Neural Information Processing Systems (NIPS) @ Long Beach (December 2017) - (Fig: Alex Lebrun)
  90. 90. bit.ly/commsenselab @DocXavi 90 The AI Hype (?) Annual Conference on Neural Information Processing Systems (NIPS) @ Montreal (December 2018)
  91. 91. bit.ly/commsenselab @DocXavi 91 The AI Hype (?)
  92. 92. bit.ly/commsenselab @DocXavi 92 The AI Hype (?) CVPR is the top conference in computer science (IN ALL COMPUTER SCIENCE !!) Again (IN ALL COMPUTER SCIENCE !!) ...and Top #2 & #4 are on machine learning ...as Top #3 & #5 are on computer vision. Source: David Forsyth @ Good Citizen CVPR 2018 [video] [slides]
  93. 93. bit.ly/commsenselab @DocXavi 93 The AI Hype (?)
  94. 94. bit.ly/commsenselab @DocXavi 94 The AI Hype (?)
  95. 95. bit.ly/commsenselab @DocXavi 95 The AI Hype (?) Nature: Junior AI researchers are in demand by universities and industry (April 2019)
  96. 96. bit.ly/commsenselab @DocXavi 96 The AI Hype (?) Xavier Sala-i-Martin (Columbia University), “Les conclusions del Fòrum de Davos” (TV3, 03/02/2016) - in Catalan Carles Boix (Princeton University), “La quarta revolució industrial” (Diari Ara, 08/02/2016) - in Catalan
  97. 97. bit.ly/commsenselab @DocXavi 97 The AI Hype (?) Barack Obama, Neural Nets, Self-driving cars, and the Future of the World (Wired, June 2016)
  98. 98. bit.ly/commsenselab @DocXavi 98 The AI Hype (?)
  99. 99. bit.ly/commsenselab @DocXavi 99 The AI Hype (?)
  100. 100. bit.ly/commsenselab @DocXavi 100 The AI Hype (?)
  101. 101. bit.ly/commsenselab @DocXavi 101 The AI Hype (?) LLegiu l’informe des d’aquí
  102. 102. bit.ly/commsenselab @DocXavi 102 The AI Hype (?)
  103. 103. bit.ly/commsenselab @DocXavi 103 The AI Hype (?)
  104. 104. bit.ly/commsenselab @DocXavi 104 The AI Hype (?)
  105. 105. bit.ly/commsenselab @DocXavi 105 The AI Hype (?)
  106. 106. bit.ly/commsenselab @DocXavi 106 Deep Learning courses @ UPC TelecomBCN: ● MSc course [2017] [2018] [2019] ● BSc course [2018] [2019] ● 1st edition (2016) ● 2nd edition (2017) ● 3rd edition (2018) ● 4th edition (2019) ● 1st edition (2017) ● 2nd edition (2018) ● 3rd edition - NLP (2019) Next edition: Autumn 2019 Included in ETSETB Master MATT Deep Learning Track + MET
  107. 107. bit.ly/commsenselab @DocXavi 107 Deep Learning for Professionals @ UPC School Next edition starts November 2019. Sign up here.
  108. 108. bit.ly/commsenselab @DocXavi 108 Community building bcn.ai deeplearning.barcelona
  109. 109. bit.ly/commsenselab @DocXavi 109 Our team from UPC & BSC Barcelona Victor Campos Amaia Salvador Amanda Duarte Dèlia Fernández Eduard Ramon Andreu Girbau Dani Fojo Oscar Mañas Santi Pascual Xavi Giró Miriam Bellver Janna Escur Carles Ventura Paula Gómez Benet Oriol Mariona Carós Jordi Torres Ferran Marqués

×