Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

- The AI Rush by Jean-Baptiste Dumont 497073 views
- AI and Machine Learning Demystified... by Carol Smith 3420457 views
- 10 facts about jobs in the future by Pew Research Cent... 565914 views
- 2017 holiday survey: An annual anal... by Deloitte United S... 930726 views
- Harry Surden - Artificial Intellige... by Harry Surden 523893 views
- Inside Google's Numbers in 2017 by Rand Fishkin 1111453 views

127 views

Published on

Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.

Published in:
Data & Analytics

No Downloads

Total views

127

On SlideShare

0

From Embeds

0

Number of Embeds

0

Shares

0

Downloads

14

Comments

0

Likes

1

No embeds

No notes for slide

- 1. Xavier Giro-i-Nieto xavier.giro@upc.edu Associate Professor Universitat Politecnica de Catalunya Technical University of Catalonia The Perceptron Day 1 Lecture 2 #DLUPC http://bit.ly/dlai2018
- 2. 2 Acknowledgements Santiago Pascual Kevin McGuinness kevin.mcguinness@dcu.ie Research Fellow Insight Centre for Data Analytics Dublin City University
- 3. 3 Video lectures Santiago Pascual, DLSL 2017 Xavier Giró, DLAI 2017
- 4. 4 Outline 1. Supervised learning: regression/classification 2. Single neuron models (perceptrons) a. Linear regression b. Logistic regression c. Multiple outputs and softmax regression 3. Limitations of the perceptron
- 5. 5 Outline 1. Supervised learning: regression/classification 2. Single neuron models (perceptrons) a. Linear regression b. Logistic regression c. Multiple outputs and softmax regression 3. Limitations of the perceptron
- 6. Types of machine learning Yann Lecun’s Black Forest cake 6
- 7. Types of machine learning We can categorize three types of learning procedures: 1. Supervised Learning: 퐲 = ƒ(퐱) 2. Unsupervised Learning: ƒ(퐱) 3. Reinforcement Learning: 퐲 = ƒ(퐱) 퐳 7
- 8. Supervised learning 8 Fit a function: 퐲 = ƒ(퐱), 퐱 ∈ ℝm
- 9. Supervised learning Fit a function: 퐲 = ƒ(퐱), 퐱 ∈ ℝm Given paired training examples {(xi , yi )} 9 xi yi
- 10. Supervised learning Fit a function: 퐲 = ƒ(퐱), 퐱 ∈ ℝm Given paired training examples {(xi , yi )} Key point: generalize well to unseen examples 10
- 11. Black box abstraction of supervised learning 11 y^
- 12. Regression vs Classification Depending on the type of target 퐲 we get: ● Regression: 퐲 ∈ ℝN is continuous (e.g. temperatures 퐲 = {19º, 23º, 22º}) ● Classification: 퐲 is discrete (e.g. 퐲 = {“dog”,”cat”,”ostrich”}). 12
- 13. Regression vs Classification Depending on the type of target 퐲 we get: ● Regression: 퐲 ∈ ℝN is continuous (e.g. temperatures 퐲 = {19º, 23.2º, 22.8º}) ● Classification: 퐲 is discrete (e.g. 퐲 = {“dog”, “cat”, “ostrich”}). 13
- 14. Linear Regression (eg. 1D input - 1D ouput) 14
- 15. Linear Regression (eg. 1D input - 1D ouput) 15 = w · x + b Training a model means learning parameters w and b from data.
- 16. Linear Regression (M-D input) 16 Input data can also be M-dimensional with vector x: y = wT · x + b = w1·x1 + w2·x2 + w3·x3 + … + wM·xM + b e.g. we want to predict the price of a house (y) based on: x1 = square-meters (sqm) x2,3 = location (lat, lon) x4 = year of construction (yoc) y = price = w1·(sqm) + w2·(lat) + w3·(lon) + w4·(yoc) + b
- 17. Regression vs Classification Depending on the type of target 퐲 we get: ● Regression: 퐲 ∈ ℝN is continuous (e.g. temperatures 퐲 = {19º, 23º, 22º}) ● Classification: 퐲 is discrete (e.g. 퐲 = {”dog”,”cat”,”ostrich”}). 17
- 18. Binary Classification (eg. 2D input, 1D ouput) 18
- 19. 19 Multi-class Classification
- 20. Multi-class Classification ● Classification: 퐲 is discrete (e.g. 퐲 = {”dog”,”cat”,”ostrich”}. ○ Classes are often coded as one-hot vector (each class corresponds to a different dimension of the output space) 20 Perronin, F., CVPR Tutorial on LSVR @ CVPR’14, Output embedding for LSVR [1,0,0] [0,1,0] [0,0,1] One-hot representations
- 21. 21 Discussion Should you treat these three problems as classification or as regression problems? Problem Regression ? Classification ? Predicting whether stock price of a company will increase tomorrow Predict the number of copies a music album will be sold next month Predicting the gender of a person by his/her handwriting style
- 22. 22 Outline 1. Supervised learning: regression/classification 2. Single neuron models (perceptrons) a. Linear regression b. Logistic regression c. Multiple outputs and softmax regression 3. Limitations of the perceptron
- 23. 23 The Perceptron is seen as an analogy to a biological neuron. Biological neurons fire an impulse once the sum of all inputs is over a threshold. The perceptron acts like a switch (learn how in the next slides...). Single neuron model (perceptron)
- 24. Single Neuron Model (Perceptron) The perceptron can address both regression or classification problems, depending on the chosen activation function. 24
- 25. Single neuron model (perceptron) 25
- 26. Single neuron model (perceptron) 26 Weights and bias are the parameters that define the behavior (must be estimated).
- 27. Single neuron model (perceptron) 27 The output y is derived from a sum of the weighted inputs plus a bias term.
- 28. 28 Outline 1. Supervised learning: regression/classification 2. Single neuron models (perceptrons) a. Linear regression b. Logistic regression c. Multiple outputs and softmax regression 3. Limitations of the perceptron
- 29. Single neuron model: Linear Regression 29 The perceptron can solve linear regression problems when f(a)=a. [identity]
- 30. 30 Outline 1. Supervised learning: regression/classification 2. Single neuron models (perceptrons) a. Linear regression b. Logistic regression c. Multiple outputs and softmax regression 3. Limitations of the perceptron
- 31. Single neuron model: Regression 31 More interesting when the activation function f(a) is not the identity, but:
- 32. Single neuron model: Logistic Regression 32 More interesting when the activation function f(a) is not the identity, but:
- 33. Single neuron model: Logistic Regression 33 The sigmoid function σ(x) or logistic curve maps any input x between [0,1]:
- 34. Single neuron model: Logistic Regression 34 The perceptron is suitable for classification problems when f(a)=σ(a). [sigmoid] Logits
- 35. Single neuron model: Binary Classification 35 For classification, regressed values should b collapsed into 0 and 1 to quantize the confidence of the predictions (“probabilities”). Threshold (thr)
- 36. Single neuron model: Binary Classification 36 y thr → class 1 (eg. green) y thr → class 2 (eg. red) Setting a threshold (thr) at the output of the perceptron allows solving classification problems between two classes (binary): Logits
- 37. Single neuron model: Binary Classification 37 The classification threshold can be adjusted based on the desired precision - recall trade-off: High precision low recall for class green Low precision high recall for class green
- 38. 38 Outline 1. Supervised learning: regression/classification 2. Single neuron models (perceptrons) a. Linear regression b. Logistic regression c. Softmax regression 3. Limitations of the perceptron
- 39. Softmax regression: Binary case 39 J. Alammar, “A visual and interactive guide to the Basics of Neural Networks” (2016) Normalization factor so that the sum of probabilities sum up to 1. Softmax regression
- 40. 40 Softmax regression: Multiclass (N classes) 40 Multiple classes can be predicted by putting many neurons in parallel, each processing its binary output out of N possible classes. 0.3 “dog” 0.08 “cat” 0.6 “whatever” raw pixels unrolled img Normalization factor so that the sum of probabilities sum up to 1. Softmax regression
- 41. Softmax Regression 41
- 42. 42 Softmax regressor: Multiclass (3 classes) TensorFlow, “MNIST for ML beginners”
- 43. 43 TensorFlow, “MNIST for ML beginners” Softmax regressor: Multiclass (3 classes)
- 44. 44 TensorFlow, “MNIST for ML beginners” Softmax regressor: Multiclass (3 classes)
- 45. Exercise 45 Consider a binary classifier implemented with a single neuron modelled by two weights w1 =0.2 and w2 =0.8 and a bias b=-1. Consider the activation function to be a sigmoid f(x) = 1 / (1+e-x ). a) Draw a scheme of the model. b) Compute the output of the logistic regressor for a given input x=[1,1]. c) Considering a classification threshold of yth =0 (yth 0.9 for class A, and yth 0.9 for class B), which class would be predicted for the considered input x=[1,1] ?
- 46. 46 Outline 1. Supervised learning: regression/classification 2. Single neuron models (perceptrons) a. Linear regression b. Logistic regression c. Multiple outputs and softmax regression 3. Limitations of the perceptron
- 47. Limitations of the Perceptrons 47Minsky, Marvin, and Seymour A. Papert. Perceptrons: An introduction to computational geometry. 1969
- 48. Limitations of the Perceptrons 48 x1 x2 Class 0 Class 1 2D input space data Parameters of the line. They are find based on training data - Learning Stage. x
- 49. Limitations of the Perceptrons 49 Input 1 Input 2 Desired Output 0 0 0 0 1 1 1 0 1 1 1 0 XOR logic table 1 0 0 1 Input 1 Input 2 Data might be non linearly separable → One single neuron is not enough ?
- 50. Limitations of the Perceptrons 50 Perceptrons can only produce linear decision boundaries. Real world problems often need non-linear boundaries ● Images ● Audio ● Text
- 51. Limitations of the Perceptrons 51 What can we do? 1. Use a non-linear classifier ○ Decision trees (and forests) ○ K nearest neighbours 2. Engineer a suitable representation ○ One in which features are more linearly separable ○ Then use a linear model 3. Engineer a kernel ○ Design a kernel K(x1 , x2 ) ○ Use kernel methods (e.g. SVM) 4. Learn a suitable representation space from the data ○ Deep learning, deep neural networks ○ Boosted cascade classifiers like Viola Jones also take this approach
- 52. 52 Discussion One single perceptron can solve classification problems imposing non-linear boundaries... ● Always ● Only if the activation function is not linear. ● Only if the loss function is the binary cross-entropy. ● Never.
- 53. Prepare your the lecture... 53 How to solve the XOR problem with two perceptrons ? Rumelhart, D. E., McClelland, J. L. (1986). Parallel distributed processing: explorations in the microstructure of cognition. volume 1. Foundations.
- 54. Suggested readings for the ext lecture 54 Neural networks as universal approximators Hornik, Kurt, Maxwell Stinchcombe, and Halbert White. Multilayer feedforward networks are universal approximators. Neural networks 2, no. 5 (1989): 359-366.
- 55. 55 Final Questions

No public clipboards found for this slide

Be the first to comment