Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.                                                                                           Upcoming SlideShare
Loading in …5
×

# Gentlest Introduction to Tensorflow - Part 3

4,178 views

Published on

Articles:
* https://medium.com/all-of-us-are-belong-to-machines/gentlest-intro-to-tensorflow-part-3-matrices-multi-feature-linear-regression-30a81ebaaa6c
* https://medium.com/all-of-us-are-belong-to-machines/gentlest-intro-to-tensorflow-4-logistic-regression-2afd0cabc54

Video: https://youtu.be/F8g_6TXKlxw

Code: https://github.com/nethsix/gentle_tensorflow

In this part, we:
* Use Tensorflow for linear regression models with multiple features
* Use Tensorflow for logistic regression models with multiple features. Specifically:
* Predict multi-class/discrete outcome
* Explain why we use cross-entropy as cost function
* Explain why we use softmax
* Tensorflow Cheatsheet #1
* Single feature linear regression
* Multi-feature linear regression
* Multi-feature logistic regression

Published in: Technology
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here • Be the first to comment

### Gentlest Introduction to Tensorflow - Part 3

1. 1. Gentlest Intro to Tensorflow (Part 3) Khor Soon Hin, @neth_6, re:Culture In collaboration with Sam & Edmund
2. 2. Overview ● Multi-feature Linear Regression ● Logistic Regression ○ Multi-class prediction ○ Cross-entropy ○ Softmax ● Tensorflow Cheatsheet #1
3. 3. Review: Predict from Single Feature with Linear Regression
4. 4. Quick Review: Predict from Single Feature (House Size)
5. 5. Quick Review: Use Linear Regression
6. 6. Quick Review: Predict using Linear Regression
7. 7. Linear Regression: Predict from Two (or More) Features
8. 8. Two Features: House Size, Rooms Source: teraplot.com House Price, \$ Rooms, unit House Size, sqm
9. 9. Same Issue: Predict for Values without Datapoint Source: teraplot.com House Price, \$ Rooms, unit House Size, sqm
10. 10. Same Solution: Find Best-Fit Source: teraplot.com House Price, \$ House Size, sqm Rooms, unit Prediction!
11. 11. Review: Tensorflow Code
12. 12. Tensorflow Code # Model linear regression y = Wx + b x = tf.placeholder(tf.float32, [None, 1]) W = tf.Variable(tf.zeros([1,1])) b = tf.Variable(tf.zeros()) product = tf.matmul(x,W) y = product + b y_ = tf.placeholder(tf.float32, [None, 1]) # Cost function 1/n * sum((y_-y)**2) cost = tf.reduce_mean(tf.square(y_-y)) # Training using Gradient Descent to minimize cost train_step = tf.train.GradientDescentOptimizer(0.0000001).minimize(cost)
13. 13. Multi-feature: Change in Model & Cost Function
14. 14. Model 1 Feature y = W.x + b y: House price prediction x: House size Goal: Find scalars W,b
15. 15. Model 1 Feature y = W.x + b y: House price prediction x: House size Goal: Find scalars W,b 2 Features y = W.x + W2.x2 + b y: House price prediction x: House size x2: Rooms Goal: Find scalars W, W2, b
16. 16. Tensorflow Graph 1 Feature y = tf.matmul(x, W) + b W = tf.Variable(tf.zeros[1,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 1]) y_ = tf.placeholder(tf.float, [None, 1])
17. 17. Tensorflow Graph 1 Feature y = tf.matmul(x, W) + b W = tf.Variable(tf.zeros[1,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 1]) y_ = tf.placeholder(tf.float, [None, 1]) 2 Features y = matmul(x, W) + matmul(x2, W2) + b W = tf.Variable(tf.zeros[1,1]) W2 = tf.Variable(tf.zeros[1,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 1]) x2 = tf.placeholder(tf.float, [None, 1]) y_ = tf.placeholder(tf.float, [None, 1])
18. 18. Tensorflow Graph: Train 1 Feature y = tf.matmul(x, W) + b W = tf.Variable(tf.zeros[1,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 1]) y_ = tf.placeholder(tf.float, [None, 1]) 2 Features y = matmul(x, W) + matmul(x2, W2) + b W = tf.Variable(tf.zeros[1,1]) W2 = tf.Variable(tf.zeros[1,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 1]) x2 = tf.placeholder(tf.float, [None, 1]) y_ = tf.placeholder(tf.float, [None, 1]) Train: feed = { x: … , y_: … } Train: feed = { x: … , x2: ... y_: … }
19. 19. Tensorflow Graph: Scalability Issue 1 Feature y = tf.matmul(x, W) + b W = tf.Variable(tf.zeros[1,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 1]) y_ = tf.placeholder(tf.float, [None, 1]) 2 Features y = tf.matmul(x, W) + tf.matmul(x2, W2) + b W = tf.Variable(tf.zeros[1,1]) W2 = tf.Variable(tf.zeros[1,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 1]) x2 = tf.placeholder(tf.float, [None, 1]) y_ = tf.placeholder(tf.float, [None, 1]Train: feed = { x: … , y_: … } Train: feed = { x: … , x2: ... y_: … } 3 Features y = tf.matmul(x, W) + tf.matmul(x2, W2) + tf.matmul(x3, W3) +b W = tf.Variable(tf.zeros[1,1]) W2 = tf.Variable(tf.zeros[1,1]) W3 = tf.Variable(tf.zeros[1,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 1]) x2 = tf.placeholder(tf.float, [None, 1]) x3 = tf.placeholder(tf.float, [None, 1]) y_ = tf.placeholder(tf.float, [None, 1] Train: feed = { x: … , x2: ... , x3: … , y_: … } Model gets messy!
20. 20. Data Representation House #1 Size_1 Rooms_1 Price_1 House #2 Size_2 Rooms_2 Price_2 … … … ... House #m Size_m Rooms_m Price_m Feature values Actual outcome
21. 21. House #1 Size_1 Rooms_1 Price_1 House #2 Size_2 Rooms_2 Price_2 … … … ... House #m Size_m Rooms_m Price_m Lots of Data Manipulation 2 Features y = tf.matmul(x, W) + tf.matmul(x2, W2) + b W = tf.Variable(tf.zeros[1,1]) W2 = tf.Variable(tf.zeros[1,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 1]) x2 = tf.placeholder(tf.float, [None, 1]) y_ = tf.placeholder(tf.float, [None, 1] Train: feed = { x: … , x2: ... y_: … }
22. 22. House #1 Size_1 Rooms_1 Price_1 House #2 Size_2 Rooms_2 Price_2 … … … ... House #m Size_m Rooms_m Price_m Lots of Data Manipulation 2 2 Features y = tf.matmul(x, W) + tf.matmul(x2, W2) + b W = tf.Variable(tf.zeros[1,1]) W2 = tf.Variable(tf.zeros[1,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 1]) x2 = tf.placeholder(tf.float, [None, 1]) y_ = tf.placeholder(tf.float, [None, 1] Train: feed = { x: … , x2: ... y_: … }
23. 23. House #1 Size_1 Rooms_1 Price_1 House #2 Size_2 Rooms_2 Price_2 … … … ... House #m Size_m Rooms_m Price_m Lots of Data Manipulation 3 2 Features y = tf.matmul(x, W) + tf.matmul(x2, W2) + b W = tf.Variable(tf.zeros[1,1]) W2 = tf.Variable(tf.zeros[1,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 1]) x2 = tf.placeholder(tf.float, [None, 1]) y_ = tf.placeholder(tf.float, [None, 1] Train: feed = { x: … , x2: ... y_: … }
24. 24. House #1 Size_1 Rooms_1 Price_1 House #2 Size_2 Rooms_2 Price_2 … … … ... House #m Size_m Rooms_m Price_m Lots of Data Manipulation 4 2 Features y = tf.matmul(x, W) + tf.matmul(x2, W2) + b W = tf.Variable(tf.zeros[1,1]) W2 = tf.Variable(tf.zeros[1,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 1]) x2 = tf.placeholder(tf.float, [None, 1]) y_ = tf.placeholder(tf.float, [None, 1] Train: feed = { x: … , x2: ... y_: … } Data manipulation gets messy!
25. 25. House #1 Size_1 Rooms_1 Price_1 House #2 Size_2 Rooms_2 Price_2 … … … ... House #m Size_m Rooms_m Price_m Lots of Data Manipulation 4 2 Features y = tf.matmul(x, W) + tf.matmul(x2, W2) + b W = tf.Variable(tf.zeros[1,1]) W2 = tf.Variable(tf.zeros[1,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 1]) x2 = tf.placeholder(tf.float, [None, 1]) y_ = tf.placeholder(tf.float, [None, 1] Train: feed = { x: … , x2: ... y_: … } y = W.x + W2.x2 + b Find better way
26. 26. Matrix: Cleaning Up Representations
27. 27. Matrix Representation House #0 Size_0 Rooms_0 Price_0 House #1 Size_1 Rooms_1 Price_1 … … … ... House #m Size_m Rooms_m Price_m
28. 28. Matrix Representation House #0 Size_0 Rooms_0 Price_0 House #1 Size_1 Rooms_1 Price_1 … … … ... House #m Size_m Rooms_m Price_m House #0 House #1 … House #m Size_0 Rooms_0 Size_1 Rooms_1 … … Size_m Rooms_m Price_0 Price_1 ... Price_m ActualFeaturesLabels
29. 29. Matrix Representation House #0 Size_0 Rooms_0 Price_0 House #1 Size_1 Rooms_1 Price_1 … … … ... House #m Size_m Rooms_m Price_m House #0 House #1 … House #m Size_0 Rooms_0 Size_1 Rooms_1 … … Size_m Rooms_m Price_0 Price_1 ... Price_m ... ... Align all data by row so NO need for label ActualFeaturesLabels
30. 30. Matrix Representation House #0 Size_0 Rooms_0 Price_0 House #1 Size_1 Rooms_1 Price_1 … … … ... House #m Size_m Rooms_m Price_m House #0 House #1 … House #m Size_0 Rooms_0 Size_1 Rooms_1 … … Size_m Rooms_m Price_0 Price_1 ... Price_m ... ... Align all data by row so NO need for label Let’s Focus Here ActualFeaturesLabels
31. 31. Matrix: Cleaning Up Models
32. 32. House #1 Size_1 Rooms_1 Price_1 House #2 Size_2 Rooms_2 Price_2 … … … ... House #m Size_m Rooms_m Price_m Better Model Equation 2 Features y = tf.matmul(x, W) + tf.matmul(x2, W2) + b W = tf.Variable(tf.zeros[1,1]) W2 = tf.Variable(tf.zeros[1,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 1]) x2 = tf.placeholder(tf.float, [None, 1]) y_ = tf.placeholder(tf.float, [None, 1] Train: feed = { x: … , x2: … , y_: … } y = W.x + W2.x2 + b Find better way
33. 33. House #1 Size_1 Rooms_1 Price_1 House #2 Size_2 Rooms_2 Price_2 … … … ... House #m Size_m Rooms_m Price_m Better Model Equation 2 Features y = tf.matmul(x, W) + tf.matmul(x2, W2) + b W = tf.Variable(tf.zeros[1,1]) W2 = tf.Variable(tf.zeros[1,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 1]) x2 = tf.placeholder(tf.float, [None, 1]) y_ = tf.placeholder(tf.float, [None, 1] Train: feed = { x: … , x2: … , y_: … } y = W.x + W2.x2 + b Find better way
34. 34. House #1 Size_1 Rooms_1 Price_1 House #2 Size_2 Rooms_2 Price_2 … … … ... House #m Size_m Rooms_m Price_m Better Model Equation 2 Features y = tf.matmul(x, W) + tf.matmul(x2, W2) + b W = tf.Variable(tf.zeros[1,1]) W2 = tf.Variable(tf.zeros[1,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 1]) x2 = tf.placeholder(tf.float, [None, 1]) y_ = tf.placeholder(tf.float, [None, 1] Train: feed = { x: [size_i, rooms_i], … , x2: … y_: … } y = W.x + W2.x2 + b Find better way
35. 35. 2 Features y = tf.matmul(x, W) + tf.matmul(x2, W2) + b W = tf.Variable(tf.zeros[1,1]) W2 = tf.Variable(tf.zeros[1,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 2]) x2 = tf.placeholder(tf.float, [None, 1]) y_ = tf.placeholder(tf.float, [None, 1] House #1 Size_1 Rooms_1 Price_1 House #2 Size_2 Rooms_2 Price_2 … … … ... House #m Size_m Rooms_m Price_m Better Model Equation Train: feed = { x: [size_i, rooms_i], … , x2: …, y_: … } y = W.x + W2.x2 + b Find better way
36. 36. 2 Features y = tf.matmul(x, W) + tf.matmul(x2, W2) + b W = tf.Variable(tf.zeros[2,1]) W2 = tf.Variable(tf.zeros[1,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 2]) x2 = tf.placeholder(tf.float, [None, 1]) y_ = tf.placeholder(tf.float, [None, 1] House #1 Size_1 Rooms_1 Price_1 House #2 Size_2 Rooms_2 Price_2 … … … ... House #m Size_m Rooms_m Price_m Better Model Equation Train: feed = { x: [size_i, rooms_i], … , x2: …, y_: … } y = W.x + W2.x2 + b Find better way
37. 37. 2 Features y = tf.matmul(x, W) + tf.matmul(x2, W2) + b W = tf.Variable(tf.zeros[2,1]) W2 = tf.Variable(tf.zeros[1,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 2]) x2 = tf.placeholder(tf.float, [None, 1]) y_ = tf.placeholder(tf.float, [None, 1] House #1 Size_1 Rooms_1 Price_1 House #2 Size_2 Rooms_2 Price_2 … … … ... House #m Size_m Rooms_m Price_m Better Model Equation Train: feed = { x: [size_i, rooms_i], … , x2: …, y_: … } y = W.x + W2.x2 + b Find better way
38. 38. 2 Features y = tf.matmul(x, W) + tf.matmul(x2, W2) + b W = tf.Variable(tf.zeros[2,1]) W2 = tf.Variable(tf.zeros[1,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 2]) x2 = tf.placeholder(tf.float, [None, 1]) y_ = tf.placeholder(tf.float, [None, 1] House #1 Size_1 Rooms_1 Price_1 House #2 Size_2 Rooms_2 Price_2 … … … ... House #m Size_m Rooms_m Price_m Better Model Equation Train: feed = { x: [size_i, rooms_i], … , x2: …, y_: … } y = W.x + W2.x2 + b Find better way
39. 39. Tensorflow Graph (Messy) 1 Feature y = tf.matmul(x, W) + b W = tf.Variable(tf.zeros[1,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 1]) y_ = tf.placeholder(tf.float, [None, 1]) 2 Features y = matmul(x, W) + matmul(x2, W2) + b W = tf.Variable(tf.zeros[1,1]) W2 = tf.Variable(tf.zeros[1,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 1]) x2 = tf.placeholder(tf.float, [None, 1]) y_ = tf.placeholder(tf.float, [None, 1])
40. 40. Tensorflow Graph (Clean) 1 Feature y = tf.matmul(x, W) + b W = tf.Variable(tf.zeros[1,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 1]) y_ = tf.placeholder(tf.float, [None, 1]) 2 Features y = matmul(x, W) + matmul(x2, W2) + b W = tf.Variable(tf.zeros[2,1]) W2 = tf.Variable(tf.zeros[1,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 2]) x2 = tf.placeholder(tf.float, [None, 1]) y_ = tf.placeholder(tf.float, [None, 1])
41. 41. Tensorflow Graph (Clean and Formatted) 1 Feature y = tf.matmul(x, W) + b W = tf.Variable(tf.zeros[1,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 1]) y_ = tf.placeholder(tf.float, [None, 1]) 2 Features y = matmul(x, W) + b W = tf.Variable(tf.zeros[2,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 2]) y_ = tf.placeholder(tf.float, [None, 1])
42. 42. Tensorflow Graph (Illustration) 1 Feature y = tf.matmul(x, W) + b W = tf.Variable(tf.zeros[1,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 1]) y_ = tf.placeholder(tf.float, [None, 1]) 2 Features y = matmul(x, W) + b W = tf.Variable(tf.zeros[2,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 2]) y_ = tf.placeholder(tf.float, [None, 1]) scalar scalar.. 1 feature .. ..1coeff.. scalar scalar.. 2 features .. ..2coeffs..
43. 43. Logistic Regression
44. 44. Linear vs. Logistic Regression Size 1 42 3 5 Rooms ML price (scalar) Linear Regression predict
45. 45. Linear vs. Logistic Regression Size 1 42 3 5 Rooms ML price (scalar) ML 0 1 9 2 . . . number (discrete classes) Linear Regression Logistic Regression predict predict Image
46. 46. Image Features
47. 47. Image Features 1 23 53 … 33 53 20 … 88 … … … .... 62 2 … 193 Grayscale value of each pixel
48. 48. Image Features 2 23 53 … 33 53 20 … 88 … … … .... 62 2 … 193 Grayscale value of each pixel
49. 49. Image Features 3 23 53 … 33 53 20 … 88 … … … .... 62 2 … 193 Grayscale value of each pixel
50. 50. Logistic Regression: Change in Models
51. 51. Model Linear Regression y = W.x + b y: House price (scalar) prediction x: [House size, Rooms] Goal: Find scalars W,b Logistic Regression y = W.x + b y: Discrete class [0,1,...9] prediction x: [2-Dim pixel grayscale colors] Goal: Find scalars W, b
52. 52. Data Representation Comparison House #1 Size_1 Rooms_1 Price_1 House #2 Size_2 Rooms_2 Price_2 … … … ... House #m Size_m Rooms_m Price_m Feature values Actual outcome 23 53 … 33 53 20 … 88 … … … .... 62 2 … 193 Image #1 Image #2 250 10 … 33 103 5 … 88 … … … ... 5 114 … 193 Feature values Actual outcome Image #m ... 5 2 3 x x y_y_
53. 53. Data Representation Comparison House #1 Size_1 Rooms_1 Price_1 House #2 Size_2 Rooms_2 Price_2 … … … ... House #m Size_m Rooms_m Price_m Feature values Actual outcome 23 53 … 33 53 20 … 88 … … … .... 62 2 … 193 Image #1 Image #2 250 10 … 33 103 5 … 88 … … … ... 5 114 … 193 Feature values Actual outcome Image #m ... 5 2 3 2 Features 1-Dim X x Y 2-Dim Features x x y_y_
54. 54. Data Representation Comparison House #1 Size_1 Rooms_1 Price_1 House #2 Size_2 Rooms_2 Price_2 … … … ... House #m Size_m Rooms_m Price_m Feature values Actual outcome 23 53 … 33 53 20 … 88 … … … .... 62 2 … 193 Image #1 Image #2 250 10 … 33 103 5 … 88 … … … ... 5 114 … 193 Feature values Actual outcome Image #m ... 5 2 3 2 Features 1-Dim X x Y 2-Dim Features x x y_y_
55. 55. Data Representation Comparison House #1 Size_1 Rooms_1 Price_1 House #2 Size_2 Rooms_2 Price_2 … … … ... House #m Size_m Rooms_m Price_m Feature values Actual outcome Image #1 Image #2 250 10 … 33 103 5 … 88 … 5 114 … 193 Feature values Actual outcome Image #m ... 5 2 3 2 Features 1-Dim X.Y 2 1-Dim Features x x y_y_ 23 53 … 33 53 20 … 88 … 62 2 … 193 Generalize into a multi-feature problem
56. 56. Model Linear Regression y = W.x + b y: House price (scalar) prediction x: [House size, Rooms] Goal: Find scalars W,b Logistic Regression y = W.x + b y: Discrete class [0,1,...9] prediction x: [2-Dim pixel grayscale colors] x: [Pixel 1, Pixel 2, …, Pixel X.Y] Goal: Find scalars W, b
57. 57. Model Linear Regression y = W.x + b y: House price (scalar) prediction x: [House size, Rooms] Goal: Find scalars W,b Logistic Regression y = W.x + b y: Discrete class [0,1,...9] prediction x: [2-Dim pixel grayscale colors] x: [Pixel 1, Pixel 2, …, Pixel X.Y] Goal: Find scalars W, b This needs change as well!
58. 58. Why Can’t ‘y’ be left as scalar of 0-9? HINT: Beside the model, when doing ML we need this function!
59. 59. Logistic Regression: Change in Cost Function
60. 60. Linear Regression: Cost Scalar
61. 61. Linear Regression: Cost, NOT Number 6 7 5 4 3 2 0 1 9 8 Pixel grayscale values Discrete The magnitude of difference (y_ - y) does NOT matter Wrongly predicting 4 as 3 in as bad as 9 as 3
62. 62. Logistic Regression: Prediction ML Logistic Regression Image 1
63. 63. Logistic Regression: Prediction ML Logistic Regression Image Probability 0 1 2 3 4 5 6 7 8 9
64. 64. Logistic Regression: Prediction ML Logistic Regression Image Probability 0 1 2 3 4 5 6 7 8 9 Probability 0 1 2 3 4 5 6 7 8 9 Actual Compare
65. 65. Cross-Entropy: Cost Function 1 cross_entropy = -tf.reduce_sum(y_*tf.log(y))
66. 66. Cross-Entropy: Cost Function 2 cross_entropy = -tf.reduce_sum(y_*tf.log(y)) Prediction Probability 0 1 2 3 4 5 6 7 8 9 Probability 0 1 2 3 4 5 6 7 8 9 Actual
67. 67. Cross-Entropy: Cost Function 3 cross_entropy = -tf.reduce_sum(y_*tf.log(y)) Prediction -log (Probability) 0 1 2 3 4 5 6 7 8 9 Probability 0 1 2 3 4 5 6 7 8 9 Actual Almost inverse because Probability < 1
68. 68. Graph: -log(probability) -log(probability) probability 10 Probability always between 0 and 1. Thus -log(probability) is inversely proportional to probability
69. 69. Cross-Entropy: Cost Function x x xx cross_entropy = -tf.reduce_sum(y_*tf.log(y)) Prediction 0 1 2 3 4 5 6 7 8 9 Probability 0 1 2 3 4 5 6 7 8 9 Actual -log (Probability)
70. 70. Cross-Entropy: Cost Function cross_entropy = -tf.reduce_sum(y_*tf.log(y)) Prediction 0 1 2 3 4 5 6 7 8 9 Probability 0 1 2 3 4 5 6 7 8 9 Actual -log (Probability)
71. 71. Cross-Entropy: Cost Function cross_entropy = -tf.reduce_sum(y_*tf.log(y)) Prediction 0 1 2 3 4 5 6 7 8 9 Probability 0 1 2 3 4 5 6 7 8 9 Actual -log (Probability)
72. 72. Cross-Entropy: Cost Function cross_entropy = -tf.reduce_sum(y_*tf.log(y)) Prediction 0 1 2 3 4 5 6 7 8 9 Probability 0 1 2 3 4 5 6 7 8 9 Actual -log (Probability) ● Only y’i = 1 and log(yi) matters ● The smaller the log(yi) the lower cost ● The higher the yi the lower the cost
73. 73. Logistic Regression: Prediction ML Logistic Regression Image Probability 0 1 2 3 4 5 6 7 8 9 ML Linear Regression Image 1
74. 74. Tensorflow Graph (Review) 1 Feature y = tf.matmul(x, W) + b W = tf.Variable(tf.zeros[1,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 1]) y_ = tf.placeholder(tf.float, [None, 1]) 2 Features y = matmul(x, W) + b W = tf.Variable(tf.zeros[2,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 2]) y_ = tf.placeholder(tf.float, [None, 1])
75. 75. Tensorflow Graph (Review 2) 1 Feature y = tf.matmul(x, W) + b W = tf.Variable(tf.zeros[1,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 1]) y_ = tf.placeholder(tf.float, [None, 1]) 2 Features y = matmul(x, W) + b W = tf.Variable(tf.zeros[2,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 2]) y_ = tf.placeholder(tf.float, [None, 1]) scalar scalar.. 1 feature .. ..1coeff.. scalar scalar.. 2 features .. ..2coeffs..
76. 76. Tensorflow Graph (Review 2) 1 Feature y = tf.matmul(x, W) + b W = tf.Variable(tf.zeros[1,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 1]) y_ = tf.placeholder(tf.float, [None, 1]) 2 Features y = matmul(x, W) + b W = tf.Variable(tf.zeros[2,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 2]) y_ = tf.placeholder(tf.float, [None, 1]) scalar scalar.. 1 feature .. ..1coeff.. scalar scalar.. 2 features .. ..2coeffs.. Apply multi-feature linear regression
77. 77. Logistic Regression: Prediction ML Logistic Regression Image Probability 0 1 2 3 4 5 6 7 8 9 ML Linear Regression Image 1 y = tf.matmul(x,W) + b scalar scalar.. n features .. ..ncoeffs..
78. 78. Logistic Regression: Prediction ML Logistic Regression Image Probability 0 1 2 3 4 5 6 7 8 9 ML Linear Regression Image 1 y = tf.matmul(x,W) + b scalar scalar.. n features .. .. k class .. ..ncoeffs..
79. 79. Logistic Regression: Prediction ML Logistic Regression Image Probability 0 1 2 3 4 5 6 7 8 9 ML Linear Regression Image 1 y = tf.matmul(x,W) + b scalar scalar.. n features .. .. k class .. y = tf.matmul(x,W) + b ..ncoeffs.. .. k class .... n features .. ..ncoeffs.. .. k class ..
80. 80. Tensorflow Graph: Basic, Multi-feature, Multi-class 1 Feature y = tf.matmul(x, W) + b W = tf.Variable(tf.zeros[1,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 1]) y_ = tf.placeholder(tf.float, [None, 1]) 2 Features y = matmul(x, W) + b W = tf.Variable(tf.zeros[2,1]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 2]) y_ = tf.placeholder(tf.float, [None, 1]) scalar scalar.. 1 feature .. ..1coeff.. scalar scalar.. 2 features .. ..2coeffs.. 2 Features, 10 Classes y = matmul(x, W) + b W = tf.Variable(tf.zeros[2,10]) b = tf.Variable(tf.zeros) x = tf.placeholder(tf.float, [None, 2]) y_ = tf.placeholder(tf.float, [None, 10]) k class k class.. 2 features .. ..2coeffs.. k class
81. 81. Logistic Regression: Prediction ML Logistic Regression Image Probability 0 1 2 3 4 5 6 7 8 9 Probability 0 1 2 3 4 5 6 7 8 9 Actual Compare Great but….sum of all prediction ‘probability’ NOT 1
82. 82. ML Logistic Regression Image Value 0 1 2 3 4 5 6 7 8 9
83. 83. ML Logistic Regression Image Value 0 1 2 3 4 5 6 7 8 9 exp(Value) 0 1 2 3 4 5 6 7 8 9
84. 84. ML Logistic Regression Image Value 0 1 2 3 4 5 6 7 8 9 exp(Value) 0 1 2 3 4 5 6 7 8 9 SUM
85. 85. ML Logistic Regression Image Value 0 1 2 3 4 5 6 7 8 9 exp(Value) 0 1 2 3 4 5 6 7 8 9 exp(Value) 0 1 2 3 4 5 6 7 8 9 SUM SUM
86. 86. ML Logistic Regression Image Value 0 1 2 3 4 5 6 7 8 9 exp(Value) 0 1 2 3 4 5 6 7 8 9 exp(Value) 0 1 2 3 4 5 6 7 8 9 SUM SUM sum of all prediction ‘probability’ is 1
87. 87. Before softmax y = tf.matmul(x, W) + b
88. 88. With softmax y = tf.nn.softmax(tf.matmul(x, W) + b) y = tf.matmul(x, W) + b
89. 89. Summary ● Cheat sheet: Single feature, Multi-feature, Multi-class ● Logistic regression: ○ Multi-class prediction: Ensure that prediction is one of a discrete set of values ○ Cross-entropy: Measure difference between multi-class prediction and actual ○ Softmax: Ensure the multi-class prediction probability is a valid distribution (sum = 1)
90. 90. Congrats! You can now understand Google’s Tensorflow Beginner’s Tutorial (https://www.tensorflow.org/versions/r0.7/tutorials/mnist/beginners/index.html)
91. 91. References ● Perform ML with TF using multi-feature linear regression (the wrong way) ○ https://github. com/nethsix/gentle_tensorflow/blob/master/code/linear_regression_multi_feature_using_mini_ batch_with_tensorboard.py ● Perform ML with TF using multi-feature linear regression ○ https://github. com/nethsix/gentle_tensorflow/blob/master/code/linear_regression_multi_feature_using_mini_ batch_without_matrix_with_tensorboard.py ● Tensorflow official tutorial for character recognition ○ https://www.tensorflow.org/versions/r0.7/tutorials/mnist/beginners/index.html ● Colah’s excellent explanation of cross-entropy ○ http://colah.github.io/posts/2015-09-Visual-Information/