Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.                                                                                   Upcoming SlideShare
Loading in …5
×

# TensorFlow in Practice

4,935 views

Published on

TensorFlow is a wonderful tool for rapidly implementing neural networks. In this presentation, we will learn the basics of TensorFlow and show how neural networks can be built with just a few lines of code. We will highlight some of the confusing bits of TensorFlow as a way of developing the intuition necessary to avoid common pitfalls when developing your own models. Additionally, we will discuss how to roll our own Recurrent Neural Networks. While many tutorials focus on using built in modules, this presentation will focus on writing neural networks from scratch enabling us to build flexible models when Tensorflow’s high level components can’t quite fit our needs.

About Nathan Lintz:
Nathan Lintz is a research scientist at indico Data Solutions where he is responsible for developing machine learning systems in the domains of language detection, text summarization, and emotion recognition. Outside of work, Nathan is currently writting a book on TensorFlow as an extension to his tutorial repository https://github.com/nlintz/TensorFlow-Tutorials

Link to video https://www.youtube.com/watch?v=op1QJbC2g0E&feature=youtu.be

Published in: Data & Analytics
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here • Be the first to comment

### TensorFlow in Practice

1. 1. TensorFlow In Practice Nathan Lintz nathan@indico.io
2. 2. Inputs Parameters and Operations Outputs
3. 3. Inputs Parameters and Operations Outputs Cost
4. 4. Batter Cake Doneness Doneness Temperature Mush Perfect Burnt
5. 5. Batter Cake Doneness Doneness Temperature Mush Perfect Burnt 𝑦 = 𝑚𝑥 + 𝑏 ?
6. 6. Inputs (x) (placeholders) Parameters and Operations (m, b) Outputs (y_predict) Cost y_target (doneness)
7. 7. Placeholders Parameters + Operations Cost Optimization Train TensorFlow in 5 Easy Pieces
8. 8. Inputs (placeholders) import tensorflow as tf temp = tf.placeholder(tf.float32, [10, 1]) cake_doneness = tf.placeholder(tf.float32, [10, 1])
9. 9. import tensorflow as tf temp = tf.placeholder(tf.float32, [10, 1]) cake_doneness = tf.placeholder(tf.float32, [10, 1]) temp_m = tf.get_variable(‘temp_m’, [1, 1]) temp_b = tf.get_variable(‘temp_b’, ) predicted_output = tf.nn.xw_plus_b(temp, temp_m , temp_b) Parameters and Operations (m, b) Outputs (y)
10. 10. import tensorflow as tf temp = tf.placeholder(tf.float32, [10, 1]) cake_doneness = tf.placeholder(tf.float32, [10, 1] temp_m = tf.get_variable(‘temp_m’, [1, 1]) temp_b = tf.get_variable(‘temp_b’, ) predicted_output = tf.nn.xw_plus_b(temp, temp_m , temp_b) cost = tf.reduce_mean((cake_doneness – predicted_output)**2) Cost
11. 11. import tensorflow as tf temp = tf.placeholder(tf.float32, [10, 1]) cake_doneness = tf.placeholder(tf.float32, [10, 1]) temp_m = tf.get_variable(‘temp_m’, [1, 1]) temp_b = tf.get_variable(‘temp_b’, ) predicted_output = tf.nn.xw_plus_b(temp, temp_m , temp_b) cost = tf.reduce_mean((cake_doneness –predicted_output)**2) optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01). minimize(cost) Optimizer
12. 12. import tensorflow as tf temp = tf.placeholder(tf.float32, [10, 1]) cake_doneness = tf.placeholder(tf.float32, [10, 1]) temp_m = tf.get_variable(‘temp_m’, [1, 1]) temp_b = tf.get_variable(‘temp_b’, ) predicted_output = tf.nn.xw_plus_b(temp, temp_m , temp_b) cost = tf.reduce_mean((cake_doneness –predicted_output) 2) optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(cost) sess = tf.Session() sess.run(tf.initialize_all_variables()) temp_train = np.linspace(0, 10, 10).reshape(-1, 1) doneness_train = temp_observe * 5. + 1. + np.random.randn(10, 1) for _ in range(100): sess.run(optimizer, feed_dict={temp: temp_train, cake_doneness: doneness_train}) predicted_doneness = sess.run(predicted_output, feed_dict={temp: temp_train}) Train Code
13. 13. Placeholders Parameters + Operations Cost Optimization Train TensorFlow in 5 Easy Pieces
14. 14. Batter Cake Doneness Doneness Temperature Mush Perfect Burnt 𝑦 = 𝑚𝑥 + 𝑏 m = 4.99 b = 1.21
15. 15. Batter Cake Doneness Doneness Temperature Mush Perfect Burnt
16. 16. No Yes Doneness Is Done?
17. 17. Handwritten Digit (28 x 28 pixels) -> 784 pixels Predicted Digit Value
18. 18. X (pixels)  softmax( mx + b) m b Y_true 
19. 19. Placeholders Parameters + Operations Cost Optimization Train TensorFlow in 5 Easy Pieces
20. 20. import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets('MNIST_data', one_hot=True) X = tf.placeholder(tf.float32, [128, 784]) Y_true = tf.placeholder(tf.float32, [128, 10]) Placeholders
21. 21. import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets('MNIST_data', one_hot=True) X = tf.placeholder(tf.float32, [128, 784]) Y_true = tf.placeholder(tf.float32, [128, 10]) m = tf.get_variable('m', [784, 10]) b = tf.get_variable('b', ) Y_pred = tf.nn.xw_plus_b(X, m, b) Parameters and Operations
22. 22. import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets('MNIST_data', one_hot=True) X = tf.placeholder(tf.float32, [128, 784]) Y_true = tf.placeholder(tf.float32, [128, 10]) m = tf.get_variable('m', [784, 10]) b = tf.get_variable('b', ) Y_pred = tf.nn.xw_plus_b(X, m, b) cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits( Y_pred, Y_true)) Cost
23. 23. import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets('MNIST_data', one_hot=True) X = tf.placeholder(tf.float32, [128, 784]) Y_true = tf.placeholder(tf.float32, [128, 10]) m = tf.get_variable('m', [784, 10]) b = tf.get_variable('b', ) Y_pred = tf.nn.xw_plus_b(X, m, b) cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(Y_pred, Y_true)) optimzer = tf.train.GradientDescentOptimizer(learning_rate=0.5) .minimize(cost) Optimizer
24. 24. import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets('MNIST_data', one_hot=True) X = tf.placeholder(tf.float32, [128, 784]) Y_true = tf.placeholder(tf.float32, [128, 10]) m = tf.get_variable('m', [784, 10]) b = tf.get_variable('b', ) Y_pred = tf.nn.xw_plus_b(X, m, b) cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(Y_pred, Y_true)) optimzer = tf.train.GradientDescentOptimizer(learning_rate=0.5) .minimize(cost) sess = tf.Session() sess.run(tf.initialize_all_variables()) for i in range(2000): trX, trY = mnist.train.next_batch(128) sess.run(optimzer, feed_dict={X: trX, Y_true: trY}) Train Code
25. 25. 92% Accuracy!
26. 26. 7 2 1 0
27. 27. 6 6 2 2
28. 28. m b nonlinear( mx + b)
29. 29. Relu 𝑌 = 𝑥 𝑖𝑓 𝑥 > 0 0 𝑒𝑙𝑠𝑒
30. 30. X (pixels)  softmax( m1h + b1) m0 b0 Y_true  m1 b1 relu( m0x + b0) h hidden layer classifier layer
31. 31. import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets('MNIST_data', one_hot=True) X = tf.placeholder(tf.float32, [128, 784]) Y_true = tf.placeholder(tf.float32, [128, 10]) Placeholders
32. 32. import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets('MNIST_data', one_hot=True) X = tf.placeholder(tf.float32, [128, 784]) Y_true = tf.placeholder(tf.float32, [128, 10]) m0 = tf.get_variable('m0', [784, 256]) b0 = tf.get_variable('b0', , initializer=tf.constant_initializer(0.)) m1 = tf.get_variable('m1', [256, 10]) b1 = tf.get_variable('b1', , initializer=tf.constant_initializer(0.)) h = tf.nn.relu(tf.nn.xw_plus_b(X, m0, b0)) Y_pred = tf.nn.xw_plus_b(h, m1, b1) Parameters and Operations
33. 33. import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets('MNIST_data', one_hot=True) X = tf.placeholder(tf.float32, [128, 784]) Y_true = tf.placeholder(tf.float32, [128, 10]) m0 = tf.get_variable('m0', [784, 256]) b0 = tf.get_variable('b0', , initializer=tf.constant_initializer(0.)) m1 = tf.get_variable('m1', [256, 10]) b1 = tf.get_variable('b1', , initializer=tf.constant_initializer(0.)) h = tf.nn.relu(tf.nn.xw_plus_b(X, m0, b0)) Y_pred = tf.nn.xw_plus_b(h, m1, b1) cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits( Y_pred, Y_true)) Cost
34. 34. import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets('MNIST_data', one_hot=True) X = tf.placeholder(tf.float32, [128, 784]) Y_true = tf.placeholder(tf.float32, [128, 10]) m0 = tf.get_variable('m0', [784, 256]) b0 = tf.get_variable('b0', , initializer=tf.constant_initializer(0.)) m1 = tf.get_variable('m1', [256, 10]) b1 = tf.get_variable('b1', , initializer=tf.constant_initializer(0.)) h = tf.nn.relu(tf.nn.xw_plus_b(X, m0, b0)) Y_pred = tf.nn.xw_plus_b(h, m1, b1) cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(Y_pred, Y_true)) optimzer = tf.train.GradientDescentOptimizer(learning_rate=0.5) .minimize(cost) Optimizer
35. 35. import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets('MNIST_data', one_hot=True) X = tf.placeholder(tf.float32, [128, 784]) Y_true = tf.placeholder(tf.float32, [128, 10]) m0 = tf.get_variable('m0', [784, 256]) b0 = tf.get_variable('b0', , initializer=tf.constant_initializer(0.)) m1 = tf.get_variable('m1', [256, 10]) b1 = tf.get_variable('b1', , initializer=tf.constant_initializer(0.)) h = tf.nn.relu(tf.nn.xw_plus_b(X, m0, b0)) Y_pred = tf.nn.xw_plus_b(h, m1, b1) cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(Y_pred, Y_true)) optimzer = tf.train.GradientDescentOptimizer(learning_rate=0.5) .minimize(cost) sess = tf.Session() sess.run(tf.initialize_all_variables()) for i in range(2000): trX, trY = mnist.train.next_batch(128) sess.run(optimzer, feed_dict={X: trX, Y_true: trY}) Train Code
36. 36. 97% Test Accuracy! (97% train accuracy)
37. 37. m0 m1 m2b0 b1 b2 X (pixels)  relu( m0x + b0) relu( m1x + b1) softmax( m2h1 + b2) Y_true  h1 h2 hidden layer 1 classifier layerhidden layer 2
38. 38. def model(X): m0 = tf.get_variable('m0', [784, 256]) b0 = tf.get_variable('b0', , initializer=tf.constant_initializer(0.)) m1 = tf.get_variable('m1', [256, 256]) b1 = tf.get_variable('b1', , initializer=tf.constant_initializer(0.)) m2 = tf.get_variable('m2', [256, 10]) b2 = tf.get_variable('b2', , initializer=tf.constant_initializer(0.)) h1 = tf.nn.relu(tf.nn.xw_plus_b(X, m0, b0)) h2 = tf.nn.relu(tf.nn.xw_plus_b(h1, m1, b1)) output = tf.nn.xw_plus_b(h2, m2, b2) return output Y_pred = model(X) Parameters and Operations (with 2 hidden layers)
39. 39. 97% Test Accuracy! (98% train accuracy)
40. 40. Overfitting Train cost Test cost Cost Iterations
41. 41. mx + bx y
42. 42. x dropout( mx + b) y
43. 43. def model(X, p_keep): m0 = tf.get_variable('m0', [784, 256]) b0 = tf.get_variable('b0', , initializer=tf.constant_initializer(0.)) m1 = tf.get_variable('m1', [256, 256]) b1 = tf.get_variable('b1', , initializer=tf.constant_initializer(0.)) m2 = tf.get_variable('m2', [256, 10]) b2 = tf.get_variable('b2', , initializer=tf.constant_initializer(0.)) h1 = tf.nn.relu(tf.nn.xw_plus_b(X, m0, b0)) h1 = tf.nn.dropout(h1, p_keep) h2 = tf.nn.relu(tf.nn.xw_plus_b(h1, m1, b1)) h2 = tf.nn.dropout(h2, p_keep) output = tf.nn.xw_plus_b(h2, m2, b2) return output Y_pred = model(X, 0.8) Y_pred_test = model(X, 1.) Parameters and Operations (with 2 hidden layers and dropout)
44. 44. m0 m1 m2b0 b1 b2 X (pixels)  relu( m0x + b0) relu( m1x + b1) softmax( m2h1 + b2) Y_true  Dropout(h1) Dropout(h2) hidden layer 1 classifier layerhidden layer 2
45. 45. 98% Test Accuracy! (98% train accuracy)
46. 46. TensorFlow Tips and Tricks
47. 47. Scaling Predictions X (pixels)  m b softmax( mx + b) Y_true  X = tf.placeholder(tf.float32, [128, 784]) Y_true = tf.placeholder(tf.float32, [128, 10]) m = tf.get_variable('m', [784, 10]) b = tf.get_variable('b', ) Y_pred = tf.nn.xw_plus_b(X, m, b) cost = tf.reduce_mean(tf.nn.softmax_cross_entrop y_with_logits(Y_pred, Y_true)) VS. cost = tf.reduce_mean(tf.nn.softmax_cross_entrop y_with_logits(tf.nn.softmax(Y_pred) , Y_true))
48. 48. Parameter Sharing def model(X, p_keep): m0 = tf.get_variable('m0', [784, 256]) b0 = tf.get_variable('b0', , initializer=tf.constant_initializer(0.)) m1 = tf.get_variable('m1', [256, 256]) b1 = tf.get_variable('b1', , initializer=tf.constant_initializer(0.)) m2 = tf.get_variable('m2', [256, 10]) b2 = tf.get_variable('b2', , initializer=tf.constant_initializer(0.)) h1 = tf.nn.relu(tf.nn.xw_plus_b(X, m0, b0)) h1 = tf.nn.dropout(h1, p_keep) h2 = tf.nn.relu(tf.nn.xw_plus_b(h1, m1, b1)) h2 = tf.nn.dropout(h2, p_keep) output = tf.nn.xw_plus_b(h2, m2, b2) return output Y_pred = model(X, 0.8) Y_pred_test = model(X, 1.)
49. 49. m0 m1 m2b0 b1 b2 X (pixels)  relu( m0x + b0) relu( m1x + b1) softmax( m2h1 + b2) Y_true  Dropout(h1) Dropout(h2) m0_test m1_test m2_testb0_test b1_test b2_test X (pixels)  relu( m0x + b0) relu( m1x + b1) softmax( m2h1 + b2) Y_true  Dropout(h1) Dropout(h2) Y_pred = model(X, 0.8) Y_pred = model(X, 1.)
50. 50. m0 m1 m2b0 b1 b2 X (pixels)  relu( m0x + b0) relu( m1x + b1) softmax( m2h1 + b2) Y_true  Dropout(h1) Dropout(h2) m0_test m1_test m2_testb0_test b1_test b2_test X (pixels)  relu( m0x + b0) relu( m1x + b1) softmax( m2h1 + b2) Y_true  Dropout(h1) Dropout(h2) Y_pred = model(X, 0.8) Y_pred = model(X, 1.)
51. 51. m0 m1 m2b0 b1 b2 X (pixels)  relu( m0x + b0) relu( m1x + b1) softmax( m2h1 + b2) Y_true  Dropout(h1) Dropout(h2) m0 m1 m2b0 b1 b2 X (pixels)  relu( m0x + b0) relu( m1x + b1) softmax( m2h1 + b2) Y_true  Dropout(h1) Dropout(h2) Y_pred = model(X, 0.8) Y_pred = model(X, 1.)
52. 52. Parameter Sharing (correct) def model(X, p_keep): m0 = tf.get_variable('m0', [784, 256]) b0 = tf.get_variable('b0', , initializer=tf.constant_initializer(0.)) m1 = tf.get_variable('m1', [256, 256]) b1 = tf.get_variable('b1', , initializer=tf.constant_initializer(0.)) m2 = tf.get_variable('m2', [256, 10]) b2 = tf.get_variable('b2', , initializer=tf.constant_initializer(0.)) h1 = tf.nn.relu(tf.nn.xw_plus_b(X, m0, b0)) h1 = tf.nn.dropout(h1, p_keep) h2 = tf.nn.relu(tf.nn.xw_plus_b(h1, m1, b1)) h2 = tf.nn.dropout(h2, p_keep) output = tf.nn.xw_plus_b(h2, m2, b2) return output with tf.variable_scope(“model”) as scope: Y_pred = model(X, 0.8) scope.reuse_variables() Y_pred_test = model(X, 1.)
53. 53. Collections def model(X): m0 = tf.get_variable('m0', [784, 256]) b0 = tf.get_variable('b0', , initializer=tf.constant_initializer(0.)) m1 = tf.get_variable('m1', [256, 256]) b1 = tf.get_variable('b1', , initializer=tf.constant_initializer(0.)) m2 = tf.get_variable('m2', [256, 10]) b2 = tf.get_variable('b2', , initializer=tf.constant_initializer(0.)) h1 = tf.nn.relu(tf.nn.xw_plus_b(X, m0, b0)) h2 = tf.nn.relu(tf.nn.xw_plus_b(h1, m1, b1)) tf.add_to_collection(“activations”, h1) tf.add_to_collection(“activations”, h2) output = tf.nn.xw_plus_b(h2, m2, b2) return output Y_pred = model(X)
54. 54. Collections activations = tf.get_collection(‘activations’) activations_values = session.run(activations) parameters = tf.get_collection(‘trainable_parameters’) parameter_values = session.run(parameters)
55. 55. X = tf.placeholder(tf.float32, [128, 784]) Placeholders
56. 56. X = tf.placeholder(tf.float32, [None, 784]) Placeholders
57. 57. Placeholders X = tf.placeholder(tf.float32, [None, 784]) model = … cost = … optimizer = … for i in range(1000): trX, trY = mnist.train.next_batch(128) sess.run(optimzer, feed_dict={X: trX, Y_true: trY})
58. 58. Placeholders X = tf.placeholder(tf.float32, [None, 784]) model = … cost = … optimizer = … for i in range(1000): trX, trY = mnist.train.next_batch(128) sess.run(optimzer, feed_dict={X: trX, Y_true: trY})
59. 59. Placeholders X = tf.placeholder(tf.float32, [None, 784]) model = … cost = … optimizer = … for i in range(1000): trX, trY = mnist.train.next_batch(512) sess.run(optimzer, feed_dict={X: trX, Y_true: trY})
60. 60. Advanced Tensorflow: Building RNNs Note – Most of the code for the generation is “pseudo-code” meant mostly to illustrate my point. If you wish to see the actual code, feel free to email me and I’ll send you a copy.
61. 61. RNNs “The food at the restaurant, was very good” 
62. 62. RNNs [The, food, at, the, restaurant, was, very, good] 
63. 63. RNNs [The, food, at, the, restaurant, was, very, good]  t = 7 t = 0 t = 1
64. 64. RNNs 𝑌𝑡 = 𝑡𝑎𝑛ℎ(𝑚 𝑥 𝑋𝑡 + 𝑚ℎℎ 𝑡−1 + 𝑏) mxXt + mhht-1+ b Xt ht ht-1
65. 65. RNNs X = tf.placeholder(tf.float32, [28, 128, 28]) X_split = [tf.squeeze(x) for x in tf.split(0, 28, X)] rnn = tf.nn.rnn_cell.BasicRNNCell(256, 28) outputs, states = tf.nn.rnn(rnn, X_split, dtype=tf.float32)
66. 66. Scan elems = [1, 2, 3, 4, 5, 6] def step(a, x): return a + x sum = scan(step, elems) >>> sum = [1, 3, 6, 10, 15, 21]
67. 67. RNNs with Scan X = tf.placeholder(tf.float32, [28, 128, 28]) m_x = tf.get_variable(‘m_x’, [28, 256]) m_h = tf.get_variable(‘m_h’, [256, 256]) def step(h_tm1, x): return tf.tanh(tf.nn.xw_plus_b(x, m_x, b_x) + tf.nn.xw_plus_b(h_tm1, m_h, b_h)) states = tf.scan(step, X, initializer=tf.zeros(256)) 𝑌𝑡 = 𝑡𝑎𝑛ℎ(𝑚 𝑥 𝑋𝑡 + 𝑚ℎℎ 𝑡−1 + 𝑏)
68. 68. MNIST Generation
69. 69. MNIST Generation t = 28 t = 0 t = 28 t = 0
70. 70. X = tf.placeholder(tf.float32, [27, 128, 28]) # first 27 rows of image Y = tf.placeholder(tf.float32, [27, 128, 28]) # last 27 rows of image m_output = tf.get_variable(tf.float32, [256, 28]) b_output = tf.get_variable(tf.float32, ) states = rnn(X) output_img = tf.map_fn(lambda x: tf.nn.xw_plus_b(x, m_output, b_output), tf.pack(states)) cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(output_ img, Y)) Language Model
71. 71. def generate(num_steps): states = [tf.zeros([batch_size, hidden_dim])] for _ in range(num_steps): next_output = tf.sigmoid(tf.nn.xw_plus_b(states[-1], m_output, b_output)) outputs.append(next_output) state = gru.step_(states[-1], outputs[-1])) states.append(state) return tf.pack(outputs) Language Model (Generate)
72. 72. Language Model (Generations)
73. 73. Seq2Seq Language Model (RNN) Encoder (RNN) Input Digit Output Digit Take Final State
74. 74. X = tf.placeholder(tf.float32, [27, 128, 28]) # first 27 rows of image Y_in = tf.placeholder(tf.float32, [27, 128, 28]) # first 27 rows of target image Y_out = tf.placeholder(tf.float32, [27, 128, 28]) # last 27 rows of target image m_output = tf.get_variable(tf.float32, [256, 28]) b_output = tf.get_variable(tf.float32, ) with tf.variable_scope(“encoder”) as scope: encoded_states = rnn(X) final_state = tf.reverse(encoded_states, [True, False, False])[0, :, :] with tf.variable_scope(“decoder”) as scope: output_states = rnn(Y_in, initializer=final_state) output_img = tf.map_fn(lambda x: tf.nn.xw_plus_b(x, m_output, b_output), tf.pack(output_states)) cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(output_img, Y_out)) Seq2Seq
75. 75. Seq2Seq (Generations)
76. 76. Q and A