- 1. B2B Travel technology Machine Learning Introduction October 20th 2016
- 3. Agenda What’s Machine Learning ? Usage examples Complexity Algorithm families Let’s go! Troubleshoot Tech insights Next steps Conclusion 3!
- 5. What’s Machine Learning ? Software that do something without being explicitly programmed to, just by learning through examples Same software can be used for various tasks It learns from experiences with respect to some task and performance, and improves through experience 5!
- 6. Usage examples (1/2) 6! Some typical usage examples
- 7. Use cases : MyLittleAdventure (2/2) 7 Language detection Clustering Anomaly detection Recommendation Chose of parameters MyLittleAdventure usage !
- 8. Complexity 8! """Tests for convolution related functionality in tensorﬂow.ops.nn.""" from __future__ import absolute_import from __future__ import division from __future__ import print_function import numpy as np from six.moves import xrange # pylint: disable=redeﬁned-builtin import tensorﬂow as tf class Conv2DTransposeTest(tf.test.TestCase): def testConv2DTransposeSingleStride(self): with self.test_session(): strides = [1, 1, 1, 1] # Input, output: [batch, height, width, depth] x_shape = [2, 6, 4, 3] y_shape = [2, 6, 4, 2] # Filter: [kernel_height, kernel_width, output_depth, input_depth] f_shape = [3, 3, 2, 3] x = tf.constant(1.0, shape=x_shape, name="x", dtype=tf.ﬂoat32) f = tf.constant(1.0, shape=f_shape, name="ﬁlter", dtype=tf.ﬂoat32) output = tf.nn.conv2d_transpose(x, f, y_shape, strides=strides, padding="SAME") value = output.eval() # We count the number of cells being added at the locations in the output. # At the center, #cells=kernel_height * kernel_width # At the corners, #cells=ceil(kernel_height/2) * ceil(kernel_width/2) # At the borders, #cells=ceil(kernel_height/2)*kernel_width or # kernel_height * ceil(kernel_width/2) for n in xrange(x_shape[0]): for k in xrange(f_shape[2]): for w in xrange(y_shape[2]): for h in xrange(y_shape[1]): target = 4 * 3.0 h_in = h > 0 and h < y_shape[1] - 1 w_in = w > 0 and w < y_shape[2] - 1 if h_in and w_in: target += 5 * 3.0 """GradientDescent for TensorFlow.""" from __future__ import absolute_import from __future__ import division from __future__ import print_function from tensorﬂow.python.framework import ops from tensorﬂow.python.ops import math_ops from tensorﬂow.python.training import optimizer from tensorﬂow.python.training import training_ops class GradientDescentOptimizer(optimizer.Optimizer): """Optimizer that implements the gradient descent algorithm. @@__init__ """ def __init__(self, learning_rate, use_locking=False, name="GradientDescent"): """Construct a new gradient descent optimizer. Args: learning_rate: A Tensor or a ﬂoating point value. The learning rate to use. use_locking: If True use locks for update operations. name: Optional name preﬁx for the operations created when applying gradients. Defaults to "GradientDescent". """ super(GradientDescentOptimizer, self).__init__(use_locking, name) self._learning_rate = learning_rate def _apply_dense(self, grad, var): return training_ops.apply_gradient_descent( var, math_ops.cast(self._learning_rate_tensor, var.dtype.base_dtype), grad, use_locking=self._use_locking).op def _apply_sparse(self, grad, var): delta = ops.IndexedSlices( grad.values * """Tests for tensorﬂow.ops.linalg_grad.""" from __future__ import absolute_import from __future__ import division from __future__ import print_function import numpy as np import tensorﬂow as tf class ShapeTest(tf.test.TestCase): def testBatchGradientUnknownSize(self): with self.test_session(): batch_size = tf.constant(3) matrix_size = tf.constant(4) batch_identity = tf.tile( tf.expand_dims( tf.diag(tf.ones([matrix_size])), 0), [batch_size, 1, 1]) determinants = tf.matrix_determinant(batch_identity) reduced = tf.reduce_sum(determinants) sum_grad = tf.gradients(reduced, batch_identity)[0] self.assertAllClose(batch_identity.eval(), sum_grad.eval()) class MatrixUnaryFunctorGradientTest(tf.test.TestCase): pass # Filled in below def _GetMatrixUnaryFunctorGradientTest(functor_, dtype_, shape_, **kwargs_): def Test(self): with self.test_session(): np.random.seed(1) m = np.random.uniform(low=-1.0, high=1.0, size=np.prod(shape_)).reshape(shape_).astype(dtype_) a = tf.constant(m) b = functor_(a, **kwargs_) # Optimal stepsize for central difference is O(epsilon^{1/3}). epsilon = np.ﬁnfo(dtype_).eps delta = 0.1 * epsilon**(1.0 / 3.0) # tolerance obtained by looking at actual differences using # np.linalg.norm(theoretical-numerical, np.inf) on -mavx build Complex algorithm before train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy) And Machine learning now…
- 13. Recipe !13 Collect Training data Files, database, cache, data ﬂow Selection of model, and (hyper) parameters Train algorithm Use or store your trained estimator Make predictions Measure accuracy precision Measure
- 14. Collect training data Get qualitative data Get some samples Don’t get data for months and then try Go fast and try things. 14 Weight (g) Width (cm) Height (cm) Label 192 8.4 7.3 Granny smith apple 86 6.2 4.7 Mandarin 178 7.1 7.8 Braeburn apple 162 7.4 7.2 Cripps pink apple 118 6.1 8.1 Unidentiﬁed lemons 144 6.8 7.4 Turkey orange 362 9.6 9.2 Spanish jumbo orange … … … … What about the data? Fruit identiﬁcation example
- 15. Prepare your data 15 Numerize your features and labels Put them in same scale (normalization) ? Weight (g) Width (cm) Height (cm) Label 192 8.4 7.3 1 86 6.2 4.7 2 178 7.1 7.8 3 162 7.4 7.2 5 118 6.1 8.1 10 144 6.8 7.4 8 362 9.6 9.2 9 … … … … We need to have some tests Training set Learning phase (60% - 80 %) Test set Analytics phase (20% - 40%)
- 16. 16 Prepare your data (code)
- 17. Train algorithm 17 Choose a classiﬁer Fit the decision tree Weight (g) Width (cm) Height (cm) Label 192 8.4 7.3 1 86 6.2 4.7 2 178 7.1 7.8 3 162 7.4 7.2 5 118 6.1 8.1 10 144 6.8 7.4 8 362 9.6 9.2 9 … … … … We need to choose an estimator
- 18. Make predictions 18 What looks our predictions? Weight (g) Width (cm) Height (cm) Label 192 8.4 7.3 1 86 6.2 4.7 2 178 7.1 7.8 3 Test set Weight (g) Width (cm) Height (cm) Label 192 8.4 7.3 1 86 6.2 4.7 2 178 7.1 7.8 1 Predictions !
- 19. Measure (1/2) Evaluate on the dataset that as never ever been learned by your model 19! Accuracy Correct predictions / total predictions Gives a simple conﬁdence score of our performance level
- 20. Measure (2/2) Try to visualize and analyze your data, and know what you want 20! Actual true Actual false Predicted true True positive False positive Predicted false False negative True negative Confusion Matrix Skewed classes Precision = True positives / #predicted positives Recall = True positives / #actual positives F1 score (trade-off) = (precision * recall) / (precision + recall)
- 21. 21 Measure and prediction (code)
- 24. Troubleshoot (2/4) Underﬁtting Add / create more features Use more sophisticated model Use fewer samples Decrease regularization 24! Overﬁtting Use fewer features Use more simple model Use more samples Increase regularization What are the different options ?
- 25. Troubleshoot (3/4) 25! Underﬁtting Overﬁtting Using the learning curves…
- 26. Troubleshoot : Model choice (4/4) 26!
- 28. Platforms : easy, peasy You don’t even have to code to build something (*wink wink* business developers) Built-in models Data munging Model management by UI PaaS 28! Very high-level solutions
- 29. Languages For understanding & prototyping implementation Most Valuable Languages Comfortable for prototyping, yet powerful for industrialisation For bigger companies & projects, and ﬁne-tuned softwares 29! Matlab Octave Go Python Java C++ What language for what purpose ?
- 30. Libraries Built-in models Data munging Fine-tuning Full integration to your product 30! You will have great power using a library Golearn
- 32. Next steps Split your data in 3 : Training / Cross validation / Test set Know the top algorithms Search advanced techniques and optimizers (online learning, stacking) Deep and reinforcement learning Partial and semi-supervised learning Transfer learning How to store and analyse big data ? How do we scale ? !32 Try it ! Find your best tools and have some fun
- 33. Conclusion Try it and let’s get in touch! Machine learning is not just a buzz word Difﬁculties are not always what we think! Machine learning is rather experiences and tests than just algorithms There is no perfect unique solution There is plenty of easy to use solutions for beginners 33!
- 34. Machine learning One more thing! 34
- 35. Tensorﬂow 35
- 37. Thank you Machine Learning Introduction October 20th 2016 Questions ?