Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016

504 views

Published on

DL4J and DataVec for Enterprise Deep Learning Workflows: Applications in NLP, sensor processing (IoT), image processing, and audio processing have all emerged as prime deep learning applications. In this session we will take a look at a practical review of building practical and secure Deep Learning workflows in the enterprise. We’ll see how DL4J’s DataVec tool enables scalable ETL and vectorization pipelines to be created for a single machine or scale out to Spark on Hadoop. We’ll also see how Deep Networks such as Recurrent Neural Networks are able to leverage DataVec to more quickly process data for modeling.

Published in: Technology
  • Be the first to comment

Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016

  1. 1. skymind.io | deeplearning.org | gitter.im/deeplearning4j DL4J and DataVec Building Production Class Deep Learning Workflows for the Enterprise Josh Patterson / Director Field Org MLConf 2016 / Atlanta, GA
  2. 2. Josh Patterson Director Field Engineering / Skymind Co-Author: O’Reilly’s “Deep Learning: A Practitioners Approach” Past: Self-Organizing Mesh Networks / Meta-Heuristics Research Smartgrid work / TVA + NERC Principal Field Architect / Cloudera
  3. 3. Topics • Deep Learning in Production for the Enterprise • DL4J and DataVec • Example Workflow: Modeling Sensor Data with RNNs
  4. 4. Deep Learning in Production
  5. 5. Defining Deep Learning Higher neuron counts than in previous generation neural networks Different and evolved ways to connect layers inside neural networks More computing power to train Automated Feature Learning “machines that learn to represent the world”
  6. 6. Quick Usage Guide • If I have Timeseries or Audio Input: Use a Recurrent Neural Network • If I have Image input: Use a Convolutional Neural Network • If I have Video input: Use a hybrid Convolutional + Recurrent Architecture!
  7. 7. The Challenge of the Fortune 500 Take business problem and translate it into a product-izable solution • Get data together • Understand modeling, pull together expertise Get the right data workflow / infra architecture to production-ize application • Security • Integration
  8. 8. “Google is living a few years in the future and sending the rest of us messages” -- Doug Cutting in 2013 However Most organizations are not built like Google (and Jeff Dean does not work at your company…) Anyone building Next-Gen infrastructure has to consider these things
  9. 9. Production Considerations • Security – even though I can build a model, will IT let me run it? • Data Warehouse Integration – can I easily run this In the existing IT footprint? • Speedup – once I need to go faster, how hard is it to speed up modeling?
  10. 10. DL4J and DataVec
  11. 11. DL4J and DataVec • DL4J – ASF 2.0 Licensed JVM Platform for Enterprise Deep Learning • DataVec - a tool for machine learning ETL (Extract, Transform, Load) operations. • Both run natively on Spark on CPU or GPU as Backends • DL4J Suite certified on CDH5, HDP2.4, and upcoming IBM IOP platform.
  12. 12. ND4J: The Need for Speed JavaCPP • Auto generate JNI Bindings for C++ • Allows for easy maintenance and deployment of C++ binaries in Java CPU Backends • OpenMP (multithreading within native operations) • OpenBLAS or MKL (BLAS operations) • SIMD-extensions GPU Backends • DL4J supports Cuda 7.5 (+cuBLAS) at the moment, and will support 8.0 support as soon as it comes out. • Leverages cuDNN as well https://github.com/deeplearning4j/dl4j-benchmark
  13. 13. Prepping Data is Time Consuming http://www.forbes.com/sites/gilpress/2016/03/23/data-preparation-most-time-consuming-least-enjoyable-data-science-task-survey- says/#633ea7f67f75
  14. 14. Preparing Data for Modeling is Hard
  15. 15. DL4J Workflow Toolchain ETL (DataVec) Vectorization (DataVec) Modeling (DL4J) Evaluation (Arbiter) Execution Platforms: Spark/Hadoop, Single Machine ND4J - Linear Algebra Runtime: CPU, GPU
  16. 16. Modeling Sensor Data with RNNs and DL4J
  17. 17. NERC Sensor Data Collection openPDC PMU Data Collection circa 2009 • 120 Sensors • 30 samples/second • 4.3B Samples/day • Housed in Hadoop
  18. 18. Classifying UCI Sensor Data: Trends A – Downward Trend B – Cyclic C – Normal D – Upward Shift E – Upward Trend F – Downward Shift
  19. 19. Loading and Transforming Timeseries Data with DataVec SequenceRecordReader trainFeatures = new CSVSequenceRecordReader(); trainFeatures.initialize(new NumberedFileInputSplit(featuresDirTrain.getAbsolutePath() + "/%d.csv", 0, 449)); SequenceRecordReader trainLabels = new CSVSequenceRecordReader(); trainLabels.initialize(new NumberedFileInputSplit(labelsDirTrain.getAbsolutePath() + "/%d.csv", 0, 449)); int minibatch = 10; int numLabelClasses = 6; DataSetIterator trainData = new SequenceRecordReaderDataSetIterator(trainFeatures, trainLabels, minibatch, numLabelClasses, false, SequenceRecordReaderDataSetIterator.AlignmentMode.ALIGN_END); //Normalize the training data DataNormalization normalizer = new NormalizerStandardize(); normalizer.fit(trainData); //Collect training data statistics trainData.reset(); trainData.setPreProcessor(normalizer); //Use previously collected statistics to normalize on-the-fly
  20. 20. Configuring a Recurrent Neural Network with DL4J MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder() .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT).iterations(1) .updater(Updater.NESTEROVS).momentum(0.9).learningRate(0.005) .gradientNormalization(GradientNormalization.ClipElementWiseAbsoluteValue) .gradientNormalizationThreshold(0.5) .list() .layer(0, new GravesLSTM.Builder().activation("tanh").nIn(1).nOut(10).build()) .layer(1, new RnnOutputLayer.Builder(LossFunctions.LossFunction.MCXENT) .activation("softmax").nIn(10).nOut(numLabelClasses).build()) .pretrain(false).backprop(true).build(); MultiLayerNetwork net = new MultiLayerNetwork(conf); net.init();
  21. 21. Train the Network on Local Machine int nEpochs = 40; String str = "Test set evaluation at epoch %d: Accuracy = %.2f, F1 = %.2f"; for (int i = 0; i < nEpochs; i++) { net.fit(trainData); //Evaluate on the test set: Evaluation evaluation = net.evaluate(testData); System.out.println(String.format(str, i, evaluation.accuracy(), evaluation.f1())); testData.reset(); trainData.reset(); }
  22. 22. Train the Network on Spark TrainingMaster tm = new ParameterAveragingTrainingMaster(true,executors_count,1,batchSizePerWorker,1,0); //Create Spark multi layer network from configuration SparkDl4jMultiLayer sparkNetwork = new SparkDl4jMultiLayer(sc, net, tm); int nEpochs = 40; String str = "Test set evaluation at epoch %d: Accuracy = %.2f, F1 = %.2f"; for (int i = 0; i < nEpochs; i++) { sparkNetwork.fit(trainDataRDD); //Evaluate on the test set: Evaluation evaluation = net.evaluate(testData); System.out.println(String.format(str, i, evaluation.accuracy(), evaluation.f1())); testData.reset(); trainData.reset(); }
  23. 23. Thank you! Please visit skymind.io/learn for more information OR Visit us at booth P33

×