Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017

167 views

Published on

Artificial Intelligence has entered a renaissance thanks to rapid progress in domains as diverse as self-driving cars, intelligent assistants, and game play. Underlying this progress is Deep Learning – driven by significant improvements in Graphic Processing Units and computational models inspired by the human brain that excel at capturing structures hidden in massive complex datasets. These techniques have been pioneered at research universities and digital giants but mainstream enterprises are starting to apply them as open source tools and improved hardware become available. Learn how AI is impacting analytics today and in the future.
Learn how AI is affecting the enterprise including applications like fraud detection, mobile personalization, predicting failures for IoT and text analysis to improve call center interactions. We look at how practical examples of assessing the opportunity for AI, phased adoption, and lessons going from research, to prototype, to scaled production deployment.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017

  1. 1. AI in the Enterprise: Past, Present & Future Paul Huibers, Think Big Analytics
  2. 2. 2 “ By 2020 AI will be a top five investment priority for more than 30% of CIOs. —Gartner BI Summit, February, 2017 “The Resurgence of AI By 2019, deep learning will provide best- in-class performance for demand, fraud, and failure prediction. - Gartner
  3. 3. 3 AI First
  4. 4. 4 • Introduction to AI/DL • AI in Industry • Case Study: Financial Fraud • Pilot to Production Agenda
  5. 5. 5 What is AI? Artificial Intelligence is usually defined as the science of making computers do things that require intelligence when done by humans.
  6. 6. 6 AI: A Brief History… …and now, Deep Learning! • 1940’s – early concepts developed • 1980’s – more concepts – copy the brain, neurons, perceptron – backpropagation for training • 1990’s – LeCun handwriting reader • AI winter • 2009 – Netflix Prize $1M • 2010 – first ImageNet competition • 2012 - AI/deep learning comes of age • ImageNet classification error: – 2011: 25% using traditional methods – 2012: 16% achieved by a ConvNet – 2013: 11% – 2014: 6.7% – 2015: 3.6% – 2016: < 3% • 3% ~ human error rate (expert group) • 0.3% mislabeling • (1000 categories of images) • What changed since the 1990s? – 10,000X computing power, GPUs – massive labeled datasets
  7. 7. 7 Deep Learning Innovation in Computer Vision Recent ImageNet Results
  8. 8. 8 ImageNet 1.2 million images 1000 categories lots of animals… a jungle of viewpoints, lighting conditions, and variations of all imaginable types. …a jungle of viewpoints, lighting conditions, and variations of all imaginable types. – Karpathy
  9. 9. 9 What is Deep Learning? • A machine learning method that involves learning data representations rather than task-specific algorithms • Deep Neural Networks – an artificial neural network with multiple hidden layers of “neurons” between the input and the output • Artificial Neural Networks – computing systems inspired by biological neural networks, involving a collection of connected units, with learned weights and activation functions between the units How is it achieved?
  10. 10. 10 Deep Neural Networks How are they different? • Multiple hidden layers in neural network with intermediate data representations to facilitate dimensional reduction • Interpret non-linear relationships in the data through activation functions • Derive patterns from data with very high dimensionality Why do we care? • Ability to create value with little or no domain knowledge required • Ability to incorporate data from across multiple, seemingly unrelated sources • Ability to tolerate very noisy data
  11. 11. 11 Data Quantity Drives Deep Learning Performance Andrew Ng Amount of Labeled Data ModelPerformance 1990’s Small Training Sets Traditional ML Small NN Medium NN Large NN
  12. 12. 12 Deep Learning Architectures Convolutional Neural Network (ConvNet or CNN) • CNN = Convolution + Pooling + ReLu + Fully Connected • Convolution Layers are composable so can be chained • Primary use: any problem that has a high dimensional input (ex.: Image Labeling)
  13. 13. 13 Specialized APIs General Purpose Frameworks AI Framework Landscape Vision Language Speech Keras • Pretrained (fast) • Public • Google/Microsoft/Amazon • Need to be trained (expensive) • Private • Fully customizable
  14. 14. 14 Touched by AI… • Cognitive successes • Siri, Alexa, OK Google! – Understanding words – Understanding context – Language translation • Face detection in images • Recommender systems • How about some practical examples from industry?
  15. 15. 15 • Introduction to AI/DL • AI in Industry • Case Study: Financial Fraud • Pilot to Production Agenda
  16. 16. 16 Proven applications of Deep Learning ANOMALY DETECTION Enables real-time detection of abnormal patterns of data, usually time- series events. PREDICTIVE MAINTENANCE Improves preventative measures & performance with greater accuracy at the asset & component level RECOMMENDER SYSTEMS Enable more effective search rankings based on context, in accordance with a particular objective such as purchase or click-through SPEECH RECOGNITION Enable capture of voice to text with higher fidelity of speech transcription and improved precision of speaker identification COMPUTER VISION Enables dramatically more accurate visual recognition tasks that include image classification, detection and localization DOCUMENT AUTOMATION Enables automation of manual, paper- based processes that are human- intensive with higher speed, accuracy and fidelity
  17. 17. 17 Industry Specific Use Cases High-Dimensional Data Image Video Audio Time Series Text • Many already have working solutions using non-DL Machine Learning Techniques • Deep Learning is delivering improvement in performance on complex problems Automotive Retail • Navigation, Guidance, Assistance • Predictive Maintenance • Visual Search • Recommendation • Text Analytics • Assistants • Brand Analytics Manufacturing & High-Tech Health Care • Image/Audio/Video • Reinforcement Learning – Systems Optimization • Plant Operations Optimization • Image-based Analysis • Drug Discovery Financial Services & Insurance Cross-Industry • Anti-Fraud • Portfolio Optimization • Damage Assessment • Cyber Security • Call Center Audio
  18. 18. 18 Large European Logistics Provider • Increasing use of plastic bags in shipping • Challenges with existing package sorting and identification system • Use Deep Learning Image Analytics to improve identification and sorting • Tools: TensorFlow, Hadoop • Techniques: – Deep Learning: Convolutional Neural Network
  19. 19. 19 • Road objects, traffic and accident events are manually reported or not at all • Automated object detection and scene labeling system from car camera feed to improve navigation and traffic • Tools: Darknet, Caffe, TensorFlow • Techniques: – Object Detection: Single Shot MultiBox Detector (SSD), You Only Look Once (YOLO) – Scene Labeling: Convolutional Neural Network Large Auto Parts Manufacturer Use Case Real-Time Streaming Streaming Results Traffic Data Service Navigation Update Darknet/Darkflow – Object Detection TensorFlow – Scene Labeling Cloud GPU Based Training TF Serving Cloud GPU Based Inference Model Updates
  20. 20. 20 • Handwritten check volume is decreasing however processing checks has many fixed costs • Handwriting recognition to reduce manual processing and fraud examination resulting in cost savings • Tools: Spark, Hadoop, TensorFlow • Techniques: – Convolutional Neural Network – Image Processing Large US Multinational Bank Check Images To Hadoop ImageMagick Processing Handwriting Recognition Fraud Detection
  21. 21. 21 • Predict failure of pistons on large container ships to reduce unplanned and costly maintenance • Utilized sensor data to predict piston wear between 70-80% • Failures are extremely infrequent so there is a risk of overfitting • Tools: R, Hadoop, Spark, AWS • Techniques: – ROC curve – Internet of Things data – Methods to prevent overfitting Large Container Shipping Company Container Ship Sensors Predict Failures 1 month (December) High Low Abnormalcylinderbehaviour* Lead time Port stays PROB1 0.0 0.2 0.4 0.6 0.8 2015-12-06 2015-12-13 2015-12-20 2015-12-27 Piston ring change Cylinders Piston ring(s) changed Threshold Abnormal behavior: Everything above the threshold triggers an alarm Other cylinders below threshold Worn piston ring was changed Each point is a combination of selected sensor data for a specific cylinder
  22. 22. 22 Large European Railway • Detecting rail switch failures • Allows for switches to be fixed ahead of time thus not delaying trains • Tools: R (Shiny and Studio), Hadoop, Spark • Techniques: – Survival Analysis – Machine Learning – Internet of Things data Railway Switch Sensor Data Visualize Failures and Act
  23. 23. 23 • Fraud detection across products • Trends – Mobile payments exploding – Fraud evolving rapidly, increased sophistication • Significant improvements over traditional rules-based techniques • Tools: Spark, Hadoop, TensorFlow • Techniques: – Boosted Decision Trees – Convolutional Neural Network Large European Bank
  24. 24. 24 State of AI in Industry Successes • Computer vision (e.g., ImageNet) • Speech & NLP • Simplification of general-purpose ML (e.g., recommendation) • Rapid advance of state of art, growth of expertise & applications • Major investment programs in industry Challenges • Research-driven, fundamentals change • Mostly empirical, little theory • Complexity in solution design • Limited access to talent • AI/DL still requires governed data, and Analytics Ops integration • Gaps in enterprise deployment beyond lock-in clouds
  25. 25. 25 • Introduction to AI/DL • AI in Industry • Case Study: Financial Fraud • Pilot to Production Agenda
  26. 26. 26 Fighting Financial Fraud with Artificial Intelligence at Danske Bank
  27. 27. 27 Data Driven Approach to Fight Fraud Fast evolving fraud sophistication Ambitions for Fraud Project Danske Bank advanced analytics blueprint Data driven approach to real time scoring of transactions Reduce false-positives & Enhance fraud detection rate ONLY ~40% of fraud cases are detected using rules Low Detection Rate 99.5% of cases are not fraud related Many false positives Challenges for Fraud Detection Tens of Millions € lost each month High Fraud Loss © 2017 Teradata
  28. 28. 28 Modeling Challenges © 2017 Teradata • Class imbalance (100,000:1 non-fraud vs. fraud) • Assigning fraud labels from historic data • Fraud is ambiguous • Not all features available in real-time (balance, etc.) • Most machine learning sees transactions atomically
  29. 29. 29 Current models can only catch ~70% of all fraud cases Deep Learning Opportunity Traditional ML models view transactions atomically Often missed fraud transactions are part of a series Capturing correlation across many features © 2017 Teradata
  30. 30. 30 Machine Learning Results (Live System: 60 transactions/sec.) Ensemble of boosted decision trees and logistic regression. From online validation of the model: ● 25-30% false positive reduction, with over 35% increase in detection rate ● Opportunity to expand model with additional features, retrain on recent data and add additional models to the ensemble. ● Models can be expanded to additional channels Rule Engine on validation set © 2017 Teradata
  31. 31. 31 Three Deep Learning Architectures to Deliver Value • Designed for spatial correlated features, but by transforming transactions into a 2D image, we can learn temporal correlated features. • Deeper ConvNet allows learning more complex & general features. Goal: Learn kernels from temporal & static features to gain insight into the characteristics of fraud. • Learn temporal information and classify if the sequence of transactions contains fraud. • Shares knowledge across learning time. Goal: Learn transaction patterns within a window. Two solutions can be tested: flag fraud or predict next transaction and define an error. • Learn how to generate normal transactions, potentially large volumes of non-fraud data. • AE provide a low level representation of the data. Goal: Build a model that learns how to generate non-fraud data. To detect fraud, define a reconstruction error rate for the fraud cases Auto-Encoders LSTM ConvNet © 2017 Teradata
  32. 32. 32 How Can We Create an Image From Bank Transactions? t0 X_0, X_1, ... X_n dt t1 X_0, X_1, ... X_n t2 X_0, X_1, ... X_n ts X_0, X_1, ... X_n ... Top k Features Correlation ... X_0 X_41 X_5 X_30 X_29X_31X_10 X_37 X_3 X_1 X_42 X_40 X_32 X_15X_35X_2 X_16 X_31 X_2 X_3 X_15 X_4 X_1X_28X_40 X_31 X_49 X_n X_26 X_9 X_40 X_35X_28X_2 X_17 X_1 ... X_0 X_41 X_5 X_30 X_29X_31X_10 X_37 X_3 X_1 X_42 X_40 X_32 X_15X_35X_2 X_16 X_31 X_2 X_3 X_15 X_4 X_1X_28X_40 X_31 X_49 X_n X_26 X_9 X_40 X_35X_28X_2 X_17 X_1 ... ... ... ... ... X_0 X_41 X_5 X_30 X_29X_31X_10 X_37 X_3 X_1 X_42 X_40 X_32 X_15X_35X_2 X_16 X_31 X_2 X_3 X_15 X_4 X_1X_28X_40 X_31 X_49 X_n X_26 X_9 X_40 X_35X_28X_2 X_17 X_1 N dt t0 t1 ts Input Output Raw Features Add correlated features in a clock-wise manner © 2017 Teradata Image size is: [10 x 3, 50 x 3, 1] Time Strides of 3
  33. 33. 33 2D Transaction Image Example © 2017 Teradata Non-fraud Transaction Image Non-fraud Fraud Transaction Image X-axis: features, Y-axis: time Fraud Non-fraud
  34. 34. 34 Network Architecture for CNNs Fraud Normal 50 30 25 15 13 8CNN Fraud Normal50 30 25 15 25 15 25 15 25 15 © 2017 Teradata
  35. 35. 35 Inside the ResNet model 64 Filters Activations After the CNN Residual Blocks FraudNon-fraud © 2017 Teradata
  36. 36. 36 Dramatically Improve Fraud Detection over Traditional Rule Engine © 2017 Teradata Our Deep Learning Rules Engine False Positive Rates Our Deep Learning vs Rule Engine at 9% FP 40% Gain TruePositiveRates Classic Machine Learning Our Deep Learning vs Machine Learning at 1% FP 44% Gain
  37. 37. 37 Lessons Learned: Take-Aways From Danske Bank Deep learning adoption from pictures to financial transactions Enhancement of data quality & cluster capabilities with data ingestion Building Analytics Ops capabilities to support business units Leveraging experience from Fraud advanced analytics to deliver extra use cases © 2017 Teradata
  38. 38. 38 • Introduction to AI/DL • AI in Industry • Case Study: Financial Fraud • Pilot to Production Agenda
  39. 39. 39 Operationalization is Hard “We evaluated some of the new methods offline but the additional accuracy gains that we measured did not seem to justify the engineering effort needed to bring them into a production environment.” - Netflix, 2012
  40. 40. 40 Focus First on Pilot into Production Sets up Phase Two: Scale COE, Standardize Capabilities Investigate Test Engineer SimulateIntegration Analyze Data Go Live Handover Validate Activities: Define business opportunity, understand data available, test model approaches, potentially generate data Outcome: Proposed solution approach Discovery/Insights Activities: Architecture selection, software engineering of model and simulation Outcome: Predicted impact of model Live Test Activities: Integration into live business process (Champion/Challenger), analysis, iteration Outcome: Benefit measurement, live learnings, improvement Production Activities: Go Live, Analytics Ops integration, Hand Over Outcome: System scaled, application teams and ops trained and operating Assessment Insights Production Live Test Cross-Functional Teams Cross-Functional Teams
  41. 41. 4141 For more information, please contact: Paul.Huibers@ThinkBigAnalytics.com 603-395-6567 Thank You StampedeCon! stampedecon.com/ai-summit-2017-st-louis/
  42. 42. 42 The Future… and more… Architectural innovations: RNN, LSTM, GAN and more Better training through new optimization, new activation functions and more Transfer learning, pre-training and more Theory catching up with practice (Tishby) – relevant information, bottlenecks 10,000X speed improvement would make many things possible – Moore’s Law Unsupervised learning Learning with few samples AGI – artificial general intelligence Singularity

×