Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Practical Artificial Intelligence: Deep Learning Beyond Cats and Cars

863 views

Published on

Developing a Real-life DNN-based Embedded Vision Product
for Agriculture, Construction, Medical, or Retail.

What it takes to succeed in a real-life development of a DNN-based embedded vision product? You have your hardware and software building blocks – want’s next? Learn how to plan and design for deep learning, how to select and cascade algorithms, where to get the training data and how much is enough, and how to optimize and troubleshoot your product.

By now we very well know how to design and train a neural network to recognize cats, dogs and cars. But what about real projects — agriculture, construction, medical, retail? This how-to talk will provide an overview of what it takes to design, train, and fine-tune a real-life DNN-based embedded vision solution. Presentation will explore algorithmic, data set, training, and optimization decisions that take you from proofs-of-concepts to solid, reliable, and highly optimized systems. This material is based on our own successes, failures, and other lessons we learnt while implementing embedded vision solutions over the past few years.

Alexey Rybakov is Senior Director with Luxoft, and manages software R&D, consulting and optimization services in artificial intelligence, deep learning, computer vision, and video processing.

Published in: Technology
  • Be the first to comment

Practical Artificial Intelligence: Deep Learning Beyond Cats and Cars

  1. 1. Copyright © 2017 LUXOFT 1 Alexey Rybakov, LUXOFT May 2017
  2. 2. Copyright © 2017 LUXOFT 2 What This Talk is AboutWhat This Talk is About Key Decisions in a Deep Learning Computer Vision Project 1. Data 2. Compute Pipeline 3. Training and Tuning • Data acquisition, and how much is enough • Data processing pipeline • Compute infrastructure • AI platform and model • Data preparation • Network training • Fine tuning
  3. 3. Copyright © 2017 LUXOFT 3 Why We Are Giving This TalkWhy We Are Giving This Talk Our AI Practice: -Computer vision -..and non-vision AI PharmaPharma AgricultureAgricultureRetailRetail IndustrialIndustrial AutomotiveAutomotive
  4. 4. Copyright © 2017 LUXOFT 4 Introduction: Paradigm Shift in Software R&D 0.
  5. 5. Copyright © 2017 LUXOFT 5 Paradigm Shift in Software R&DParadigm Shift in Software R&D Coding / testing Integration, roll-out and maintenance Developers IT Support Knowledge (Algorithm development) System design SME System Architects Data selection Learning Pipeline Design Data preparation, Training, Tuning Integration, roll-out and maintenance SME AI Architects Data specialists Data and AI Support Access to DataFocus of this talk New Professions!
  6. 6. Copyright © 2017 LUXOFT 6 Data is the King 1.
  7. 7. Copyright © 2017 LUXOFT 7 Paradigm Shift: Another Way to Look At ItParadigm Shift: Another Way to Look At It Was Becoming Subject Matter Experts Data Algorithm Development No algorithm Development Domain (Vertical) Knowledge AI (System) Knowledge Decision Quality is Art Decision Quality is Predictable The single most important investment you can make in AI project is Data Strategy The single most important investment you can make in AI project is Data Strategy
  8. 8. Copyright © 2017 LUXOFT 8 1. Data coverage - needs to be representative - cover all cases (more than you think) 2. Data balance (normalization) - about equal amount of data for each class/case/scenario 3. Amount of data Data Sourcing: It Starts with Data!Data Sourcing: It Starts with Data! Acquire Data Prepare Data Six data guidelines that we consider important …to achieve the best results (accuracy, false positives, false negatives). 4. Data formatting - make it work with existing/selected DNN architectures: Like ROI selection, breaking big picture into smaller, video into frames, etc 5. Data synthesis and augmentation - “cat and mirror reflection of a cat are two different cats”. Often easier to transform existing data than to obtain new data 6. Data annotation
  9. 9. Copyright © 2017 LUXOFT 9 Example: Pharma DataExample: Pharma Data Crystal clear media before dissolution Just after the sample is dropped, and remains a single solid piece When the sample started to swell, still a single piece When the sample continues to swell, and produces many small particles: low-contrast media
  10. 10. Copyright © 2017 LUXOFT 10 Pharma Improper Data SamplingPharma Improper Data Sampling –– ImbalancedImbalanced SetSet Accuracy paradox clear media sample dropped sample dissolving 1 sample dissolving 2 precipitate presented cloudy media Labeled video sample Problem: dataset imbalance •Tricky to identify •What to do: change performance metric for trained network What to do: •Use penalized models: adjust cost function for imbalance •Decompose large classes into smaller •Resample: over- and under-sample to balance
  11. 11. Copyright © 2017 LUXOFT 11 Pharma Improper Data Sampling (contd.)Pharma Improper Data Sampling (contd.) Different optimization approaches to find separate thresholds for each class (1,2,3) 3 options of Uniform class distribution (balanced datasets) Proper data shuffling is important Common approaches used by NVidia DIGITS (conventional 2-stage shuffling) Our custom data shuffling gave us up to 1.3% of increase in accuracy in comparison with conventional scheme Top 1% accuracy Balanced dataset 1 Balanced dataset 2 Balanced dataset 3
  12. 12. Copyright © 2017 LUXOFT 12 Data Processing Pipeline 2.
  13. 13. Copyright © 2017 LUXOFT 13 Machine Learning vs. Deep LearningMachine Learning vs. Deep Learning [Ian Goodfellow et all, ISBN 978-0262035613] Classic ML Deep Learning
  14. 14. Copyright © 2017 LUXOFT 14 100% Deep Learning: possible but not practical •In this diagram, it seems that deep learning works from raw data. In reality this is the most ideal case • Needs infinite data + infinite compute •Practical implementations are still in between Classic ML and DL with a lot of upfront non-neural efforts in data selection and preparation. That is why we need to build data processing pipeline Machine Learning vs. Deep LearningMachine Learning vs. Deep Learning [Ian Goodfellow et all, ISBN 978-0262035613]
  15. 15. Copyright © 2017 LUXOFT 15 Feeding raw data straight to Deep Network is still a dream. Asymptotically possible, but practically inefficient. •Therefore we pipeline processing blocks, like this: Visual Data Processing PipelineVisual Data Processing Pipeline
  16. 16. Copyright © 2017 LUXOFT 16 • Purpose: • Region of interest (ROI) detection • Camera calibration • Detect shakes and shifts • Solution • Based on YOLO (you look only once, http://pjreddie.com/darknet/yolo/) • Processing in real-time Example: Pharma Processing Pipeline (One of theExample: Pharma Processing Pipeline (One of the AI Parts)AI Parts)
  17. 17. Copyright © 2017 LUXOFT 17 • Wobbling detection • Represent video in temporal space • Search for sinusoid amplitude • RPM detection • Treat paddle as a signal • Fourier transformation for frequency detection • Implemented for 20 FPS • Calibration required Example: Pharma Processing Pipeline (One of the Non-Example: Pharma Processing Pipeline (One of the Non- AI Parts)AI Parts) Spatial (left) and temporal (right) representation of video data ROI (left) and signal (right) of paddle appearance
  18. 18. Copyright © 2017 LUXOFT 18 Platform selection (Caffe / Torch / TensorFlow / etc) - is like selecting a java app server, or web server: many similarities and hard to chose, however this is what we use: •Availability of models for your task •Deployment compatibility: both enterprise and embedded • Works with your cloud? Like Amazon AWS or MS Azure? • Works on your device? Like ARM or NVidia GPUs? •Distributed processing choices: Cloud / Edge •Embedded optimization opportunities •Maintenance considerations •Support for training models, tools and scenarios <= important to think ahead Platform Selection GuidelinesPlatform Selection Guidelines
  19. 19. Copyright © 2017 LUXOFT 19 We often hear: “use the latest, greatest model”. However, we’d rather use “simple model for simple data”. Same as “good-enough” concept. Important selection criteria we encounter •Production accuracy •“Cascadability” •Production run-time speed •Training and fine-tuning scenarios: amount and kind of required training data, efforts to train Pharma example: Winner is AlexNet (yes!): quick learner, fast to run, good accuracy on our data Model Selection GuidelinesModel Selection Guidelines Accuracy Training Epoch
  20. 20. Copyright © 2017 LUXOFT 20 • Total number of different classes ~100,000 • We use a cascade of networks • Level 1: Type-search network • Level 2: Different Classification networks for each type • Conventional accuracy metrics for type-search networks: • mAP, Precision, Recall • These metrics always use fixed IoU at =0.5 • IoU = Intersection over Union • Classification accuracy strongly depends on IoU • IoU = 0.5  classification probability =~ 0.68 • IoU > 0.7  classification probability > 0.90 Retail Example: “Cascadability” of a SearchRetail Example: “Cascadability” of a Search NetworkNetwork IoU=1 IoU=0.5 IoU Classification Probability What is IoU:
  21. 21. Copyright © 2017 LUXOFT 21 Model Training and Tuning (Data Again!) 3.
  22. 22. Copyright © 2017 LUXOFT 22 AI System Development Is More IterativeAI System Development Is More Iterative Traditional SW Engineering Deep Learning SW Engineering
  23. 23. Copyright © 2017 LUXOFT 23 Example: Pharma Iterative Training WorkflowExample: Pharma Iterative Training Workflow database: ~5,000 images for each class Uniform distribution of classes required Training Epoch
  24. 24. Copyright © 2017 LUXOFT 24 Example: Pharma Iterative Training WorkflowExample: Pharma Iterative Training Workflow DNN Human Legend Crystal clear media Sample dropped Sample Swelling Sample dissolving Precipitate presented Media Foggy Iterative DNN training Use trained DNN for fixing data labelling that follows "Pseudo-Label" technique Visualize your data
  25. 25. Copyright © 2017 LUXOFT 25 1. Data coverage - needs to be representative - cover all cases (more than you think) 2. Data balance (normalization) - about equal amount of data for each class/case/scenario 3. Amount of data Model Training: Time to Use Your Data!Model Training: Time to Use Your Data! Acquire Data Prepare Data Six data guidelines that we consider important …to achieve the best results (accuracy, false positives, false negatives). 4. Data formatting - make it work with existing/selected DNN architectures: Like ROI selection, breaking big picture into smaller, video into frames, etc 5. Data synthesis and augmentation - “cat and mirror reflection of a cat are two different cats”. Often easier to transform existing data than to obtain new data 6. Data annotation
  26. 26. Copyright © 2017 LUXOFT 26 Data Preparation and Annotation ScenariosData Preparation and Annotation Scenarios Data Annotation (Labeling) Use Cases Real / raw Manual Sometimes good choice for total greenfield. Need human resources. Tools dramatically increase the efficiency Computer, then human Rudimentary AI provides draft annotation. Then humans confirm/correct. Example: house number captcha Human, then computer Humans annotate 1st frame. Then existing CV methods provide object tracking on subsequent frames. Very efficient for dynamic scenes Augmented Automated Use high quality labeled dataset and augment to simulate real-life conditions. Example: “German Traffic Sign” dataset could have been almost entirely synthesized trough augmentation. Synthetic Automated Use 3D rendering software and scripts to generate scenes. Our construction example.
  27. 27. Copyright © 2017 LUXOFT 27 Retail Example. Data Augmentation andRetail Example. Data Augmentation and SynthesisSynthesis Normal Light Conditions WB Deviations ISO Simulation Year Synthesis
  28. 28. Copyright © 2017 LUXOFT 28 Railway Safety Example: Synthetic Data forRailway Safety Example: Synthetic Data for TrainingTraining
  29. 29. Copyright © 2017 LUXOFT 29 Lessons Learned and Resources !.
  30. 30. Copyright © 2017 LUXOFT 30 Lessons LearnedLessons Learned model effort data effort data effort
  31. 31. Copyright © 2017 LUXOFT 31 • The most important decision is data strategy • Data acquisition: Need lots of data, full coverage, well balanced. • Model decisions: Best overall may be not the best for you. • Pipeline decisions: Cascade and combine: Classic and AI algorithms. • Training: A lot can be achieved by data preparation and synthesis. Annotation tools save millions. • Be prepared to stop. AI development is a very iterative process Lessons LearnedLessons Learned
  32. 32. Copyright © 2017 LUXOFT 32 • Our website: www.luxoft.com • Our last year talk here at the Summit 2016 about computer vision pipeline optimization: • Available in full on Embedded Vision Alliance website: https://www.embedded-vision.com/platinum-members/luxoft/embedded-vision-training/videos/pages/may-2016-embedded-vision-summit ResourcesResources
  33. 33. Copyright © 2017 LUXOFT 33 Check out the demos at LUXOFT booth!Check out the demos at LUXOFT booth! Extremely optimized artificial intelligence and computer vision pipelines running on low-power embedded platforms and GPU architectures: stereo vision, video and image processing, DNNs; as well as our Hybrid AI Platform that distributes and manages both deep learning and classic computation across cloud and edge devices. *photo of our last year booth
  34. 34. Copyright © 2017 LUXOFT 34 Alexey Rybakov ARybakov@luxoft.com LUXOFT 4400 Bohannon Dr Ste 235 Menlo Park, CA 94025 Thank you! :)

×