Copyright © 2018 Intel 1
Yury Gorbachev
May 22, 2018
Enabling Cross-platform Deep Learning
Applications with Intel OpenVINO™
Copyright © 2018 Intel 2
• Faster TTM: lower porting burden from algo prototype to deployment
• Cross-platform portability/future-proofing of applications: Deploy the
same algorithm on platforms with different DL accelerators/different
network nodes (e.g. smart cameras, NVRs, cloud)
• Smarten up already installed products
• Add functionality incrementally (increase complexity)
• Update existing algorithms (improve quality)
• Scale processing with increasing demand or complexity
• Differentiate with algorithms and intelligence
Customer demands and industry trends
Copyright © 2018 Intel 3
• Deployment in different environments causes portability issues
• HW/SW architectures differ, thus application redesign may be required
• Rapidly moving target as number of platforms grows
• DL model retraining/redesign may be required
• CV/DL domain is complex and features a high entry threshold
• Low performing academic results are frequently used in production
• Existing Deep Learning solutions are mostly training oriented
Hurdles
Copyright © 2018 Intel 4
Intel HW portfolio
Smart Cameras Video Gateways Datacenter / Cloud Clients
 End-to-End Architecture
 X86 software portability & ecosystem
 Suite of SDKs and tools
Intel
Delivers
CV
CV
CV
Copyright © 2018 Intel 5
• Development toolkit for high performance CV and DL inference
• Solution for application designers
• No training overhead or specifics, minimal footprint, highly portable code
• Set of libraries to solve CV/DL deployment problems
• Fastest OpenCV build
• Certified OpenVX implementation
• Deep Learning Inference Engine
• Provides access to all accelerators and heterogeneous execution model
• Intel CPU, CPU w/integrated graphics
• Vision Processing Unit (VPU) and FPGA
OpenVINO™ Toolkit for best CV/DL applications
Copyright © 2018 Intel 6
Computer Vision using Intel OpenVINO™ Toolkit
Input
Object
Person
Face
Emotion
Gesture
Text
…
Custom Components CV/non-DL Components
Direct Coding Solution
API Solution API Solution
Custom Code
VPU GPU CPUFPGA VPU GPU CPU
DL Components
Computer Vision Pipeline
DL Inference Engine
VPU GPU CPUFPGA
OpenVINO™
Copyright © 2018 Intel 7
Deep Learning Inference Engine (IE)
DL Inference Engine API
Deep Learning application
IR
Model
Optimizer
Design time
CPU Plugin
MKL-
DNN
Heterogeneous Execution Engine
C++
layers
GPU Plugin
clDNN
Movidius Plugin
Custom OCL layers
FPGA Plugin
MVNC DLA
Custom layers
+
3
4
3 Framework independent lightweight internal representation
4 Customizations in C++ and OpenCL languages
2
Heterogeneous network execution across accelerators
1
2
1 Single API solution across accelerators
Copyright © 2018 Intel 8
• Original training framework is not required for execution
• Same API across all Intel accelerator offerings
• No need for application redesign when deploying on different target
• Consistent DL models accuracy and functionality across targets
• No model retraining is required
• Comprehensive validation suite
Cross platform portability
Copyright © 2018 Intel 9
Additional portability benefits
Create application using
Inference Engine API
Design and validate on
CPU/GPU clusters
Deploy on Movidius/FPGA
Targets
• Design future-proof products that use evolving Intel DL accelerators,
including the ones under development
• Convenient and available targets for design and validation
• Very easy to troubleshoot on traditional desktops
Copyright © 2018 Intel 10
• Custom layers extension mechanism supported via API
• New topologies and features are easy to support
• Keep your own intellectual property protected
• MKL-DNN and clDNN back-ends available in source form
• Most of the layers are distributed in source form
• Easy to modify for your needs, remove unnecessary features
• Use as a sample for faster time to market
• C++ for CPU, OpenCL for other IPs
Customization and source availability
Copyright © 2018 Intel 11
• Split graph execution between multiple targets
• Custom layers via CPU code if needed
• Better performance for certain functions (e.g. GPU vs CPU)
Heterogeneous execution
GPU/FPGA
CPU
Layer1
Layer2
Layer3
Layer4
Layer5
Layer6
Layer7
Custom Layer
Standard Layer
Copyright © 2018 Intel 12
• Asynchronous execution
• Hide transfer latency (e.g. FPGA offload)
• Perform parallel compute on multiple targets
• Efficient SoC utilization and lower pipeline latency
Asynchronous execution
Preprocess Detect Classify
Object
detection
CNN
Object classification
CNN
Pre-process
…
GPUFPGA/Movidius
Preprocess Detect Classify
Preprocess Preprocess Preprocess
Detect Detect Detect
Classify Classify Classify
Copyright © 2018 Intel 13
• Free reference models for Deep Learning Inference Engine
• Object Detection (Face, People, Vehicles, etc.)
• Object Analysis (Facial attributes, Head Pose, Vehicle attributes)
• Superior performance on Intel
• Core™ i5 CPU: SSD 300 (6 fps) vs. People Detection Model (60 fps)
• Significant reduction in development efforts, no dataset & training needed
• Squeeze-in more functions thanks to efficient models!
High quality DL models (IoT Model Zoo)
Copyright © 2018 Intel 14
• Basic samples to facilitate API understanding
• Classification, object detection, segmentation
• Target selection via command line
• Extended samples using Model Zoo
• Face analysis, Security camera sample
• Interworking between Media SDK, OpenCV, DL IE
• Automated public models downloader script
Samples
Copyright © 2018 Intel 15
• OpenVINO™: https://software.intel.com/en-us/openvino-toolkit
• Support forum: https://software.intel.com/en-us/forums/computer-vision
• Visit our demo booth, ask questions
• E-mail me: yury.gorbachev@intel.com
Resources and useful links
Copyright © 2018 Intel 16
Thank you

Enabling Cross-platform Deep Learning Applications with Intel OpenVINO™

  • 1.
    Copyright © 2018Intel 1 Yury Gorbachev May 22, 2018 Enabling Cross-platform Deep Learning Applications with Intel OpenVINO™
  • 2.
    Copyright © 2018Intel 2 • Faster TTM: lower porting burden from algo prototype to deployment • Cross-platform portability/future-proofing of applications: Deploy the same algorithm on platforms with different DL accelerators/different network nodes (e.g. smart cameras, NVRs, cloud) • Smarten up already installed products • Add functionality incrementally (increase complexity) • Update existing algorithms (improve quality) • Scale processing with increasing demand or complexity • Differentiate with algorithms and intelligence Customer demands and industry trends
  • 3.
    Copyright © 2018Intel 3 • Deployment in different environments causes portability issues • HW/SW architectures differ, thus application redesign may be required • Rapidly moving target as number of platforms grows • DL model retraining/redesign may be required • CV/DL domain is complex and features a high entry threshold • Low performing academic results are frequently used in production • Existing Deep Learning solutions are mostly training oriented Hurdles
  • 4.
    Copyright © 2018Intel 4 Intel HW portfolio Smart Cameras Video Gateways Datacenter / Cloud Clients  End-to-End Architecture  X86 software portability & ecosystem  Suite of SDKs and tools Intel Delivers CV CV CV
  • 5.
    Copyright © 2018Intel 5 • Development toolkit for high performance CV and DL inference • Solution for application designers • No training overhead or specifics, minimal footprint, highly portable code • Set of libraries to solve CV/DL deployment problems • Fastest OpenCV build • Certified OpenVX implementation • Deep Learning Inference Engine • Provides access to all accelerators and heterogeneous execution model • Intel CPU, CPU w/integrated graphics • Vision Processing Unit (VPU) and FPGA OpenVINO™ Toolkit for best CV/DL applications
  • 6.
    Copyright © 2018Intel 6 Computer Vision using Intel OpenVINO™ Toolkit Input Object Person Face Emotion Gesture Text … Custom Components CV/non-DL Components Direct Coding Solution API Solution API Solution Custom Code VPU GPU CPUFPGA VPU GPU CPU DL Components Computer Vision Pipeline DL Inference Engine VPU GPU CPUFPGA OpenVINO™
  • 7.
    Copyright © 2018Intel 7 Deep Learning Inference Engine (IE) DL Inference Engine API Deep Learning application IR Model Optimizer Design time CPU Plugin MKL- DNN Heterogeneous Execution Engine C++ layers GPU Plugin clDNN Movidius Plugin Custom OCL layers FPGA Plugin MVNC DLA Custom layers + 3 4 3 Framework independent lightweight internal representation 4 Customizations in C++ and OpenCL languages 2 Heterogeneous network execution across accelerators 1 2 1 Single API solution across accelerators
  • 8.
    Copyright © 2018Intel 8 • Original training framework is not required for execution • Same API across all Intel accelerator offerings • No need for application redesign when deploying on different target • Consistent DL models accuracy and functionality across targets • No model retraining is required • Comprehensive validation suite Cross platform portability
  • 9.
    Copyright © 2018Intel 9 Additional portability benefits Create application using Inference Engine API Design and validate on CPU/GPU clusters Deploy on Movidius/FPGA Targets • Design future-proof products that use evolving Intel DL accelerators, including the ones under development • Convenient and available targets for design and validation • Very easy to troubleshoot on traditional desktops
  • 10.
    Copyright © 2018Intel 10 • Custom layers extension mechanism supported via API • New topologies and features are easy to support • Keep your own intellectual property protected • MKL-DNN and clDNN back-ends available in source form • Most of the layers are distributed in source form • Easy to modify for your needs, remove unnecessary features • Use as a sample for faster time to market • C++ for CPU, OpenCL for other IPs Customization and source availability
  • 11.
    Copyright © 2018Intel 11 • Split graph execution between multiple targets • Custom layers via CPU code if needed • Better performance for certain functions (e.g. GPU vs CPU) Heterogeneous execution GPU/FPGA CPU Layer1 Layer2 Layer3 Layer4 Layer5 Layer6 Layer7 Custom Layer Standard Layer
  • 12.
    Copyright © 2018Intel 12 • Asynchronous execution • Hide transfer latency (e.g. FPGA offload) • Perform parallel compute on multiple targets • Efficient SoC utilization and lower pipeline latency Asynchronous execution Preprocess Detect Classify Object detection CNN Object classification CNN Pre-process … GPUFPGA/Movidius Preprocess Detect Classify Preprocess Preprocess Preprocess Detect Detect Detect Classify Classify Classify
  • 13.
    Copyright © 2018Intel 13 • Free reference models for Deep Learning Inference Engine • Object Detection (Face, People, Vehicles, etc.) • Object Analysis (Facial attributes, Head Pose, Vehicle attributes) • Superior performance on Intel • Core™ i5 CPU: SSD 300 (6 fps) vs. People Detection Model (60 fps) • Significant reduction in development efforts, no dataset & training needed • Squeeze-in more functions thanks to efficient models! High quality DL models (IoT Model Zoo)
  • 14.
    Copyright © 2018Intel 14 • Basic samples to facilitate API understanding • Classification, object detection, segmentation • Target selection via command line • Extended samples using Model Zoo • Face analysis, Security camera sample • Interworking between Media SDK, OpenCV, DL IE • Automated public models downloader script Samples
  • 15.
    Copyright © 2018Intel 15 • OpenVINO™: https://software.intel.com/en-us/openvino-toolkit • Support forum: https://software.intel.com/en-us/forums/computer-vision • Visit our demo booth, ask questions • E-mail me: yury.gorbachev@intel.com Resources and useful links
  • 16.
    Copyright © 2018Intel 16 Thank you