SlideShare a Scribd company logo
Globalcode – Open4education
Trilha – Machine Learning
Tendências da junção entre Big Data Analytics, Machine
Learning e Supercomputação
Igor Freitas
Intel – Software and Services Group
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
2
Legal Notices and Disclaimers
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL
PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL’S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY
WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL® PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO
FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. INTEL PRODUCTS ARE NOT
INTENDED FOR USE IN MEDICAL, LIFE SAVING, OR LIFE SUSTAINING APPLICATIONS.
Intel may make changes to specifications and product descriptions at any time, without notice.
All products, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice.
Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current
characterized errata are available on request.
Any code names featured are used internally within Intel to identify products that are in development and not yet publicly announced for release. Customers, licensees and other third parties
are not authorized by Intel to use code names in advertising, promotion or marketing of any product or services and any such use of Intel's internal code names is at the sole risk of the user.
Intel product plans in this presentation do not constitute Intel plan of record product roadmaps. Please contact your Intel representative to obtain Intel’s current plan of record product
roadmaps.
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests are measured using specific computer
systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to
assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to
http://www.intel.com/performance
Intel, the Intel logo, Xeon and Xeon logo , Xeon Phi and Xeon Phi logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries
Other names and brands may be claimed as the property of others.
All products, dates, and figures are preliminary and are subject to change without any notice. Copyright © 2016, Intel Corporation.
This document contains information on products in the design phase of development.
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
3
Optimization Notice
Optimization Notice
Intel® compilers, associated libraries and associated development tools may include or utilize options that optimize for instruction sets that are available in
both Intel® and non-Intel microprocessors (for example SIMD instruction sets), but do not optimize equally for non-Intel microprocessors. In addition,
certain compiler options for Intel compilers, including some that are not specific to Intel micro-architecture, are reserved for Intel microprocessors. For a
detailed description of Intel compiler options, including the instruction sets and specific microprocessors they implicate, please refer to the “Intel® Compiler
User and Reference Guides” under “Compiler Options." Many library routines that are part of Intel® compiler products are more highly optimized for Intel
microprocessors than for other microprocessors. While the compilers and libraries in Intel® compiler products offer optimizations for both Intel and Intel-
compatible microprocessors, depending on the options you select, your code and other factors, you likely will get extra performance on Intel microprocessors.
Intel® compilers, associated libraries and associated development tools may or may not optimize to the same degree for non-Intel microprocessors for
optimizations that are not unique to Intel microprocessors. These optimizations include Intel® Streaming SIMD Extensions 2 (Intel® SSE2), Intel® Streaming
SIMD Extensions 3 (Intel® SSE3), and Supplemental Streaming SIMD Extensions 3 (Intel® SSSE3) instruction sets and other optimizations. Intel does not
guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent
optimizations in this product are intended for use with Intel microprocessors.
While Intel believes our compilers and libraries are excellent choices to assist in obtaining the best performance on Intel® and non-Intel microprocessors,
Intel recommends that you evaluate other compilers and libraries to determine which best meet your requirements. We hope to win your business by
striving to offer the best performance of any compiler or library; please let us know if you find we do not.
Notice revision #20101101
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
https://www.intelnervana.com/framework-optimizations/ 4
Big Data Analytics
HPC != Big Data Analytics and Artificial Intelligence ?
*Other brands and names are the property of their respective owners.
FORTRAN / C++ Applications
MPI
High Performance
Java, Python, Go, etc.*
Applications
Hadoop*
Simple to Use
SLURM
Supports large scale startup
YARN*
More resilient of hardware failures
Lustre*
Remote Storage
HDFS*, SPARK*
Local Storage
Compute & Memory Focused
High Performance Components
Storage Focused
Standard Server Components
Server Storage
SSDs
Switch
Fabric
Infrastructure
Modelo de
Programação
Resource
Manager
Sistema de
arquivos
Hardware
Server Storage
HDDs
Switch
Ethernet
Infrastructure
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
https://www.intelnervana.com/framework-optimizations/ 5
Trends in HPC + Big Data Analytics
Standards
Business viability
Performance
Code Modernization
(Vector instructions)
Many-core
FPGA, ASICs
Usability
Faster time-to-market
Lower costs (HPC at Cloud ? )
Better products
Easy to mantain HW & SW
Portability
Open
Commom
Environments
Integrated solutions:
Storage + Network +
Processing + Memory
Public investments
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
https://www.intelnervana.com/framework-optimizations/
Varied Resource Needs
Typical HPC
Workloads
Typical
Big Data
Workloads
6
Big Data Analytics
HPC in real time
Small Data + Small
Compute
e.g. Data analysis
Big Data +
Small Compute
e.g. Search, Streaming,
Data Preconditioning
Small Data +
Big Compute
e.g. Mechanical Design, Multi-physics
Data
Compute
High
Frequency
Trading
Numeric
Weather
Simulation
Oil & Gas
Seismic
Systemcostbalance
Video Survey Traffic
Monitor
Personal
Digital Health
Systemcostbalance
Processor Memory Interconnect Storage
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
7#IntelAI
WhatisMachineLearning?
 Random Forest
 Support Vector Machines
 Regression
 Naïve Bayes
 Hidden Markov
 K-Means Clustering
 Ensemble Methods
 More…
ClassicML
Using optimized functions or algorithms
to extract insights from data
Training
Data*
Inference,
Clustering, or
Classification
New Data*
Deeplearning
Using massive labeled data sets to train deep (neural)
graphs that can make inferences about new data
Step 1: Training
Use massive labeled dataset (e.g. 10M
tagged images) to iteratively adjust
weighting of neural network connections
Step 2: Inference
Form inference about new
input data (e.g. a photo) using
trained neural network
Hours to Days
in Cloud
Real-Time
at Edge/Cloud
New
Data
Untrained Trained
Algorithms
CNN,
RNN,
RBM..
*Note: not all classic machine learning functions require training
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
8#IntelAI
ArtificialIntelligenceOpportunity
Intel Confidential
AI
40%
Classic Machine Learning Deep Learning
60%
2016 Servers
7%
97%
Intel Architecture (IA)
1%
IA+
GPU
2%
Other
91% 7%
Intel Architecture (IA)
2%
IA+
GPU
Other
AI is the fastest growing data center workload
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
9#IntelAI
ArtificialIntelligencePlan
Intel Confidential
Bringing the HPC Strategy to AI
Most widely
deployed machine
learning solution
High performance,
classic machine
learning
LAKE
CREST
Best in class
neural network
performance
Programmable,
low-latency
inference
Top 500 % FLOPs
Jun'10 Nov '12 Nov '16
35%
5%
60%
93%
7% 16%
20%
64%
Nvidia intro Xeon Phi intro November ‘16
Nvidia* Intel® Xeon Phi™ Xeon
%
0
COMING 2017
KNIGHTSMILL
NOW
SKYLAKE
COMING 2017
LAKECREST
SDVS SHIPPING TODAY
BROADWELL+ARRIA10
Intel® Nervana™ Portfolio
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
libraries Intel® Math Kernel Library
(MKL, MKL-DNN)
platforms
Frameworks
Intel® Data Analytics
Acceleration Library
(DAAL)
hardware Memory & Storage NetworkingCompute
Intel Python
Distribution
Mllib BigDL
Intel® Nervana™ Graph*
experiences
Intel® Nervana™ Cloud & Appliance
Intel®Nervana™portfolio
Intel® Nervana™ DL Studio
Intel® Computer
Vision SDK
Movidius
Fathom
More
*Future
Other names and brands may be claimed as the property of others.
*
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
11#IntelAI
BIGDL
Bringing Deep Learning to Big Data
github.com/intel-analytics/BigDL
 Open Sourced Deep Learning Library for
Apache Spark*
 Make Deep learning more Accessible to Big
data users and data scientists.
 Feature Parity with popular DL frameworks like
Caffe, Torch, Tensorflow etc.
 Easy Customer and Developer Experience
 Run Deep learning Applications as Standard
Spark programs;
 Run on top of existing Spark/Hadoop clusters
(No Cluster change)
 High Performance powered by Intel MKL and
Multi-threaded programming.
 Efficient Scale out leveraging Spark
architecture.
Spark Core
SQL SparkR
Stream-
ing
MLlib GraphX
ML Pipeline
DataFrame
BigDL
For developers looking to run deep learning on Hadoop/Spark due to familiarity or analytics use
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
12#IntelAI
Intel®Nervana™porftolio(Detail)
Batch Manybatchmodels
Train machine learning models across a
diverse set of dense and sparse data
Train large deep neural networks
Train large models as fast as possible
LAKE
CREST
Stream Edge
Infer billions of data samples at a time
and feed applications within ~1 day
Infer deep data streams with low latency
in order to take action within milliseconds
Power-constrained
environments
…
Training
inference
Batch
or other Intel®
edge processor
OR OR
OR
Option for higher
throughput/watt
*Future*
Required for
low latency
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
0
5
10
15
20
25
30
Intel Xeon E5-2699v4 Intel Xeon E5-2699v4 Intel Xeon E5-2699v4 Intel Xeon Phi 7250
Out-of-the-box +Intel MKL 11.3.3 +Intel MKL 2017 +Intel MKL 2017
Performancespeedup
Caffe/AlexNet single node inference performance
2.2x
1.9x
13
Better performance in Deep Neural Network workloads with
Intel® Math Kernel Library (Intel® MKL)
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer
systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your
contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance . *Other names and brands may be property of others
Configurations:
• 2 socket system with Intel® Xeon® Processor E5-2699 v4 (22 Cores, 2.2 GHz,), 128 GB memory, Red Hat* Enterprise Linux 6.7, BVLC Caffe, Intel Optimized Caffe framework, Intel® MKL 11.3.3, Intel® MKL 2017
• Intel® Xeon Phi™ Processor 7250 (68 Cores, 1.4 GHz, 16GB MCDRAM), 128 GB memory, Red Hat* Enterprise Linux 6.7, Intel® Optimized Caffe framework, Intel® MKL 2017
All numbers measured without taking data manipulation into account.
7.5x
31x
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
Skt-Learn* Optimizations With Intel® MKL
0x
1x
2x
3x
4x
5x
6x
7x
8x
9x
Approximate
neighbors
Fast K-means GLM GLM net LASSO Lasso path Least angle
regression, OpenMP
Non-negative matrix
factorization
Regression by SGD Sampling without
replacement
SVD
Speedups of Scikit-Learn Benchmarks
Intel® Distribution for Python* 2017 Update 1 vs. system Python & NumPy/Scikit-Learn
System info: 32x Intel® Xeon® CPU E5-2698 v3 @ 2.30GHz, disabled HT, 64GB RAM; Intel® Distribution for Python* 2017 Gold; Intel® MKL 2017.0.0; Ubuntu 14.04.4 LTS; Numpy 1.11.1; scikit-learn 0.17.1.
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software,
operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that
product when combined with other products. * Other brands and names are the property of their respective owners. Benchmark Source: Intel Corporation
Optimization Notice: Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other
optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel
microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by
Effect of Intel MKL
optimizations for
NumPy* and SciPy*
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
15#IntelAI
Intel®Arria®10FPGAperformance
Deep Learning Image Classification SINGLE-NODE Inference Performance Efficiency
* Xeon with Integrated FPGA refers to Broadwell Platform Proof of Concept
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software,
operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product
when combined with other products. For more information go to http://www.intel.com/performance . *Other names and brands may be property of others.
Configurations:
- AlexNet1 Classification Implementation as specified by http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf, Training Parameters taken from Caffe open-source Framework are 224x224x3 Input, 1000x1 Output, FP16 with Shared Block Exponents, All
compute layers (incl. Fully Connected) done on the FPGA except for Softmax
- Arria 10-1150 FPGA, -1 Speed Grade on Altera PCIe DevKit with x72 DDR4 @ 1333 MHz, Largest mid-range part from newest family (1518 DSPs, 2800 M20Ks), Power measured through on-board power monitor (FPGA POWER ONLY) ACDS 16.1 Internal Builds
+ OpenCL SDK 16.1 Internal Build Host system: HP Z620 Workstation, Xeon E5-1660 at 3.3 GHz with 32GB RAM. The Xeon is not used for compute.
- AlexNet: https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf , Batch Size:256 2.From GTC Keynote
System Throughput Power Throughput / Watt
Arria® 10-115 (FP32, Full Size Image, Speed @306Mhz)1 575 img/s ~31W 18.5 img/s/W
Arria® 10-115 (FP16, Full Size Image, Speed @297Mhz)1
1020 img/s ~40W 25.5 img/s/W
Nvidia* M42 20 img/s/w
Topology Datatype Throughput*
AlexNet* FloatP32 ~360 Image/s
FixedP16 ~600 Image/s
FixedP8 ~750 Images/s
FixedP8 Winograd ~1200 Images/s
+
Integrated*
discrete
Note: Intel® Xeon®
processor performance is
not included in tables
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
16
 LeTV Cloud (www.lecloud.com) is a leading video cloud
provider in China
 LeTV Cloud provides illegal video detection service to 3rd
party video cloud customers to help them detect illegal videos
 Originally, LeTV adopted open source BVLC Caffe plus
OpenBlas as CNN framework, but the performance was poor
 By using Caffe + Intel MKL, they gained up to 30x performance
improvement on training in production environment
Case Study:
LeTV Cloud Illegal Video Detection
* The test data is based on Intel Xeon E5 2680 V3 processor
1
30
0
5
10
15
20
25
30
35
BVLC Caffe + OpenBlas(Baseline) IntelCaffe + MKL
LeTV Cloud Illegal Video Detection Process Flow
Video Pictures
Extract
Keyframe Input Classification
CNN Network based on Caffe Video Classification
…
LeTV Cloud Caffe Optimization - higher is better
Other names and brands may be claimed as property of others. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance
tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary.
You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
For more complete information visit http://www.intel.com/performance
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
https://www.intelnervana.com/framework-optimizations/
Identificando oportunidades de otimização
Recapitulando o que é vetorização / SIMD
18
for (i=0;i<=MAX;i++)
c[i]=a[i]+b[i];
+
c[i+7] c[i+6] c[i+5] c[i+4] c[i+3] c[i+2] c[i+1] c[i]
b[i+7] b[i+6] b[i+5] b[i+4] b[i+3] b[i+2] b[i+1] b[i]
a[i+7] a[i+6] a[i+5] a[i+4] a[i+3] a[i+2] a[i+1] a[i]
Vector
- Uma instrução
- Oito operações
+
C
B
A
Scalar
- Uma instrução
- Uma operação
• O que é e ?
• Capacidade de realizar uma operação matemática
em dois ou mais elementos ao mesmo tempo.
• Por que Vetorizar ?
• Ganho substancial em performance !
https://www.intelnervana.com/framework-optimizations/
https://colfaxresearch.com/knl-avx512/
Obrigado!IGOR.FREITAS@ INTEL.COM
Backup
Getstartedtoday
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
https://www.intelnervana.com/framework-optimizations/ 22
Get started today
Frameworks optimized for Intel
Build Faster Deep Learning Applications with Caffe*
The popular open-source development framework for
image recognition is now optimized for Intel® Architecture.
Get the framework
Learn how to install Caffe*
Delving Into Deep Learning
The Python library designed to help write deep learning
models is now optimized for Intel® Architecture.
Visit the library
Getting started with Theano*
Speed Up Your Spark Analytics
Apache Spark* MLlib, the open-source data processing
framework’s machine learning library, now includes
Intel® Architecture support.
Get the library
Building faster applications on Spark clusters
BigDL on GitHub
BigDL on Spark video
Running on EC2 Page
BigDL: Distributed Deep learning on
Apache Spark
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
https://www.intelnervana.com/framework-optimizations/ 23
Get started today
Get Intel® Libraries (Community License) for Free
Intel® Data Analytics Acceleration Library
Intel® DAAL
Highly optimized library that helps speed big data
analytics by providing algorithmic building blocks for all
data analysis stages and for offline, streaming, and
distributed analytics usages.
Open-source options for Intel® DAAL
Learn more about Intel® DAAL
Intel® Math Kernel Library
Intel® MKL
A high-performance library with assets to help accelerate math
processing routines and increase application performance.
Deep neural network technical preview for Intel® MKL
Get the library
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
https://www.intelnervana.com/framework-optimizations/ 24
Training
General Purpose Infrastructure Workload optimized
• Accelerating Deep Learning and Machine Learning
This talk focuses on two Intel performance libraries, MKL and DAAL, which offer optimized building blocks for data analytics and machine
learning.
• Remove Python Performance Barriers for Machine Learning
This webinar highlights significant performance speed-ups achieved by implementing multiple Intel tools and techniques for high
performance Python.
• Analyze Python* App Performance with Intel® VTune™ Amplifier
Efficient profiling techniques can help dramatically improve the performance of your Python* code. Learn how Intel® VTune Amplifier can
help.
Boost Python* Performance with Intel® Math Kernel Library
Meet Intel® Distribution for Python*, an easy-to-install, optimized Python distribution that can help you optimize your app’s performance.
Building Faster Data Applications on Spark* Clusters
Apache Spark* is big for big data processing apps. Intel® Data Analytics Acceleration Library (Intel® DAAL) can help optimize performance.
Learn how.
Faster Big Data Analytics Using New Intel® Data Analytics Acceleration Library
Big data is BIG. And you need information faster. New Intel® Data Analytics Acceleration Library (Intel® DAAL) speeds data processing for
data mining, statistical analysis, and machine learning.
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
25
Resources
Intel® Machine Learning
• http://www.intel.com/content/www/us/en/analytics/machine-learning/overview.html
• Intel Caffe* fork, https://github.com/intelcaffe/caffe
• Intel Theano* fork, https://github.com/intel/theano
• Intel® Deep Learning SDK: https://software.intel.com/deep-learning-SDK
Intel® DAAL
• https://software.intel.com/en-us/intel-daal
Intel® Omni-Path Architecture
• http://www.intel.com/content/www/us/en/high-performance-computing-fabrics/omni-path-
architecture-fabric-overview.html
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
26
Intel(R) MKL Resources
Intel® MKL website
• https://software.intel.com/en-us/intel-mkl
Intel® MKL forum
• https://software.intel.com/en-us/forums/intel-math-kernel-library
Intel® MKL benchmarks
• https://software.intel.com/en-us/intel-mkl/benchmarks#
Intel® MKL link line advisor
• http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
27#IntelAI
Customerexamples
health Finance industrial Health Finance industrial
Early Tumor Detection
Leading medical imaging
company

Early detection of malignant
tumors in mammograms

Millions of “Diagnosed”
Mammograms

Deep Learning (CNN) tumor
image recognition

Higher accuracy and earlier
breast cancer detection
Data Synthesis
Financial services institution
with >$750B assets

Parse info to reduce portfolio
manager time to insight

Vast stores of documents
(news, emails, research,
social)

Deep Learning (RNN w/
encoder/decoder)

Faster and more informed
invest-ment decisions
Smart Agriculture
World leader in agricultural
biotech

Accelerate hybrid plant
development

Large dataset of hybrid plant
performance based on
genotype markers

Deep Learning (CNN) to
detect favorable interactions
between genotype markers

More accurate selection
leading to cost reduction
Personalized
Care
Renowned US Hospital
system

Accurately diagnose fatal
heart conditions

10,000 health attributes
used

Saffron memory-based
reasoning

Increased accuracy to 94%
compared with 54% for
average cardiologist
Customer
Personalization
Leading Insurance Group

Increase product
recommendation accuracy

5 Product Levels
1,353 Products
12M Members

Saffron memory-based
reasoning

50% increase in product
recommendation accuracy
Supply Chain Logistics
Multinational Aerospace
Corp

Reduce time to select aircraft
replacement part

15,000 Aircraft
$1M/day idle

Saffron memory-based
reasoning

Reduced part selection from
4 hours to 5 mins
Deep Learning Memory-based Reasoning
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
28#IntelAI
Intel®Nervana™Platform
For Deep Learning
Results have been estimated or simulated using internal Intel analysis or architecture simulation or modeling, and provided to you for informational purposes. Any differences in your system hardware, software or configuration may affect your actual performance
COMPARED TO TODAY’S FASTEST SOLUTION1
100XReductionDelivering Intimetotrain
by2020
KnightsCrest
Bootable Intel Xeon Processor
with integrated acceleration
LakeCrest
Discrete accelerator
First silicon 1H’2017
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
29#IntelAI
Hardware for DL Workloads
 Custom-designed for deep
learning
 Unprecedented compute density
 More raw computing power than
today’s state-of-the-art GPUs
Blazingly Fast Data Access
 32 GB of in package memory via
HBM2 technology
 8 Tera-bits/s of memory access
speed
High Speed Scalability
 12 bi-directional high-bandwidth
links
 Seamless data transfer via
interconnects
Lakecrest
Deep Learning by Design
Add-in card for unprecedented compute density in deep learning centric environments
Lake
Crest
Everything needed for deep learning and nothing more!
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
30#IntelAI
Removing IO and
Memory Barriers
 Integrated Intel® Omni-Path fabric
increases price-performance and
reduces communication latency
 Direct access of up to 400 GB of
memory with no PCIe performance lag
(vs. GPU:16GB)
Breakthrough Highly
Parallel Performance
 Up to 400X deep learning
performance on existing hardware via
Intel software optimizations
 Up to 4Xdeep learning performance
increase estimated (Knights Mill, 2017)
Easier Programmability
 Binary-compatible with Intel® Xeon®
processors
 Open standards, libraries and
frameworks
Intel®XeonPhi™ProcessorFamily
Enables Shorter Time to Train Using General Purpose Infrastructure
Processor for HPC & enterprise customers running scale-out, highly-parallel, memory intensive apps
Configuration details on slide: 30
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations
and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when
combined with other products. For more complete information visit: http://www.intel.com/performance Source: Intel measured as of November 2016
Optimization Notice: Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other
optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors.
Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice Revision #20110804
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
31#IntelAI
Energy Efficient Inference with Infrastructure Flexibility
 Excellent energy efficiency up to 25images/sec/watt inference on Caffe/Alexnet
 Reconfigurable accelerator can be used for variety of data center workloads
 Integrated FPGA with Intel® Xeon® processor fits in standard server infrastructure
-OR- Discrete FPGA fits in PCIe card and embedded applications*
Intel®Arria®10FPGA
Superior Inference Capabilities
Add-in card for higher performance/watt inference with low latency and flexible precision
*Xeon with Integrated FPGA refers to Broadwell Proof of Concept
Configuration details on slide: 44
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations
and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined
with other products. For more complete information visit: http://www.intel.com/performance Source: Intel measured as of November 2016
Optimization Notice: Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other
optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors.
Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice Revision #20110804
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
32#IntelAI
Energy Efficient Inference
 Accelerates image recognition using
convolutional neural networks (CNN)
 Excellent energy efficient inference up to 25
images/s/w on Caffe/Alexnet
 Fits in standard server infrastructure*
Accelerate Time to Market
 Simplify deployment with preloaded optimized
CNN algorithms
 Integrated software ecosystem: optimized
libraries, frameworks and APIs
CanyonVista
Turnkey Deep Learning Inference Solution
Pre-configured add-in card for higher performance/watt inference for image recognition
*Standard server infrastructure – verified with Broadwell platform
Configuration details on slide: 44
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations
and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined
with other products. For more complete information visit: http://www.intel.com/performance Source: Intel measured as of November 2016
Optimization Notice: Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other
optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors.
Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice Revision #20110804
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
33#IntelAI
Lowest TCO With Superior
Infrastructure Flexibility
 Standard server infrastructure
 Open standards, libraries &
frameworks
 Optimized to run wide variety of
data center workloads
Server Class Reliability
 Industry standard server features:
high reliability, hardware enhanced
security
Leadership Throughput
 Industry leading inference
performance
 Up to 18Xperformance on existing
hardware via Intel software
optimizations
Intel®Xeon®ProcessorFamily
Most Widely Deployed Machine Learning Platform (97% share)*
Processor optimized for a wide variety of datacenter workloads enabling flexible infrastructure
Configuration details on slide: 30
*Intel® Xeon® processors are used in 97% of servers that are running machine learning workloads today (Source: Intel)
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations
and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when
combined with other products. For more complete information visit: http://www.intel.com/performance Source: Intel measured as of November 2016
Optimization Notice: Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other
optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors.
Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice Revision #20110804
Intel® Math Kernel Library Intel® Data
Analytics
Acceleration
Library (DAAL)
Intel®
Distribution
for
Open Source
Frameworks
Intel Deep
Learning Tools
Intel® MKL MKL-DNN
High
Level
Overview
High
performance
math primitives
granting low level
of control
Free open source
DNN functions for
high-velocity
integration with
deep learning
frameworks
Broad data
analytics
acceleration object
oriented library
supporting
distributed ML at
the algorithm level
Most popular
and fastest
growing
language for
machine
learning
Toolkits driven by
academia and
industry for
training machine
learning
algorithms
Accelerate deep
learning model
design, training
and deployment
Primary
Audience
Consumed by
developers of
higher level
libraries and
Applications
Consumed by
developers of the
next generation
of deep learning
frameworks
Wider Data
Analytics and ML
audience,
Algorithm level
development for
all stages of data
analytics
Application
Developers and
Data Scientists
Machine Learning
App Developers,
Researchers and
Data Scientists.
Application
Developers and
Data Scientists
Example
Usage
Framework
developers call
matrix
multiplication,
convolution
functions
New framework
with functions
developers call
for max CPU
performance
Call distributed
alternating least
squares algorithm
for a
recommendation
system
Call scikit-learn
k-means
function for
credit card fraud
detection
Script and train a
convolution
neural network
for image
recognition
Deep Learning
training and model
creation, with
optimization for
deployment on
constrained end
device
Mor
e…
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
Intel®DeepLearningSDK
Accelerate Deep Learning Development
 FREE for data scientists and
software developers to develop,
train & deploy deep learning
 Simplify installation of Intel
optimized frameworks and libraries
 Increase productivity through
simple and highly-visual interface
 Enhance deployment through
model compression and
normalization
 Facilitate integration with full
software stack via inference engine
For developers looking to accelerate deep learning model design, training & deployment
software.intel.com/deep-learning-sdk
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
BIGDL
Bringing Deep Learning to Big Data
github.com/intel-analytics/BigDL
 Open Sourced Deep Learning Library for
Apache Spark*
 Make Deep learning more Accessible to Big
data users and data scientists.
 Feature Parity with popular DL frameworks like
Caffe, Torch, Tensorflow etc.
 Easy Customer and Developer Experience
 Run Deep learning Applications as Standard
Spark programs;
 Run on top of existing Spark/Hadoop clusters
(No Cluster change)
 High Performance powered by Intel MKL and
Multi-threaded programming.
 Efficient Scale out leveraging Spark
architecture.
Spark Core
SQL SparkR
Stream-
ing
MLlib GraphX
ML Pipeline
DataFrame
BigDL
For developers looking to run deep learning on Hadoop/Spark due to familiarity or analytics use
Copyright © 2016, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
https://www.intelnervana.com/framework-optimizations/
Inteldistributionforpython
Advancing Python Performance Closer to Native Speeds
software.intel.com/intel-distribution-for-python
Easy, Out-of-the-box Access
to High Performance Python
 Prebuilt, optimized for numerical
computing, data analytics, HPC
 Drop in replacement for your
existing Python (no code changes
required)
Drive Performance with
Multiple Optimization
Techniques
 Accelerated NumPy/SciPy/Scikit-
Learn with Intel® MKL
 Data analytics with pyDAAL,
enhanced thread scheduling with
TBB, Jupyter* Notebook interface,
Numba, Cython
 Scale easily with optimized MPI4Py
and Jupyter notebooks
Faster Access to Latest
Optimizations for Intel
Architecture
 Distribution and individual
optimized packages available
through conda and Anaconda
Cloud
 Optimizations upstreamed back to
main Python trunk
For developers using the most popular and fastest growing programming language for AI
Easy to install with Anaconda*
https://anaconda.org/intel/
Hardware-Specific Transformers
Hardware
Agnostic
Intel® Nervana™ Graph Compiler
Intel® Nervana™ Graph Compiler
enables optimizations that are applicable
across multiple HW targets.
• Efficient buffer allocation
• Training vs inference optimizations
• Efficient scaling across multiple nodes
• Efficient partitioning of subgraphs
• Compounding of ops
MKL-DNN
Customer
Solution Neon Solutions
Neon ModelsCustomer Models
Neon Deep Learning Functions
Customer
Algorithms
High-levelexecutiongraphforneuralnetworks
Intel®Nervana™
AIAcademy SoftwareinnovatorsandBlackBelts
software.intel.com/ai
Frameworks,ToolsandLibraries
Workshops,webinars,meetups
in partnership with
honeyourskillsandbuildthefutureofai
*Other names and brands may be claimed as the property of others.
Intel Confidential | NDA Required
DataCenterGroup 40
Optimized Mathematical Building Blocks
Intel® MKL
Linear Algebra
• BLAS
• LAPACK
• ScaLAPACK
• Sparse BLAS
• PARDISO* SMP & Cluster
• Iterative sparse solvers
Vector RNGs
• Congruential
• Wichmann-Hill
• Mersenne Twister
• Sobol
• Neiderreiter
• Non-deterministic
Fast Fourier Transforms
• Multidimensional
• FFTW interfaces
• Cluster FFT
Summary Statistics
• Kurtosis
• Variation coefficient
• Order statistics
• Min/max
• Variance-covariance
Vector Math
• Trigonometric
• Hyperbolic
• Exponential
• Log
• Power
• Root
And More
• Splines
• Interpolation
• Trust Region
• Fast Poisson Solver
*Other names and brands may be claimed as property of others.
Intel Confidential | NDA Required
DataCenterGroup 41@IntelAI
Intel®MKL-DNN
Math Kernel Library for Deep Neural Networks
Distribution Details
 Open Source
 Apache 2.0 License
 Common DNN APIs across all Intel hardware.
 Rapid release cycles, iterated with the DL community, to
best support industry framework integration.
 Highly vectorized & threaded for maximal performance,
based on the popular Intel® MKL library.
For developers of deep learning frameworks featuring optimized performance on Intel hardware
github.com/01org/mkl-dnn
Direct 2D
Convolution
Rectified linear unit
neuron activation
(ReLU)
Maximum
pooling
Inner product
Local response
normalization
(LRN)
Intel Confidential | NDA Required
DataCenterGroup 42@IntelAI
Intel®Machinelearningscalinglibrary(MLSL)
Scaling Deep Learning to 32 Nodes and Beyond
Deep learning abstraction of message-
passing implementation
 Built on top of MPI; allows other communication
libraries to be used as well
 Optimized to drive scalability of
communication patterns
 Works across various interconnects: Intel®
Omni-Path Architecture, InfiniBand, and
Ethernet
 Common API to support Deep Learning
frameworks (Caffe, Theano, Torch etc.)
For maximum deep learning scale-out performance on Intel® architecture
FORWARD PROP
BACK PROP
L
A
Y
E
R
1
L
A
Y
E
R
2
L
A
Y
E
R
N
Allreduce
Alltoall
Allreduce
Allreduce
Reduce
Scatter
AllgatherAlltoall Allreduce
github.com/01org/MLSL/releases
TDC2017 | São Paulo - Trilha Machine Learning How we figured out we had a SRE team at - Tendências da junção entre Big Data Analytics, Machine Learning e Supercomputação

More Related Content

What's hot

AWS Summit Singapore - Make Business Intelligence Scalable and Adaptable
AWS Summit Singapore - Make Business Intelligence Scalable and AdaptableAWS Summit Singapore - Make Business Intelligence Scalable and Adaptable
AWS Summit Singapore - Make Business Intelligence Scalable and Adaptable
Amazon Web Services
 
8 intel network builders overview
8 intel network builders overview8 intel network builders overview
8 intel network builders overview
videos
 
Intel IT Experts Tour Cyber Security - Matthew Rosenquist 2013
Intel IT Experts Tour   Cyber Security - Matthew Rosenquist 2013Intel IT Experts Tour   Cyber Security - Matthew Rosenquist 2013
Intel IT Experts Tour Cyber Security - Matthew Rosenquist 2013
Matthew Rosenquist
 
Explore, design and implement threading parallelism with Intel® Advisor XE
Explore, design and implement threading parallelism with Intel® Advisor XEExplore, design and implement threading parallelism with Intel® Advisor XE
Explore, design and implement threading parallelism with Intel® Advisor XE
Intel IT Center
 
Unlock Hidden Potential through Big Data and Analytics
Unlock Hidden Potential through Big Data and AnalyticsUnlock Hidden Potential through Big Data and Analytics
Unlock Hidden Potential through Big Data and Analytics
IT@Intel
 
4 dpdk roadmap(1)
4 dpdk roadmap(1)4 dpdk roadmap(1)
4 dpdk roadmap(1)
videos
 
Microsoft Build 2019- Intel AI Workshop
Microsoft Build 2019- Intel AI Workshop Microsoft Build 2019- Intel AI Workshop
Microsoft Build 2019- Intel AI Workshop
Intel® Software
 
Cognitive Computing in IBM Spectrum LSF
Cognitive Computing in IBM Spectrum LSFCognitive Computing in IBM Spectrum LSF
Cognitive Computing in IBM Spectrum LSF
Gabor Samu
 
Intel® QuickAssist Technology (Intel® QAT) and OpenSSL-1.1.0: Performance
Intel® QuickAssist Technology (Intel® QAT) and OpenSSL-1.1.0: PerformanceIntel® QuickAssist Technology (Intel® QAT) and OpenSSL-1.1.0: Performance
Intel® QuickAssist Technology (Intel® QAT) and OpenSSL-1.1.0: Performance
DESMOND YUEN
 
Programming Intel® QuickAssist Technology Hardware Accelerators for Optimal P...
Programming Intel® QuickAssist Technology Hardware Accelerators for Optimal P...Programming Intel® QuickAssist Technology Hardware Accelerators for Optimal P...
Programming Intel® QuickAssist Technology Hardware Accelerators for Optimal P...
DESMOND YUEN
 
LF_DPDK17_Reducing Barriers to Adoption - Making DPDK Easier to Integrate int...
LF_DPDK17_Reducing Barriers to Adoption - Making DPDK Easier to Integrate int...LF_DPDK17_Reducing Barriers to Adoption - Making DPDK Easier to Integrate int...
LF_DPDK17_Reducing Barriers to Adoption - Making DPDK Easier to Integrate int...
LF_DPDK
 
LF_DPDK17_The Path to Data Plane Microservices
LF_DPDK17_The Path to Data Plane MicroservicesLF_DPDK17_The Path to Data Plane Microservices
LF_DPDK17_The Path to Data Plane Microservices
LF_DPDK
 
Intel Knights Landing Slides
Intel Knights Landing SlidesIntel Knights Landing Slides
Intel Knights Landing Slides
Ronen Mendezitsky
 
Utilisation des capteurs dans les applications windows 8
Utilisation des capteurs dans les applications windows 8Utilisation des capteurs dans les applications windows 8
Utilisation des capteurs dans les applications windows 8
Intel Developer Zone Community
 
elcomsoft_catalog_en_2015
elcomsoft_catalog_en_2015elcomsoft_catalog_en_2015
elcomsoft_catalog_en_2015
Mounes Kayyali
 
LF_DPDK17_DPDK's best kept secret – Micro-benchmark performance tests
LF_DPDK17_DPDK's best kept secret – Micro-benchmark performance testsLF_DPDK17_DPDK's best kept secret – Micro-benchmark performance tests
LF_DPDK17_DPDK's best kept secret – Micro-benchmark performance tests
LF_DPDK
 
Unveiling the Early Universe with Intel Xeon Processors and Intel Xeon Phi at...
Unveiling the Early Universe with Intel Xeon Processors and Intel Xeon Phi at...Unveiling the Early Universe with Intel Xeon Processors and Intel Xeon Phi at...
Unveiling the Early Universe with Intel Xeon Processors and Intel Xeon Phi at...
Intel IT Center
 

What's hot (17)

AWS Summit Singapore - Make Business Intelligence Scalable and Adaptable
AWS Summit Singapore - Make Business Intelligence Scalable and AdaptableAWS Summit Singapore - Make Business Intelligence Scalable and Adaptable
AWS Summit Singapore - Make Business Intelligence Scalable and Adaptable
 
8 intel network builders overview
8 intel network builders overview8 intel network builders overview
8 intel network builders overview
 
Intel IT Experts Tour Cyber Security - Matthew Rosenquist 2013
Intel IT Experts Tour   Cyber Security - Matthew Rosenquist 2013Intel IT Experts Tour   Cyber Security - Matthew Rosenquist 2013
Intel IT Experts Tour Cyber Security - Matthew Rosenquist 2013
 
Explore, design and implement threading parallelism with Intel® Advisor XE
Explore, design and implement threading parallelism with Intel® Advisor XEExplore, design and implement threading parallelism with Intel® Advisor XE
Explore, design and implement threading parallelism with Intel® Advisor XE
 
Unlock Hidden Potential through Big Data and Analytics
Unlock Hidden Potential through Big Data and AnalyticsUnlock Hidden Potential through Big Data and Analytics
Unlock Hidden Potential through Big Data and Analytics
 
4 dpdk roadmap(1)
4 dpdk roadmap(1)4 dpdk roadmap(1)
4 dpdk roadmap(1)
 
Microsoft Build 2019- Intel AI Workshop
Microsoft Build 2019- Intel AI Workshop Microsoft Build 2019- Intel AI Workshop
Microsoft Build 2019- Intel AI Workshop
 
Cognitive Computing in IBM Spectrum LSF
Cognitive Computing in IBM Spectrum LSFCognitive Computing in IBM Spectrum LSF
Cognitive Computing in IBM Spectrum LSF
 
Intel® QuickAssist Technology (Intel® QAT) and OpenSSL-1.1.0: Performance
Intel® QuickAssist Technology (Intel® QAT) and OpenSSL-1.1.0: PerformanceIntel® QuickAssist Technology (Intel® QAT) and OpenSSL-1.1.0: Performance
Intel® QuickAssist Technology (Intel® QAT) and OpenSSL-1.1.0: Performance
 
Programming Intel® QuickAssist Technology Hardware Accelerators for Optimal P...
Programming Intel® QuickAssist Technology Hardware Accelerators for Optimal P...Programming Intel® QuickAssist Technology Hardware Accelerators for Optimal P...
Programming Intel® QuickAssist Technology Hardware Accelerators for Optimal P...
 
LF_DPDK17_Reducing Barriers to Adoption - Making DPDK Easier to Integrate int...
LF_DPDK17_Reducing Barriers to Adoption - Making DPDK Easier to Integrate int...LF_DPDK17_Reducing Barriers to Adoption - Making DPDK Easier to Integrate int...
LF_DPDK17_Reducing Barriers to Adoption - Making DPDK Easier to Integrate int...
 
LF_DPDK17_The Path to Data Plane Microservices
LF_DPDK17_The Path to Data Plane MicroservicesLF_DPDK17_The Path to Data Plane Microservices
LF_DPDK17_The Path to Data Plane Microservices
 
Intel Knights Landing Slides
Intel Knights Landing SlidesIntel Knights Landing Slides
Intel Knights Landing Slides
 
Utilisation des capteurs dans les applications windows 8
Utilisation des capteurs dans les applications windows 8Utilisation des capteurs dans les applications windows 8
Utilisation des capteurs dans les applications windows 8
 
elcomsoft_catalog_en_2015
elcomsoft_catalog_en_2015elcomsoft_catalog_en_2015
elcomsoft_catalog_en_2015
 
LF_DPDK17_DPDK's best kept secret – Micro-benchmark performance tests
LF_DPDK17_DPDK's best kept secret – Micro-benchmark performance testsLF_DPDK17_DPDK's best kept secret – Micro-benchmark performance tests
LF_DPDK17_DPDK's best kept secret – Micro-benchmark performance tests
 
Unveiling the Early Universe with Intel Xeon Processors and Intel Xeon Phi at...
Unveiling the Early Universe with Intel Xeon Processors and Intel Xeon Phi at...Unveiling the Early Universe with Intel Xeon Processors and Intel Xeon Phi at...
Unveiling the Early Universe with Intel Xeon Processors and Intel Xeon Phi at...
 

Similar to TDC2017 | São Paulo - Trilha Machine Learning How we figured out we had a SRE team at - Tendências da junção entre Big Data Analytics, Machine Learning e Supercomputação

Software-defined Visualization, High-Fidelity Visualization: OpenSWR and OSPRay
Software-defined Visualization, High-Fidelity Visualization: OpenSWR and OSPRaySoftware-defined Visualization, High-Fidelity Visualization: OpenSWR and OSPRay
Software-defined Visualization, High-Fidelity Visualization: OpenSWR and OSPRay
Intel® Software
 
Embree Ray Tracing Kernels
Embree Ray Tracing KernelsEmbree Ray Tracing Kernels
Embree Ray Tracing Kernels
Intel® Software
 
Ready access to high performance Python with Intel Distribution for Python 2018
Ready access to high performance Python with Intel Distribution for Python 2018Ready access to high performance Python with Intel Distribution for Python 2018
Ready access to high performance Python with Intel Distribution for Python 2018
AWS User Group Bengaluru
 
Make your unity game faster, faster
Make your unity game faster, fasterMake your unity game faster, faster
Make your unity game faster, faster
Intel® Software
 
AIDC Summit LA- Hands-on Training
AIDC Summit LA- Hands-on Training AIDC Summit LA- Hands-on Training
AIDC Summit LA- Hands-on Training
Intel® Software
 
In The Trenches Optimizing UE4 for Intel
In The Trenches Optimizing UE4 for IntelIn The Trenches Optimizing UE4 for Intel
In The Trenches Optimizing UE4 for Intel
Intel® Software
 
MeeGo Overview DeveloperDay Munich
MeeGo Overview DeveloperDay MunichMeeGo Overview DeveloperDay Munich
MeeGo Overview DeveloperDay Munich
Intel Developer Zone Community
 
Intel® Trace Analyzer e Collector (ITAC) - Intel Software Conference 2013
Intel® Trace Analyzer e Collector (ITAC) - Intel Software Conference 2013Intel® Trace Analyzer e Collector (ITAC) - Intel Software Conference 2013
Intel® Trace Analyzer e Collector (ITAC) - Intel Software Conference 2013
Intel Software Brasil
 
High Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge EconomyHigh Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge Economy
Intel IT Center
 
Gary Brown (Movidius, Intel): Deep Learning in AR: the 3 Year Horizon
Gary Brown (Movidius, Intel): Deep Learning in AR: the 3 Year HorizonGary Brown (Movidius, Intel): Deep Learning in AR: the 3 Year Horizon
Gary Brown (Movidius, Intel): Deep Learning in AR: the 3 Year Horizon
AugmentedWorldExpo
 
FPGAs and Machine Learning
FPGAs and Machine LearningFPGAs and Machine Learning
FPGAs and Machine Learning
inside-BigData.com
 
Driving Industrial InnovationOn the Path to Exascale
Driving Industrial InnovationOn the Path to ExascaleDriving Industrial InnovationOn the Path to Exascale
Driving Industrial InnovationOn the Path to Exascale
Intel IT Center
 
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY
 
Accelerate Ceph performance via SPDK related techniques
Accelerate Ceph performance via SPDK related techniques Accelerate Ceph performance via SPDK related techniques
Accelerate Ceph performance via SPDK related techniques
Ceph Community
 
Overcoming Scaling Challenges in MongoDB Deployments with SSD
Overcoming Scaling Challenges in MongoDB Deployments with SSDOvercoming Scaling Challenges in MongoDB Deployments with SSD
Overcoming Scaling Challenges in MongoDB Deployments with SSD
MongoDB
 
What are latest new features that DPDK brings into 2018?
What are latest new features that DPDK brings into 2018?What are latest new features that DPDK brings into 2018?
What are latest new features that DPDK brings into 2018?
Michelle Holley
 
Intel Technologies for High Performance Computing
Intel Technologies for High Performance ComputingIntel Technologies for High Performance Computing
Intel Technologies for High Performance Computing
Intel Software Brasil
 
Intel® Open Image Denoise in Unity*
Intel® Open Image Denoise in Unity*Intel® Open Image Denoise in Unity*
Intel® Open Image Denoise in Unity*
Intel® Software
 
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
tdc-globalcode
 
【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYeh
【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYeh【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYeh
【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYeh
MAKERPRO.cc
 

Similar to TDC2017 | São Paulo - Trilha Machine Learning How we figured out we had a SRE team at - Tendências da junção entre Big Data Analytics, Machine Learning e Supercomputação (20)

Software-defined Visualization, High-Fidelity Visualization: OpenSWR and OSPRay
Software-defined Visualization, High-Fidelity Visualization: OpenSWR and OSPRaySoftware-defined Visualization, High-Fidelity Visualization: OpenSWR and OSPRay
Software-defined Visualization, High-Fidelity Visualization: OpenSWR and OSPRay
 
Embree Ray Tracing Kernels
Embree Ray Tracing KernelsEmbree Ray Tracing Kernels
Embree Ray Tracing Kernels
 
Ready access to high performance Python with Intel Distribution for Python 2018
Ready access to high performance Python with Intel Distribution for Python 2018Ready access to high performance Python with Intel Distribution for Python 2018
Ready access to high performance Python with Intel Distribution for Python 2018
 
Make your unity game faster, faster
Make your unity game faster, fasterMake your unity game faster, faster
Make your unity game faster, faster
 
AIDC Summit LA- Hands-on Training
AIDC Summit LA- Hands-on Training AIDC Summit LA- Hands-on Training
AIDC Summit LA- Hands-on Training
 
In The Trenches Optimizing UE4 for Intel
In The Trenches Optimizing UE4 for IntelIn The Trenches Optimizing UE4 for Intel
In The Trenches Optimizing UE4 for Intel
 
MeeGo Overview DeveloperDay Munich
MeeGo Overview DeveloperDay MunichMeeGo Overview DeveloperDay Munich
MeeGo Overview DeveloperDay Munich
 
Intel® Trace Analyzer e Collector (ITAC) - Intel Software Conference 2013
Intel® Trace Analyzer e Collector (ITAC) - Intel Software Conference 2013Intel® Trace Analyzer e Collector (ITAC) - Intel Software Conference 2013
Intel® Trace Analyzer e Collector (ITAC) - Intel Software Conference 2013
 
High Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge EconomyHigh Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge Economy
 
Gary Brown (Movidius, Intel): Deep Learning in AR: the 3 Year Horizon
Gary Brown (Movidius, Intel): Deep Learning in AR: the 3 Year HorizonGary Brown (Movidius, Intel): Deep Learning in AR: the 3 Year Horizon
Gary Brown (Movidius, Intel): Deep Learning in AR: the 3 Year Horizon
 
FPGAs and Machine Learning
FPGAs and Machine LearningFPGAs and Machine Learning
FPGAs and Machine Learning
 
Driving Industrial InnovationOn the Path to Exascale
Driving Industrial InnovationOn the Path to ExascaleDriving Industrial InnovationOn the Path to Exascale
Driving Industrial InnovationOn the Path to Exascale
 
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
 
Accelerate Ceph performance via SPDK related techniques
Accelerate Ceph performance via SPDK related techniques Accelerate Ceph performance via SPDK related techniques
Accelerate Ceph performance via SPDK related techniques
 
Overcoming Scaling Challenges in MongoDB Deployments with SSD
Overcoming Scaling Challenges in MongoDB Deployments with SSDOvercoming Scaling Challenges in MongoDB Deployments with SSD
Overcoming Scaling Challenges in MongoDB Deployments with SSD
 
What are latest new features that DPDK brings into 2018?
What are latest new features that DPDK brings into 2018?What are latest new features that DPDK brings into 2018?
What are latest new features that DPDK brings into 2018?
 
Intel Technologies for High Performance Computing
Intel Technologies for High Performance ComputingIntel Technologies for High Performance Computing
Intel Technologies for High Performance Computing
 
Intel® Open Image Denoise in Unity*
Intel® Open Image Denoise in Unity*Intel® Open Image Denoise in Unity*
Intel® Open Image Denoise in Unity*
 
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
 
【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYeh
【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYeh【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYeh
【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYeh
 

More from tdc-globalcode

TDC2019 Intel Software Day - Visao Computacional e IA a servico da humanidade
TDC2019 Intel Software Day - Visao Computacional e IA a servico da humanidadeTDC2019 Intel Software Day - Visao Computacional e IA a servico da humanidade
TDC2019 Intel Software Day - Visao Computacional e IA a servico da humanidade
tdc-globalcode
 
TDC2019 Intel Software Day - ACATE - Cases de Sucesso
TDC2019 Intel Software Day - ACATE - Cases de SucessoTDC2019 Intel Software Day - ACATE - Cases de Sucesso
TDC2019 Intel Software Day - ACATE - Cases de Sucesso
tdc-globalcode
 
TDC2019 Intel Software Day - Otimizacao grafica com o Intel GPA
TDC2019 Intel Software Day - Otimizacao grafica com o Intel GPATDC2019 Intel Software Day - Otimizacao grafica com o Intel GPA
TDC2019 Intel Software Day - Otimizacao grafica com o Intel GPA
tdc-globalcode
 
TDC2019 Intel Software Day - Deteccao de objetos em tempo real com OpenVino
TDC2019 Intel Software Day - Deteccao de objetos em tempo real com OpenVinoTDC2019 Intel Software Day - Deteccao de objetos em tempo real com OpenVino
TDC2019 Intel Software Day - Deteccao de objetos em tempo real com OpenVino
tdc-globalcode
 
TDC2019 Intel Software Day - OpenCV: Inteligencia artificial e Visao Computac...
TDC2019 Intel Software Day - OpenCV: Inteligencia artificial e Visao Computac...TDC2019 Intel Software Day - OpenCV: Inteligencia artificial e Visao Computac...
TDC2019 Intel Software Day - OpenCV: Inteligencia artificial e Visao Computac...
tdc-globalcode
 
TDC2019 Intel Software Day - Inferencia de IA em edge devices
TDC2019 Intel Software Day - Inferencia de IA em edge devicesTDC2019 Intel Software Day - Inferencia de IA em edge devices
TDC2019 Intel Software Day - Inferencia de IA em edge devices
tdc-globalcode
 
Trilha BigData - Banco de Dados Orientado a Grafos na Seguranca Publica
Trilha BigData - Banco de Dados Orientado a Grafos na Seguranca PublicaTrilha BigData - Banco de Dados Orientado a Grafos na Seguranca Publica
Trilha BigData - Banco de Dados Orientado a Grafos na Seguranca Publica
tdc-globalcode
 
Trilha .Net - Programacao funcional usando f#
Trilha .Net - Programacao funcional usando f#Trilha .Net - Programacao funcional usando f#
Trilha .Net - Programacao funcional usando f#
tdc-globalcode
 
TDC2018SP | Trilha Go - Case Easylocus
TDC2018SP | Trilha Go - Case EasylocusTDC2018SP | Trilha Go - Case Easylocus
TDC2018SP | Trilha Go - Case Easylocus
tdc-globalcode
 
TDC2018SP | Trilha Modern Web - Para onde caminha a Web?
TDC2018SP | Trilha Modern Web - Para onde caminha a Web?TDC2018SP | Trilha Modern Web - Para onde caminha a Web?
TDC2018SP | Trilha Modern Web - Para onde caminha a Web?
tdc-globalcode
 
TDC2018SP | Trilha Go - Clean architecture em Golang
TDC2018SP | Trilha Go - Clean architecture em GolangTDC2018SP | Trilha Go - Clean architecture em Golang
TDC2018SP | Trilha Go - Clean architecture em Golang
tdc-globalcode
 
TDC2018SP | Trilha Go - "Go" tambem e linguagem de QA
TDC2018SP | Trilha Go - "Go" tambem e linguagem de QATDC2018SP | Trilha Go - "Go" tambem e linguagem de QA
TDC2018SP | Trilha Go - "Go" tambem e linguagem de QA
tdc-globalcode
 
TDC2018SP | Trilha Mobile - Digital Wallets - Seguranca, inovacao e tendencia
TDC2018SP | Trilha Mobile - Digital Wallets - Seguranca, inovacao e tendenciaTDC2018SP | Trilha Mobile - Digital Wallets - Seguranca, inovacao e tendencia
TDC2018SP | Trilha Mobile - Digital Wallets - Seguranca, inovacao e tendencia
tdc-globalcode
 
TDC2018SP | Trilha .Net - Real Time apps com Azure SignalR Service
TDC2018SP | Trilha .Net - Real Time apps com Azure SignalR ServiceTDC2018SP | Trilha .Net - Real Time apps com Azure SignalR Service
TDC2018SP | Trilha .Net - Real Time apps com Azure SignalR Service
tdc-globalcode
 
TDC2018SP | Trilha .Net - Passado, Presente e Futuro do .NET
TDC2018SP | Trilha .Net - Passado, Presente e Futuro do .NETTDC2018SP | Trilha .Net - Passado, Presente e Futuro do .NET
TDC2018SP | Trilha .Net - Passado, Presente e Futuro do .NET
tdc-globalcode
 
TDC2018SP | Trilha .Net - Novidades do C# 7 e 8
TDC2018SP | Trilha .Net - Novidades do C# 7 e 8TDC2018SP | Trilha .Net - Novidades do C# 7 e 8
TDC2018SP | Trilha .Net - Novidades do C# 7 e 8
tdc-globalcode
 
TDC2018SP | Trilha .Net - Obtendo metricas com TDD utilizando build automatiz...
TDC2018SP | Trilha .Net - Obtendo metricas com TDD utilizando build automatiz...TDC2018SP | Trilha .Net - Obtendo metricas com TDD utilizando build automatiz...
TDC2018SP | Trilha .Net - Obtendo metricas com TDD utilizando build automatiz...
tdc-globalcode
 
TDC2018SP | Trilha .Net - .NET funcional com F#
TDC2018SP | Trilha .Net - .NET funcional com F#TDC2018SP | Trilha .Net - .NET funcional com F#
TDC2018SP | Trilha .Net - .NET funcional com F#
tdc-globalcode
 
TDC2018SP | Trilha .Net - Crie SPAs com Razor e C# usando Blazor em .Net Core
TDC2018SP | Trilha .Net - Crie SPAs com Razor e C# usando Blazor  em .Net CoreTDC2018SP | Trilha .Net - Crie SPAs com Razor e C# usando Blazor  em .Net Core
TDC2018SP | Trilha .Net - Crie SPAs com Razor e C# usando Blazor em .Net Core
tdc-globalcode
 
TDC2018SP | Trilha .Net - Novidades do ASP.NET Core 2.1
TDC2018SP | Trilha .Net - Novidades do ASP.NET Core 2.1TDC2018SP | Trilha .Net - Novidades do ASP.NET Core 2.1
TDC2018SP | Trilha .Net - Novidades do ASP.NET Core 2.1
tdc-globalcode
 

More from tdc-globalcode (20)

TDC2019 Intel Software Day - Visao Computacional e IA a servico da humanidade
TDC2019 Intel Software Day - Visao Computacional e IA a servico da humanidadeTDC2019 Intel Software Day - Visao Computacional e IA a servico da humanidade
TDC2019 Intel Software Day - Visao Computacional e IA a servico da humanidade
 
TDC2019 Intel Software Day - ACATE - Cases de Sucesso
TDC2019 Intel Software Day - ACATE - Cases de SucessoTDC2019 Intel Software Day - ACATE - Cases de Sucesso
TDC2019 Intel Software Day - ACATE - Cases de Sucesso
 
TDC2019 Intel Software Day - Otimizacao grafica com o Intel GPA
TDC2019 Intel Software Day - Otimizacao grafica com o Intel GPATDC2019 Intel Software Day - Otimizacao grafica com o Intel GPA
TDC2019 Intel Software Day - Otimizacao grafica com o Intel GPA
 
TDC2019 Intel Software Day - Deteccao de objetos em tempo real com OpenVino
TDC2019 Intel Software Day - Deteccao de objetos em tempo real com OpenVinoTDC2019 Intel Software Day - Deteccao de objetos em tempo real com OpenVino
TDC2019 Intel Software Day - Deteccao de objetos em tempo real com OpenVino
 
TDC2019 Intel Software Day - OpenCV: Inteligencia artificial e Visao Computac...
TDC2019 Intel Software Day - OpenCV: Inteligencia artificial e Visao Computac...TDC2019 Intel Software Day - OpenCV: Inteligencia artificial e Visao Computac...
TDC2019 Intel Software Day - OpenCV: Inteligencia artificial e Visao Computac...
 
TDC2019 Intel Software Day - Inferencia de IA em edge devices
TDC2019 Intel Software Day - Inferencia de IA em edge devicesTDC2019 Intel Software Day - Inferencia de IA em edge devices
TDC2019 Intel Software Day - Inferencia de IA em edge devices
 
Trilha BigData - Banco de Dados Orientado a Grafos na Seguranca Publica
Trilha BigData - Banco de Dados Orientado a Grafos na Seguranca PublicaTrilha BigData - Banco de Dados Orientado a Grafos na Seguranca Publica
Trilha BigData - Banco de Dados Orientado a Grafos na Seguranca Publica
 
Trilha .Net - Programacao funcional usando f#
Trilha .Net - Programacao funcional usando f#Trilha .Net - Programacao funcional usando f#
Trilha .Net - Programacao funcional usando f#
 
TDC2018SP | Trilha Go - Case Easylocus
TDC2018SP | Trilha Go - Case EasylocusTDC2018SP | Trilha Go - Case Easylocus
TDC2018SP | Trilha Go - Case Easylocus
 
TDC2018SP | Trilha Modern Web - Para onde caminha a Web?
TDC2018SP | Trilha Modern Web - Para onde caminha a Web?TDC2018SP | Trilha Modern Web - Para onde caminha a Web?
TDC2018SP | Trilha Modern Web - Para onde caminha a Web?
 
TDC2018SP | Trilha Go - Clean architecture em Golang
TDC2018SP | Trilha Go - Clean architecture em GolangTDC2018SP | Trilha Go - Clean architecture em Golang
TDC2018SP | Trilha Go - Clean architecture em Golang
 
TDC2018SP | Trilha Go - "Go" tambem e linguagem de QA
TDC2018SP | Trilha Go - "Go" tambem e linguagem de QATDC2018SP | Trilha Go - "Go" tambem e linguagem de QA
TDC2018SP | Trilha Go - "Go" tambem e linguagem de QA
 
TDC2018SP | Trilha Mobile - Digital Wallets - Seguranca, inovacao e tendencia
TDC2018SP | Trilha Mobile - Digital Wallets - Seguranca, inovacao e tendenciaTDC2018SP | Trilha Mobile - Digital Wallets - Seguranca, inovacao e tendencia
TDC2018SP | Trilha Mobile - Digital Wallets - Seguranca, inovacao e tendencia
 
TDC2018SP | Trilha .Net - Real Time apps com Azure SignalR Service
TDC2018SP | Trilha .Net - Real Time apps com Azure SignalR ServiceTDC2018SP | Trilha .Net - Real Time apps com Azure SignalR Service
TDC2018SP | Trilha .Net - Real Time apps com Azure SignalR Service
 
TDC2018SP | Trilha .Net - Passado, Presente e Futuro do .NET
TDC2018SP | Trilha .Net - Passado, Presente e Futuro do .NETTDC2018SP | Trilha .Net - Passado, Presente e Futuro do .NET
TDC2018SP | Trilha .Net - Passado, Presente e Futuro do .NET
 
TDC2018SP | Trilha .Net - Novidades do C# 7 e 8
TDC2018SP | Trilha .Net - Novidades do C# 7 e 8TDC2018SP | Trilha .Net - Novidades do C# 7 e 8
TDC2018SP | Trilha .Net - Novidades do C# 7 e 8
 
TDC2018SP | Trilha .Net - Obtendo metricas com TDD utilizando build automatiz...
TDC2018SP | Trilha .Net - Obtendo metricas com TDD utilizando build automatiz...TDC2018SP | Trilha .Net - Obtendo metricas com TDD utilizando build automatiz...
TDC2018SP | Trilha .Net - Obtendo metricas com TDD utilizando build automatiz...
 
TDC2018SP | Trilha .Net - .NET funcional com F#
TDC2018SP | Trilha .Net - .NET funcional com F#TDC2018SP | Trilha .Net - .NET funcional com F#
TDC2018SP | Trilha .Net - .NET funcional com F#
 
TDC2018SP | Trilha .Net - Crie SPAs com Razor e C# usando Blazor em .Net Core
TDC2018SP | Trilha .Net - Crie SPAs com Razor e C# usando Blazor  em .Net CoreTDC2018SP | Trilha .Net - Crie SPAs com Razor e C# usando Blazor  em .Net Core
TDC2018SP | Trilha .Net - Crie SPAs com Razor e C# usando Blazor em .Net Core
 
TDC2018SP | Trilha .Net - Novidades do ASP.NET Core 2.1
TDC2018SP | Trilha .Net - Novidades do ASP.NET Core 2.1TDC2018SP | Trilha .Net - Novidades do ASP.NET Core 2.1
TDC2018SP | Trilha .Net - Novidades do ASP.NET Core 2.1
 

Recently uploaded

Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
National Information Standards Organization (NISO)
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
Nicholas Montgomery
 
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective UpskillingYour Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Excellence Foundation for South Sudan
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
Nicholas Montgomery
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
History of Stoke Newington
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
PECB
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
Celine George
 
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem studentsRHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
Himanshu Rai
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
Celine George
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
eBook.com.bd (প্রয়োজনীয় বাংলা বই)
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
WaniBasim
 
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Fajar Baskoro
 
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
imrankhan141184
 
PIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf IslamabadPIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf Islamabad
AyyanKhan40
 
How to Create a More Engaging and Human Online Learning Experience
How to Create a More Engaging and Human Online Learning Experience How to Create a More Engaging and Human Online Learning Experience
How to Create a More Engaging and Human Online Learning Experience
Wahiba Chair Training & Consulting
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
Academy of Science of South Africa
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
Priyankaranawat4
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
HajraNaeem15
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
amberjdewit93
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
RAHUL
 

Recently uploaded (20)

Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
 
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective UpskillingYour Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective Upskilling
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
 
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem studentsRHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
 
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
 
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
 
PIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf IslamabadPIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf Islamabad
 
How to Create a More Engaging and Human Online Learning Experience
How to Create a More Engaging and Human Online Learning Experience How to Create a More Engaging and Human Online Learning Experience
How to Create a More Engaging and Human Online Learning Experience
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
 

TDC2017 | São Paulo - Trilha Machine Learning How we figured out we had a SRE team at - Tendências da junção entre Big Data Analytics, Machine Learning e Supercomputação

  • 1. Globalcode – Open4education Trilha – Machine Learning Tendências da junção entre Big Data Analytics, Machine Learning e Supercomputação Igor Freitas Intel – Software and Services Group
  • 2. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 2 Legal Notices and Disclaimers INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL’S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL® PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. INTEL PRODUCTS ARE NOT INTENDED FOR USE IN MEDICAL, LIFE SAVING, OR LIFE SUSTAINING APPLICATIONS. Intel may make changes to specifications and product descriptions at any time, without notice. All products, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice. Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request. Any code names featured are used internally within Intel to identify products that are in development and not yet publicly announced for release. Customers, licensees and other third parties are not authorized by Intel to use code names in advertising, promotion or marketing of any product or services and any such use of Intel's internal code names is at the sole risk of the user. Intel product plans in this presentation do not constitute Intel plan of record product roadmaps. Please contact your Intel representative to obtain Intel’s current plan of record product roadmaps. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance Intel, the Intel logo, Xeon and Xeon logo , Xeon Phi and Xeon Phi logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries Other names and brands may be claimed as the property of others. All products, dates, and figures are preliminary and are subject to change without any notice. Copyright © 2016, Intel Corporation. This document contains information on products in the design phase of development.
  • 3. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 3 Optimization Notice Optimization Notice Intel® compilers, associated libraries and associated development tools may include or utilize options that optimize for instruction sets that are available in both Intel® and non-Intel microprocessors (for example SIMD instruction sets), but do not optimize equally for non-Intel microprocessors. In addition, certain compiler options for Intel compilers, including some that are not specific to Intel micro-architecture, are reserved for Intel microprocessors. For a detailed description of Intel compiler options, including the instruction sets and specific microprocessors they implicate, please refer to the “Intel® Compiler User and Reference Guides” under “Compiler Options." Many library routines that are part of Intel® compiler products are more highly optimized for Intel microprocessors than for other microprocessors. While the compilers and libraries in Intel® compiler products offer optimizations for both Intel and Intel- compatible microprocessors, depending on the options you select, your code and other factors, you likely will get extra performance on Intel microprocessors. Intel® compilers, associated libraries and associated development tools may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include Intel® Streaming SIMD Extensions 2 (Intel® SSE2), Intel® Streaming SIMD Extensions 3 (Intel® SSE3), and Supplemental Streaming SIMD Extensions 3 (Intel® SSSE3) instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. While Intel believes our compilers and libraries are excellent choices to assist in obtaining the best performance on Intel® and non-Intel microprocessors, Intel recommends that you evaluate other compilers and libraries to determine which best meet your requirements. We hope to win your business by striving to offer the best performance of any compiler or library; please let us know if you find we do not. Notice revision #20101101
  • 4. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice https://www.intelnervana.com/framework-optimizations/ 4 Big Data Analytics HPC != Big Data Analytics and Artificial Intelligence ? *Other brands and names are the property of their respective owners. FORTRAN / C++ Applications MPI High Performance Java, Python, Go, etc.* Applications Hadoop* Simple to Use SLURM Supports large scale startup YARN* More resilient of hardware failures Lustre* Remote Storage HDFS*, SPARK* Local Storage Compute & Memory Focused High Performance Components Storage Focused Standard Server Components Server Storage SSDs Switch Fabric Infrastructure Modelo de Programação Resource Manager Sistema de arquivos Hardware Server Storage HDDs Switch Ethernet Infrastructure
  • 5. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice https://www.intelnervana.com/framework-optimizations/ 5 Trends in HPC + Big Data Analytics Standards Business viability Performance Code Modernization (Vector instructions) Many-core FPGA, ASICs Usability Faster time-to-market Lower costs (HPC at Cloud ? ) Better products Easy to mantain HW & SW Portability Open Commom Environments Integrated solutions: Storage + Network + Processing + Memory Public investments
  • 6. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice https://www.intelnervana.com/framework-optimizations/ Varied Resource Needs Typical HPC Workloads Typical Big Data Workloads 6 Big Data Analytics HPC in real time Small Data + Small Compute e.g. Data analysis Big Data + Small Compute e.g. Search, Streaming, Data Preconditioning Small Data + Big Compute e.g. Mechanical Design, Multi-physics Data Compute High Frequency Trading Numeric Weather Simulation Oil & Gas Seismic Systemcostbalance Video Survey Traffic Monitor Personal Digital Health Systemcostbalance Processor Memory Interconnect Storage
  • 7. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 7#IntelAI WhatisMachineLearning?  Random Forest  Support Vector Machines  Regression  Naïve Bayes  Hidden Markov  K-Means Clustering  Ensemble Methods  More… ClassicML Using optimized functions or algorithms to extract insights from data Training Data* Inference, Clustering, or Classification New Data* Deeplearning Using massive labeled data sets to train deep (neural) graphs that can make inferences about new data Step 1: Training Use massive labeled dataset (e.g. 10M tagged images) to iteratively adjust weighting of neural network connections Step 2: Inference Form inference about new input data (e.g. a photo) using trained neural network Hours to Days in Cloud Real-Time at Edge/Cloud New Data Untrained Trained Algorithms CNN, RNN, RBM.. *Note: not all classic machine learning functions require training
  • 8. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 8#IntelAI ArtificialIntelligenceOpportunity Intel Confidential AI 40% Classic Machine Learning Deep Learning 60% 2016 Servers 7% 97% Intel Architecture (IA) 1% IA+ GPU 2% Other 91% 7% Intel Architecture (IA) 2% IA+ GPU Other AI is the fastest growing data center workload
  • 9. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 9#IntelAI ArtificialIntelligencePlan Intel Confidential Bringing the HPC Strategy to AI Most widely deployed machine learning solution High performance, classic machine learning LAKE CREST Best in class neural network performance Programmable, low-latency inference Top 500 % FLOPs Jun'10 Nov '12 Nov '16 35% 5% 60% 93% 7% 16% 20% 64% Nvidia intro Xeon Phi intro November ‘16 Nvidia* Intel® Xeon Phi™ Xeon % 0 COMING 2017 KNIGHTSMILL NOW SKYLAKE COMING 2017 LAKECREST SDVS SHIPPING TODAY BROADWELL+ARRIA10 Intel® Nervana™ Portfolio
  • 10. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice libraries Intel® Math Kernel Library (MKL, MKL-DNN) platforms Frameworks Intel® Data Analytics Acceleration Library (DAAL) hardware Memory & Storage NetworkingCompute Intel Python Distribution Mllib BigDL Intel® Nervana™ Graph* experiences Intel® Nervana™ Cloud & Appliance Intel®Nervana™portfolio Intel® Nervana™ DL Studio Intel® Computer Vision SDK Movidius Fathom More *Future Other names and brands may be claimed as the property of others. *
  • 11. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 11#IntelAI BIGDL Bringing Deep Learning to Big Data github.com/intel-analytics/BigDL  Open Sourced Deep Learning Library for Apache Spark*  Make Deep learning more Accessible to Big data users and data scientists.  Feature Parity with popular DL frameworks like Caffe, Torch, Tensorflow etc.  Easy Customer and Developer Experience  Run Deep learning Applications as Standard Spark programs;  Run on top of existing Spark/Hadoop clusters (No Cluster change)  High Performance powered by Intel MKL and Multi-threaded programming.  Efficient Scale out leveraging Spark architecture. Spark Core SQL SparkR Stream- ing MLlib GraphX ML Pipeline DataFrame BigDL For developers looking to run deep learning on Hadoop/Spark due to familiarity or analytics use
  • 12. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 12#IntelAI Intel®Nervana™porftolio(Detail) Batch Manybatchmodels Train machine learning models across a diverse set of dense and sparse data Train large deep neural networks Train large models as fast as possible LAKE CREST Stream Edge Infer billions of data samples at a time and feed applications within ~1 day Infer deep data streams with low latency in order to take action within milliseconds Power-constrained environments … Training inference Batch or other Intel® edge processor OR OR OR Option for higher throughput/watt *Future* Required for low latency
  • 13. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 0 5 10 15 20 25 30 Intel Xeon E5-2699v4 Intel Xeon E5-2699v4 Intel Xeon E5-2699v4 Intel Xeon Phi 7250 Out-of-the-box +Intel MKL 11.3.3 +Intel MKL 2017 +Intel MKL 2017 Performancespeedup Caffe/AlexNet single node inference performance 2.2x 1.9x 13 Better performance in Deep Neural Network workloads with Intel® Math Kernel Library (Intel® MKL) Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance . *Other names and brands may be property of others Configurations: • 2 socket system with Intel® Xeon® Processor E5-2699 v4 (22 Cores, 2.2 GHz,), 128 GB memory, Red Hat* Enterprise Linux 6.7, BVLC Caffe, Intel Optimized Caffe framework, Intel® MKL 11.3.3, Intel® MKL 2017 • Intel® Xeon Phi™ Processor 7250 (68 Cores, 1.4 GHz, 16GB MCDRAM), 128 GB memory, Red Hat* Enterprise Linux 6.7, Intel® Optimized Caffe framework, Intel® MKL 2017 All numbers measured without taking data manipulation into account. 7.5x 31x
  • 14. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice Skt-Learn* Optimizations With Intel® MKL 0x 1x 2x 3x 4x 5x 6x 7x 8x 9x Approximate neighbors Fast K-means GLM GLM net LASSO Lasso path Least angle regression, OpenMP Non-negative matrix factorization Regression by SGD Sampling without replacement SVD Speedups of Scikit-Learn Benchmarks Intel® Distribution for Python* 2017 Update 1 vs. system Python & NumPy/Scikit-Learn System info: 32x Intel® Xeon® CPU E5-2698 v3 @ 2.30GHz, disabled HT, 64GB RAM; Intel® Distribution for Python* 2017 Gold; Intel® MKL 2017.0.0; Ubuntu 14.04.4 LTS; Numpy 1.11.1; scikit-learn 0.17.1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. * Other brands and names are the property of their respective owners. Benchmark Source: Intel Corporation Optimization Notice: Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by Effect of Intel MKL optimizations for NumPy* and SciPy*
  • 15. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 15#IntelAI Intel®Arria®10FPGAperformance Deep Learning Image Classification SINGLE-NODE Inference Performance Efficiency * Xeon with Integrated FPGA refers to Broadwell Platform Proof of Concept Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance . *Other names and brands may be property of others. Configurations: - AlexNet1 Classification Implementation as specified by http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf, Training Parameters taken from Caffe open-source Framework are 224x224x3 Input, 1000x1 Output, FP16 with Shared Block Exponents, All compute layers (incl. Fully Connected) done on the FPGA except for Softmax - Arria 10-1150 FPGA, -1 Speed Grade on Altera PCIe DevKit with x72 DDR4 @ 1333 MHz, Largest mid-range part from newest family (1518 DSPs, 2800 M20Ks), Power measured through on-board power monitor (FPGA POWER ONLY) ACDS 16.1 Internal Builds + OpenCL SDK 16.1 Internal Build Host system: HP Z620 Workstation, Xeon E5-1660 at 3.3 GHz with 32GB RAM. The Xeon is not used for compute. - AlexNet: https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf , Batch Size:256 2.From GTC Keynote System Throughput Power Throughput / Watt Arria® 10-115 (FP32, Full Size Image, Speed @306Mhz)1 575 img/s ~31W 18.5 img/s/W Arria® 10-115 (FP16, Full Size Image, Speed @297Mhz)1 1020 img/s ~40W 25.5 img/s/W Nvidia* M42 20 img/s/w Topology Datatype Throughput* AlexNet* FloatP32 ~360 Image/s FixedP16 ~600 Image/s FixedP8 ~750 Images/s FixedP8 Winograd ~1200 Images/s + Integrated* discrete Note: Intel® Xeon® processor performance is not included in tables
  • 16. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 16  LeTV Cloud (www.lecloud.com) is a leading video cloud provider in China  LeTV Cloud provides illegal video detection service to 3rd party video cloud customers to help them detect illegal videos  Originally, LeTV adopted open source BVLC Caffe plus OpenBlas as CNN framework, but the performance was poor  By using Caffe + Intel MKL, they gained up to 30x performance improvement on training in production environment Case Study: LeTV Cloud Illegal Video Detection * The test data is based on Intel Xeon E5 2680 V3 processor 1 30 0 5 10 15 20 25 30 35 BVLC Caffe + OpenBlas(Baseline) IntelCaffe + MKL LeTV Cloud Illegal Video Detection Process Flow Video Pictures Extract Keyframe Input Classification CNN Network based on Caffe Video Classification … LeTV Cloud Caffe Optimization - higher is better Other names and brands may be claimed as property of others. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit http://www.intel.com/performance
  • 17. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice https://www.intelnervana.com/framework-optimizations/ Identificando oportunidades de otimização Recapitulando o que é vetorização / SIMD 18 for (i=0;i<=MAX;i++) c[i]=a[i]+b[i]; + c[i+7] c[i+6] c[i+5] c[i+4] c[i+3] c[i+2] c[i+1] c[i] b[i+7] b[i+6] b[i+5] b[i+4] b[i+3] b[i+2] b[i+1] b[i] a[i+7] a[i+6] a[i+5] a[i+4] a[i+3] a[i+2] a[i+1] a[i] Vector - Uma instrução - Oito operações + C B A Scalar - Uma instrução - Uma operação • O que é e ? • Capacidade de realizar uma operação matemática em dois ou mais elementos ao mesmo tempo. • Por que Vetorizar ? • Ganho substancial em performance ! https://www.intelnervana.com/framework-optimizations/ https://colfaxresearch.com/knl-avx512/
  • 21. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice https://www.intelnervana.com/framework-optimizations/ 22 Get started today Frameworks optimized for Intel Build Faster Deep Learning Applications with Caffe* The popular open-source development framework for image recognition is now optimized for Intel® Architecture. Get the framework Learn how to install Caffe* Delving Into Deep Learning The Python library designed to help write deep learning models is now optimized for Intel® Architecture. Visit the library Getting started with Theano* Speed Up Your Spark Analytics Apache Spark* MLlib, the open-source data processing framework’s machine learning library, now includes Intel® Architecture support. Get the library Building faster applications on Spark clusters BigDL on GitHub BigDL on Spark video Running on EC2 Page BigDL: Distributed Deep learning on Apache Spark
  • 22. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice https://www.intelnervana.com/framework-optimizations/ 23 Get started today Get Intel® Libraries (Community License) for Free Intel® Data Analytics Acceleration Library Intel® DAAL Highly optimized library that helps speed big data analytics by providing algorithmic building blocks for all data analysis stages and for offline, streaming, and distributed analytics usages. Open-source options for Intel® DAAL Learn more about Intel® DAAL Intel® Math Kernel Library Intel® MKL A high-performance library with assets to help accelerate math processing routines and increase application performance. Deep neural network technical preview for Intel® MKL Get the library
  • 23. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice https://www.intelnervana.com/framework-optimizations/ 24 Training General Purpose Infrastructure Workload optimized • Accelerating Deep Learning and Machine Learning This talk focuses on two Intel performance libraries, MKL and DAAL, which offer optimized building blocks for data analytics and machine learning. • Remove Python Performance Barriers for Machine Learning This webinar highlights significant performance speed-ups achieved by implementing multiple Intel tools and techniques for high performance Python. • Analyze Python* App Performance with Intel® VTune™ Amplifier Efficient profiling techniques can help dramatically improve the performance of your Python* code. Learn how Intel® VTune Amplifier can help. Boost Python* Performance with Intel® Math Kernel Library Meet Intel® Distribution for Python*, an easy-to-install, optimized Python distribution that can help you optimize your app’s performance. Building Faster Data Applications on Spark* Clusters Apache Spark* is big for big data processing apps. Intel® Data Analytics Acceleration Library (Intel® DAAL) can help optimize performance. Learn how. Faster Big Data Analytics Using New Intel® Data Analytics Acceleration Library Big data is BIG. And you need information faster. New Intel® Data Analytics Acceleration Library (Intel® DAAL) speeds data processing for data mining, statistical analysis, and machine learning.
  • 24. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 25 Resources Intel® Machine Learning • http://www.intel.com/content/www/us/en/analytics/machine-learning/overview.html • Intel Caffe* fork, https://github.com/intelcaffe/caffe • Intel Theano* fork, https://github.com/intel/theano • Intel® Deep Learning SDK: https://software.intel.com/deep-learning-SDK Intel® DAAL • https://software.intel.com/en-us/intel-daal Intel® Omni-Path Architecture • http://www.intel.com/content/www/us/en/high-performance-computing-fabrics/omni-path- architecture-fabric-overview.html
  • 25. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 26 Intel(R) MKL Resources Intel® MKL website • https://software.intel.com/en-us/intel-mkl Intel® MKL forum • https://software.intel.com/en-us/forums/intel-math-kernel-library Intel® MKL benchmarks • https://software.intel.com/en-us/intel-mkl/benchmarks# Intel® MKL link line advisor • http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/
  • 26. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 27#IntelAI Customerexamples health Finance industrial Health Finance industrial Early Tumor Detection Leading medical imaging company  Early detection of malignant tumors in mammograms  Millions of “Diagnosed” Mammograms  Deep Learning (CNN) tumor image recognition  Higher accuracy and earlier breast cancer detection Data Synthesis Financial services institution with >$750B assets  Parse info to reduce portfolio manager time to insight  Vast stores of documents (news, emails, research, social)  Deep Learning (RNN w/ encoder/decoder)  Faster and more informed invest-ment decisions Smart Agriculture World leader in agricultural biotech  Accelerate hybrid plant development  Large dataset of hybrid plant performance based on genotype markers  Deep Learning (CNN) to detect favorable interactions between genotype markers  More accurate selection leading to cost reduction Personalized Care Renowned US Hospital system  Accurately diagnose fatal heart conditions  10,000 health attributes used  Saffron memory-based reasoning  Increased accuracy to 94% compared with 54% for average cardiologist Customer Personalization Leading Insurance Group  Increase product recommendation accuracy  5 Product Levels 1,353 Products 12M Members  Saffron memory-based reasoning  50% increase in product recommendation accuracy Supply Chain Logistics Multinational Aerospace Corp  Reduce time to select aircraft replacement part  15,000 Aircraft $1M/day idle  Saffron memory-based reasoning  Reduced part selection from 4 hours to 5 mins Deep Learning Memory-based Reasoning
  • 27. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 28#IntelAI Intel®Nervana™Platform For Deep Learning Results have been estimated or simulated using internal Intel analysis or architecture simulation or modeling, and provided to you for informational purposes. Any differences in your system hardware, software or configuration may affect your actual performance COMPARED TO TODAY’S FASTEST SOLUTION1 100XReductionDelivering Intimetotrain by2020 KnightsCrest Bootable Intel Xeon Processor with integrated acceleration LakeCrest Discrete accelerator First silicon 1H’2017
  • 28. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 29#IntelAI Hardware for DL Workloads  Custom-designed for deep learning  Unprecedented compute density  More raw computing power than today’s state-of-the-art GPUs Blazingly Fast Data Access  32 GB of in package memory via HBM2 technology  8 Tera-bits/s of memory access speed High Speed Scalability  12 bi-directional high-bandwidth links  Seamless data transfer via interconnects Lakecrest Deep Learning by Design Add-in card for unprecedented compute density in deep learning centric environments Lake Crest Everything needed for deep learning and nothing more!
  • 29. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 30#IntelAI Removing IO and Memory Barriers  Integrated Intel® Omni-Path fabric increases price-performance and reduces communication latency  Direct access of up to 400 GB of memory with no PCIe performance lag (vs. GPU:16GB) Breakthrough Highly Parallel Performance  Up to 400X deep learning performance on existing hardware via Intel software optimizations  Up to 4Xdeep learning performance increase estimated (Knights Mill, 2017) Easier Programmability  Binary-compatible with Intel® Xeon® processors  Open standards, libraries and frameworks Intel®XeonPhi™ProcessorFamily Enables Shorter Time to Train Using General Purpose Infrastructure Processor for HPC & enterprise customers running scale-out, highly-parallel, memory intensive apps Configuration details on slide: 30 Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit: http://www.intel.com/performance Source: Intel measured as of November 2016 Optimization Notice: Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice Revision #20110804
  • 30. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 31#IntelAI Energy Efficient Inference with Infrastructure Flexibility  Excellent energy efficiency up to 25images/sec/watt inference on Caffe/Alexnet  Reconfigurable accelerator can be used for variety of data center workloads  Integrated FPGA with Intel® Xeon® processor fits in standard server infrastructure -OR- Discrete FPGA fits in PCIe card and embedded applications* Intel®Arria®10FPGA Superior Inference Capabilities Add-in card for higher performance/watt inference with low latency and flexible precision *Xeon with Integrated FPGA refers to Broadwell Proof of Concept Configuration details on slide: 44 Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit: http://www.intel.com/performance Source: Intel measured as of November 2016 Optimization Notice: Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice Revision #20110804
  • 31. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 32#IntelAI Energy Efficient Inference  Accelerates image recognition using convolutional neural networks (CNN)  Excellent energy efficient inference up to 25 images/s/w on Caffe/Alexnet  Fits in standard server infrastructure* Accelerate Time to Market  Simplify deployment with preloaded optimized CNN algorithms  Integrated software ecosystem: optimized libraries, frameworks and APIs CanyonVista Turnkey Deep Learning Inference Solution Pre-configured add-in card for higher performance/watt inference for image recognition *Standard server infrastructure – verified with Broadwell platform Configuration details on slide: 44 Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit: http://www.intel.com/performance Source: Intel measured as of November 2016 Optimization Notice: Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice Revision #20110804
  • 32. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 33#IntelAI Lowest TCO With Superior Infrastructure Flexibility  Standard server infrastructure  Open standards, libraries & frameworks  Optimized to run wide variety of data center workloads Server Class Reliability  Industry standard server features: high reliability, hardware enhanced security Leadership Throughput  Industry leading inference performance  Up to 18Xperformance on existing hardware via Intel software optimizations Intel®Xeon®ProcessorFamily Most Widely Deployed Machine Learning Platform (97% share)* Processor optimized for a wide variety of datacenter workloads enabling flexible infrastructure Configuration details on slide: 30 *Intel® Xeon® processors are used in 97% of servers that are running machine learning workloads today (Source: Intel) Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit: http://www.intel.com/performance Source: Intel measured as of November 2016 Optimization Notice: Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice Revision #20110804
  • 33. Intel® Math Kernel Library Intel® Data Analytics Acceleration Library (DAAL) Intel® Distribution for Open Source Frameworks Intel Deep Learning Tools Intel® MKL MKL-DNN High Level Overview High performance math primitives granting low level of control Free open source DNN functions for high-velocity integration with deep learning frameworks Broad data analytics acceleration object oriented library supporting distributed ML at the algorithm level Most popular and fastest growing language for machine learning Toolkits driven by academia and industry for training machine learning algorithms Accelerate deep learning model design, training and deployment Primary Audience Consumed by developers of higher level libraries and Applications Consumed by developers of the next generation of deep learning frameworks Wider Data Analytics and ML audience, Algorithm level development for all stages of data analytics Application Developers and Data Scientists Machine Learning App Developers, Researchers and Data Scientists. Application Developers and Data Scientists Example Usage Framework developers call matrix multiplication, convolution functions New framework with functions developers call for max CPU performance Call distributed alternating least squares algorithm for a recommendation system Call scikit-learn k-means function for credit card fraud detection Script and train a convolution neural network for image recognition Deep Learning training and model creation, with optimization for deployment on constrained end device Mor e…
  • 34. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice Intel®DeepLearningSDK Accelerate Deep Learning Development  FREE for data scientists and software developers to develop, train & deploy deep learning  Simplify installation of Intel optimized frameworks and libraries  Increase productivity through simple and highly-visual interface  Enhance deployment through model compression and normalization  Facilitate integration with full software stack via inference engine For developers looking to accelerate deep learning model design, training & deployment software.intel.com/deep-learning-sdk
  • 35. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice BIGDL Bringing Deep Learning to Big Data github.com/intel-analytics/BigDL  Open Sourced Deep Learning Library for Apache Spark*  Make Deep learning more Accessible to Big data users and data scientists.  Feature Parity with popular DL frameworks like Caffe, Torch, Tensorflow etc.  Easy Customer and Developer Experience  Run Deep learning Applications as Standard Spark programs;  Run on top of existing Spark/Hadoop clusters (No Cluster change)  High Performance powered by Intel MKL and Multi-threaded programming.  Efficient Scale out leveraging Spark architecture. Spark Core SQL SparkR Stream- ing MLlib GraphX ML Pipeline DataFrame BigDL For developers looking to run deep learning on Hadoop/Spark due to familiarity or analytics use
  • 36. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice https://www.intelnervana.com/framework-optimizations/ Inteldistributionforpython Advancing Python Performance Closer to Native Speeds software.intel.com/intel-distribution-for-python Easy, Out-of-the-box Access to High Performance Python  Prebuilt, optimized for numerical computing, data analytics, HPC  Drop in replacement for your existing Python (no code changes required) Drive Performance with Multiple Optimization Techniques  Accelerated NumPy/SciPy/Scikit- Learn with Intel® MKL  Data analytics with pyDAAL, enhanced thread scheduling with TBB, Jupyter* Notebook interface, Numba, Cython  Scale easily with optimized MPI4Py and Jupyter notebooks Faster Access to Latest Optimizations for Intel Architecture  Distribution and individual optimized packages available through conda and Anaconda Cloud  Optimizations upstreamed back to main Python trunk For developers using the most popular and fastest growing programming language for AI Easy to install with Anaconda* https://anaconda.org/intel/
  • 37. Hardware-Specific Transformers Hardware Agnostic Intel® Nervana™ Graph Compiler Intel® Nervana™ Graph Compiler enables optimizations that are applicable across multiple HW targets. • Efficient buffer allocation • Training vs inference optimizations • Efficient scaling across multiple nodes • Efficient partitioning of subgraphs • Compounding of ops MKL-DNN Customer Solution Neon Solutions Neon ModelsCustomer Models Neon Deep Learning Functions Customer Algorithms High-levelexecutiongraphforneuralnetworks
  • 38. Intel®Nervana™ AIAcademy SoftwareinnovatorsandBlackBelts software.intel.com/ai Frameworks,ToolsandLibraries Workshops,webinars,meetups in partnership with honeyourskillsandbuildthefutureofai *Other names and brands may be claimed as the property of others.
  • 39. Intel Confidential | NDA Required DataCenterGroup 40 Optimized Mathematical Building Blocks Intel® MKL Linear Algebra • BLAS • LAPACK • ScaLAPACK • Sparse BLAS • PARDISO* SMP & Cluster • Iterative sparse solvers Vector RNGs • Congruential • Wichmann-Hill • Mersenne Twister • Sobol • Neiderreiter • Non-deterministic Fast Fourier Transforms • Multidimensional • FFTW interfaces • Cluster FFT Summary Statistics • Kurtosis • Variation coefficient • Order statistics • Min/max • Variance-covariance Vector Math • Trigonometric • Hyperbolic • Exponential • Log • Power • Root And More • Splines • Interpolation • Trust Region • Fast Poisson Solver *Other names and brands may be claimed as property of others.
  • 40. Intel Confidential | NDA Required DataCenterGroup 41@IntelAI Intel®MKL-DNN Math Kernel Library for Deep Neural Networks Distribution Details  Open Source  Apache 2.0 License  Common DNN APIs across all Intel hardware.  Rapid release cycles, iterated with the DL community, to best support industry framework integration.  Highly vectorized & threaded for maximal performance, based on the popular Intel® MKL library. For developers of deep learning frameworks featuring optimized performance on Intel hardware github.com/01org/mkl-dnn Direct 2D Convolution Rectified linear unit neuron activation (ReLU) Maximum pooling Inner product Local response normalization (LRN)
  • 41. Intel Confidential | NDA Required DataCenterGroup 42@IntelAI Intel®Machinelearningscalinglibrary(MLSL) Scaling Deep Learning to 32 Nodes and Beyond Deep learning abstraction of message- passing implementation  Built on top of MPI; allows other communication libraries to be used as well  Optimized to drive scalability of communication patterns  Works across various interconnects: Intel® Omni-Path Architecture, InfiniBand, and Ethernet  Common API to support Deep Learning frameworks (Caffe, Theano, Torch etc.) For maximum deep learning scale-out performance on Intel® architecture FORWARD PROP BACK PROP L A Y E R 1 L A Y E R 2 L A Y E R N Allreduce Alltoall Allreduce Allreduce Reduce Scatter AllgatherAlltoall Allreduce github.com/01org/MLSL/releases