Meg Mude, Intel - Data Engineering Lifecycle Optimized on Intel - H2O World San Francisco

Session Title
Name
Title
Company
Social Media (LinkedIn / Twitter)
#H2OWORLD
Data Engineering Lifecycle on IA
Meg Mude
Solutions Architecture, Intel
@megmyname
www.Linkedin.com/in/megmude
Software.intel.com

Today is a big day... Announcing Blue Danube
https://www.prnewswire.com/news-releases/h2oai-teams-up-with-intel-to-drive-an-ai-transformation-in-the-enterprise-300789659.html

46%
of Chief Information Officers (CIOs) have developed plans
to implement AI, but only
4%have implemented AI
so far.
According to a recent Gartner survey…
Ai adoption is nascent
Source: Gartner Says Nearly Half of CIOs Are Planning to Deploy Artificial Intelligence. February 2018 (https://www.gartner.com/newsroom/id/3856163)

Consider how the brain processes data, Endless, enormous quantities
of data…
and delivers useful insights.

4 TButonomous vehicle
5 TBNECTED AIRPLANE
1 PBSmart Factory
1.5 GBrage internet user
750 pBoud video Provider
Daily By 2020
Source: Amalgamation of analyst data and Intel analysis.
Business
Insights
Operational
Insights
Security
Insights
The deluge of data

Artificial
I ntelligence
is the ability of machines to learn from
experience, without explicit programming, in
order to perform cognitive functions
associated with the human mind
Artificial Intelligence
Machine
learning
Algorithms whose performance
improve as they are exposed to
more data over time
Deep
learning
Subset of machine
learning in which
multi-layered neural
networks learn from
vast amounts of data
Analytics

Consumer Health Finance Retail
Governme
nt
Energy Transport Industrial Other
Smart Assistants
Chatbots
Search
Personalization
Augmented Reality
Robots
Enhanced Diagnostics
Drug
Discovery
Patient Care
Research
Sensory
Aids
Algorithmic Trading
Fraud Detection
Research
Personal Finance
Risk Mitigation
Support
Experience
Marketing
Merchandising
Loyalty
Supply Chain
Security
Defense
Data
Insights
Safety & Security
Resident Engagement
Smarter
Cities
Oil & Gas Exploration
Smart
Grid
Operational
Improvement
Conservation
Autonomous Cars
Automated Trucking
Aerospace
Shipping
Search & Rescue
Factory Automation
Predictive Maintenance
Precision Agriculture
Field Automation
Advertising
Education
Gaming
Professional & IT
Services
Telco/Media
Sports
Source: Intel forecast
AI will transform…everything

The AI lifecycle
2. Approach
Team breaks down the defined business
problem into workable steps to translate
the right data to achieve results
3. Expertise
A team of management sponsors,
data scientists, data engineers,
solution architects, and domain
experts identifies the right data and
works to translate the data to
achieve results
4. Philosophy
Team embraces fail-fast continuous
improvement practices to evaluate their success
in translating data to achieve results
5. Source Data
Team understands and obtains the
right data that explains the business
problem to achieve results
6. Infrastructure
Organization secures hardware and
software infrastructure that supports
data processing in a timely manner
7. Organization
Organization embraces data insights,
sponsors properly resourced teams, and
prioritizes analytic development work
1. Define the Challenge

hardwareMulti-purpose to purpose-built
AI compute from cloud to device
solutions Partner ecosystem to facilitate AI in
finance, health, retail, industrial & more
Intel
analytics
ecosystem
to get your
data ready
Data
Driving AI
forward
through R&D,
investments
and policy
Future
tools Software to accelerate development and
deployment of real solutions
Bring Your AI Vision to Life Using Our Extensive Portfolio

Edge
Device
ARTIFICIAL INTELLIGENCE
Platforms Finance Healthcare Energy Industrial Transport Retail Home More…
Data Center
TOOLKIT
S
App
Developers
libraries
Data
Scientists
foundatio
n
Library
Developers
*
*
*
*
FOR
* * * *
Hardware
IT System
Architects
Solution
s
Solution
Architects
AI Solutions Catalog
(Public & Internal)
DEEP LEARNING ACCELERATORS
Inference
DEEP LEARNING DEPLOYMENT
OpenVINO™ † Intel® Movidius™ SDK
Open Visual Inference & Neural Network Optimization toolkit
for inference deployment on CPU, processor graphics, FPGA
& VPU using TF, Caffe* & MXNet*
Optimized inference deployment
for all Intel® Movidius™ VPUs using
TensorFlow* & Caffe*
DEEP LEARNING FRAMEWORKS
Now optimized for CPU Optimizations in progress
TensorFlow* MXNet* Caffe* BigDL/Spark* Caffe2* PyTorch* PaddlePaddle*
DEEP LEARNING
Intel® Deep
Learning Studio‡
Open-source tool to compress deep
learning development cycle
MACHINE LEARNING LIBRARIES
Python R Distributed
•Scikit-learn
•Pandas
•NumPy
•Cart
•RandomF
orest
•e1071
•MlLib (on Spark)
•Mahout
ANALYTICS, MACHINE & DEEP LEARNING PRIMITIVES
Python DAAL MKL-DNN
Intel distribution
optimized for
machine learning
Intel® Data Analytics
Acceleration Library
(for machine learning)
Open-source deep neural
network functions for
CPU, processor graphics
DEEP LEARNING GRAPH COMPILER
Intel® nGraph™ Compiler (Alpha)
Open-sourced compiler for deep learning model computations
optimized for multiple devices (CPU, GPU, NNP) using multiple
frameworks (TF, MXNet, ONNX)
AI FOUNDATION
A
R
T
I
F
I
C
I
A
l
I
N
T
E
L
L
I
G
E
n
C
e NNP L-1000
* * * *
Ai.intel.com
† Formerly the Intel® Computer Vision SDK
*Other names and brands may be claimed as the property of others.
All products, computer systems, dates, and figures are preliminary based on current expectations, and are subject to change without notice.

• Typical HPC Cluster • Typical Cloud setup
One Example - Intel® Xeon®
Based Clusters

Customer Testimonials
“Thanks to Intel OpenVino toolkit, that
Learning Factory can now deliver the expected SLA of less
than 1 second for inferencing all 3 X-ray models that we
were targeting! I can’t believe, all the above happened in
almost a month since this effort got started! – Aruna
Narayanan, GE Healthcare AI DL Platform” – Aug 2018
(1)
“Taboola ended up sticking with Intel for reasons of speed and cost, said Ariel
Pisetzky, the company's vice president of information technology. Nvidia's
chip was far faster, but time spent shuffling data
back and forth to the chip negated the gains, Pisetzky
said. Second, Intel dispatched engineers to help Taboola tweak its computer
code so that the same servers could handle more than twice as many requests.
– Ariel Pisetzky, VP Taboola IT” July 2018 (3)
Intel and Philips achieved a speed improvement of 188 times for
the bone-age-prediction model, and a 38 times speed
improvement for the lung-segmentation model over the
baseline measurements. Vijayananda J., chief architect and fellow,
Data Science and AI at Philips HealthSuite Insights July 2018 (4)
“With PaddlePaddle now optimized for Intel
Xeon Scalable processors, developers and
data scientists can now use the same
hardware that powers the
world’s data centers and
clouds to advance their AI algorithms.”
– Jul 2018 (5)
The CERN team demonstrated that AI-based models have the potential to
act as orders-of-magnitude-faster replacements for
computationally expensive tasks in simulation, while maintaining a
remarkable level of accuracy. Dr. Federico Carminati, Gul Rukh Khattak,
and Dr. Sofia Vallecorsa at CERN, as well as Jean-Roch Vlimant at
Caltech. The work is part of a CERN openlab project in collaboration with
Intel Corporation, who partially funded the endeavor through the Intel
Parallel Computing Center (IPCC) program” – Aug 2018 (2)
“The collaboration team with representatives from Novartis and
Intel have shown more than 6X improvement in the time to
process a dataset of 10K images for training. Using the Broad
Bio-image Benchmark Collection* 021 (BBBC-021) dataset,
the team has achieved a total processing time of
31 minutes with over 99 percent accuracy.” May 2018
(6)
1) https://newsroom.intel.com/articles/solve-healthcare-intel-partners-demonstrate-real-uses-artificial-intelligence-healthcare/ 2) https://www.hpcwire.com/2018/08/14/cern-incorporates-ai-into-physics-based-simulations/ 3) https://www.reuters.com/article/us-nvidia-
intel/as-nvidia-expands-in-artificial-intelligence-intel-defends-turf-idUSKBN1L2051 4) https://venturebeat-com.cdn.ampproject.org/c/s/venturebeat.com/2018/08/14/intel-and-philips-use-xeon-chips-to-speed-up-ai-medical-scan-analysis/amp/
5) https://newsroom.intel.com/news/intel-ai-baidu-create-ai-camera-fpga-based-acceleration-xeon-scalable-optimizations-deep-learning/ 6) https://newsroom.intel.com/news/using-deep-neural-network-acceleration-image-analysis-drug-discovery/
(7) For more information, see http://aidc.gallery.video/detail/videos/china:-keynotes/video/5977039606001/large-scale-deep-learning-applications-at-baidu-and-open-source-ai-framework-paddlepaddle?autoStart=true
“Machine learning is a big part of our heritage. IT works
on GPUs today, but it also works on instances
powered by highly customized Intel
Xeon Processors” – Bratin Saha, VP & GM
Machine Learning Platforms, Amazon AI - Amazon
“Inference is one thing we do, but we do lots more. That’s
why flexibility is really essential” – Kim
Hazelwood, Head of AI Infrastructure Foundation,
Facebook
Public
Philips
“We rely heavily on Intel Xeon processors for
deep learning training and
inference workloads at Baidu”
– Dianhai Yu, Tech leader of Baidu PaddlePaddle
(7)
Baidu Customers
Internal Baidu

Intel® Confidential. For Internal Use ONLY
Intel works with customers across the entire AI lifecycle
TIME-
TO-
SOLUTI
ON
Opportunity Hypotheses Data Modeling Deployment Iteration Evaluation
15% 15% 23%
15% 15%
8% 8%
Experiment with
Topologies
Tune Hyper-
parameters
Share
ResultsLabel Data Load Data Augment Data
Support
Inference
Compute-intensiveLabor-intensive Labor-intensive
Proof
of
concept
Training
Source Data Scale & Deploy Inference Scale & Deploy inference within broader application
15%
15%
23%
15%
15%
8%
8%
Dev Cycle
…
Build,
Deploy
& Scale
AI customer example
The complete analytics pipeline

15
Results
188X &
38x increase
Client: Philips, a worldwide
leader in healthcare
products for consumers,
patients, providers and
caregivers across the health
continuum.
Challenge: AI for medicalimagingischallenging
becausetheinformationis often high-resolutionand
multi-dimensional. Down-samplingimagesto lower
resolutionsdueto memory constraintscan cause
misdiagnoses.Philips’goalis to offerAI to its end
customers withoutsignificantlyincreasingthe cost of
the customers’systems,and withoutrequiring
modificationsto the hardwaredeployed in the field.
Solution: Philips and Intel tested two healthcare
use cases for deep learning inference, models:
one on X-rays of bones for bone-age-prediction
modeling, and the other on CT scans of lungs for
lung segmentation. The solution took advantage
of efficient multi-core processing Intel Xeon®
Scalable processors, along with the OpenVINO™
toolkit.
Intel® Distribution for
OpenVINO™
In inference performance over baseline (images
per second) for a 2S Intel® Xeon® Scalable 8168
processor
Bone age prediction model Lung segmentation model
*Other names and brands may be claimed as the property of others.
Configuration: 2-socket Intel® Xeon® Platinum 8168 processor, 2.70Ghz, HT OFF ,Total Memory 192 GB (2666 MHz), Ubuntu 18.04.1 LTS (GNU/Linux 4.15.0-29-generic x86_64*), BIOS: SE5C620.86B.0D.01.0010.072020182008, Intel
Math Kernel Library for Deep Neural Networks (Intel® MKL-DNN) v0.14. Source: https://ai.intel.com/ai/wp-content/uploads/sites/69/Intel-PhilipsAIHealthcare-CaseStudy-FinalV2-withquote.pdf
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors.
Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary.
You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more
complete information visit http://www.intel.com/performance. Performance results are based on testing as of August 2018 and may not reflect all publicly available security updates.
See configuration disclosure for details. No product can be absolutely secure.
White paper: https://ai.intel.com/ai/wp-content/uploads/sites/69/Intel-PhilipsAIHealthcare-CaseStudy-FinalV2-withquote.pdf
Video: https://ai.intel.com/videos/philips/
Public

16
LINKS:
Blog: Montefiore Health System Improves Patient Outcomes and Healthcare Efficiency with Semantic Data Lake and Artificial Intelligence Powered by Intel Technologies
Research brief: https://ai.intel.com/ai/wp-content/uploads/sites/69/montefiore-in-ai-case-study.pdf
Client: Montefiore, a
premier academic health
system in the Bronx, NY,
which has implemented a
Patient-centered Analytical
Machine Learning (PALM)
platform
Challenge: Risk stratification across a patient
population. For example, determining which
patients are at risk of respiratory failure, and
subsequent intubation (which significantly
diminishes the odds of a positive outcome).
A robust and scalable intelligent healthcare system
is needed, where models will need to be built on
data coming from a variety of sources (traditional
databases or in newer unstructured data stores),
while still complying with privacy regulations.
Solution: The PALM platform, which can tap into a
myriad of data stores, regardless of where the
information is located or how it is structured. PALM
is powered by Intel® Xeon® Scalable Gold
processors and Intel® Optane™ SSDs, and was first
deployed help identify patients at risk for
respiratory failure. This improved patient outcomes
and lowered costs, and is already starting to apply
PALM to a variety of other projects.
Intel does not control or audit third-party benchmark data or the web sites
referenced in this document. You should visit the referenced web site and
confirm whether referenced data are accurate.
Result
“With Intel’s solutions for AI, all of [these AI
capabilities] can occur on the same architecture
already in use for so many other traditional
enterprise activities, increasing efficiency and
improving time to value.”
Public

Sample End-to-End Solution
7
Complementary Public Cloud/Private Cloud
Sensors
logs
Messages
Smart
Machines
Transaction
logs
Source Data Sourcing and
Collection (Examples)
Storage Processing + Analysis
Kafka
Sqoop
Spark Streaming
Storm/Heron
Informatica,
DataDtage
Object Storage
RDS
In-memory
Cache
MPP DB
k/v storage
SAP HANA
Elastic
Search/Solr
Spark/BigDL
Impala/Presto
ML/deep
learning
Consumable, Visualized and
Syndicated Data / Information
Add Arcadia
Data
• Apache spark
and apache
• * Native
visualization
stacks
Post-processing
Ingest

8
Examples of Pipelines

AI
Is the
driving force
The path to deeper insight
Descriptive
Analytics
Diagnostic
Analytics
Predictive
Analytics
Prescriptive
Analytics
Cognitive
Analytics
Foresight
What Will Happen, When, and Why
Hindsight
What Happened?
Insight
What Happened and Why?
Forecast
How Should I Proceed?
Self-Learning
How Do I Proceed?

Thank You! 
Learn more @:
Software.intel.com
Ai.intel.com
Contact: meg.mude@intel.com; @megmyname

Meg Mude, Intel - Data Engineering Lifecycle Optimized on Intel - H2O World San Francisco

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Meg Mude, Intel - Data Engineering Lifecycle Optimized on Intel - H2O World San Francisco

Similar to Meg Mude, Intel - Data Engineering Lifecycle Optimized on Intel - H2O World San Francisco (20)

More from Sri Ambati

More from Sri Ambati (20)

Recently uploaded

Recently uploaded (20)

Meg Mude, Intel - Data Engineering Lifecycle Optimized on Intel - H2O World San Francisco

Editor's Notes