SlideShare a Scribd company logo
1 of 30
TOP 5 LESSONS
LEARNED
IN DEPLOYING AI IN THE
REAL WORLD
© 2018 PURE STORAGE INC.2
QUESTION ON EVERYONE’S MIND:
WHY IS A STORAGE COMPANY TALKING
ABOUT AI?
© 2018 PURE STORAGE INC.3
NEW ALGORITHMS
Massively Parallel Delivering
Superhuman Accuracy
MODERN COMPUTE
Massively Parallel Architecture
Driving Performance
GPU- THOUSANDS OF CORES
BIG DATA
Data is the New Oil
50 Zettabytes Created in 2020
EXPLOSION IN ARTIFICIAL INTELLIGENCE
FUELED BY PARALLEL COMPUTE, NEW ALGORITHMS, AND BIG DATA
© 2018 PURE STORAGE INC.4
FRAMEWORKS GPU SERVER STORAGE
TECHNOLOGIES OF THE BIG BANG
WHAT CUSTOMERS DEPLOY
© 2018 PURE STORAGE INC.5
DATA IS VITAL TO MACHINE LEARNING
OBSERVATION BY PROF. ANDREW NG, AI LUMINARY
© 2018 PURE STORAGE INC.6
“We don’t have better algorithms,
we just have more data”
PETER NORVIG
Engineering Director, Google
© 2018 PURE STORAGE INC.7
The AI “hierarchy of needs”
credit: Monica Rogati
ML algorithms: linear & logistic
regression, k-means clustering, decision
trees, etc.
Validation: A/B testing, detecting model
drift over time✓
Data preparation: cleaning, feature
identification, exploration, etc.
Data acquisition: ingest, transformation,
and representation of data for analysis
© 2018 PURE STORAGE INC.8
TOP 5 LESSONS LEARNED
1.AIisaDataPipeline
© 2018 PURE STORAGE INC.9
WHAT MOST THINK IS AI
NEW POSSIBILITIES
For Nearly Every Industry
FRAMEWORKS
To Get Started
GPU
The Engine
© 2018 PURE STORAGE INC.10
AI IS SO MUCH MORE
“Hidden Technical Debt in Machine Learning Systems”, Google NIPS 2015
© 2018 PURE STORAGE INC.11
COMPLEXITIES OF AI IN PRODUCTION
INGEST
From sensors, machines,
& user generated
CLEAN &
TRANSFORM
Label, anomaly detection, ETL,
prep, stage
EXPLORE
Quickly iterate to
converge on models
TRAIN
Run for hours to days in
production cluster
CPU Servers GPU Server GPU Production Cluster
COPY &
TRANSFORM
COPY &
TRANSFORM
COPY &
TRANSFORM
© 2018 PURE STORAGE INC.12
WIDE RANGE OF NEEDS IN AI PIPELINE
SIGNIFICANT CHALLENGE TO LEGACY STORAGE
INGEST
From sensors & machines
CLEAN &
TRANSFORM
CPU Servers
EXPLORE
GPU Server
TRAIN
GPU Production Cluster
Access Pattern sequential sequential or random random random
Access Type write read & write read read
File Size mostly large small to large small to large mostly small
Concurrency high high low high
© 2018 PURE STORAGE INC.13
TOP 5 LESSONS LEARNED
1.AIisaDataPipeline
2.Don’tThrowYourDataintoDataLake
© 2018 PURE STORAGE INC.14
DATA LAKE
OR DATA GRAVEYARD?
We see customers creating big data
graveyards, dumping everything into
HDFS [Hadoop Distributed File
System] and hoping to do something
with it down the road. But then they
just lose track of what’s there.
The main challenge is not creating a
data lake, but taking advantage of the
opportunities it presents.
“
”
PricewaterhouseCoopers
Technology Forecast, Issue 1, 2014
© 2018 PURE STORAGE INC.15
MODERN ANALYTICS WITH OLD DATA LAKE
SPRAWLING, COMPLEX SILOS & SLOW PERFORMANCE
Each App Locked into Physical Silos
Redundant Data Copies in Silos
Fixed Compute to Storage in Silo
Built for Large, Sequential Data
Optimized for Batch, Not Real-Time
STATIC DATA LAKE
NO LONGER VIABLE
HDFS DATA LAKE
SILO
SILO
SILOSILOSILO
© 2018 PURE STORAGE INC.16
TOP 5 LESSONS LEARNED
1.AIisaDataPipeline
2.Don’tThrowYourDataintoDataLake
3.CloudorNottoCloud?
© 2018 PURE STORAGE INC.17
IT DEPENDS
WHERE YOU ARE ON YOUR AI JOURNEY
EXPLORATION PRODUCTION
NEED Start Immediately
Get New Products & Features to
Market Faster than Competition
© 2018 PURE STORAGE INC.18
IT DEPENDS
WHERE YOU ARE ON YOUR AI JOURNEY
EXPLORATION PRODUCTION
NEED Start Immediately
Get New Products & Features to
Market Faster than Competition
DON’T NEED Bogged Down with Infrastructure
Bogged Down by Performance
& Cost Inefficiencies
© 2018 PURE STORAGE INC.19
IT DEPENDS
WHERE YOU ARE ON YOUR AI JOURNEY
EXPLORATION PRODUCTION
NEED Start Immediately
Get New Products & Features to
Market Faster than Competition
DON’T NEED Bogged Down with Infrastructure
Bogged Down by Performance
& Cost Inefficiencies
RECOMMENDATION Cloud On-Premises
© 2018 PURE STORAGE INC.20
TOP 5 LESSONS LEARNED
1.AIisaDataPipeline
2.Don’tThrowYourDataintoDataLake
3.CloudorNottoCloud?
4.Lies,DamnLies,andBenchmarks
© 2018 PURE STORAGE INC.21
BENCHMARKS DO NOT REFLECT REALITY
IMAGENET
REAL-WORLD AUTONOMOUS CAR
COMPANY
IMAGE SIZE 100-200KB 2-5MB
FILE SIZE 150MB
(Packed TFRecords)
2-5MB
MODE OF TESTING Synthetic (No I/O) Read from Storage
© 2018 PURE STORAGE INC.22
AI TRAINING SYSTEM
GOAL IS TO KEEP THE GPUs 100% BUSY
decode scale
evaluate
forward-
propagation
update
back-propagation
GPUI/O CPU
FULL TRAINING
WORKFLOW
Setup #1: Synthetic Data
from System RAM into
GPUs
Setup #3: Real Image Data from FlashBlade
into DGX-1
BENCHMARK
SETUP
GPU ONLY I/O + CPU + GPU
Setup #2: Real Image Data
from System RAM Through
CPU + GPU
CPU + GPU
© 2018 PURE STORAGE INC.23
NEAR-LINEAR SCALE DELIVERED
AIRI ENGINEERED FOR MAXIMUM PRODUCTIVITY AND OUT-OF-THE-BOX SCALE
DEEP LEARNING TRAINING- MULTI-NODE USING GPUDIRECT RDMA OVER ETHERNET
Comparing Synthetic Mode, Entire Data in DRAM, Entire Data in FlashBlade
© 2018 PURE STORAGE INC.24
TOP 5 LESSONS LEARNED
1.AIisaDataPipeline
2.Don’tThrowYourDataintoDataLake
3.CloudorNottoCloud?
4.Lies,DamnLies,andBenchmarks
5.IdealDataPlatformisaDataHub
© 2018 PURE STORAGE INC.25
IDEAL PLATFORM FOR MODERN ERA
DYNAMIC DATA HUB ARCHITECTED FOR REAL-TIME & ELASTIC DATA
DATA PIPELINE
DATA HUB
“TUNED FOR EVERYTHING”
Small, Random to Large, Seq.
Architected for the Unknown
REAL-TIME
Low Latency Performance for
Instant Response
ALL-FLASH
Modern, Ultra-Fast
Technology
PARALLEL
No Serial Bottlenecks
for Max Throughput
ELASTIC
Grow Non-Disruptively
with More App Clusters
SIMPLE
Focus More on Insights,
Not Infrastructure
© 2018 PURE STORAGE INC. PURE PROPRIETARY26
NVIDIA® DGX-1™ | 4x DGX-1 Systems | 4 PFLOPS of DL Performance
PURE FLASHBLADE™ | 15x 17TB Blades | 1.5M IOPS
ARISTA | 2x 100Gb Ethernet Switches with RDMA
NVIDIA GPU CLOUD DEEP LEARNING STACK | NVIDIA Optimized Frameworks
AIRI SCALING TOOLKIT | Multi-node Training Made Simple
THE INDUSTRY’S FIRST
COMPLETE AI-READY INFRASTRUCTURE
HARDWARE
SOFTWARE
© 2018 PURE STORAGE INC.27
AI & MODERN ANALYTICS
POWERING ANALYTICS FOR WORLD’S LARGEST PUBLIC HEDGE FUND
AI CLEAN & LABEL AI EXPLORE AI TRAIN
CPU Servers GPU Server GPU Servers
SPARK
CPU Servers CPU Servers
MONGO
Our quants want to test a model,
get the results, and then test
another one- all day long. So a
10-20X improvement in
performance is a game-changer
when it comes to creating a
time-to-market advantage for us.
Gary Collier, co-CTO, Man AHL
“
”
© 2018 PURE STORAGE INC.28
ORCHESTRATION WITH OPENSHIFT
(KUBERNETES)
Monitoring
Load balancing
Scheduling
Resource allocation
OPENSHIFT + PURE PROVIDE RECIPE
FOR OPERATIONS AT SCALE
© 2018 PURE STORAGE INC.29
TOP 5 LESSONS LEARNED
1.AIisaDataPipeline
2.Don’tThrowYourDataintoDataLake
3.CloudorNottoCloud?
4.Lies,DamnLies,andBenchmarks
5.IdealDataPlatformisaDataHub
Big Data LDN 2018: LESSONS LEARNED FROM DEPLOYING REAL-WORLD AI SYSTEMS

More Related Content

What's hot

Building intelligent applications, experimental ML with Uber’s Data Science W...
Building intelligent applications, experimental ML with Uber’s Data Science W...Building intelligent applications, experimental ML with Uber’s Data Science W...
Building intelligent applications, experimental ML with Uber’s Data Science W...DataWorks Summit
 
Real World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in ProductionReal World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in ProductionCodemotion
 
Introduction of Big data and Hadoop
Introduction of Big data and Hadoop Introduction of Big data and Hadoop
Introduction of Big data and Hadoop Arohi Khandelwal
 
The key to unlocking the Value in the IoT? Managing the Data!
The key to unlocking the Value in the IoT? Managing the Data!The key to unlocking the Value in the IoT? Managing the Data!
The key to unlocking the Value in the IoT? Managing the Data!DataWorks Summit/Hadoop Summit
 
How Google Does Big Data - DevNexus 2014
How Google Does Big Data - DevNexus 2014How Google Does Big Data - DevNexus 2014
How Google Does Big Data - DevNexus 2014James Chittenden
 
Bigdata Machine Learning Platform
Bigdata Machine Learning PlatformBigdata Machine Learning Platform
Bigdata Machine Learning PlatformMk Kim
 
Stl meetup cloudera platform - january 2020
Stl meetup   cloudera platform  - january 2020Stl meetup   cloudera platform  - january 2020
Stl meetup cloudera platform - january 2020Adam Doyle
 
Guest Lecture: Introduction to Big Data at Indian Institute of Technology
Guest Lecture: Introduction to Big Data at Indian Institute of TechnologyGuest Lecture: Introduction to Big Data at Indian Institute of Technology
Guest Lecture: Introduction to Big Data at Indian Institute of TechnologyNishant Gandhi
 
State of enterprise data science
State of enterprise data scienceState of enterprise data science
State of enterprise data scienceYan Xu
 
The Synapse IoT Stack: Technology Trends in IOT and Big Data
The Synapse IoT Stack: Technology Trends in IOT and Big DataThe Synapse IoT Stack: Technology Trends in IOT and Big Data
The Synapse IoT Stack: Technology Trends in IOT and Big DataInMobi Technology
 
Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)
Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)
Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)Spark Summit
 
Common and unique use cases for Apache Hadoop
Common and unique use cases for Apache HadoopCommon and unique use cases for Apache Hadoop
Common and unique use cases for Apache HadoopBrock Noland
 
20100806 cloudera 10 hadoopable problems webinar
20100806 cloudera 10 hadoopable problems webinar20100806 cloudera 10 hadoopable problems webinar
20100806 cloudera 10 hadoopable problems webinarCloudera, Inc.
 
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionBig Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionEtu Solution
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data ScienceDataWorks Summit
 
Big Data in Action : Operations, Analytics and more
Big Data in Action : Operations, Analytics and moreBig Data in Action : Operations, Analytics and more
Big Data in Action : Operations, Analytics and moreSoftweb Solutions
 
Achieving compute and storage independence for data-driven workloads
Achieving compute and storage independence for data-driven workloadsAchieving compute and storage independence for data-driven workloads
Achieving compute and storage independence for data-driven workloadsAlluxio, Inc.
 
Empower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
Empower Splunk and other SIEMs with the Databricks Lakehouse for CybersecurityEmpower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
Empower Splunk and other SIEMs with the Databricks Lakehouse for CybersecurityDatabricks
 
Addressing Enterprise Customer Pain Points with a Data Driven Architecture
Addressing Enterprise Customer Pain Points with a Data Driven ArchitectureAddressing Enterprise Customer Pain Points with a Data Driven Architecture
Addressing Enterprise Customer Pain Points with a Data Driven ArchitectureDataWorks Summit
 

What's hot (20)

Building intelligent applications, experimental ML with Uber’s Data Science W...
Building intelligent applications, experimental ML with Uber’s Data Science W...Building intelligent applications, experimental ML with Uber’s Data Science W...
Building intelligent applications, experimental ML with Uber’s Data Science W...
 
Real World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in ProductionReal World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in Production
 
Introduction of Big data and Hadoop
Introduction of Big data and Hadoop Introduction of Big data and Hadoop
Introduction of Big data and Hadoop
 
The key to unlocking the Value in the IoT? Managing the Data!
The key to unlocking the Value in the IoT? Managing the Data!The key to unlocking the Value in the IoT? Managing the Data!
The key to unlocking the Value in the IoT? Managing the Data!
 
How Google Does Big Data - DevNexus 2014
How Google Does Big Data - DevNexus 2014How Google Does Big Data - DevNexus 2014
How Google Does Big Data - DevNexus 2014
 
Bigdata Machine Learning Platform
Bigdata Machine Learning PlatformBigdata Machine Learning Platform
Bigdata Machine Learning Platform
 
Stl meetup cloudera platform - january 2020
Stl meetup   cloudera platform  - january 2020Stl meetup   cloudera platform  - january 2020
Stl meetup cloudera platform - january 2020
 
Guest Lecture: Introduction to Big Data at Indian Institute of Technology
Guest Lecture: Introduction to Big Data at Indian Institute of TechnologyGuest Lecture: Introduction to Big Data at Indian Institute of Technology
Guest Lecture: Introduction to Big Data at Indian Institute of Technology
 
State of enterprise data science
State of enterprise data scienceState of enterprise data science
State of enterprise data science
 
The Synapse IoT Stack: Technology Trends in IOT and Big Data
The Synapse IoT Stack: Technology Trends in IOT and Big DataThe Synapse IoT Stack: Technology Trends in IOT and Big Data
The Synapse IoT Stack: Technology Trends in IOT and Big Data
 
Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)
Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)
Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)
 
Common and unique use cases for Apache Hadoop
Common and unique use cases for Apache HadoopCommon and unique use cases for Apache Hadoop
Common and unique use cases for Apache Hadoop
 
20100806 cloudera 10 hadoopable problems webinar
20100806 cloudera 10 hadoopable problems webinar20100806 cloudera 10 hadoopable problems webinar
20100806 cloudera 10 hadoopable problems webinar
 
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionBig Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data Science
 
BigData Analytics
BigData AnalyticsBigData Analytics
BigData Analytics
 
Big Data in Action : Operations, Analytics and more
Big Data in Action : Operations, Analytics and moreBig Data in Action : Operations, Analytics and more
Big Data in Action : Operations, Analytics and more
 
Achieving compute and storage independence for data-driven workloads
Achieving compute and storage independence for data-driven workloadsAchieving compute and storage independence for data-driven workloads
Achieving compute and storage independence for data-driven workloads
 
Empower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
Empower Splunk and other SIEMs with the Databricks Lakehouse for CybersecurityEmpower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
Empower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
 
Addressing Enterprise Customer Pain Points with a Data Driven Architecture
Addressing Enterprise Customer Pain Points with a Data Driven ArchitectureAddressing Enterprise Customer Pain Points with a Data Driven Architecture
Addressing Enterprise Customer Pain Points with a Data Driven Architecture
 

Similar to Big Data LDN 2018: LESSONS LEARNED FROM DEPLOYING REAL-WORLD AI SYSTEMS

QCon 2018 | Gimel | PayPal's Analytic Platform
QCon 2018 | Gimel | PayPal's Analytic PlatformQCon 2018 | Gimel | PayPal's Analytic Platform
QCon 2018 | Gimel | PayPal's Analytic PlatformDeepak Chandramouli
 
Lessons learned processing 70 billion data points a day using the hybrid cloud
Lessons learned processing 70 billion data points a day using the hybrid cloudLessons learned processing 70 billion data points a day using the hybrid cloud
Lessons learned processing 70 billion data points a day using the hybrid cloudDataWorks Summit
 
Designing the Next Generation Data Lake
Designing the Next Generation Data LakeDesigning the Next Generation Data Lake
Designing the Next Generation Data LakeRobert Chong
 
Automating Big Data with the Automic Hadoop Agent
Automating Big Data with the Automic Hadoop AgentAutomating Big Data with the Automic Hadoop Agent
Automating Big Data with the Automic Hadoop AgentCA | Automic Software
 
Postgres Vision 2018: Taking Postgres Everywhere
Postgres Vision 2018: Taking Postgres EverywherePostgres Vision 2018: Taking Postgres Everywhere
Postgres Vision 2018: Taking Postgres EverywhereEDB
 
Analyzing Big Data - Jeff Scheel
Analyzing Big Data - Jeff ScheelAnalyzing Big Data - Jeff Scheel
Analyzing Big Data - Jeff ScheelKangaroot
 
Master the RETE algorithm
Master the RETE algorithmMaster the RETE algorithm
Master the RETE algorithmMasahiko Umeno
 
Gimel at Dataworks Summit San Jose 2018
Gimel at Dataworks Summit San Jose 2018Gimel at Dataworks Summit San Jose 2018
Gimel at Dataworks Summit San Jose 2018Romit Mehta
 
Dataworks | 2018-06-20 | Gimel data platform
Dataworks | 2018-06-20 | Gimel data platformDataworks | 2018-06-20 | Gimel data platform
Dataworks | 2018-06-20 | Gimel data platformDeepak Chandramouli
 
Data Warehouse Evolution Roadshow
Data Warehouse Evolution RoadshowData Warehouse Evolution Roadshow
Data Warehouse Evolution RoadshowMapR Technologies
 
Exploring the Wider World of Big Data
Exploring the Wider World of Big DataExploring the Wider World of Big Data
Exploring the Wider World of Big DataNetApp
 
GOAI: GPU-Accelerated Data Science DataSciCon 2017
GOAI: GPU-Accelerated Data Science DataSciCon 2017GOAI: GPU-Accelerated Data Science DataSciCon 2017
GOAI: GPU-Accelerated Data Science DataSciCon 2017Joshua Patterson
 
Accelerated Any-Scale Solutions from DDN
Accelerated Any-Scale Solutions from DDNAccelerated Any-Scale Solutions from DDN
Accelerated Any-Scale Solutions from DDNinside-BigData.com
 
GTC Taiwan 2017 如何在充滿未知的巨量數據時代中建構一個數據中心
GTC Taiwan 2017 如何在充滿未知的巨量數據時代中建構一個數據中心GTC Taiwan 2017 如何在充滿未知的巨量數據時代中建構一個數據中心
GTC Taiwan 2017 如何在充滿未知的巨量數據時代中建構一個數據中心NVIDIA Taiwan
 
Real-Time With AI – The Convergence Of Big Data And AI by Colin MacNaughton
Real-Time With AI – The Convergence Of Big Data And AI by Colin MacNaughtonReal-Time With AI – The Convergence Of Big Data And AI by Colin MacNaughton
Real-Time With AI – The Convergence Of Big Data And AI by Colin MacNaughtonSynerzip
 
AWS Earth and Space 2018 - Element 84 Processing and Streaming GOES-16 Data...
AWS Earth and Space 2018 -   Element 84 Processing and Streaming GOES-16 Data...AWS Earth and Space 2018 -   Element 84 Processing and Streaming GOES-16 Data...
AWS Earth and Space 2018 - Element 84 Processing and Streaming GOES-16 Data...Dan Pilone
 
Modern data integration expert sessions
Modern data integration expert sessionsModern data integration expert sessions
Modern data integration expert sessionsJessicaMurrell3
 
Modern Data Integration Expert Session Webinar
Modern Data Integration Expert Session Webinar Modern Data Integration Expert Session Webinar
Modern Data Integration Expert Session Webinar ibi
 

Similar to Big Data LDN 2018: LESSONS LEARNED FROM DEPLOYING REAL-WORLD AI SYSTEMS (20)

Top 5 Lessons Learned in Deploying AI in the Real World
Top 5 Lessons Learned in Deploying AI in the Real WorldTop 5 Lessons Learned in Deploying AI in the Real World
Top 5 Lessons Learned in Deploying AI in the Real World
 
QCon 2018 | Gimel | PayPal's Analytic Platform
QCon 2018 | Gimel | PayPal's Analytic PlatformQCon 2018 | Gimel | PayPal's Analytic Platform
QCon 2018 | Gimel | PayPal's Analytic Platform
 
Lessons learned processing 70 billion data points a day using the hybrid cloud
Lessons learned processing 70 billion data points a day using the hybrid cloudLessons learned processing 70 billion data points a day using the hybrid cloud
Lessons learned processing 70 billion data points a day using the hybrid cloud
 
Designing the Next Generation Data Lake
Designing the Next Generation Data LakeDesigning the Next Generation Data Lake
Designing the Next Generation Data Lake
 
Automating Big Data with the Automic Hadoop Agent
Automating Big Data with the Automic Hadoop AgentAutomating Big Data with the Automic Hadoop Agent
Automating Big Data with the Automic Hadoop Agent
 
VSD Paris 2018 - Présentation Finale
VSD Paris 2018 - Présentation FinaleVSD Paris 2018 - Présentation Finale
VSD Paris 2018 - Présentation Finale
 
Postgres Vision 2018: Taking Postgres Everywhere
Postgres Vision 2018: Taking Postgres EverywherePostgres Vision 2018: Taking Postgres Everywhere
Postgres Vision 2018: Taking Postgres Everywhere
 
Analyzing Big Data - Jeff Scheel
Analyzing Big Data - Jeff ScheelAnalyzing Big Data - Jeff Scheel
Analyzing Big Data - Jeff Scheel
 
Master the RETE algorithm
Master the RETE algorithmMaster the RETE algorithm
Master the RETE algorithm
 
Gimel at Dataworks Summit San Jose 2018
Gimel at Dataworks Summit San Jose 2018Gimel at Dataworks Summit San Jose 2018
Gimel at Dataworks Summit San Jose 2018
 
Dataworks | 2018-06-20 | Gimel data platform
Dataworks | 2018-06-20 | Gimel data platformDataworks | 2018-06-20 | Gimel data platform
Dataworks | 2018-06-20 | Gimel data platform
 
Data Warehouse Evolution Roadshow
Data Warehouse Evolution RoadshowData Warehouse Evolution Roadshow
Data Warehouse Evolution Roadshow
 
Exploring the Wider World of Big Data
Exploring the Wider World of Big DataExploring the Wider World of Big Data
Exploring the Wider World of Big Data
 
GOAI: GPU-Accelerated Data Science DataSciCon 2017
GOAI: GPU-Accelerated Data Science DataSciCon 2017GOAI: GPU-Accelerated Data Science DataSciCon 2017
GOAI: GPU-Accelerated Data Science DataSciCon 2017
 
Accelerated Any-Scale Solutions from DDN
Accelerated Any-Scale Solutions from DDNAccelerated Any-Scale Solutions from DDN
Accelerated Any-Scale Solutions from DDN
 
GTC Taiwan 2017 如何在充滿未知的巨量數據時代中建構一個數據中心
GTC Taiwan 2017 如何在充滿未知的巨量數據時代中建構一個數據中心GTC Taiwan 2017 如何在充滿未知的巨量數據時代中建構一個數據中心
GTC Taiwan 2017 如何在充滿未知的巨量數據時代中建構一個數據中心
 
Real-Time With AI – The Convergence Of Big Data And AI by Colin MacNaughton
Real-Time With AI – The Convergence Of Big Data And AI by Colin MacNaughtonReal-Time With AI – The Convergence Of Big Data And AI by Colin MacNaughton
Real-Time With AI – The Convergence Of Big Data And AI by Colin MacNaughton
 
AWS Earth and Space 2018 - Element 84 Processing and Streaming GOES-16 Data...
AWS Earth and Space 2018 -   Element 84 Processing and Streaming GOES-16 Data...AWS Earth and Space 2018 -   Element 84 Processing and Streaming GOES-16 Data...
AWS Earth and Space 2018 - Element 84 Processing and Streaming GOES-16 Data...
 
Modern data integration expert sessions
Modern data integration expert sessionsModern data integration expert sessions
Modern data integration expert sessions
 
Modern Data Integration Expert Session Webinar
Modern Data Integration Expert Session Webinar Modern Data Integration Expert Session Webinar
Modern Data Integration Expert Session Webinar
 

More from Matt Stubbs

Blueprint Series: Banking In The Cloud – Ultra-high Reliability Architectures
Blueprint Series: Banking In The Cloud – Ultra-high Reliability ArchitecturesBlueprint Series: Banking In The Cloud – Ultra-high Reliability Architectures
Blueprint Series: Banking In The Cloud – Ultra-high Reliability ArchitecturesMatt Stubbs
 
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...Matt Stubbs
 
Blueprint Series: Expedia Partner Solutions, Data Platform
Blueprint Series: Expedia Partner Solutions, Data PlatformBlueprint Series: Expedia Partner Solutions, Data Platform
Blueprint Series: Expedia Partner Solutions, Data PlatformMatt Stubbs
 
Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...
Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...
Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...Matt Stubbs
 
Big Data LDN 2018: DATA, WHAT PEOPLE THINK AND WHAT YOU CAN DO TO BUILD TRUST.
Big Data LDN 2018: DATA, WHAT PEOPLE THINK AND WHAT YOU CAN DO TO BUILD TRUST.Big Data LDN 2018: DATA, WHAT PEOPLE THINK AND WHAT YOU CAN DO TO BUILD TRUST.
Big Data LDN 2018: DATA, WHAT PEOPLE THINK AND WHAT YOU CAN DO TO BUILD TRUST.Matt Stubbs
 
Big Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCE
Big Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCEBig Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCE
Big Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCEMatt Stubbs
 
Big Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQL
Big Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQLBig Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQL
Big Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQLMatt Stubbs
 
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTSBig Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTSMatt Stubbs
 
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...Matt Stubbs
 
Big Data LDN 2018: AI VS. GDPR
Big Data LDN 2018: AI VS. GDPRBig Data LDN 2018: AI VS. GDPR
Big Data LDN 2018: AI VS. GDPRMatt Stubbs
 
Big Data LDN 2018: REALISING THE PROMISE OF SELF-SERVICE ANALYTICS WITH DATA ...
Big Data LDN 2018: REALISING THE PROMISE OF SELF-SERVICE ANALYTICS WITH DATA ...Big Data LDN 2018: REALISING THE PROMISE OF SELF-SERVICE ANALYTICS WITH DATA ...
Big Data LDN 2018: REALISING THE PROMISE OF SELF-SERVICE ANALYTICS WITH DATA ...Matt Stubbs
 
Big Data LDN 2018: TURNING MULTIPLE DATA LAKES INTO A UNIFIED ANALYTIC DATA L...
Big Data LDN 2018: TURNING MULTIPLE DATA LAKES INTO A UNIFIED ANALYTIC DATA L...Big Data LDN 2018: TURNING MULTIPLE DATA LAKES INTO A UNIFIED ANALYTIC DATA L...
Big Data LDN 2018: TURNING MULTIPLE DATA LAKES INTO A UNIFIED ANALYTIC DATA L...Matt Stubbs
 
Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...
Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...
Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...Matt Stubbs
 
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...Matt Stubbs
 
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICSBig Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICSMatt Stubbs
 
Big Data LDN 2018: EXPERIAN: MAXIMISE EVERY OPPORTUNITY IN THE BIG DATA UNIVERSE
Big Data LDN 2018: EXPERIAN: MAXIMISE EVERY OPPORTUNITY IN THE BIG DATA UNIVERSEBig Data LDN 2018: EXPERIAN: MAXIMISE EVERY OPPORTUNITY IN THE BIG DATA UNIVERSE
Big Data LDN 2018: EXPERIAN: MAXIMISE EVERY OPPORTUNITY IN THE BIG DATA UNIVERSEMatt Stubbs
 
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNINGBig Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNINGMatt Stubbs
 
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...Matt Stubbs
 
Big Data LDN 2018: FROM PROLIFERATION TO PRODUCTIVITY: MACHINE LEARNING DATA ...
Big Data LDN 2018: FROM PROLIFERATION TO PRODUCTIVITY: MACHINE LEARNING DATA ...Big Data LDN 2018: FROM PROLIFERATION TO PRODUCTIVITY: MACHINE LEARNING DATA ...
Big Data LDN 2018: FROM PROLIFERATION TO PRODUCTIVITY: MACHINE LEARNING DATA ...Matt Stubbs
 
Big Data LDN 2018: DATA APIS DON’T DISCRIMINATE
Big Data LDN 2018: DATA APIS DON’T DISCRIMINATEBig Data LDN 2018: DATA APIS DON’T DISCRIMINATE
Big Data LDN 2018: DATA APIS DON’T DISCRIMINATEMatt Stubbs
 

More from Matt Stubbs (20)

Blueprint Series: Banking In The Cloud – Ultra-high Reliability Architectures
Blueprint Series: Banking In The Cloud – Ultra-high Reliability ArchitecturesBlueprint Series: Banking In The Cloud – Ultra-high Reliability Architectures
Blueprint Series: Banking In The Cloud – Ultra-high Reliability Architectures
 
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
 
Blueprint Series: Expedia Partner Solutions, Data Platform
Blueprint Series: Expedia Partner Solutions, Data PlatformBlueprint Series: Expedia Partner Solutions, Data Platform
Blueprint Series: Expedia Partner Solutions, Data Platform
 
Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...
Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...
Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...
 
Big Data LDN 2018: DATA, WHAT PEOPLE THINK AND WHAT YOU CAN DO TO BUILD TRUST.
Big Data LDN 2018: DATA, WHAT PEOPLE THINK AND WHAT YOU CAN DO TO BUILD TRUST.Big Data LDN 2018: DATA, WHAT PEOPLE THINK AND WHAT YOU CAN DO TO BUILD TRUST.
Big Data LDN 2018: DATA, WHAT PEOPLE THINK AND WHAT YOU CAN DO TO BUILD TRUST.
 
Big Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCE
Big Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCEBig Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCE
Big Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCE
 
Big Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQL
Big Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQLBig Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQL
Big Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQL
 
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTSBig Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
 
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
 
Big Data LDN 2018: AI VS. GDPR
Big Data LDN 2018: AI VS. GDPRBig Data LDN 2018: AI VS. GDPR
Big Data LDN 2018: AI VS. GDPR
 
Big Data LDN 2018: REALISING THE PROMISE OF SELF-SERVICE ANALYTICS WITH DATA ...
Big Data LDN 2018: REALISING THE PROMISE OF SELF-SERVICE ANALYTICS WITH DATA ...Big Data LDN 2018: REALISING THE PROMISE OF SELF-SERVICE ANALYTICS WITH DATA ...
Big Data LDN 2018: REALISING THE PROMISE OF SELF-SERVICE ANALYTICS WITH DATA ...
 
Big Data LDN 2018: TURNING MULTIPLE DATA LAKES INTO A UNIFIED ANALYTIC DATA L...
Big Data LDN 2018: TURNING MULTIPLE DATA LAKES INTO A UNIFIED ANALYTIC DATA L...Big Data LDN 2018: TURNING MULTIPLE DATA LAKES INTO A UNIFIED ANALYTIC DATA L...
Big Data LDN 2018: TURNING MULTIPLE DATA LAKES INTO A UNIFIED ANALYTIC DATA L...
 
Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...
Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...
Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...
 
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
 
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICSBig Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
 
Big Data LDN 2018: EXPERIAN: MAXIMISE EVERY OPPORTUNITY IN THE BIG DATA UNIVERSE
Big Data LDN 2018: EXPERIAN: MAXIMISE EVERY OPPORTUNITY IN THE BIG DATA UNIVERSEBig Data LDN 2018: EXPERIAN: MAXIMISE EVERY OPPORTUNITY IN THE BIG DATA UNIVERSE
Big Data LDN 2018: EXPERIAN: MAXIMISE EVERY OPPORTUNITY IN THE BIG DATA UNIVERSE
 
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNINGBig Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING
 
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
 
Big Data LDN 2018: FROM PROLIFERATION TO PRODUCTIVITY: MACHINE LEARNING DATA ...
Big Data LDN 2018: FROM PROLIFERATION TO PRODUCTIVITY: MACHINE LEARNING DATA ...Big Data LDN 2018: FROM PROLIFERATION TO PRODUCTIVITY: MACHINE LEARNING DATA ...
Big Data LDN 2018: FROM PROLIFERATION TO PRODUCTIVITY: MACHINE LEARNING DATA ...
 
Big Data LDN 2018: DATA APIS DON’T DISCRIMINATE
Big Data LDN 2018: DATA APIS DON’T DISCRIMINATEBig Data LDN 2018: DATA APIS DON’T DISCRIMINATE
Big Data LDN 2018: DATA APIS DON’T DISCRIMINATE
 

Recently uploaded

Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 

Recently uploaded (20)

Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 

Big Data LDN 2018: LESSONS LEARNED FROM DEPLOYING REAL-WORLD AI SYSTEMS

  • 1. TOP 5 LESSONS LEARNED IN DEPLOYING AI IN THE REAL WORLD
  • 2. © 2018 PURE STORAGE INC.2 QUESTION ON EVERYONE’S MIND: WHY IS A STORAGE COMPANY TALKING ABOUT AI?
  • 3. © 2018 PURE STORAGE INC.3 NEW ALGORITHMS Massively Parallel Delivering Superhuman Accuracy MODERN COMPUTE Massively Parallel Architecture Driving Performance GPU- THOUSANDS OF CORES BIG DATA Data is the New Oil 50 Zettabytes Created in 2020 EXPLOSION IN ARTIFICIAL INTELLIGENCE FUELED BY PARALLEL COMPUTE, NEW ALGORITHMS, AND BIG DATA
  • 4. © 2018 PURE STORAGE INC.4 FRAMEWORKS GPU SERVER STORAGE TECHNOLOGIES OF THE BIG BANG WHAT CUSTOMERS DEPLOY
  • 5. © 2018 PURE STORAGE INC.5 DATA IS VITAL TO MACHINE LEARNING OBSERVATION BY PROF. ANDREW NG, AI LUMINARY
  • 6. © 2018 PURE STORAGE INC.6 “We don’t have better algorithms, we just have more data” PETER NORVIG Engineering Director, Google
  • 7. © 2018 PURE STORAGE INC.7 The AI “hierarchy of needs” credit: Monica Rogati ML algorithms: linear & logistic regression, k-means clustering, decision trees, etc. Validation: A/B testing, detecting model drift over time✓ Data preparation: cleaning, feature identification, exploration, etc. Data acquisition: ingest, transformation, and representation of data for analysis
  • 8. © 2018 PURE STORAGE INC.8 TOP 5 LESSONS LEARNED 1.AIisaDataPipeline
  • 9. © 2018 PURE STORAGE INC.9 WHAT MOST THINK IS AI NEW POSSIBILITIES For Nearly Every Industry FRAMEWORKS To Get Started GPU The Engine
  • 10. © 2018 PURE STORAGE INC.10 AI IS SO MUCH MORE “Hidden Technical Debt in Machine Learning Systems”, Google NIPS 2015
  • 11. © 2018 PURE STORAGE INC.11 COMPLEXITIES OF AI IN PRODUCTION INGEST From sensors, machines, & user generated CLEAN & TRANSFORM Label, anomaly detection, ETL, prep, stage EXPLORE Quickly iterate to converge on models TRAIN Run for hours to days in production cluster CPU Servers GPU Server GPU Production Cluster COPY & TRANSFORM COPY & TRANSFORM COPY & TRANSFORM
  • 12. © 2018 PURE STORAGE INC.12 WIDE RANGE OF NEEDS IN AI PIPELINE SIGNIFICANT CHALLENGE TO LEGACY STORAGE INGEST From sensors & machines CLEAN & TRANSFORM CPU Servers EXPLORE GPU Server TRAIN GPU Production Cluster Access Pattern sequential sequential or random random random Access Type write read & write read read File Size mostly large small to large small to large mostly small Concurrency high high low high
  • 13. © 2018 PURE STORAGE INC.13 TOP 5 LESSONS LEARNED 1.AIisaDataPipeline 2.Don’tThrowYourDataintoDataLake
  • 14. © 2018 PURE STORAGE INC.14 DATA LAKE OR DATA GRAVEYARD? We see customers creating big data graveyards, dumping everything into HDFS [Hadoop Distributed File System] and hoping to do something with it down the road. But then they just lose track of what’s there. The main challenge is not creating a data lake, but taking advantage of the opportunities it presents. “ ” PricewaterhouseCoopers Technology Forecast, Issue 1, 2014
  • 15. © 2018 PURE STORAGE INC.15 MODERN ANALYTICS WITH OLD DATA LAKE SPRAWLING, COMPLEX SILOS & SLOW PERFORMANCE Each App Locked into Physical Silos Redundant Data Copies in Silos Fixed Compute to Storage in Silo Built for Large, Sequential Data Optimized for Batch, Not Real-Time STATIC DATA LAKE NO LONGER VIABLE HDFS DATA LAKE SILO SILO SILOSILOSILO
  • 16. © 2018 PURE STORAGE INC.16 TOP 5 LESSONS LEARNED 1.AIisaDataPipeline 2.Don’tThrowYourDataintoDataLake 3.CloudorNottoCloud?
  • 17. © 2018 PURE STORAGE INC.17 IT DEPENDS WHERE YOU ARE ON YOUR AI JOURNEY EXPLORATION PRODUCTION NEED Start Immediately Get New Products & Features to Market Faster than Competition
  • 18. © 2018 PURE STORAGE INC.18 IT DEPENDS WHERE YOU ARE ON YOUR AI JOURNEY EXPLORATION PRODUCTION NEED Start Immediately Get New Products & Features to Market Faster than Competition DON’T NEED Bogged Down with Infrastructure Bogged Down by Performance & Cost Inefficiencies
  • 19. © 2018 PURE STORAGE INC.19 IT DEPENDS WHERE YOU ARE ON YOUR AI JOURNEY EXPLORATION PRODUCTION NEED Start Immediately Get New Products & Features to Market Faster than Competition DON’T NEED Bogged Down with Infrastructure Bogged Down by Performance & Cost Inefficiencies RECOMMENDATION Cloud On-Premises
  • 20. © 2018 PURE STORAGE INC.20 TOP 5 LESSONS LEARNED 1.AIisaDataPipeline 2.Don’tThrowYourDataintoDataLake 3.CloudorNottoCloud? 4.Lies,DamnLies,andBenchmarks
  • 21. © 2018 PURE STORAGE INC.21 BENCHMARKS DO NOT REFLECT REALITY IMAGENET REAL-WORLD AUTONOMOUS CAR COMPANY IMAGE SIZE 100-200KB 2-5MB FILE SIZE 150MB (Packed TFRecords) 2-5MB MODE OF TESTING Synthetic (No I/O) Read from Storage
  • 22. © 2018 PURE STORAGE INC.22 AI TRAINING SYSTEM GOAL IS TO KEEP THE GPUs 100% BUSY decode scale evaluate forward- propagation update back-propagation GPUI/O CPU FULL TRAINING WORKFLOW Setup #1: Synthetic Data from System RAM into GPUs Setup #3: Real Image Data from FlashBlade into DGX-1 BENCHMARK SETUP GPU ONLY I/O + CPU + GPU Setup #2: Real Image Data from System RAM Through CPU + GPU CPU + GPU
  • 23. © 2018 PURE STORAGE INC.23 NEAR-LINEAR SCALE DELIVERED AIRI ENGINEERED FOR MAXIMUM PRODUCTIVITY AND OUT-OF-THE-BOX SCALE DEEP LEARNING TRAINING- MULTI-NODE USING GPUDIRECT RDMA OVER ETHERNET Comparing Synthetic Mode, Entire Data in DRAM, Entire Data in FlashBlade
  • 24. © 2018 PURE STORAGE INC.24 TOP 5 LESSONS LEARNED 1.AIisaDataPipeline 2.Don’tThrowYourDataintoDataLake 3.CloudorNottoCloud? 4.Lies,DamnLies,andBenchmarks 5.IdealDataPlatformisaDataHub
  • 25. © 2018 PURE STORAGE INC.25 IDEAL PLATFORM FOR MODERN ERA DYNAMIC DATA HUB ARCHITECTED FOR REAL-TIME & ELASTIC DATA DATA PIPELINE DATA HUB “TUNED FOR EVERYTHING” Small, Random to Large, Seq. Architected for the Unknown REAL-TIME Low Latency Performance for Instant Response ALL-FLASH Modern, Ultra-Fast Technology PARALLEL No Serial Bottlenecks for Max Throughput ELASTIC Grow Non-Disruptively with More App Clusters SIMPLE Focus More on Insights, Not Infrastructure
  • 26. © 2018 PURE STORAGE INC. PURE PROPRIETARY26 NVIDIA® DGX-1™ | 4x DGX-1 Systems | 4 PFLOPS of DL Performance PURE FLASHBLADE™ | 15x 17TB Blades | 1.5M IOPS ARISTA | 2x 100Gb Ethernet Switches with RDMA NVIDIA GPU CLOUD DEEP LEARNING STACK | NVIDIA Optimized Frameworks AIRI SCALING TOOLKIT | Multi-node Training Made Simple THE INDUSTRY’S FIRST COMPLETE AI-READY INFRASTRUCTURE HARDWARE SOFTWARE
  • 27. © 2018 PURE STORAGE INC.27 AI & MODERN ANALYTICS POWERING ANALYTICS FOR WORLD’S LARGEST PUBLIC HEDGE FUND AI CLEAN & LABEL AI EXPLORE AI TRAIN CPU Servers GPU Server GPU Servers SPARK CPU Servers CPU Servers MONGO Our quants want to test a model, get the results, and then test another one- all day long. So a 10-20X improvement in performance is a game-changer when it comes to creating a time-to-market advantage for us. Gary Collier, co-CTO, Man AHL “ ”
  • 28. © 2018 PURE STORAGE INC.28 ORCHESTRATION WITH OPENSHIFT (KUBERNETES) Monitoring Load balancing Scheduling Resource allocation OPENSHIFT + PURE PROVIDE RECIPE FOR OPERATIONS AT SCALE
  • 29. © 2018 PURE STORAGE INC.29 TOP 5 LESSONS LEARNED 1.AIisaDataPipeline 2.Don’tThrowYourDataintoDataLake 3.CloudorNottoCloud? 4.Lies,DamnLies,andBenchmarks 5.IdealDataPlatformisaDataHub