SlideShare a Scribd company logo
1 of 14
Download to read offline
1	
  
Becoming	
  Informa/on-­‐Driven	
  
Introduc/on	
  to	
  the	
  Enterprise	
  Data	
  Hub	
  
Mike	
  Olson	
  
Cloudera,	
  Inc.	
  
Co-­‐Founder	
  &	
  Chief	
  Strategy	
  Officer	
  
2	
  
Expanding	
  Data	
  Requires	
  A	
  New	
  Approach	
  
©2014	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  2	
  
1980s	
  
Bring	
  Data	
  to	
  Compute	
  
Now	
  
Bring	
  Compute	
  to	
  Data	
  
RelaEve	
  size	
  &	
  complexity	
  
Data	
  
InformaEon-­‐centric	
  
businesses	
  use	
  all	
  data:	
  
	
  	
  
Mul/-­‐structured,	
  	
  
internal	
  &	
  external	
  data	
  	
  
of	
  all	
  types	
  
Compute	
  
Compute	
  
Compute	
  
Process-­‐centric	
  	
  
businesses	
  use:	
  
	
  
• Structured	
  data	
  mainly	
  
• Internal	
  data	
  only	
  
• “Important”	
  data	
  only	
  
	
  
	
  
Compute	
  
Compute	
  
Compute	
  
Data	
  
Data	
  
Data	
  
Data	
  
3	
  
The	
  Old	
  Way:	
  Bringing	
  Data	
  to	
  Compute	
  
©2014	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  3	
  
Complex	
  Architecture	
  
•  Many	
  special-­‐purpose	
  
systems	
  
•  Moving	
  data	
  around	
  
•  No	
  complete	
  views	
  
Visibility	
  
•  Leaving	
  data	
  behind	
  
•  Risk	
  and	
  compliance	
  
•  High	
  cost	
  of	
  storage	
  
Time	
  to	
  Data	
  
•  Up-­‐front	
  modeling	
  
•  Transforms	
  slow	
  
•  Transforms	
  lose	
  data	
  
Cost	
  of	
  AnalyEcs	
  
•  Exis/ng	
  systems	
  strained	
  
•  No	
  agility	
  
•  BI	
  backlog	
  
4	
  
1	
  
2	
  
3	
  
SERVERS	
  MARTS	
  EDWS	
   DOCUMENTS	
   STORAGE	
   SEARCH	
   ARCHIVE	
  
ERP,	
  CRM,	
  RDBMS,	
  MACHINES	
   FILES,	
  IMAGES,	
  VIDEOS,	
  LOGS,	
  CLICKSTREAMS	
   EXTERNAL	
  DATA	
  SOURCES	
  
4	
  
SERVERS	
   MARTS	
   EDWS	
   DOCUMENTS	
   STORAGE	
   SEARCH	
   ARCHIVE	
  
ERP,	
  CRM,	
  RDBMS,	
  MACHINES	
   FILES,	
  IMAGES,	
  VIDEOS,	
  LOGS,	
  CLICKSTREAMS	
   ESTERNAL	
  DATA	
  SOURCES	
  
©2014	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
MulE-­‐workload	
  analyEc	
  plaRorm	
  
•  Bring	
  applica/ons	
  to	
  data	
  
•  Combine	
  different	
  workloads	
  on	
  	
  
common	
  data	
  (i.e.	
  SQL	
  +	
  Search)	
  
•  True	
  BI	
  agility	
  
4	
  
1	
  
2	
  
3	
   4	
  
The	
  New	
  Way:	
  Bringing	
  Compute	
  to	
  Data	
  
4	
  
AcEve	
  archive	
  
•  Full	
  fidelity	
  original	
  data	
  
•  Indefinite	
  /me,	
  any	
  source	
  
•  Lowest	
  cost	
  storage	
  
1	
  
Data	
  management,	
  transforms	
  
•  One	
  source	
  of	
  data	
  for	
  all	
  analy/cs	
  
•  Persist	
  state	
  of	
  transformed	
  data	
  
•  Significantly	
  faster	
  &	
  cheaper	
  
2	
  
Self-­‐service	
  exploratory	
  BI	
  
•  Simple	
  search	
  +	
  BI	
  tools	
  
•  “Schema	
  on	
  read”	
  agility	
  
•  Reduce	
  BI	
  user	
  backlog	
  requests	
  
3	
  
5	
  
Beeer,	
  faster,	
  cheaper	
  and	
  mul/-­‐framework	
  
BATCH	
  
PROCESSING	
  
MR	
  /	
  PIG/	
  Hive	
  /	
  Cascading	
  
SQL	
  
IMPALA	
  
SEARCH	
  
SOLR	
  
MACHINE	
  
LEARNING	
  
SAS,	
  R,	
  H20,	
  MLlib	
  
STREAM	
  
PROCESSING	
  
SPARK	
  STREAMING	
  
NOSQL	
  
HBASE	
  
Process	
  Data	
  
IN-­‐MEMORY	
  
SPARK	
  
Train	
  &	
  Test	
  
Models	
  
Respond	
  to	
  
Events	
  in	
  RT	
  
Explore	
  &	
  
Analyze	
  Data	
  
• Highly	
  mature	
  
• Wide	
  range	
  of	
  clients	
  
• Significant	
  advances	
  
in	
  speed	
  &	
  usability	
  
• Integra/on	
  with	
  the	
  
SAS	
  &	
  Revolu/on	
  
product	
  porgolio	
  
• Python	
  /	
  0xdata	
  /	
  ML	
  
lib	
  for	
  advanced	
  users	
  
• Very	
  low	
  (~10ms)	
  
latency	
  
• High	
  volumes	
  of	
  
single	
  events	
  
• High	
  speed	
  
• High	
  concurrency	
  
• Workload	
  mgt	
  
• Broad	
  BI	
  support	
  
• For	
  unstructured	
  &	
  
semi-­‐structured	
  data	
  
• For	
  business	
  users	
  
• Low	
  (1	
  second)	
  latency	
  
• Windows	
  (collec/ons)	
  
of	
  events	
  
©2014	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
6	
  
Opera/onal	
  Data	
  Store	
  
•  Consolidate,	
  cleanse	
  &	
  stage	
  
data	
  
•  Promote	
  to	
  other	
  opera/onal	
  
systems	
  or	
  EDW’s	
  
Data	
  Warehouse	
  
•  ELT	
  
•  Archive	
  
Ra/onalizing	
  exis/ng	
  infrastructure	
  
Migra/ng	
  data	
  sets,	
  workloads	
  or	
  en/re	
  systems	
  from	
  more	
  expensive	
  or	
  less	
  
flexible	
  systems	
  
©2014	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
7	
  
Combine	
  &	
  
explore	
  new	
  	
  
data	
  sets	
  
• Scrip/ng	
  
• Data	
  blending	
  
• Tradi/onal	
  ETL	
  
Support	
  ad-­‐hoc	
  
marts	
  and	
  self-­‐
serve	
  BI	
  users	
  
• Tableau,	
  Qlik	
  et	
  al	
  
Enable	
  data	
  
scien/sts	
  to	
  train	
  
&	
  test	
  models	
  
• ML	
  libraries	
  
• SAS,	
  Revolu/on	
  
What	
  do	
  we	
  mean	
  by	
  data	
  discovery?	
  
Providing	
  a	
  flexible	
  analy/c	
  sandbox	
  where	
  users	
  can	
  apply	
  mul/ple	
  tools	
  &	
  
techniques	
  to	
  derive	
  insights	
  from	
  new	
  &	
  tradi/onal	
  data	
  
©2014	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
8	
  
Analyze	
  paeerns	
  
over	
  deep	
  
histories	
  
• Recommenda/ons	
  
• Outliers	
  
Automate	
  
responses	
  to	
  new	
  
data	
  /	
  
observa/ons	
  
• Classifying	
  or	
  scoring	
  
new	
  data	
  
User	
  explora/on	
  /	
  
judgment	
  
applica/on	
  
• Reviewing	
  outliers	
  
• Overriding	
  sugges/ons	
  
What	
  do	
  we	
  mean	
  by	
  pervasive	
  analy/cs?	
  
Using	
  predic/ve	
  analy/cs	
  to	
  improve	
  business	
  processes	
  or	
  augment	
  
professional	
  judgment	
  in	
  an	
  automated	
  way	
  across	
  the	
  organiza/on	
  
©2014	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
9	
  
Big	
  Data	
  in	
  Credit	
  Card	
  Processing	
  
“Customer	
  privacy	
  is	
  
paramount,	
  but	
  we	
  need	
  to	
  
keep	
  vast	
  amounts	
  of	
  
informaFon	
  online	
  to	
  run	
  
our	
  business.	
  Can	
  we	
  achieve	
  
both	
  goals?”	
  
“Modern	
  credit	
  card	
  fraud	
  
rings	
  operate	
  globally	
  over	
  
long	
  Fme	
  scales	
  –	
  how	
  can	
  we	
  
collect,	
  store	
  &	
  analyze	
  the	
  
petabytes	
  of	
  data	
  it	
  takes	
  to	
  
detect	
  them?”	
  
“We	
  obviously	
  have	
  vast	
  and	
  
detailed	
  informaFon	
  about	
  
customer	
  purchases.	
  Can	
  we	
  
combine	
  it	
  with	
  GPS	
  &	
  mobile	
  
data,	
  combined	
  with	
  
browsing	
  behavior	
  to	
  offer	
  
new	
  products?”	
  
“How	
  can	
  we	
  deliver	
  what	
  
the	
  business	
  team	
  wants,	
  
and	
  faster,	
  without	
  
spending	
  tens	
  of	
  millions	
  of	
  
dollars	
  to	
  expand	
  our	
  data	
  
warehouse?”	
  
Fraud	
  DetecEon	
  
Regulatory	
  	
  
Compliance	
  
Product	
  &	
  Service	
  	
  
InnovaEon	
  
OperaEonal	
  	
  
Efficiency	
  
CFO	
  &	
  CRO	
   CIO	
  &	
  CRO	
   R&D,	
  CMO	
   CIO	
  
10	
  
Big	
  Data	
  in	
  Retail	
  
360°	
  Customer	
  View	
   Fraud	
  PrevenEon	
  
LogisEcs	
  &	
  	
  
Supply	
  Chain	
   OperaEonal	
  Efficiency	
  
CMO	
   CMO	
  &	
  	
  
Customer	
  Service	
  
CEO,	
  VP	
  OperaEons	
   CIO	
  
“We	
  want	
  to	
  know	
  what	
  our	
  
customer	
  do	
  on-­‐line	
  and	
  in	
  
our	
  stored.	
  How	
  can	
  we	
  
combine	
  data	
  from	
  separate	
  
analyFcs	
  silos	
  to	
  understand	
  
&	
  serve	
  them	
  beSer?”	
  
“TheT,	
  or	
  ‘shrinkage’	
  in	
  our	
  
stores	
  is	
  on	
  the	
  increase	
  –	
  
can	
  we	
  combine	
  POS	
  data	
  
with	
  video	
  surveillance	
  to	
  
reduce	
  it	
  without	
  impacFng	
  
customer	
  service	
  
negaFvely?”	
  
“How	
  can	
  we	
  reduce	
  stock-­‐
outs	
  &	
  ensure	
  products	
  are	
  in	
  
the	
  right	
  stores	
  at	
  the	
  right	
  
Fme?	
  Can	
  we	
  combine	
  data	
  
from	
  our	
  carriers	
  with	
  in-­‐
store	
  historical	
  data	
  from	
  
thousands	
  of	
  stores?	
  
“Our	
  EDW	
  infrastructure	
  is	
  
being	
  overwhelmed	
  with	
  
data	
  and	
  workloads;	
  we	
  are	
  
running	
  into	
  capacity	
  limits,	
  
and	
  the	
  annual	
  costs	
  of	
  
expansion	
  are	
  in	
  the	
  tens	
  of	
  
millions.	
  What	
  can	
  we	
  do?”	
  
11	
  
Big	
  Data	
  in	
  Health	
  Care	
  
360°	
  PaEent	
  View	
  
Regulatory	
  
Compliance	
  
Maximize	
  
Medical	
  Efficacy	
   OperaEonal	
  Efficiency	
  
VP	
  OperaEons,	
  	
  
Chief	
  of	
  Compliance	
  
VP	
  OperaEons	
  
Chief	
  Medical	
  Officer	
  
CFO	
  
Chief	
  Medical	
  Officer	
  
CIO	
  
“PaFent	
  data	
  ends	
  up	
  
scaSered	
  across	
  many	
  
different	
  systems	
  –	
  is	
  there	
  a	
  
way	
  to	
  get	
  a	
  complete	
  picture	
  
by	
  combining	
  it	
  while	
  
ensuring	
  HIPAA	
  compliance?”	
  
“The	
  move	
  to	
  EMR	
  combined	
  
with	
  the	
  strict	
  regulaFons	
  
means	
  we	
  need	
  to	
  keep	
  at	
  
least	
  7	
  years	
  of	
  data	
  online	
  –	
  
how	
  can	
  we	
  afford	
  to	
  do	
  that	
  
and	
  make	
  it	
  searchable	
  and	
  
available	
  for	
  analysis?”	
  
“We	
  invest	
  hundreds	
  of	
  
millions	
  in	
  new	
  equipment	
  
every	
  year.	
  How	
  can	
  we	
  judge	
  
the	
  long	
  term	
  efficacy	
  for	
  
paFent	
  outcomes,	
  and	
  make	
  
smarter	
  investment	
  
decisions?”	
  
“Our	
  EDW	
  infrastructure	
  is	
  
being	
  overwhelmed	
  with	
  data	
  
and	
  workloads;	
  we	
  are	
  
running	
  into	
  capacity	
  limits,	
  
and	
  the	
  annual	
  costs	
  of	
  
expansion	
  are	
  in	
  the	
  tens	
  of	
  
millions.	
  What	
  can	
  we	
  do?”	
  
12
13
14	
   ©2014	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.	
  
Mike	
  Olson	
  
@mikeolson	
  
mike.olson@cloudera.com	
  

More Related Content

What's hot

Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...Hortonworks
 
Hadoop and Manufacturing
Hadoop and ManufacturingHadoop and Manufacturing
Hadoop and ManufacturingCloudera, Inc.
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks
 
10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data LakeVMware Tanzu
 
Breakout: Operational Analytics with Hadoop
Breakout: Operational Analytics with HadoopBreakout: Operational Analytics with Hadoop
Breakout: Operational Analytics with HadoopCloudera, Inc.
 
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike FergusonMapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike FergusonMapR Technologies
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Cloudera, Inc.
 
Enterprise Search: Addressing the First Problem of Big Data & Analytics - Sta...
Enterprise Search: Addressing the First Problem of Big Data & Analytics - Sta...Enterprise Search: Addressing the First Problem of Big Data & Analytics - Sta...
Enterprise Search: Addressing the First Problem of Big Data & Analytics - Sta...StampedeCon
 
Dataguise hortonworks insurance_feb25
Dataguise hortonworks insurance_feb25Dataguise hortonworks insurance_feb25
Dataguise hortonworks insurance_feb25Hortonworks
 
Big Data Integration Webinar: Reducing Implementation Efforts of Hadoop, NoSQ...
Big Data Integration Webinar: Reducing Implementation Efforts of Hadoop, NoSQ...Big Data Integration Webinar: Reducing Implementation Efforts of Hadoop, NoSQ...
Big Data Integration Webinar: Reducing Implementation Efforts of Hadoop, NoSQ...Pentaho
 
Hybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data WarehouseHybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data WarehouseDataWorks Summit
 
IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...
IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...
IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...In-Memory Computing Summit
 
The path to a Modern Data Architecture in Financial Services
The path to a Modern Data Architecture in Financial ServicesThe path to a Modern Data Architecture in Financial Services
The path to a Modern Data Architecture in Financial ServicesHortonworks
 
Data Governance for Data Lakes
Data Governance for Data LakesData Governance for Data Lakes
Data Governance for Data LakesKiran Kamreddy
 
How to Become an Analytics Ready Insurer - with Informatica and Hortonworks
How to Become an Analytics Ready Insurer - with Informatica and HortonworksHow to Become an Analytics Ready Insurer - with Informatica and Hortonworks
How to Become an Analytics Ready Insurer - with Informatica and HortonworksHortonworks
 
Beyond a Big Data Pilot: Building a Production Data Infrastructure - Stampede...
Beyond a Big Data Pilot: Building a Production Data Infrastructure - Stampede...Beyond a Big Data Pilot: Building a Production Data Infrastructure - Stampede...
Beyond a Big Data Pilot: Building a Production Data Infrastructure - Stampede...StampedeCon
 
Govern This! Data Discovery and the application of data governance with new s...
Govern This! Data Discovery and the application of data governance with new s...Govern This! Data Discovery and the application of data governance with new s...
Govern This! Data Discovery and the application of data governance with new s...Cloudera, Inc.
 
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR DataExclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR DataPentaho
 

What's hot (20)

Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
 
Hadoop and Manufacturing
Hadoop and ManufacturingHadoop and Manufacturing
Hadoop and Manufacturing
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration
 
10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake
 
Breakout: Operational Analytics with Hadoop
Breakout: Operational Analytics with HadoopBreakout: Operational Analytics with Hadoop
Breakout: Operational Analytics with Hadoop
 
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike FergusonMapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8
 
Enterprise Search: Addressing the First Problem of Big Data & Analytics - Sta...
Enterprise Search: Addressing the First Problem of Big Data & Analytics - Sta...Enterprise Search: Addressing the First Problem of Big Data & Analytics - Sta...
Enterprise Search: Addressing the First Problem of Big Data & Analytics - Sta...
 
Dataguise hortonworks insurance_feb25
Dataguise hortonworks insurance_feb25Dataguise hortonworks insurance_feb25
Dataguise hortonworks insurance_feb25
 
Solving Big Data Problems using Hortonworks
Solving Big Data Problems using Hortonworks Solving Big Data Problems using Hortonworks
Solving Big Data Problems using Hortonworks
 
Big Data Integration Webinar: Reducing Implementation Efforts of Hadoop, NoSQ...
Big Data Integration Webinar: Reducing Implementation Efforts of Hadoop, NoSQ...Big Data Integration Webinar: Reducing Implementation Efforts of Hadoop, NoSQ...
Big Data Integration Webinar: Reducing Implementation Efforts of Hadoop, NoSQ...
 
Hybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data WarehouseHybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data Warehouse
 
IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...
IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...
IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...
 
The path to a Modern Data Architecture in Financial Services
The path to a Modern Data Architecture in Financial ServicesThe path to a Modern Data Architecture in Financial Services
The path to a Modern Data Architecture in Financial Services
 
Oracle's BigData solutions
Oracle's BigData solutionsOracle's BigData solutions
Oracle's BigData solutions
 
Data Governance for Data Lakes
Data Governance for Data LakesData Governance for Data Lakes
Data Governance for Data Lakes
 
How to Become an Analytics Ready Insurer - with Informatica and Hortonworks
How to Become an Analytics Ready Insurer - with Informatica and HortonworksHow to Become an Analytics Ready Insurer - with Informatica and Hortonworks
How to Become an Analytics Ready Insurer - with Informatica and Hortonworks
 
Beyond a Big Data Pilot: Building a Production Data Infrastructure - Stampede...
Beyond a Big Data Pilot: Building a Production Data Infrastructure - Stampede...Beyond a Big Data Pilot: Building a Production Data Infrastructure - Stampede...
Beyond a Big Data Pilot: Building a Production Data Infrastructure - Stampede...
 
Govern This! Data Discovery and the application of data governance with new s...
Govern This! Data Discovery and the application of data governance with new s...Govern This! Data Discovery and the application of data governance with new s...
Govern This! Data Discovery and the application of data governance with new s...
 
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR DataExclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
 

Similar to Ask bigger questions

The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubCloudera, Inc.
 
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...MongoDB
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubCloudera, Inc.
 
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLTBig Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLTKiththi Perera
 
Big data solutions on cloud – the way forward
Big data solutions on cloud – the way forwardBig data solutions on cloud – the way forward
Big data solutions on cloud – the way forwardKiththi Perera
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationDATAVERSITY
 
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...DataScienceConferenc1
 
5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game ChangerCaserta
 
Data lake-itweekend-sharif university-vahid amiry
Data lake-itweekend-sharif university-vahid amiryData lake-itweekend-sharif university-vahid amiry
Data lake-itweekend-sharif university-vahid amirydatastack
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaCloudera, Inc.
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise AnalyticsDATAVERSITY
 
Big Data Case study - caixa bank
Big Data Case study - caixa bankBig Data Case study - caixa bank
Big Data Case study - caixa bankChungsik Yun
 
Cloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinarCloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinarHortonworks
 
Hadoop and Your Data Warehouse
Hadoop and Your Data WarehouseHadoop and Your Data Warehouse
Hadoop and Your Data WarehouseCaserta
 
The Value of Customer Insights & Analytics in a Modern Retail Environment
The Value of Customer Insights & Analytics in a Modern Retail EnvironmentThe Value of Customer Insights & Analytics in a Modern Retail Environment
The Value of Customer Insights & Analytics in a Modern Retail EnvironmentDenodo
 
Fueling AI & Machine Learning: Legacy Data as a Competitive Advantage
Fueling AI & Machine Learning: Legacy Data as a Competitive AdvantageFueling AI & Machine Learning: Legacy Data as a Competitive Advantage
Fueling AI & Machine Learning: Legacy Data as a Competitive AdvantagePrecisely
 
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...BigDataEverywhere
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...DATAVERSITY
 

Similar to Ask bigger questions (20)

The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLTBig Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
 
Big data solutions on cloud – the way forward
Big data solutions on cloud – the way forwardBig data solutions on cloud – the way forward
Big data solutions on cloud – the way forward
 
Big data rmoug
Big data rmougBig data rmoug
Big data rmoug
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data Integration
 
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
 
5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer
 
Data lake-itweekend-sharif university-vahid amiry
Data lake-itweekend-sharif university-vahid amiryData lake-itweekend-sharif university-vahid amiry
Data lake-itweekend-sharif university-vahid amiry
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics
 
Big Data Case study - caixa bank
Big Data Case study - caixa bankBig Data Case study - caixa bank
Big Data Case study - caixa bank
 
Cloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinarCloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinar
 
Hadoop and Your Data Warehouse
Hadoop and Your Data WarehouseHadoop and Your Data Warehouse
Hadoop and Your Data Warehouse
 
The Value of Customer Insights & Analytics in a Modern Retail Environment
The Value of Customer Insights & Analytics in a Modern Retail EnvironmentThe Value of Customer Insights & Analytics in a Modern Retail Environment
The Value of Customer Insights & Analytics in a Modern Retail Environment
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Fueling AI & Machine Learning: Legacy Data as a Competitive Advantage
Fueling AI & Machine Learning: Legacy Data as a Competitive AdvantageFueling AI & Machine Learning: Legacy Data as a Competitive Advantage
Fueling AI & Machine Learning: Legacy Data as a Competitive Advantage
 
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
 

More from South West Data Meetup

Leveraging open source for large scale analytics
Leveraging open source for large scale analyticsLeveraging open source for large scale analytics
Leveraging open source for large scale analyticsSouth West Data Meetup
 
Time Series Analytics for Big Fast Data
Time Series Analytics for Big Fast DataTime Series Analytics for Big Fast Data
Time Series Analytics for Big Fast DataSouth West Data Meetup
 
@Bristol Data Dome Workshop (ISO/Urban Tide)
@Bristol Data Dome Workshop (ISO/Urban Tide)@Bristol Data Dome Workshop (ISO/Urban Tide)
@Bristol Data Dome Workshop (ISO/Urban Tide)South West Data Meetup
 
Assurance Scoring: using machine learning and analytics to reduce risk in the...
Assurance Scoring: using machine learning and analytics to reduce risk in the...Assurance Scoring: using machine learning and analytics to reduce risk in the...
Assurance Scoring: using machine learning and analytics to reduce risk in the...South West Data Meetup
 
Imagine Bristol - interactive workshop day
Imagine Bristol - interactive workshop dayImagine Bristol - interactive workshop day
Imagine Bristol - interactive workshop daySouth West Data Meetup
 
@Bristol Data Dome workshop - NSC Creative
@Bristol Data Dome workshop - NSC Creative@Bristol Data Dome workshop - NSC Creative
@Bristol Data Dome workshop - NSC CreativeSouth West Data Meetup
 
Bristol is Open: Exploring Open Data in the City
Bristol is Open: Exploring Open Data in the CityBristol is Open: Exploring Open Data in the City
Bristol is Open: Exploring Open Data in the CitySouth West Data Meetup
 

More from South West Data Meetup (11)

Leveraging open source for large scale analytics
Leveraging open source for large scale analyticsLeveraging open source for large scale analytics
Leveraging open source for large scale analytics
 
Met Office Informatics Lab
Met Office Informatics LabMet Office Informatics Lab
Met Office Informatics Lab
 
Time Series Analytics for Big Fast Data
Time Series Analytics for Big Fast DataTime Series Analytics for Big Fast Data
Time Series Analytics for Big Fast Data
 
@Bristol Data Dome Workshop (ISO/Urban Tide)
@Bristol Data Dome Workshop (ISO/Urban Tide)@Bristol Data Dome Workshop (ISO/Urban Tide)
@Bristol Data Dome Workshop (ISO/Urban Tide)
 
Assurance Scoring: using machine learning and analytics to reduce risk in the...
Assurance Scoring: using machine learning and analytics to reduce risk in the...Assurance Scoring: using machine learning and analytics to reduce risk in the...
Assurance Scoring: using machine learning and analytics to reduce risk in the...
 
Imagine Bristol - interactive workshop day
Imagine Bristol - interactive workshop dayImagine Bristol - interactive workshop day
Imagine Bristol - interactive workshop day
 
Open Data Institute (ODI) Node
Open Data Institute (ODI) NodeOpen Data Institute (ODI) Node
Open Data Institute (ODI) Node
 
Bristol's Open Data Journey
Bristol's Open Data JourneyBristol's Open Data Journey
Bristol's Open Data Journey
 
@Bristol Data Dome workshop - NSC Creative
@Bristol Data Dome workshop - NSC Creative@Bristol Data Dome workshop - NSC Creative
@Bristol Data Dome workshop - NSC Creative
 
Declarative data analysis
Declarative data analysisDeclarative data analysis
Declarative data analysis
 
Bristol is Open: Exploring Open Data in the City
Bristol is Open: Exploring Open Data in the CityBristol is Open: Exploring Open Data in the City
Bristol is Open: Exploring Open Data in the City
 

Recently uploaded

Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad EscortsCall girls in Ahmedabad High profile
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 

Recently uploaded (20)

Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 

Ask bigger questions

  • 1. 1   Becoming  Informa/on-­‐Driven   Introduc/on  to  the  Enterprise  Data  Hub   Mike  Olson   Cloudera,  Inc.   Co-­‐Founder  &  Chief  Strategy  Officer  
  • 2. 2   Expanding  Data  Requires  A  New  Approach   ©2014  Cloudera,  Inc.  All  rights  reserved.  2   1980s   Bring  Data  to  Compute   Now   Bring  Compute  to  Data   RelaEve  size  &  complexity   Data   InformaEon-­‐centric   businesses  use  all  data:       Mul/-­‐structured,     internal  &  external  data     of  all  types   Compute   Compute   Compute   Process-­‐centric     businesses  use:     • Structured  data  mainly   • Internal  data  only   • “Important”  data  only       Compute   Compute   Compute   Data   Data   Data   Data  
  • 3. 3   The  Old  Way:  Bringing  Data  to  Compute   ©2014  Cloudera,  Inc.  All  rights  reserved.  3   Complex  Architecture   •  Many  special-­‐purpose   systems   •  Moving  data  around   •  No  complete  views   Visibility   •  Leaving  data  behind   •  Risk  and  compliance   •  High  cost  of  storage   Time  to  Data   •  Up-­‐front  modeling   •  Transforms  slow   •  Transforms  lose  data   Cost  of  AnalyEcs   •  Exis/ng  systems  strained   •  No  agility   •  BI  backlog   4   1   2   3   SERVERS  MARTS  EDWS   DOCUMENTS   STORAGE   SEARCH   ARCHIVE   ERP,  CRM,  RDBMS,  MACHINES   FILES,  IMAGES,  VIDEOS,  LOGS,  CLICKSTREAMS   EXTERNAL  DATA  SOURCES  
  • 4. 4   SERVERS   MARTS   EDWS   DOCUMENTS   STORAGE   SEARCH   ARCHIVE   ERP,  CRM,  RDBMS,  MACHINES   FILES,  IMAGES,  VIDEOS,  LOGS,  CLICKSTREAMS   ESTERNAL  DATA  SOURCES   ©2014  Cloudera,  Inc.  All  rights  reserved.   MulE-­‐workload  analyEc  plaRorm   •  Bring  applica/ons  to  data   •  Combine  different  workloads  on     common  data  (i.e.  SQL  +  Search)   •  True  BI  agility   4   1   2   3   4   The  New  Way:  Bringing  Compute  to  Data   4   AcEve  archive   •  Full  fidelity  original  data   •  Indefinite  /me,  any  source   •  Lowest  cost  storage   1   Data  management,  transforms   •  One  source  of  data  for  all  analy/cs   •  Persist  state  of  transformed  data   •  Significantly  faster  &  cheaper   2   Self-­‐service  exploratory  BI   •  Simple  search  +  BI  tools   •  “Schema  on  read”  agility   •  Reduce  BI  user  backlog  requests   3  
  • 5. 5   Beeer,  faster,  cheaper  and  mul/-­‐framework   BATCH   PROCESSING   MR  /  PIG/  Hive  /  Cascading   SQL   IMPALA   SEARCH   SOLR   MACHINE   LEARNING   SAS,  R,  H20,  MLlib   STREAM   PROCESSING   SPARK  STREAMING   NOSQL   HBASE   Process  Data   IN-­‐MEMORY   SPARK   Train  &  Test   Models   Respond  to   Events  in  RT   Explore  &   Analyze  Data   • Highly  mature   • Wide  range  of  clients   • Significant  advances   in  speed  &  usability   • Integra/on  with  the   SAS  &  Revolu/on   product  porgolio   • Python  /  0xdata  /  ML   lib  for  advanced  users   • Very  low  (~10ms)   latency   • High  volumes  of   single  events   • High  speed   • High  concurrency   • Workload  mgt   • Broad  BI  support   • For  unstructured  &   semi-­‐structured  data   • For  business  users   • Low  (1  second)  latency   • Windows  (collec/ons)   of  events   ©2014  Cloudera,  Inc.  All  rights  reserved.  
  • 6. 6   Opera/onal  Data  Store   •  Consolidate,  cleanse  &  stage   data   •  Promote  to  other  opera/onal   systems  or  EDW’s   Data  Warehouse   •  ELT   •  Archive   Ra/onalizing  exis/ng  infrastructure   Migra/ng  data  sets,  workloads  or  en/re  systems  from  more  expensive  or  less   flexible  systems   ©2014  Cloudera,  Inc.  All  rights  reserved.  
  • 7. 7   Combine  &   explore  new     data  sets   • Scrip/ng   • Data  blending   • Tradi/onal  ETL   Support  ad-­‐hoc   marts  and  self-­‐ serve  BI  users   • Tableau,  Qlik  et  al   Enable  data   scien/sts  to  train   &  test  models   • ML  libraries   • SAS,  Revolu/on   What  do  we  mean  by  data  discovery?   Providing  a  flexible  analy/c  sandbox  where  users  can  apply  mul/ple  tools  &   techniques  to  derive  insights  from  new  &  tradi/onal  data   ©2014  Cloudera,  Inc.  All  rights  reserved.  
  • 8. 8   Analyze  paeerns   over  deep   histories   • Recommenda/ons   • Outliers   Automate   responses  to  new   data  /   observa/ons   • Classifying  or  scoring   new  data   User  explora/on  /   judgment   applica/on   • Reviewing  outliers   • Overriding  sugges/ons   What  do  we  mean  by  pervasive  analy/cs?   Using  predic/ve  analy/cs  to  improve  business  processes  or  augment   professional  judgment  in  an  automated  way  across  the  organiza/on   ©2014  Cloudera,  Inc.  All  rights  reserved.  
  • 9. 9   Big  Data  in  Credit  Card  Processing   “Customer  privacy  is   paramount,  but  we  need  to   keep  vast  amounts  of   informaFon  online  to  run   our  business.  Can  we  achieve   both  goals?”   “Modern  credit  card  fraud   rings  operate  globally  over   long  Fme  scales  –  how  can  we   collect,  store  &  analyze  the   petabytes  of  data  it  takes  to   detect  them?”   “We  obviously  have  vast  and   detailed  informaFon  about   customer  purchases.  Can  we   combine  it  with  GPS  &  mobile   data,  combined  with   browsing  behavior  to  offer   new  products?”   “How  can  we  deliver  what   the  business  team  wants,   and  faster,  without   spending  tens  of  millions  of   dollars  to  expand  our  data   warehouse?”   Fraud  DetecEon   Regulatory     Compliance   Product  &  Service     InnovaEon   OperaEonal     Efficiency   CFO  &  CRO   CIO  &  CRO   R&D,  CMO   CIO  
  • 10. 10   Big  Data  in  Retail   360°  Customer  View   Fraud  PrevenEon   LogisEcs  &     Supply  Chain   OperaEonal  Efficiency   CMO   CMO  &     Customer  Service   CEO,  VP  OperaEons   CIO   “We  want  to  know  what  our   customer  do  on-­‐line  and  in   our  stored.  How  can  we   combine  data  from  separate   analyFcs  silos  to  understand   &  serve  them  beSer?”   “TheT,  or  ‘shrinkage’  in  our   stores  is  on  the  increase  –   can  we  combine  POS  data   with  video  surveillance  to   reduce  it  without  impacFng   customer  service   negaFvely?”   “How  can  we  reduce  stock-­‐ outs  &  ensure  products  are  in   the  right  stores  at  the  right   Fme?  Can  we  combine  data   from  our  carriers  with  in-­‐ store  historical  data  from   thousands  of  stores?   “Our  EDW  infrastructure  is   being  overwhelmed  with   data  and  workloads;  we  are   running  into  capacity  limits,   and  the  annual  costs  of   expansion  are  in  the  tens  of   millions.  What  can  we  do?”  
  • 11. 11   Big  Data  in  Health  Care   360°  PaEent  View   Regulatory   Compliance   Maximize   Medical  Efficacy   OperaEonal  Efficiency   VP  OperaEons,     Chief  of  Compliance   VP  OperaEons   Chief  Medical  Officer   CFO   Chief  Medical  Officer   CIO   “PaFent  data  ends  up   scaSered  across  many   different  systems  –  is  there  a   way  to  get  a  complete  picture   by  combining  it  while   ensuring  HIPAA  compliance?”   “The  move  to  EMR  combined   with  the  strict  regulaFons   means  we  need  to  keep  at   least  7  years  of  data  online  –   how  can  we  afford  to  do  that   and  make  it  searchable  and   available  for  analysis?”   “We  invest  hundreds  of   millions  in  new  equipment   every  year.  How  can  we  judge   the  long  term  efficacy  for   paFent  outcomes,  and  make   smarter  investment   decisions?”   “Our  EDW  infrastructure  is   being  overwhelmed  with  data   and  workloads;  we  are   running  into  capacity  limits,   and  the  annual  costs  of   expansion  are  in  the  tens  of   millions.  What  can  we  do?”  
  • 12. 12
  • 13. 13
  • 14. 14   ©2014  Cloudera,  Inc.  All  rights  reserved.   Mike  Olson   @mikeolson   mike.olson@cloudera.com