The Future of Data Management: The Enterprise Data Hub


Published on

Published in: Technology

The Future of Data Management: The Enterprise Data Hub

  1. 1. The  Future  of  Data  Management:     The  Enterprise  Data  Hub   Clarke  Pa)erson|  Sr.  Director,  Cloudera   1   ©2014  Cloudera,  Inc.  All  rights  reserved.      
  2. 2. Data  PotenAal  is  Out  There   ©2014  Cloudera,  Inc.  All  rights  reserved.      2  
  3. 3. An  Environment  of  Change   ©2014  Cloudera,  Inc.  All  rights  reserved.      3   ConsumpAon   InstrumentaAon   Value   ExploraAon  
  4. 4. ©2014  Cloudera,  Inc.  All  rights  reserved.      4  
  5. 5. 5  
  6. 6. ©2014  Cloudera,  Inc.  All  rights  reserved.      6  
  7. 7. ©2014  Cloudera,  Inc.  All  rights  reserved.      7  
  8. 8. ©2014  Cloudera,  Inc.  All  rights  reserved.      8  
  9. 9. IT’S  ALL   (BIG)   DATA   10TB  to  10PB   ©2014  Cloudera,  Inc.  All  rights  reserved.      9  
  10. 10. 0%   10%   20%   30%   40%   50%   60%   Mainframe   Enterprise  Data  Warehouse   Storage   AnalyAc  Databases   ETL  Processing   What  Infrastructure  Have  you  Augmented     with  Big  Data  SoluAons?   Source:  King  Research,  3922  Respondents   ©2014  Cloudera,  Inc.  All  rights  reserved.      10  
  11. 11. ©2014  Cloudera,  Inc.  All  rights  reserved.       ComplicaAons  of  Status  Quo   Structure   Storage   Network   Silos   INGEST   STORE   EXPLORE   PROCESS   ANALYZE   SERVE   11  
  12. 12. How  Important  are  These  CapabiliAes  in  Your   SelecAon  of  a  Big  Data  Vendor?   7   7.5   8   8.5   9   9.5   Open  Source  Socware   Technically  Superior  Product   Cost   IntegraAon  with  Other  Systems   Secure  Technology   Reliable  /  Trusted  Vendor   Flexibility   Performance   Scalability   Source:  King  Research,  3922  Respondents   ©2014  Cloudera,  Inc.  All  rights  reserved.      12  
  13. 13. ©2014  Cloudera,  Inc.  All  rights  reserved.      13  
  14. 14. What  are  the  Primary  Benefits  You’ve  Seen  Doing   a  Big  Data  Product  with  an  EDH   Source:  King  Research,  3922  Respondents   10%   30%   50%   70%   Gain  CompeAAve  Advantage   Improve  Efficiency   Increase  Business  Value  from  Data   Make  Be)er  Decisions,  Faster   Improved  Data  Processing   Improved  Data  AnalyAcs   ©2014  Cloudera,  Inc.  All  rights  reserved.      14  
  15. 15. 15%   25%   35%   45%   OperaAonal  Improvement   Customer  Experience  Analysis   Market  TargeAng   Customer  Insights   Behavioral  Analysis   Research  /  InnovaAon   ©2014  Cloudera,  Inc.  All  rights  reserved.       What  are  Your  Big  Data  ApplicaAons?   15   Source:  King  Research,  3922  Respondents  
  16. 16. ©2014  Cloudera,  Inc.  All  rights  reserved.       Expanding  Data  Requires  A  New  Approach   16   Then   Bring  Data  to  Compute   Now   Bring  Compute  to  Data   Data   InformaFon-­‐centric   businesses  use  all  Data:       MulF-­‐structured,     Internal  &  external  data     of  all  types   Compute   Compute   Compute   Process-­‐centric     businesses  use:     • Structured  data  mainly   • Internal  data  only   • “Important”  data  only       Compute   Compute   Compute   Data   Data   Data   Data  
  17. 17. Hadoop  Changes  the  Game:     Storage  and  Compute  on  One  Plalorm   ©2014  Cloudera,  Inc.  All  rights  reserved.      17   The  Hadoop  Way  The  Old  Way   $30,000+  per  TB   Expensive  &  UnaWainable   •  Hard  to  scale   •  Network  is  a  bo)leneck   •  Only  handles  relaAonal  data   •  Difficult  to  add  new  fields  &  data  types   Expensive,  Special  purpose,  “Reliable”  Servers   Expensive  Licensed  So[ware   Network   Data  Storage   (SAN,  NAS)   Compute   (RDBMS,  EDW)   $300-­‐$1,000  per  TB   Affordable  &  AWainable   •  Scales  out  forever   •  No  bo)lenecks   •  Easy  to  ingest  any  data   •  Agile  data  access   Commodity  “Unreliable”  Servers   Hybrid  Open  Source  So[ware   Compute   (CPU)   Memory   Storage   (Disk)   z   z  
  18. 18. ©2014  Cloudera,  Inc.  All  rights  reserved.      18   The  Old  Way   Expensive  &  UnaWainable   The  Hadoop  Way   Affordable  &  AWainable   Hadoop  Changes  the  Game:     Storage  and  Compute  on  One  Plalorm  
  19. 19. ©2014  Cloudera,  Inc.  All  rights  reserved.       The  Old  Way:  Bringing  Data  to  Compute   19   Complex  Architecture   •  Many  special-­‐purpose  systems   •  Moving  data  around   •  No  complete  views   Missing  Data   •  Leaving  data  behind   •  Risk  and  compliance   •  High  cost  of  storage   Time  to  Data   •  Up-­‐front  modeling   •  Transforms  slow   •  Transforms  lose  data   Cost  of  AnalyFcs   •  ExisAng  systems  strained   •  No  agility   •  “BI  backlog”   4   1   2   3   SERVERS  MARTS  EDWS   DOCUMENTS   STORAGE   SEARCH   ARCHIVE   ERP,  CRM,  RDBMS,  MACHINES   FILES,  IMAGES,  VIDEOS,  LOGS,  CLICKSTREAMS   EXTERNAL  DATA  SOURCES  
  20. 20. ©2014  Cloudera,  Inc.  All  rights  reserved.       The  New  Way:  Bringing  Compute  to  Data   20   SERVERS   MARTS   EDWS   DOCUMENTS   STORAGE   SEARCH   ARCHIVE   ERP,  CRM,  RDBMS,  MACHINES   FILES,  IMAGES,  VIDEOS,  LOGS,  CLICKSTREAMS   ESTERNAL  DATA  SOURCES   Diverse  AnalyFc  Pla]orm   •  Bring  applicaAons  to  data   •  Combine  different  workloads  on     common  data  (i.e.  SQL  +  Search)   •  True  analy*c  agility   4   1   2   3   4   AcFve  Compliance  Archive   •  Full  fidelity  original  data   •  Indefinite  Ame,  any  source   •  Lowest  cost  storage   1   Persistent  Staging   •  One  source  of  data  for  all  analyAcs   •  Persist  state  of  transformed  data   •  Significantly  faster  &  cheaper   2   Self-­‐Service  Exploratory  BI   •  Simple  search  +  BI  tools   •  “Schema  on  read”  agility   •  Reduce  BI  user  backlog  requests   3  
  21. 21. ©2014  Cloudera,  Inc.  All  rights  reserved.       Hadoop  and  The  Enterprise  Data  Hub   21   Open  Source   Scalable   Flexible   Cost-­‐EffecFve   ✔   Managed   ✖   Open   Architecture   ✖   Secure  and   Governed   ✖   ✔   ✔   ✔   3RD  PARTY   APPS   STORAGE  FOR  ANY  TYPE  OF  DATA   UNIFIED,  ELASTIC,  RESILIENT,  SECURE             CLOUDERA’S  ENTERPRISE  DATA  HUB   BATCH   PROCESSING   ANALYTIC   SQL   SEARCH   ENGINE   MACHINE   LEARNING   STREAM   PROCESSING   WORKLOAD  MANAGEMENT   FILESYSTEM   ONLINE  NOSQL   DATA   MANAGEMENT   SYSTEM   MANAGEMENT   ,  SECURE  
  22. 22. ©2014  Cloudera,  Inc.  All  rights  reserved.       The  Power  of  the  EDH   22   THE  OLD  WAY   EDH  
  23. 23. ©2014  Cloudera,  Inc.  All  rights  reserved.       TransformaAve  ApplicaAons  Drive  Revenue   23   5%   15%   25%   35%   45%   Research  /  innovaAon   Behavioral  analysis   Customer  insights   MarkeAng  targeAng  /   Customer  experience   OperaAons  improvement   Fraud  prevenAon  and   Pricing  analyAcs  and  choice   Risk  Modeling  /   Network  monitoring   Service  quality   Customer  lifecycle   Capacity  forecasAng   Inventory  management   eDiscovery  /  document   What  are  your     Big  Data  ApplicaAons?   Source:  King  Research  survey,  September  2013,  3,922  Respondents  
  24. 24. So  How  Do  We  Get  There?   24   ©2014  Cloudera,  Inc.  All  rights  reserved.      
  25. 25. The  Typical  Enterprise  Data  AnalyAcs  Stack   Business  Intelligence  /  ApplicaFons   RDBMS   ETL  Processing   Staging  /  Storage   CollecFon  
  26. 26. Step  1:  EDH  for  Storage/Staging/AcAve  Archive   Business  Intelligence  /  ApplicaFons   RDBMS   ETL  Processing   EDH  for  Storage  AcFve  Archive   CollecFon  
  27. 27. EDH  for  CollecFon  &  Storage.   Step  2:  EDH  for  Data  CollecAon  (Sqoop/Flume)   Business  Intelligence  /  ApplicaFons   RDBMS   ETL  Processing  
  28. 28. Step  3:  EDH  for  ETL  Processing  AcceleraAon   Business  Intelligence  /  ApplicaFons   RDBMS   EDH  for  CollecFon,  Storage     &  ETL  Processing  AcceleraFon.   ETL  /  Data   IntegraAon   Tools  
  29. 29. Step  4:  EDH  for  EDW  OpAmizaAon  (Impala)     EDH  for  CollecFon,  Storage,     ETL  Processing  AcceleraFon   &  Historical  RDBMS  Data/Queries.   Business  Intelligence  /  ApplicaFons   RDBMS   Rarely  Used  Data  
  30. 30. Step  5:  EDH  for  Agile  ExploraAon     EDH  for  CollecFon,  Storage,   ETL  Processing  AcceleraFon,   Historical  RDBMS  Data/Queries,   And  Agile  ExploraFon   RDBMS   BI  /  ApplicaFons   Agile  ExploraFon  
  31. 31. Step  6:  EDH  for  Data  Science  (Not  Only  SQL)     EDH  for  CollecFon,  Storage,   ETL  Processing  AcceleraFon,   Historical  RDBMS  Data/Queries,   &  Generic  Data  ComputaFon   RDBMS   BI  /   ApplicaFons   Agile   ExploraFon   Data   Science  
  32. 32. Step  7:  Converged  AnalyAcs  -­‐  Apps  Come  to  Data       EDH  for  CollecFon,  Storage,   ETL  Processing  AcceleraFon,   Historical  RDBMS  Data/Queries,   Generic  Data  ComputaFon,   And  MulFple-­‐Workloads.   RDBMS   BI   Explore   Data   Science   SAS,  R,   Spark   InformaFca   SyncSort,   Pentaho   Hunk   ...  
  33. 33. Data   Science   Agile   ExploraFon   ETL   AcceleraFon   OperaFonal  Efficiency   (Faster,  Bigger,  Cheaper)   TransformaFve  ApplicaFons   (New  Business  Value)   Cheap   Storage   Business                          IT   A  High  Level  View  of  the  Journey   ©2014  Cloudera,  Inc.  All  Rights  Reserved.   EDW   OpFmizaFon   Converged   AnalyFcs  
  34. 34. WEB/MOBILE  APPLICATIONS   ONLINE  SERVING   SYSTEM   ENTERPRISE  DATA   WAREHOUSE     ENTERPRISE   REPORTING  BI  /  ANALYTICS  MACHINE   LEARNING   CONVERGED   APPLICATIONS   CLOUDERA   MANAGER   META  DATA  /     ETL  TOOLS   ENTERPRISE  DATA  HUB   ©2014  Cloudera,  Inc.  All  Rights  Reserved.   The  Modern  InformaAon  Architecture   Data  Architects   System  Operators   Engineers   Data  ScienFsts   Analysts   Business  Users   Customers  &  End  Users   SYS  LOGS   WEB  LOGS   FILES   RDBMS  
  35. 35. Enabling  The  App  Store  of  Big  Data   So[ware  (BI,  AnalyFcs,  &  Data  IntegraFon)   System  IntegraFon   Cloud  &  MSP   Hardware   Database   Note:  Display  Cloudera  Connect  PlaAnum  and  Gold  partners  only   ©2014  Cloudera,  Inc.  All  rights  reserved.      
  36. 36. Customer  Success  Across  Industries   Financial  &   Business  Services   Telecom  &     Technology   Healthcare  &   Life  Sciences   Media  &   InformaAon   Retail  &   Consumer   Energy  &     Public  Sector   ©2014  Cloudera,  Inc.  All  rights  reserved.      
  37. 37. Enterprise  Data  Hub:  A  Complete  Big  Data  SoluAon     •  Efficient  Data  Management  System   •  Consolidated  Silos  for  Truly  Big  Data   •  Accelerated  Time  to  Insight   •  Diverse  Business  User  CapabiliAes   •  Full-­‐Fidelity  AcAve  Archive   •  Enterprise-­‐Grade  Data  Security,                                       Lineage,  AudiAng,  Governance   •  High  OpAon  Value  for  ExploraAon,                                           Data  Science,  Consolidated  360o  View   •  Complete  Plalorm  for  Converged  AnalyAcs   ©2014  Cloudera,  Inc.  All  rights  reserved.      
  38. 38. Thank  You!   38   ©2014  Cloudera,  Inc.  All  rights  reserved.