Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Introduction to Cloudera's Unique Architecture & Competitive Advantages

395 views

Published on

"Introduction to Cloudera's Unique Architecture & Competitive Advantages" by Nuno Barreto - Associate Partner & Big Data Lead @Xpand IT on the event Cloudera & Big Data Ecosystem

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Introduction to Cloudera's Unique Architecture & Competitive Advantages

  1. 1. Unique Architecture Cloudera Nuno Barreto Associate Partner & Big Data Lead nuno.barreto@xpand-it.com Proprietary & Confidential www.xpand-it.com
  2. 2. BIG DATA
  3. 3. THE 3Vs V olume V ariety V elocity
  4. 4. AND THE 4th AND 5th Vs V eracity V alue
  5. 5. HADOOP “RECAP”
  6. 6. THE BIGINNING
  7. 7. TODAY – THE ULTIMATE DATA TOOLKIT
  8. 8. DATA DEFINITIONS STRUCTURED SEMI-STRUCTURED UNSTRUCTURED
  9. 9. DATA LOCALITY X86 X86 X86 X86 NODE1 NODE2 NODE3 NODEN ...
  10. 10. CLOUDERA ON THE ENTERPRISE Sensor Data Blogs Emails Web Logs Docs (e.g.PDF) Images Videos CRM ERP Legacy 3rd Patry Extract (includesFileTansfer),TransformandLoad Scale-out DistributedDatabase Visualization(Reporting,ExplorationandSandboxing) RawDataSources Operational Systems DW&DATAMARTs
  11. 11. HADOOP MYTHS
  12. 12. MYTH 1 SEMI-STRUCTURED HADOOP is only good for DATA
  13. 13. HADOOP IS FOR ALL DATA INDEED GREAT FOR SEMI- STRUCTURED DATA GREAT FOR SQLAT SCALE (HIVE/IMPALA) EVEN GREAT FOR UNSTRUCTURED DATA FAST QUERYING (IMPALA/HBASE/KUDU)
  14. 14. MYTH 2 BATCH HADOOP is only good for
  15. 15. REALTIME END TO END KAFKA SPARK DEVELOPERS FAST ANALYTICS
  16. 16. QUERY LATENCY BATCH SQL 20 min to 20 hours Large ETL, Data mining OPERATIONAL SQL <100 ms Indexed queries INTERACTIVE SQL 100 ms to 20 minutes Interactive queries, Reporting
  17. 17. INTERACTIVE SQL ENGINES BI & SQL ANALYTICS SPARK DEVELOPERS
  18. 18. (REALLY) FAST QUERY HBASE NOSQL FAST ANALYTICS BI & SQL ANALYTICS
  19. 19. MYTH 3 SPARK HADOOP is going to be replaced by
  20. 20. HADOOP IS NOT (ONLY) MAP REDUCE HADOOP IS A COMMON DESIGNATION FOR THE ENTIRE STACK ONE PLATFORM INITIATIVE BY CLOUDERA: UNITING SPARK AND HADOOP
  21. 21. MYTH 4 SECURE HADOOP is not
  22. 22. KERBEROS AUTHENTICATION MIT KERBEROS
  23. 23. AUTHORIZATION for and andACLs for
  24. 24. ENCRYPTION
  25. 25. PCI-DSS COMPLIANT IS
  26. 26. THE STACK
  27. 27. HADOOP DISTRO KEY FEATURES MANAGEABLE RICH ENOUGH TOOL SET OPEN STANDARDS FIRST SECURITY AND COMPLIANCE NOT FOR ONLY ONE USE CASE
  28. 28. CLOUDERA DATA HUB
  29. 29. QUESTIONS?

×