Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Leveraging the Cloud for Big Data Analytics 12.11.18

804 views

Published on

Learn how organizations are deriving unique customer insights, improving product and services efficiency, and reducing business risk with a modern big data architecture powered by Cloudera on AWS. In this webinar, you see how fast and easy it is to deploy a modern data management platform—in your cloud, on your terms.

Published in: Technology
  • Be the first to comment

Leveraging the Cloud for Big Data Analytics 12.11.18

  1. 1. BIG DATA ANALYTICS IN THE CLOUD Sushant Rao Cloud Product Marketing @ Cloudera Rohit Pujari Solutions Architect @ Amazon Web Services
  2. 2. © Cloudera, Inc. All rights reserved.2 Primary Advantages for Cloud ● Agility ○ Speed of making changes to meet business / technical needs ● Scalable & Elastic ○ Scale up and down quickly ● Reliable ○ Multiple options to ensure infrastructure / services are available ● Cost effectiveness ○ Pay for what you use (but may not be cheaper than on-prem)
  3. 3. © Cloudera, Inc. All rights reserved.3 Big Data Use Cases for Cloud ● Corporate directive to leverage the cloud ○ C-level has decided to utilize the cloud more ● Disaster Recovery “location” in the cloud ○ Backup all data to the cloud, without a second “physical” location ● On-demand data mart / data engineering ○ Separate environment for new, production workloads ○ Ad-hoc workloads that run intermittently ● Sandbox environment for workloads ○ Environment to test queries and algorithms
  4. 4. © Cloudera, Inc. All rights reserved.4 Cloudera’s Solution for Data Analytics / Engineering in Cloud • The modern platform for machine learning and analytics ○ Numerous functions for all types of jobs and queries • with multiple deployment options ○ On-premises, Public cloud (including multi-), and Hybrid • and one shared data experience ○ Framework for consistent security, governance, and metadata management across applications and deployments
  5. 5. © Cloudera, Inc. All rights reserved.5 The Modern Platform for Machine Learning & Analytics OPERATIONAL DATABASE DATA ENGINEERING DATA WAREHOUSE DATA SCIENCE DATA PROCESSING • Cost efficient • Reliable • Scalable • Based on Spark, MapReduce, Hive & Pig • Supported by Workload Analytics FAST BI & SQL • Flexibility • Elastic scale • Go beyond SQL • Based on Impala & Hive • SQL dev enviro • Supported by Workload Analytics MACHINE LEARNING • Fast dev to production • Secure self-serve • Based on Python, R, and Spark • ML dev environment (CDSW) ONLINE & REAL-TIME • High throughput, low latency • Strongly consistent • Based on Hbase, Kudu & Spark streaming
  6. 6. © Cloudera, Inc. All rights reserved. 6 Cloudera’s Vision for AI and Machine Learning Modern Enterprise Platform, Tools, and Expert Guidance to help you Unlock Business Value with ML / AI Agile platform to build, train, and deploy scalable ML applications Enterprise data science tools to accelerate team productivity Expert guidance, services & training to fast track value & scale
  7. 7. © Cloudera, Inc. All rights reserved.7 Via Cloudera Altus Director INFRASTRUCTURE SERVICES OPERATIONAL DATABASE DATA ENGINEERING DATA WAREHOUSE DATA SCIENCE DATA ENGINEERING DATA WAREHOUSE Via Cloudera Altus Services With Multiple Deployment Options Traditional Infrastructure (combined storage and compute) Cloud Infrastructure (decoupled storage and compute) Cloud Infrastructure (decoupled storage and compute)
  8. 8. © Cloudera, Inc. All rights reserved.8 Cloudera Enterprise with SDX Benefits for IT infra & ops ●Central control and security ●Focus on curating not firefighting Benefits for users ●Value from single source of truth ●Bring the best tools for each job WORKLOADS 3RD PARTY SERVICES DATA ENGINEERING DATA SCIENCE DATA WAREHOUSE OPERATIONAL DATABASE DATA CATALOG GOVERNANCESECURITY LIFECYCLE MANAGEMENT STORAGE Microsoft ADLS COMMON SERVICES HDFS Amazon S3 KUDU
  9. 9. © Cloudera, Inc. All rights reserved.9 Many Options for Data Analytics / Engineering in the Cloud Altus Director Altus Services Existing On- Prem Deployment
  10. 10. © Cloudera, Inc. All rights reserved.10 Many Options for Data Analytics / Engineering in the Cloud Altus Director Altus Services Existing On- Prem Deployment Starting New Deployment
  11. 11. © Cloudera, Inc. All rights reserved.11 Journey from On-Prem Cluster to Cloud BARE METAL CLOUDERA CLUSTER (PERSISTENT) COMPUTE DATA CONTEXT Data Engineering Data Warehouse Data Science Security Metadata Governance STORAGE CLOUD OBJECT STORE 0 - ON PREMISES HDFS
  12. 12. © Cloudera, Inc. All rights reserved.12 Journey from On-Prem Cluster to Cloud CUSTOMER CLOUD CLOUDERA CLUSTER (PERSISTENT) COMPUTE DATA CONTEXT Data Engineering Data Warehouse Data Science Security Metadata Governance STORAGE CLOUD OBJECT STORE 1 - LIFT AND SHIFT HDFS BARE METAL CLOUDERA CLUSTER (PERSISTENT) COMPUTE DATA CONTEXT Data Engineering Data Warehouse Data Science Security Metadata Governance STORAGE CLOUD OBJECT STORE 0 - ON PREMISES HDFS
  13. 13. © Cloudera, Inc. All rights reserved.13 Journey from On-Prem Cluster to Cloud CUSTOMER CLOUDCUSTOMER CLOUD CLOUDERA CLUSTER (PERSISTENT) COMPUTE DATA CONTEXT Data Engineering Data Warehouse Data Science Security Metadata Governance STORAGE CLOUD OBJECT STORE CLOUDERA CLUSTER (PERSISTENT) COMPUTE DATA CONTEXT Data Engineering Data Warehouse Data Science Security Metadata Governance STORAGE CLOUD OBJECT STORE 1 - LIFT AND SHIFT 2 - OBJECT STORAGE HDFS BARE METAL CLOUDERA CLUSTER (PERSISTENT) COMPUTE DATA CONTEXT Data Engineering Data Warehouse Data Science Security Metadata Governance STORAGE CLOUD OBJECT STORE 0 - ON PREMISES HDFS
  14. 14. © Cloudera, Inc. All rights reserved.14 Journey from On-Prem Cluster to Cloud CUSTOMER CLOUD CUSTOMER CLOUDCUSTOMER CLOUD CLOUDERA CLUSTER (PERSISTENT) COMPUTE DATA CONTEXT Data Engineering Data Warehouse Data Science Security Metadata Governance STORAGE CLOUD OBJECT STORE CLOUDERA CLUSTER (PERSISTENT) COMPUTE DATA CONTEXT Data Engineering Data Warehouse Data Science Security Metadata Governance STORAGE CLOUD OBJECT STORE 1 - LIFT AND SHIFT 2 - OBJECT STORAGE HDFS CLOUDERA CLUSTERS (TRANSIENT– ALTUS) COMPUTE Data Engineering CLOUDERA CLOUD CLOUDERA ALTUS CONTROL PLANE STORAGE CLOUD OBJECT STORE DATA CONTEXT CLOUDERA CLUSTER (PERSISTENT–DIRECTOR) COMPUTE DATA CONTEXT CLOUDERA CLUSTERS (TRANSIENT– ALTUS) COMPUTE Data Warehouse 3 - CLOUD NATIVE ARCHITECTURES BARE METAL CLOUDERA CLUSTER (PERSISTENT) COMPUTE DATA CONTEXT Data Engineering Data Warehouse Data Science Security Metadata Governance STORAGE CLOUD OBJECT STORE 0 - ON PREMISES HDFS
  15. 15. © Cloudera, Inc. All rights reserved.15 Customer Examples Many Cloudera customers (Global 5K) used public cloud • Online retailer • Over 2,000 nodes with ~2PB of data on AWS running in an active - active configuration • Transforming data with Spark and then analyzing with Apache Hive • German chain of coffee retailers and cafés • 30+ nodes with 50TB of data on AWS • Modern Cloudera platform with an Impala data warehouse • Global information company • 50+ nodes on Microsoft Azure and 20+ nodes on AWS • Replaced Netezza with Hadoop and leveraging both Impala and Spark for analytics
  16. 16. © Cloudera, Inc. All rights reserved.16 Security Use Case Cloudera is using cloud as well Altus based solution saved more than 50% cost compared to initial implementation
  17. 17. © Cloudera, Inc. All rights reserved.17 Cloudera Altus Key Differentiators • Multi-function: Unified platform for data engineering, data warehouse, and data science • Multi-cloud: Option for on-premises, Public cloud (including multi-), and Hybrid • SDX: Integrated shared data experience across multi-function clusters
  18. 18. Rohit Pujari, Solutions Architect AWS Security & Compliance
  19. 19. Why is security traditionally so hard? Lack of visibility Low degree of automation
  20. 20. ORANDMove fast Stay secure Before…Now…
  21. 21. Making life easier Choosing security does not mean giving up on convenience or introducing complexity
  22. 22. The most sensitive workloads run on AWS “We can be even more secure in the AWS cloud than in our own datacenters.” —Tom Soderstrom, CTO, NASA JPL “We knew the cloud was the only way to get the scalability, speed, and security our customers expect from 3M.” —Rick Austin, 3M Health Information Systems “We determined that security in AWS is superior to our on-premises data center across several dimensions, including patching, encryption, auditing and logging, entitlements, and compliance.” —John Brady, CISO, FINRA (Financial Industry Regulatory Authority)
  23. 23. Benefits of a Data Lake - All Data is in One Place Analyze all of your data, from all of your sources, in one stored location “Why is the data distributed in many locations? Where is the single source of truth?”
  24. 24. Durable Designed for 11 9s of durability Available Designed for 99.99% availability High performance ▪ Multiple upload ▪ Range GET ▪ Scalable throughput Scalable ▪ Store as much as you need ▪ Scale storage and compute independently ▪ No minimum usage commitments Integrated Partner Tools ▪ Cloudera EDH ▪ Cloudera Altus ▪ Cloudera Impala Easy to use ▪ Simple REST API ▪ AWS SDKs ▪ Simple management tools ▪ Event notification ▪ Lifecycle policies Why Amazon S3 for a Data Lake?
  25. 25. AWS Direct Connect AWS Snowball ISV Connectors Kafka/Flume Amazon Kinesis Firehose Amazon S3 Transfer Acceleration AWS Storage Gateway Data Ingestion into Amazon S3
  26. 26. Encryption ComplianceSecurity ▪ Identity and access Management (IAM) policies ▪ Bucket policies ▪ Access Control Lists (ACLs) ▪ Private VPC endpoints to Amazon S3 ▪ Amazon S3 object tagging to manage access policies ▪ SSL endpoints ▪ Server-side encryption (SSE-S3) ▪ S3 server-side encryption with provided keys (SSE-C, SSE-KMS) ▪ Client-side encryption ▪ Buckets access logs ▪ Lifecycle management policies ▪ Access Control Lists (ACLs) ▪ Versioning and MFA deletes ▪ Certifications—HIPAA, PCI, SOC 1/2/3, etc. Strong Security Controls
  27. 27. Automate with deeply integrated security tools and services Inherit global security and compliance controls Highest standards for privacy and data security Largest network of security partners and solutions Scale with superior visibility and control that satisfies the most risk-sensitive orgs Move to AWS Strengthen your security posture
  28. 28. Encrypt data in transit and at rest with keys managed by our AWS Key Management System (KMS) or managing your own encryption keys with Cloud HSM using FIPS 140-2 Level 3 validated HSMs Meet data residency requirements Choose an AWS Region and AWS will not replicate it elsewhere unless you choose to do so Access services and tools that enable you to build GDPR-compliant infrastructure on top of AWS Comply with local data privacy laws by controlling who can access content, its lifecycle and disposal Highest standards for privacy
  29. 29. Inherit global security and compliance controls
  30. 30. © Cloudera, Inc. All rights reserved.30 Data Analytics / Engineering with Cloudera $ • Lower risk of data breach • Analysts more productive on jobs • Self-service (no shadow IT) and more productive • IT more strategic, less admin time • Deployment choices and no lock-in • Same solution as on-premises and multi- cloud • Eliminate data copies • Single security framework with universally shared metadata • Easy to track data lineage • Unified services + CLOUDERA ADVANTAGES BUSINESS VALUE
  31. 31. © Cloudera, Inc. All rights reserved.31 Ready to try Data Analytics / Engineering in the Cloud? Have an existing cluster for DW / DE • Up to $2K Free AWS Credits* • Email: awsoffer@cloudera.com Don’t have an existing cluster • Free Altus DE / DW Trial • https://sso.cloudera.com/register.html *Must work with AWS and Cloudera account managers on POC to be eligible for offer
  32. 32. THANK YOU
  33. 33. © Cloudera, Inc. All rights reserved.33 APPENDIX
  34. 34. © Cloudera, Inc. All rights reserved.34 Cloudera Pricing / Acquisition • Acquisition Options • Pay-as-you-go usage-based pricing • Node-based license subscription • Free 30-day trial • Pre-pay of cloud credits • Free version that can be deployed in the cloud • Pricing - https://www.cloudera.com/products/pricing.html

×