Amazon Reshift as your Data Warehouse Solution


Published on

An Introduction to the Speakers & What BlazeClan as an AWS Advanced Consulting Partners does and how it has Evolved. Varoon, Our Solution Architect, Specializing on Amazon Redshift, Talks about the Key differentiators of Amazon Redshift. Learn why & how Exactly Redshift can optimize your Time and Efforts & reduce costs by 1/10th the cost of a traditional warehouse solution. A Demo of Amazon Redshift in action, processing 2billion records in a matter of seconds! A casestudy of one of our products, Cloudlytics, and how it extensively user Amazon Redshift.

We had conducted a webinar on Amazon Redshift, you can also view the Video of the Webinar along with the Q & A at the end of the Slideshare.

Published in: Technology

Amazon Reshift as your Data Warehouse Solution

  1. 1. AWS Redshift Your Data Warehouse Solution 1 Cloud IT Better
  2. 2. Agenda Introduction to Amazon Redshift Economics for Amazon Redshift Redshift Demo Case Study How BlazeClan can help your organization with Redshift? Blazeclan 2 Cloud IT Better
  3. 3. Introduction to Amazon Redshift Image courtesy: Blazeclan 3 Cloud IT Better
  4. 4. Amazon Redshift • Fully managed, Petabyte scale data warehouse • Provision in minutes • Pay as you go, no upfront costs • Extremely fast with low prices • Supports SQL Image courtesy: • Allows JDBC & ODBC Connections Blazeclan 4 Cloud IT Better
  5. 5. Amazon Redshift – Key Differentiators Columnar Storage Data Compression Redshift parallelizes everything Massively Parallel Processing (MPP) Architecture Redshift Drastically Reduces I/O Encryption Amazon VPC Automated backups Built-in Security Blazeclan 5 Cloud IT Better
  6. 6. We’re off to a good start ! Some Happy feedbacks ! 6
  7. 7. Amazon Redshift Reduces I/O drastically Column Storage Large data block sizes Zone Maps Direct-attached Storage Data Compression Blazeclan 77 Cloud IT Better
  8. 8. Amazon Redshift Reduces I/O drastically • Column Storage Typical Row Storage • Data Compression • Zone Maps • Direct-attached Storage • Large data block sizes Blazeclan Columnar Storage in Redshift 8 Cloud IT Better
  9. 9. Amazon Redshift Reduces I/O drastically • Column Storage • Data compression reduces storage • Data Compression • Increases I/O, improves query performance • Zone Maps • Less memory utilization, allowing more memory for query processing • Direct-attached Storage • Large data block sizes Blazeclan 9 Cloud IT Better
  10. 10. Amazon Redshift Reduces I/O drastically • Column Storage • Data Compression • Keep track of minimum & maximum value of each block • Zone Maps • Skip over blocks that don’t contain the data needed for a query • Direct-attached Storage • Minimize unnecessary I/O • Large data block sizes Blazeclan 10 Cloud IT Better
  11. 11. Amazon Redshift Reduces I/O drastically • Column Storage • Data Compression • Use direct-attached storage to maximize throughput • Zone Maps • Hardware optimized for high performance data processing • Direct-attached Storage • Large block sizes to make the most of each read • Large data block sizes Blazeclan • Amazon Redshift manages durability for you 11 Cloud IT Better
  12. 12. Amazon Redshift Architecture • Leader Node • Manages communication with client nodes and compute nodes • Creates execution plans • Compiles code based on execution plan • Distributes loads based on the execution plan to multiple compute nodes • Compute Node • Executes compiled code received from the leader node • Each node has dedicated compute and storage capacity and memory • Clusters can be scaled based on the processing requirements Blazeclan 12 Cloud IT Better
  13. 13. Redshift is Secure • Amazon Redshift has security built-in • SSL to secure data in transit • Encryption to secure data at rest • AES-256 • All blocks on disk and Amazon S3 are encrypted • No direct access to compute nodes • Amazon VPC Support 13
  14. 14. Continuous Backup and Recovery • Replication within the cluster and backup to Amazon S3 to maintain multiple copies of data all the times • Backups to Amazon S3 are continuous, automatic and incremental • S3 is designed for eleven nines of durability • Continuous monitoring and automated recovery from failures of drives and nodes • Able to restore snapshots to any Availability Zone within a region Blazeclan 14 Cloud IT Better
  15. 15. Redshift Distributes & Parallelizes everything Query Load Backup Resize Restore Blazeclan 15 Cloud IT Better
  16. 16. Redshift Distributes & Parallelizes everything • Query • Load • Backup • Restore • Resize Blazeclan 16 Cloud IT Better
  17. 17. Redshift Distributes & Parallelizes everything • Query • Load in Parallel from Amazon S3 & Amazon DynamoDB • Load • Data automatically distributed & sorted • Backup • Scales linearly with number of nodes • Restore • Resize Blazeclan 17 Cloud IT Better
  18. 18. Redshift Distributes & Parallelizes everything • Query • Load • Backup • Restore • Resize Blazeclan • Backups up data automatically to Amazon S3 • Backups are continuous and incremental • Configurable system snapshot retention period • Take user snap shots on demand • Streaming restores enable you to resume querying faster 18 Cloud IT Better
  19. 19. Redshift Distributes & Parallelizes everything • Query • Load • Backup • Scale up without any downtime • Provision a new cluster in the background • Copy data in parallel from node to node • Restore • Only charged for source cluster • Resize • Automatic SQL endpoint switchover via DNS • Decommission Source Cluster Blazeclan 19 Cloud IT Better
  20. 20. Economics of Amazon Redshift Image courtesy: Blazeclan 20 Cloud IT Better
  21. 21. Traditional Data Warehouses • Expensive Hardware & Software Licensing • Upfront investments • Large team of skilled, highly paid DBAs to manage • Tuning & Administration is expensive Blazeclan 21 Image courtesy: Cloud IT Better
  22. 22. Traditional Data Warehouses • Large Enterprises • YoY data growth is more than 50% • Data warehousing is not growing at the same rate • Most of the data generated is not put in to data warehouses • Losing competitive edge as not all data is analyzed • Small Enterprises • Cannot afford the current solutions • Limited access to the expensive talent pool to implement Blazeclan 22 Cloud IT Better
  23. 23. Amazon Redshift Pricing • No upfront charges • Pay-as-you-go • Priced to analyze all your data • Less than $1 per hour for on demand prices • On Demand Annual Cost per TB = $3723 • 3 Year Reserved Annual Cost per TB = $999 Blazeclan 23 Cloud IT Better
  24. 24. Amazon Redshift Configurations • HS1.XL: • 2 Cores • 6 GiB Memory • 3 disk drives with 2 TB local compressed storage • HS1.8XL: • • • • 16 Cores 128 GiB Memory 24 disk drives with 16 TB local storage 2 GB/second scan rate • You can start with a Single Node instance Blazeclan 24 Cloud IT Better
  25. 25. Amazon Redshift works with your existing Analysis tools Content referenced from: 25
  26. 26. Case Study 26
  27. 27. CLOUDLYTICS Case Study Blazeclan 27 Cloud IT Better
  28. 28. Detailed analysis of your S3 & CloudFront access patterns Scalable & Reliable service built using Amazon EMR & RedShift Cloudlytics Analyze your Amazon S3 & CloudFront Logs. Dynamic Graphs to get a 360 degree perspective Pay as you Go Blazeclan 28 Cloud IT Better
  29. 29. How BlazeClan can help you with Redshift? Blazeclan 29 Cloud IT Better
  30. 30. End to End Data Warehouse Consulting Requirement Analysis Training & Knowledge Transfer Data modeling Initial Data Migration Blazeclan Capacity Planning & Redshift Setup Managed Services Design & Build ETL process BI Integration 30 Cloud IT Better
  31. 31. Thank you Follow Us On : Our Blog : Contact us : Blazeclan 31 Cloud IT Better