AWS Redshift
Your Data Warehouse Solution

1

Cloud IT Better
Agenda
Introduction to Amazon Redshift

Economics for Amazon Redshift

Redshift Demo

Cloudlytics.com Case Study
How BlazeClan can help your
organization with Redshift?

Blazeclan

2

Cloud IT Better
Introduction to Amazon Redshift

Image courtesy: datacenterknowledge.com

Blazeclan

3

Cloud IT Better
Amazon Redshift
• Fully managed, Petabyte scale data warehouse
• Provision in minutes

• Pay as you go, no upfront costs
• Extremely fast with low prices
• Supports SQL

Image courtesy: datacenterdynamics.com

• Allows JDBC & ODBC Connections
Blazeclan

4

Cloud IT Better
Amazon Redshift – Key Differentiators
Columnar Storage
Data Compression

Redshift parallelizes
everything
Massively Parallel
Processing (MPP)
Architecture

Redshift Drastically
Reduces I/O

Encryption
Amazon VPC
Automated backups

Built-in Security

Blazeclan

5

Cloud IT Better
We’re off to a good start !
Some Happy feedbacks !

6
Amazon Redshift Reduces I/O drastically
Column Storage
Large data block
sizes

Zone Maps
Direct-attached
Storage

Data
Compression

Blazeclan

77

Cloud IT Better
Amazon Redshift Reduces I/O drastically
• Column Storage
Typical Row Storage

• Data Compression
• Zone Maps
• Direct-attached Storage
• Large data block sizes
Blazeclan

Columnar Storage in Redshift

8

Cloud IT Better
Amazon Redshift Reduces I/O drastically
• Column Storage

• Data compression reduces storage

• Data Compression

• Increases I/O, improves query
performance

• Zone Maps

• Less memory utilization, allowing
more memory for query processing

• Direct-attached Storage
• Large data block sizes

Blazeclan

9

Cloud IT Better
Amazon Redshift Reduces I/O drastically
• Column Storage
• Data Compression

• Keep track of minimum &
maximum value of each block

• Zone Maps

• Skip over blocks that don’t
contain the data needed for a
query

• Direct-attached Storage

• Minimize unnecessary I/O

• Large data block sizes

Blazeclan

10

Cloud IT Better
Amazon Redshift Reduces I/O drastically
• Column Storage
• Data Compression
• Use direct-attached storage to
maximize throughput

• Zone Maps

• Hardware optimized for high
performance data processing

• Direct-attached Storage

• Large block sizes to make the
most of each read

• Large data block sizes

Blazeclan

• Amazon Redshift manages
durability for you

11

Cloud IT Better
Amazon Redshift Architecture
• Leader Node
• Manages communication with client
nodes and compute nodes
• Creates execution plans
• Compiles code based on execution
plan
• Distributes loads based on the
execution plan to multiple compute
nodes

• Compute Node
• Executes compiled code received
from the leader node
• Each node has dedicated compute
and storage capacity and memory
• Clusters can be scaled based on
the processing requirements

Blazeclan

12

Cloud IT Better
Redshift is Secure
• Amazon Redshift has security built-in

• SSL to secure data in transit
• Encryption to secure
data at rest
• AES-256
• All blocks on disk and Amazon
S3 are encrypted

• No direct access to compute
nodes
• Amazon VPC Support
13
Continuous Backup and Recovery
• Replication within the cluster and backup to
Amazon S3 to maintain multiple copies of data
all the times

• Backups to Amazon S3 are continuous,
automatic and incremental
• S3 is designed for eleven nines of durability

• Continuous monitoring and automated recovery
from failures of drives and nodes

• Able to restore snapshots to any Availability Zone
within a region
Blazeclan

14

Cloud IT Better
Redshift Distributes & Parallelizes everything

Query

Load

Backup

Resize

Restore
Blazeclan

15

Cloud IT Better
Redshift Distributes & Parallelizes everything
• Query
• Load

• Backup
• Restore
• Resize

Blazeclan

16

Cloud IT Better
Redshift Distributes & Parallelizes everything
• Query

• Load in Parallel from Amazon S3 &
Amazon DynamoDB

• Load

• Data automatically distributed &
sorted

• Backup

• Scales linearly with number of
nodes

• Restore

• Resize

Blazeclan

17

Cloud IT Better
Redshift Distributes & Parallelizes everything
• Query
• Load

• Backup
• Restore
• Resize

Blazeclan

• Backups up data automatically to
Amazon S3
• Backups are continuous and
incremental
• Configurable system snapshot
retention period
• Take user snap shots on demand
• Streaming restores enable you to
resume querying faster

18

Cloud IT Better
Redshift Distributes & Parallelizes everything
• Query
• Load
• Backup

• Scale up without any downtime
• Provision a new cluster in the
background
• Copy data in parallel from node to
node

• Restore

• Only charged for source cluster

• Resize

• Automatic SQL endpoint switchover
via DNS
• Decommission Source Cluster

Blazeclan

19

Cloud IT Better
Economics of Amazon Redshift

Image courtesy: dataversity.net

Blazeclan

20

Cloud IT Better
Traditional Data Warehouses
• Expensive Hardware &
Software Licensing

• Upfront investments
• Large team of skilled, highly
paid DBAs to manage

• Tuning & Administration is expensive

Blazeclan

21

Image courtesy: clker.com

Cloud IT Better
Traditional Data Warehouses
• Large Enterprises
• YoY data growth is more than 50%
• Data warehousing is not growing at the
same rate
• Most of the data generated is not put in to
data warehouses
• Losing competitive edge as not all data is
analyzed

• Small Enterprises
• Cannot afford the current solutions
• Limited access to the expensive talent
pool to implement
Blazeclan

22

Cloud IT Better
Amazon Redshift Pricing
• No upfront charges
• Pay-as-you-go
• Priced to analyze all your data
• Less than $1 per hour for on demand prices
• On Demand Annual Cost per TB = $3723
• 3 Year Reserved Annual Cost per TB = $999
Blazeclan

23

Cloud IT Better
Amazon Redshift Configurations
• HS1.XL:
• 2 Cores
• 6 GiB Memory
• 3 disk drives with 2 TB local
compressed storage

• HS1.8XL:
•
•
•
•

16 Cores
128 GiB Memory
24 disk drives with 16 TB local storage
2 GB/second scan rate

• You can start with a Single Node instance
Blazeclan

24

Cloud IT Better
Amazon Redshift works with your existing
Analysis tools

Content referenced from:
http://www.slideshare.net/AmazonWebServices/buildingfault-tolerant-applications-in-the-cloud-aws-summit-2012-nyc

25
Case Study

26
CLOUDLYTICS
Case Study

Blazeclan

27

Cloud IT Better
Cloudlytics.com
Detailed analysis of your
S3 & CloudFront access
patterns
Scalable &
Reliable service
built using
Amazon EMR &
RedShift

Cloudlytics Analyze your
Amazon S3 &
CloudFront
Logs.

Dynamic Graphs
to get a 360
degree
perspective

Pay as
you Go
Blazeclan

28

Cloud IT Better
How BlazeClan can help
you with
Redshift?
Blazeclan

29

Cloud IT Better
End to End Data Warehouse Consulting

Requirement
Analysis

Training &
Knowledge
Transfer

Data
modeling

Initial Data
Migration

Blazeclan

Capacity
Planning &
Redshift
Setup

Managed
Services

Design &
Build
ETL process

BI
Integration

30

Cloud IT Better
Thank you
Follow Us On :
Our Blog :
Contact us :
Blazeclan

http://blog.blazeclan.com/
info@blazeclan.com

www.blazeclan.com
31

Cloud IT Better

Amazon Reshift as your Data Warehouse Solution

  • 1.
    AWS Redshift Your DataWarehouse Solution 1 Cloud IT Better
  • 2.
    Agenda Introduction to AmazonRedshift Economics for Amazon Redshift Redshift Demo Cloudlytics.com Case Study How BlazeClan can help your organization with Redshift? Blazeclan 2 Cloud IT Better
  • 3.
    Introduction to AmazonRedshift Image courtesy: datacenterknowledge.com Blazeclan 3 Cloud IT Better
  • 4.
    Amazon Redshift • Fullymanaged, Petabyte scale data warehouse • Provision in minutes • Pay as you go, no upfront costs • Extremely fast with low prices • Supports SQL Image courtesy: datacenterdynamics.com • Allows JDBC & ODBC Connections Blazeclan 4 Cloud IT Better
  • 5.
    Amazon Redshift –Key Differentiators Columnar Storage Data Compression Redshift parallelizes everything Massively Parallel Processing (MPP) Architecture Redshift Drastically Reduces I/O Encryption Amazon VPC Automated backups Built-in Security Blazeclan 5 Cloud IT Better
  • 6.
    We’re off toa good start ! Some Happy feedbacks ! 6
  • 7.
    Amazon Redshift ReducesI/O drastically Column Storage Large data block sizes Zone Maps Direct-attached Storage Data Compression Blazeclan 77 Cloud IT Better
  • 8.
    Amazon Redshift ReducesI/O drastically • Column Storage Typical Row Storage • Data Compression • Zone Maps • Direct-attached Storage • Large data block sizes Blazeclan Columnar Storage in Redshift 8 Cloud IT Better
  • 9.
    Amazon Redshift ReducesI/O drastically • Column Storage • Data compression reduces storage • Data Compression • Increases I/O, improves query performance • Zone Maps • Less memory utilization, allowing more memory for query processing • Direct-attached Storage • Large data block sizes Blazeclan 9 Cloud IT Better
  • 10.
    Amazon Redshift ReducesI/O drastically • Column Storage • Data Compression • Keep track of minimum & maximum value of each block • Zone Maps • Skip over blocks that don’t contain the data needed for a query • Direct-attached Storage • Minimize unnecessary I/O • Large data block sizes Blazeclan 10 Cloud IT Better
  • 11.
    Amazon Redshift ReducesI/O drastically • Column Storage • Data Compression • Use direct-attached storage to maximize throughput • Zone Maps • Hardware optimized for high performance data processing • Direct-attached Storage • Large block sizes to make the most of each read • Large data block sizes Blazeclan • Amazon Redshift manages durability for you 11 Cloud IT Better
  • 12.
    Amazon Redshift Architecture •Leader Node • Manages communication with client nodes and compute nodes • Creates execution plans • Compiles code based on execution plan • Distributes loads based on the execution plan to multiple compute nodes • Compute Node • Executes compiled code received from the leader node • Each node has dedicated compute and storage capacity and memory • Clusters can be scaled based on the processing requirements Blazeclan 12 Cloud IT Better
  • 13.
    Redshift is Secure •Amazon Redshift has security built-in • SSL to secure data in transit • Encryption to secure data at rest • AES-256 • All blocks on disk and Amazon S3 are encrypted • No direct access to compute nodes • Amazon VPC Support 13
  • 14.
    Continuous Backup andRecovery • Replication within the cluster and backup to Amazon S3 to maintain multiple copies of data all the times • Backups to Amazon S3 are continuous, automatic and incremental • S3 is designed for eleven nines of durability • Continuous monitoring and automated recovery from failures of drives and nodes • Able to restore snapshots to any Availability Zone within a region Blazeclan 14 Cloud IT Better
  • 15.
    Redshift Distributes &Parallelizes everything Query Load Backup Resize Restore Blazeclan 15 Cloud IT Better
  • 16.
    Redshift Distributes &Parallelizes everything • Query • Load • Backup • Restore • Resize Blazeclan 16 Cloud IT Better
  • 17.
    Redshift Distributes &Parallelizes everything • Query • Load in Parallel from Amazon S3 & Amazon DynamoDB • Load • Data automatically distributed & sorted • Backup • Scales linearly with number of nodes • Restore • Resize Blazeclan 17 Cloud IT Better
  • 18.
    Redshift Distributes &Parallelizes everything • Query • Load • Backup • Restore • Resize Blazeclan • Backups up data automatically to Amazon S3 • Backups are continuous and incremental • Configurable system snapshot retention period • Take user snap shots on demand • Streaming restores enable you to resume querying faster 18 Cloud IT Better
  • 19.
    Redshift Distributes &Parallelizes everything • Query • Load • Backup • Scale up without any downtime • Provision a new cluster in the background • Copy data in parallel from node to node • Restore • Only charged for source cluster • Resize • Automatic SQL endpoint switchover via DNS • Decommission Source Cluster Blazeclan 19 Cloud IT Better
  • 20.
    Economics of AmazonRedshift Image courtesy: dataversity.net Blazeclan 20 Cloud IT Better
  • 21.
    Traditional Data Warehouses •Expensive Hardware & Software Licensing • Upfront investments • Large team of skilled, highly paid DBAs to manage • Tuning & Administration is expensive Blazeclan 21 Image courtesy: clker.com Cloud IT Better
  • 22.
    Traditional Data Warehouses •Large Enterprises • YoY data growth is more than 50% • Data warehousing is not growing at the same rate • Most of the data generated is not put in to data warehouses • Losing competitive edge as not all data is analyzed • Small Enterprises • Cannot afford the current solutions • Limited access to the expensive talent pool to implement Blazeclan 22 Cloud IT Better
  • 23.
    Amazon Redshift Pricing •No upfront charges • Pay-as-you-go • Priced to analyze all your data • Less than $1 per hour for on demand prices • On Demand Annual Cost per TB = $3723 • 3 Year Reserved Annual Cost per TB = $999 Blazeclan 23 Cloud IT Better
  • 24.
    Amazon Redshift Configurations •HS1.XL: • 2 Cores • 6 GiB Memory • 3 disk drives with 2 TB local compressed storage • HS1.8XL: • • • • 16 Cores 128 GiB Memory 24 disk drives with 16 TB local storage 2 GB/second scan rate • You can start with a Single Node instance Blazeclan 24 Cloud IT Better
  • 25.
    Amazon Redshift workswith your existing Analysis tools Content referenced from: http://www.slideshare.net/AmazonWebServices/buildingfault-tolerant-applications-in-the-cloud-aws-summit-2012-nyc 25
  • 26.
  • 27.
  • 28.
    Cloudlytics.com Detailed analysis ofyour S3 & CloudFront access patterns Scalable & Reliable service built using Amazon EMR & RedShift Cloudlytics Analyze your Amazon S3 & CloudFront Logs. Dynamic Graphs to get a 360 degree perspective Pay as you Go Blazeclan 28 Cloud IT Better
  • 29.
    How BlazeClan canhelp you with Redshift? Blazeclan 29 Cloud IT Better
  • 30.
    End to EndData Warehouse Consulting Requirement Analysis Training & Knowledge Transfer Data modeling Initial Data Migration Blazeclan Capacity Planning & Redshift Setup Managed Services Design & Build ETL process BI Integration 30 Cloud IT Better
  • 31.
    Thank you Follow UsOn : Our Blog : Contact us : Blazeclan http://blog.blazeclan.com/ info@blazeclan.com www.blazeclan.com 31 Cloud IT Better