Amazon	
  Redshi.	
  
	
  
	
  
Security	
  

TECHNICAL	
  
CONTENT	
  

Backup	
  
Loading	
  
Demo	
  
Pricing	
  
Customers	
  
Use	
  Cases	
  
Defini1...
Amazon	
  RedshiN	
  
A	
  fast	
  and	
  powerful,	
  petabyte-­‐scale	
  data	
  warehouse.	
  
Delivered	
  as	
  a	
  ...
Where	
  does	
  it	
  fit?	
  
Deployment & Administration
Application Services

Compute

Storage
Networking
AWS Global Infrastructure

Database
Amazon Elastic Map Reduce

Amazon Redshift
Data	
  Warehouse	
  Service	
  

Hosted	
  Hadoop	
  Service	
  

Amazon Dynam...
Structure
Low

High
Large
Hadoop	
  
(EMR)	
  

MPP	
  DW	
  
(RedshiN)	
  

Size
NoSQL	
  
(DynamoDB)	
  

Small

Tradi1o...
What	
  are	
  the	
  benefits?	
  
•  Easy to provision and scale up massively
•  Pay as you go
•  Price-performance
•  Standards-based
How	
  are	
  	
  
customers	
  	
  
	
  using	
  it?	
  
1.	
  Replace	
  

ETL

Application

OLTP
Database

Data
Warehouse

Reporting
and BI	
  
1.	
  Replace	
  

ETL

Application

OLTP
Database

Amazon
Redshift

Reporting
and BI	
  
2.	
  Assist	
  

ETL

Application

OLTP
Database

Data
Warehouse

Reporting
and BI	
  
2.	
  Assist	
  

Amazon
Redshift

Application

Reporting
and BI	
  

OLTP
Database

Data
Warehouse
3.	
  New	
  Warehouse	
  

Application

OLTP
Database

Reporting
and BI	
  
3.	
  New	
  Warehouse	
  

Application

OLTP
Database

Amazon
Redshift

Reporting
and BI	
  
4.	
  Log	
  Analysis	
  

Amazon
S3

Web /
Application
Servers

Amazon
Redshift

Reporting
and BI	
  
Not	
  Designed	
  For:	
  
Transac1onal	
  workload	
  
Very	
  small	
  data	
  sets	
  
Sub-­‐second	
  response	
  1me...
How	
  does	
  it	
  work?	
  
SQL Clients/BI Tools
JDBC/ODBC	
  

128GB RAM

Leader
Node
16 cores

16TB disk

128GB RAM

128GB RAM

128GB RAM

Compute
N...
SQL Clients/BI Tools
ID	
  

Name	
  

1	
  

John	
  Smith	
  

2	
  

Jane	
  Jones	
  

3	
  

Peter	
  Black	
  

4	
 ...
SQL Clients/BI Tools
JDBC/ODBC	
  

128GB RAM

16 cores

Results	
  
SQL	
  
16TB disk

128GB RAM

128GB RAM

16 cores

16...
Do	
  I	
  have	
  a	
  
	
  choice	
  	
  of	
  nodes?	
  
16 GB RAM
2 cores
2 TB disk

Compute	
  	
  
Node	
  	
  
Choice	
  
	
  

XL	
  	
  

Single	
  Node	
  (2	
  TB)	
  

	
...
128 GB RAM
16 cores
16 TB disk

Compute	
  	
  
Node	
  	
  
Choice	
  
	
  

8XL	
  	
  

Cluster 2-100 Nodes (32 TB – 1....
How	
  do	
  I	
  run	
  queries?	
  
JDBC/ODBC	
  
	
  

Query	
  
Tools	
  

	
  
	
  
Redshift

DB	
  Visualizer	
  
SQL	
  Workbench	
  
BI	
  Tools	
  
Do	
  I	
  need	
  to	
  
performance	
  tune?	
  
	
  
Performance	
  	
  
=	
  	
  
Parallelism	
  	
  
+	
  	
  
Columnar	
  
+	
  
Compression	
  
+	
  
Zone	
  Maps	
  
SQL Clients/BI Tools
JDBC/ODBC	
  

128GB RAM

Massively	
  
Parallel	
  
Processing	
  

Leader
Node
16 cores

16TB disk
...
ID	
  

State	
  

123	
  

20	
  

CA	
  

345	
  

Data	
  Storage:	
  
	
  

Age	
  
25	
  

WA	
  

678	
  

40	
  

F...
Raw	
  encoding	
  (RAW)	
  
Byte-­‐dic1onary	
  (BYTEDICT)	
  
Delta	
  encoding	
  (DELTA	
  /	
  DELTA32K)	
  

Compres...
CREATE	
  TABLE	
  orders	
  (	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  ...
How	
  do	
  I	
  get	
  data	
  in?	
  
S3

S3	
  

Redshift

copy events
from 's3://mybucket/data/allevents_pipe.txt'
credentials 'aws_access_key_id=<>;
aws_secr...
DynamoDB

DynamoDB	
  

Redshift

copy favoritemovies
from 'dynamodb://ProductCatalog'
credentials 'aws_access_key_id=<>;
...
Amazon
Redshift

ETL	
  
Tools	
  
Source
Systems

ETL
How	
  do	
  I	
  back	
  it	
  up?	
  
SQL Clients/BI Tools

128GB RAM

Leader
Node
16 cores

	
  

• 
• 

Backup	
  
Automa1c	
  
Incremental	
  

16TB disk

12...
Can	
  I	
  stop	
  a	
  cluster?	
  
SQL Clients/BI Tools

128GB RAM

Leader
Node
16 cores

16TB disk

Snapshot	
  
128GB RAM

128GB RAM

128GB RAM

Compute
No...
SQL Clients/BI Tools

128GB RAM

Leader
Node
16 cores

16TB disk

Snapshot	
  
Restore	
  

128GB RAM

128GB RAM

128GB RA...
How	
  do	
  I	
  resize	
  it?	
  
BI Tools

128GB RAM

128GB RAM

Leader
Node

Leader
Node

16 cores

16 cores

48TB disk

48TB disk

Resize	
  
128GB RAM

...
SQL Clients/BI Tools

128GB RAM

Leader
Node
16 cores

48TB disk

Resize	
  
128GB RAM

Compute
Node
16 cores

48TB disk

...
Is	
  it	
  secure?	
  
Customer	
  VPC	
  
SQL Clients/BI Tools

SSL	
  
Internal	
  VPC	
  

128GB RAM

Leader
Node
16 cores

16TB disk

Encrypt...
How	
  do	
  I	
  get	
  started?	
  
hop://aws.amazon.com/redshiN	
  
	
  Detail	
  
	
  FAQ	
  
	
  Pricing	
  
	
  Doco	
  –	
  Gepng	
  Started	
  Guide	
  ...
Upcoming SlideShare
Loading in...5
×

23 October 2013 - AWS 201 - A Walk through the AWS Cloud: Introduction to Amazon Redshift

496

Published on

Amazon Redshift is the new data warehouse service from Amazon Web Services. Redshift offers you fast query performance when analyzing data sets from a few hundred gigabytes to over a petabyte at a fraction of the cost of traditional solutions. In this webinar, we will take a detailed look at Redshift, including a live demonstration. This webinar is ideal for anyone looking to gain deeper insight into their data, without the usual challenges of time, cost and effort.

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
496
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

23 October 2013 - AWS 201 - A Walk through the AWS Cloud: Introduction to Amazon Redshift

  1. 1. Amazon  Redshi.      
  2. 2. Security   TECHNICAL   CONTENT   Backup   Loading   Demo   Pricing   Customers   Use  Cases   Defini1on   Architecture   TIME  
  3. 3. Amazon  RedshiN   A  fast  and  powerful,  petabyte-­‐scale  data  warehouse.   Delivered  as  a  managed  service.  
  4. 4. Where  does  it  fit?  
  5. 5. Deployment & Administration Application Services Compute Storage Networking AWS Global Infrastructure Database
  6. 6. Amazon Elastic Map Reduce Amazon Redshift Data  Warehouse  Service   Hosted  Hadoop  Service   Amazon DynamoDB NoSQL  Data  Store   Deployment & Administration Amazon RDS MySQL,  Oracle  and  SQL  Server   Application Services Compute Storage Database Networking AWS Global Infrastructure Amazon S3 Object  Storage    
  7. 7. Structure Low High Large Hadoop   (EMR)   MPP  DW   (RedshiN)   Size NoSQL   (DynamoDB)   Small Tradi1onal     DW   (RDS)  
  8. 8. What  are  the  benefits?  
  9. 9. •  Easy to provision and scale up massively •  Pay as you go •  Price-performance •  Standards-based
  10. 10. How  are     customers      using  it?  
  11. 11. 1.  Replace   ETL Application OLTP Database Data Warehouse Reporting and BI  
  12. 12. 1.  Replace   ETL Application OLTP Database Amazon Redshift Reporting and BI  
  13. 13. 2.  Assist   ETL Application OLTP Database Data Warehouse Reporting and BI  
  14. 14. 2.  Assist   Amazon Redshift Application Reporting and BI   OLTP Database Data Warehouse
  15. 15. 3.  New  Warehouse   Application OLTP Database Reporting and BI  
  16. 16. 3.  New  Warehouse   Application OLTP Database Amazon Redshift Reporting and BI  
  17. 17. 4.  Log  Analysis   Amazon S3 Web / Application Servers Amazon Redshift Reporting and BI  
  18. 18. Not  Designed  For:   Transac1onal  workload   Very  small  data  sets   Sub-­‐second  response  1me  
  19. 19. How  does  it  work?  
  20. 20. SQL Clients/BI Tools JDBC/ODBC   128GB RAM Leader Node 16 cores 16TB disk 128GB RAM 128GB RAM 128GB RAM Compute Node Compute Node Compute Node 16TB disk 16TB disk 16TB disk 16 cores 16 cores 16 cores
  21. 21. SQL Clients/BI Tools ID   Name   1   John  Smith   2   Jane  Jones   3   Peter  Black   4   Pat  Partridge   5   Sarah  Cyan   6   Brian  Snail   JDBC/ODBC   128GB RAM Leader Node 16 cores 16TB disk 128GB RAM 128GB RAM 128GB RAM Compute Node Compute Node Compute Node 16TB disk 16TB disk 16TB disk 16 cores 16 cores 16 cores 1   John  Smith   2   Jane  Jones   3   Peter  Black   4   Pat  Partridge   5   Sarah  Cyan   6   Brian  Snail  
  22. 22. SQL Clients/BI Tools JDBC/ODBC   128GB RAM 16 cores Results   SQL   16TB disk 128GB RAM 128GB RAM 16 cores 16 cores 128GB RAM 16 cores Results   SQL   Results   SQL   16TB disk Results   SQL   16TB disk 16TB disk 1   John  Smith   2   Jane  Jones   3   Peter  Black   4   Pat  Partridge   5   Sarah  Cyan   6   Brian  Snail  
  23. 23. Do  I  have  a    choice    of  nodes?  
  24. 24. 16 GB RAM 2 cores 2 TB disk Compute     Node     Choice     XL     Single  Node  (2  TB)       XL Cluster  2-­‐32  Nodes  (4  TB  –  64  TB)       XL XL XL XL XL XL XL XL XL XL XL XL XL XL XL XL XL XL XL XL XL XL XL XL XL XL XL XL XL XL XL XL
  25. 25. 128 GB RAM 16 cores 16 TB disk Compute     Node     Choice     8XL     Cluster 2-100 Nodes (32 TB – 1.6 PB) 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL
  26. 26. How  do  I  run  queries?  
  27. 27. JDBC/ODBC     Query   Tools       Redshift DB  Visualizer   SQL  Workbench  
  28. 28. BI  Tools  
  29. 29. Do  I  need  to   performance  tune?    
  30. 30. Performance     =     Parallelism     +     Columnar   +   Compression   +   Zone  Maps  
  31. 31. SQL Clients/BI Tools JDBC/ODBC   128GB RAM Massively   Parallel   Processing   Leader Node 16 cores 16TB disk 10  GigE   128GB RAM Choose     Good     Distribu1on  Keys   128GB RAM 128GB RAM Compute Node Compute Node Compute Node 16TB disk 16TB disk 16TB disk 16 cores 16 cores 16 cores 1   John  Smith   2   Jane  Jones   3   Peter  Black   4   Pat  Partridge   5   Sarah  Cyan   6   Brian  Snail  
  32. 32. ID   State   123   20   CA   345   Data  Storage:     Age   25   WA   678   40   FL   Row-­‐based   Vs   Columnar     Row  storage   Column  storage  
  33. 33. Raw  encoding  (RAW)   Byte-­‐dic1onary  (BYTEDICT)   Delta  encoding  (DELTA  /  DELTA32K)   Compression   Mostly  encoding  (MOSTLY8  /  MOSTLY16  /  MOSTLY32)   Runlength  encoding  (RUNLENGTH)   Text  encoding  (TEXT255  /  TEXT32K)   Average:  4-­‐8x    
  34. 34. CREATE  TABLE  orders  (                                                                                              orderkey      int8        NOT  NULL      DISTKEY,      custkey      NOT  NULL,      orderstatus DDL      int8        char(1)      NOT  NULL  ,      totalprice    numeric(12,2)    NOT  NULL  ,      orderdate    date        NOT  NULL        SORTKEY  ,      orderpriority    char(15)      NOT  NULL,          clerk    char(15)      NOT  NULL  ,      shippriority    int4      NOT  NULL,      comment    varchar(79)   );          NOT  NULL                                      
  35. 35. How  do  I  get  data  in?  
  36. 36. S3 S3   Redshift copy events from 's3://mybucket/data/allevents_pipe.txt' credentials 'aws_access_key_id=<>; aws_secret_access_key=<>' maxerror 5 delimiter '|' timeformat 'YYYY-MM-DD HH:MI:SS';  
  37. 37. DynamoDB DynamoDB   Redshift copy favoritemovies from 'dynamodb://ProductCatalog' credentials 'aws_access_key_id=<>; aws_secret_access_key=<>' READRATIO 50;
  38. 38. Amazon Redshift ETL   Tools   Source Systems ETL
  39. 39. How  do  I  back  it  up?  
  40. 40. SQL Clients/BI Tools 128GB RAM Leader Node 16 cores   •  •  Backup   Automa1c   Incremental   16TB disk 128GB RAM 128GB RAM 128GB RAM Compute Node Compute Node Compute Node 16TB disk 16TB disk 16TB disk 16 cores 16 cores Amazon S3 16 cores
  41. 41. Can  I  stop  a  cluster?  
  42. 42. SQL Clients/BI Tools 128GB RAM Leader Node 16 cores 16TB disk Snapshot   128GB RAM 128GB RAM 128GB RAM Compute Node Compute Node Compute Node 16TB disk 16TB disk 16TB disk 16 cores 16 cores Snapshot Amazon S3 16 cores
  43. 43. SQL Clients/BI Tools 128GB RAM Leader Node 16 cores 16TB disk Snapshot   Restore   128GB RAM 128GB RAM 128GB RAM Compute Node Compute Node Compute Node 16TB disk 16TB disk 16TB disk 16 cores 16 cores Snapshot Amazon S3 16 cores
  44. 44. How  do  I  resize  it?  
  45. 45. BI Tools 128GB RAM 128GB RAM Leader Node Leader Node 16 cores 16 cores 48TB disk 48TB disk Resize   128GB RAM Compute Node 16 cores 48TB disk 128GB RAM Compute Node 16 cores 48TB disk 128GB RAM Compute Node 16 cores 48TB disk 128GB RAM Compute Node 16 cores 48TB disk 128GB RAM Compute Node 16 cores 48TB disk 128GB RAM Compute Node 16 cores 48TB disk
  46. 46. SQL Clients/BI Tools 128GB RAM Leader Node 16 cores 48TB disk Resize   128GB RAM Compute Node 16 cores 48TB disk 128GB RAM Compute Node 16 cores 48TB disk 128GB RAM Compute Node 16 cores 48TB disk 128GB RAM Compute Node 16 cores 48TB disk
  47. 47. Is  it  secure?  
  48. 48. Customer  VPC   SQL Clients/BI Tools SSL   Internal  VPC   128GB RAM Leader Node 16 cores 16TB disk Encrypted  Data  at  Rest   Encrypted  Data  in  Transit   Security  Groups   DB  Security   128GB RAM 128GB RAM 128GB RAM Compute Node Compute Node Compute Node 16 cores VPC   16TB disk 16 cores 16TB disk 16 cores 16TB disk Access  Management   SSL   Amazon S3
  49. 49. How  do  I  get  started?  
  50. 50. hop://aws.amazon.com/redshiN    Detail    FAQ    Pricing    Doco  –  Gepng  Started  Guide    Forums     Youtube.com    Search  for  Amazon  RedshiN  Best  Prac1ces      

×