AWS Webcast - Introducing Amazon Redshift
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

AWS Webcast - Introducing Amazon Redshift

on

  • 1,296 views

This webinar is aimed at older portfolio companies who may have started when AWS wasn't as strong as it is today. Redshift is a great way to to use the cloud and bring data to the cloud where other ...

This webinar is aimed at older portfolio companies who may have started when AWS wasn't as strong as it is today. Redshift is a great way to to use the cloud and bring data to the cloud where other cloud services (EMR) can consume it.

Statistics

Views

Total Views
1,296
Views on SlideShare
1,295
Embed Views
1

Actions

Likes
2
Downloads
54
Comments
0

1 Embed 1

http://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

AWS Webcast - Introducing Amazon Redshift Presentation Transcript

  • 1. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Introducing Amazon RedshiftAmazon’s Data Warehouse as a ServiceBen Butler, Solutions ArchitectWorldwide Public Sectorbutlerb@amazon.com
  • 2. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.What is Amazon Web Services?AWS Global InfrastructureApplication ServicesNetworkingDeployment & AdministrationDatabaseStorageCompute
  • 3. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.What is Amazon Web Services?AWS Global InfrastructureApplication ServicesNetworkingDeployment & AdministrationStorageCompute Database
  • 4. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.AWS Database ServicesFully managed SQL database service for OLTPworkloadsFully managed NoSQL service for massivelyscalable, high throughput, low latency workloadsFully managed, fast and powerful, petabyte-scaledata warehouse serviceFully managed Memcached-compliant in memorycaching service
  • 5. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.AWS Database ServicesFully managed SQL database service for OLTPworkloadsFully managed NoSQL service for massivelyscalable, high throughput, low latency workloadsFully managed, fast and powerful, petabyte-scaledata warehouse serviceFully managed Memcached-compliant in memorycaching service
  • 6. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Traditional data warehousing is expensive andcomplicatedExpensive Hardware and SoftwareComplex Tuning and AdminEnterprises average between 3and 4 DBAs per datawarehouseSource: Oracle technology global price list 11/1/2012Gartner: Critical factors in calculating the data warehouse TCO, July 2009
  • 7. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Customers Aren’t Happy with Today’s SolutionsLarge Companies Small CompaniesExpensiveHard to scaleCan’t afford to have adata warehouse
  • 8. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Data warehousing done the AWS way• Pay as you go, no up front costs• Fast, cheap, easy to use• SQL• Provision in minutes
  • 9. Introducing AmazonRedshiftData Warehousing the AWS WayEasily and rapidly analyzepetabytes of data1/10 the cost of traditional datawarehousesAutomated deployment &administrationCompatible with popular BI tools
  • 10. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Most data never makes it to a data warehouse1990 2000 2010 2020The Data Analysis GapEnterprise DataData in WarehouseEnterprise Data is growing atover 50% yearlyData Warehousing growing atless than 10% yearlyMost data is left on the floorSources:Gartner: User Survey Analysis: Key Trends Shaping the Future of Data Center Infrastructure Through 2011IDC: Worldwide Business Analytics Software 2012–2016 Forecast and 2011 Vendor Shares
  • 11. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.We set out to build…A fast and powerful, petabyte-scale data warehouse that is:A Lot FasterA Lot CheaperA Lot SimplerAmazon RedshiftDelivered as aManaged Service
  • 12. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Data warehousing performance is all about IO
  • 13. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Amazon Redshift dramatically reduces I/OData compressionZone mapsDirect-attached storageLarge data block sizesID Age State Amount123 20 CA 500345 25 WA 250678 40 FL 125957 37 WA 375• With row storage you dounnecessary I/O• To get total amount, you have toread everything
  • 14. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Amazon Redshift dramatically reduces I/OData compressionZone mapsDirect-attached storageLarge data block sizesID Age State Amount123 20 CA 500345 25 WA 250678 40 FL 125957 37 WA 375• With column storage, you onlyread the data you need
  • 15. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Amazon Redshift dramatically reduces I/OColumn storageData compressionZone mapsDirect-attached storageLarge data block sizes• Columnar compression savesspace & reduces I/O• Amazon Redshift analyzes andcompresses your dataanalyze compression listing;Table | Column | Encoding---------+----------------+----------listing | listid | deltalisting | sellerid | delta32klisting | eventid | delta32klisting | dateid | bytedictlisting | numtickets | bytedictlisting | priceperticket | delta32klisting | totalprice | mostly32listing | listtime | raw
  • 16. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Amazon Redshift dramatically reduces I/OColumn storageData compressionDirect-attached storageLarge data block sizes• Keep track of the minimum andmaximum value for each block• Skip over blocks that don’tcontain the data needed for agiven query• Minimize unnecessary I/O
  • 17. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Amazon Redshift dramatically reduces I/OColumn storageData compressionZone mapsDirect-attached storageLarge data block sizes• Use direct-attached storage tomaximize throughput• Hardware optimized for highperformance data processing• Large block sizes to make themost of each read• Amazon Redshift managesdurability for you
  • 18. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Amazon Redshift architectureLeader Node• SQL endpoint• Stores metadata• Coordinates query executionCompute Nodes• Local, columnar storage• Execute queries in parallel• Load, backup, restore via Amazon S3• Parallel load from Amazon DynamoDBSingle node version available10 GigE(HPC)IngestionBackupRestoreJDBC/ODBC
  • 19. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Amazon Redshift runs on optimized hardwareHS1.8XL: 128 GB RAM, 16 Cores, 24 Spindles, 16 TB compressed user storage, 2 GB/sec scan rateHS1.XL: 16 GB RAM, 2 Cores, 3 Spindles, 2 TB compressed customer storageOptimized for I/O intensive workloadsHigh disk densityRuns in HPC - fast networkHS1.8XL available on Amazon EC2
  • 20. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Amazon Redshift parallelizes and distributes everythingQueryLoadBackup/RestoreResize
  • 21. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.• Load in parallel from Amazon S3or Amazon DynamoDB• Data automatically distributed andsorted• Scales linearly with number ofnodesQueryLoadBackup/RestoreResizeAmazon Redshift parallelizes and distributes everything
  • 22. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.• Backups to Amazon S3 areautomatic, continuous andincremental• Configurable system snapshotretention period• Take user snapshots on-demand• Streaming restores enable you toresume querying fasterQueryLoadBackup/RestoreResizeAmazon Redshift parallelizes and distributes everything
  • 23. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.• Resize while remaining online• Provision a new cluster in thebackground• Copy data in parallel from node tonode• Only charged for source clusterQueryLoadBackup/RestoreResizeAmazon Redshift parallelizes and distributes everything
  • 24. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.QueryLoadBackup/RestoreResize• Automatic SQL endpoint switchovervia DNS• Decommission the source cluster• Simple operation via AWS Console orAPIAmazon Redshift parallelizes and distributes everything
  • 25. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Amazon Redshift lets you start small and grow bigExtra Large Node (HS1.XL)3 spindles, 2 TB, 16 GB RAM, 2 coresSingle Node (2 TB)Cluster 2-32 Nodes (4 TB – 64 TB)Eight Extra Large Node (HS1.8XL)24 spindles, 16 TB, 128 GB RAM, 16 cores, 10 GigECluster 2-100 Nodes (32 TB – 1.6 PB)Note: Nodes not to scale
  • 26. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Amazon Redshift is priced to let you analyze all your dataPrice Per Hour forHS1.XL Single NodeEffective Hourly PricePer TBEffective Annual Priceper TBOn-Demand $ 0.850 $ 0.425 $ 3,7231 YearReservation$ 0.500 $ 0.250 $ 2,1903 YearReservation$ 0.228 $ 0.114 $ 999Simple PricingNumber of Nodes x Cost per HourNo charge for Leader NodeNo upfront costsPay as you go
  • 27. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Amazon Redshift is easy to useProvision in minutesMonitor query performancePoint and click resizeBuilt in securityAutomatic backups
  • 28. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Provision a data warehouse in minutes
  • 29. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Monitor query performance
  • 30. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Deep dive analysis
  • 31. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Point and click resize
  • 32. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Amazon Redshift has security built-inSSL to secure data in transitEncryption to secure data at rest• AES-256; hardware accelerated• All blocks on disks and in AmazonS3 encryptedNo direct access to computenodesAmazon VPC support10 GigE(HPC)IngestionBackupRestoreCustomer VPCInternalVPCJDBC/ODBC
  • 33. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Amazon Redshift continuously backs up your data andrecovers from failuresReplication within the cluster and backup to Amazon S3 to maintain multiple copies ofdata at all timesBackups to Amazon S3 are continuous, automatic, and incremental• Designed for eleven nines of durabilityContinuous monitoring and automated recovery from failures of drives and nodesAble to restore snapshots to any Availability Zone within a region
  • 34. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Amazon Redshift integrates with multiple data sourcesAmazonDynamoDBAmazon ElasticMapReduceAmazon SimpleStorage Service (S3)Amazon ElasticCompute Cloud(EC2)AWS StorageGatewayServiceCorporateData CenterAmazon RelationalDatabase Service(RDS)AmazonRedshift
  • 35. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Amazon Redshift provides multiple data loading optionsUpload to Amazon S3AWS Import/ExportAWS Direct ConnectWork with a partnerData Integration Systems Integrators
  • 36. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Amazon Redshift works with your existing analysis toolsJDBC/ODBCAmazon Redshift
  • 37. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • 38. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Pilot results have been dramaticTested 2 Billion row data set, 6representative queries on a 2-node Amazon Redshift clusterQueries ran between 12x and150x fasterCurrent environment:32 nodes, 128 CPUs, 4.2TBRAM, 1.6 PB disk
  • 39. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Reporting WarehouseAccelerated operational reportingSupport for short-time use casesData compression, index redundancyRDBMSRedshiftOLTPERPReportingand BI
  • 40. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.DataIntegrationPartners*On-Premises IntegrationRDBMSRedshiftOLTPERPReportingand BI* as of 3/14/2013
  • 41. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Live Archive for (Structured) Big DataDirect integration with copy commandHigh velocity data ages into RedshiftLow cost, high scale option for new appsDynamoDBRedshiftOLTPWeb AppsReportingand BI
  • 42. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Cloud ETL for Big DataMaintain online SQL access to historical logsTransformation and enrichment with EMRLonger history ensures better insightRedshiftReportingand BIEMRS3
  • 43. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Resources & QuestionsBen Butler | butlerb@amazon.comRedShift on AWS - http://aws.amazon.com/redshiftMarketplace - https://aws.amazon.com/marketplace/redshift/Documentation/User Guide - http://aws.amazon.com/documentation/redshift/Best Practices• http://docs.aws.amazon.com/redshift/latest/dg/c_designing-tables-best-practices.html• http://docs.aws.amazon.com/redshift/latest/dg/c_loading-data-best-practices.html
  • 44. © 2011 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.Introducing Amazon RedshiftAmazon’s Data Warehouse as a Servicehttp://aws.amazon.com/resources/databaseservices/webinarsBen Butler, Solutions ArchitectWorldwide Public Sectorbutlerb@amazon.com