Amazon Redshift for Business Intelligence

33,141 views
24,443 views

Published on

An introduction to Amazon Redshift for business intelligence applications. Presented at Microstrategy World 2013.

Published in: Technology
1 Comment
37 Likes
Statistics
Notes
No Downloads
Views
Total views
33,141
On SlideShare
0
From Embeds
0
Number of Embeds
1,461
Actions
Shares
0
Downloads
298
Comments
1
Likes
37
Embeds 0
No embeds

No notes for slide

Amazon Redshift for Business Intelligence

  1. introducingAMAZON REDSHIFT forBUSINESS INTELLIGENCE a presentation at MICROSTRATEGY WORLD 2013 by DR. MATT WOOD
  2. Hello.
  3. Thank you.
  4. IData, dataeverywhere
  5. I IIData, data Collection &everywhere storage
  6. I II IIIData, data Collection & Dataeverywhere storage security
  7. I II III IVData, data Collection & Data Dataeverywhere storage security movement
  8. I II 0. III IV Amazon webData, data Collectionervices Data S & Dataeverywhere storage security movement
  9. Building blocks.
  10. Compute, storage & databases.
  11. Retail Merchant Web services services
  12. Blinding flash of the obvious.
  13. Available.
  14. Low cost.
  15. Flexible.
  16. Every day, AWS adds enough servercapacity to power amazon.com in 2003,when it was a $5B enterprise
  17. IData, data everywhere
  18. Data for competitive advantage.
  19. Customer segmentation,financial modeling,system analysis,line of sight,business intelligence...
  20. Generation Collection & storageAnalytics & computationCollaboration & sharing
  21. Cost of data generation is falling.
  22. devicesKindle Fire HD, Kindle Fire, KindlePaperwhite and Kindle hold the top fourspots on the Amazon world wide best sellerchart since launch.
  23. Amazon Appstore selection tripled in 2012. apps and games
  24. Amazon customers purchased more than one toy per second on mobile devices.commerce
  25. most giftedkindle book
  26. lower cost,increased throughput Generation Collection & storage Analytics & computation Collaboration & sharing
  27. Generation highly constrained Collection & storageAnalytics & computationCollaboration & sharing
  28. Gap.
  29. Data volume The Data Analysis Gap Generated data Available for analysis 1990 2000 2010 2020 Enterprise Data Data in Warehouse Gartner: User Survey Analysis: Key Trends Shaping the Future of Data Center Infrastructure Through 2011 IDC: Worldwide Business Analytics Software 2012–2016 Forecast and 2011 Vendor Shares
  30. Enter AWS.
  31. Utility.
  32. Remove constraints.
  33. Generation highly constrained Collection & storageAnalytics & computationCollaboration & sharing
  34. Generation Collection & storageAnalytics & computationCollaboration & sharing
  35. Full value.
  36. Close the gap.
  37. Reduced time to market.
  38. Identify and meet new business opportunities.
  39. Lower costs.
  40. IICollection & Storage
  41. One schema to rule them all.
  42. One schema to rule them all.
  43. Lots of data. Lots of users. Lots of uses.Lots of locations.
  44. Cost.
  45. Multipliers.
  46. Object storage.
  47. 99.999999999% durability
  48. Relational databases.
  49. NoSQL data stores.
  50. HDFS based stores.
  51. Undifferentiated heavy lifting.
  52. Lower costs. Ease of use.
  53. only pay for what you useno capital investment Lower costs. Ease of use.pay as you go no subscriptions
  54. programmable integrate with existing toolsLower costs. Ease of use. easy to zero admin configure
  55. Data warehousing.
  56. Expensive. Complicated.
  57. Enterprises average between3 and 4 DBAs per datawarehouse. Source: Gartner. Critical factors in calculating the data warehouse TCO, July 2009
  58. Source: Oracle technology global price list 11/1/2012
  59. Expensive. Complicated.
  60. Unobtainable.
  61. Amazon Redshift.
  62. Fast. Powerful. Petabyte scale.
  63. Managed service.
  64. Automated deployment & configuration.
  65. SQL access and BI tool integration.
  66. Parallel execution.
  67. Leader Node
  68. Leader NodeCompute Compute Compute Node Node Node
  69. Leader NodeCompute Compute Compute Node Node Node
  70. 10gigE full bisection network.
  71. Leader NodeCompute Compute Compute Node Node Node
  72. Common BI Tools JDBC/ODBC Leader NodeCompute Compute Compute Node Node Node
  73. Certified for use with Microstrategy.
  74. Data compression.
  75. Automated backup to S3.
  76. Data encrypted in transit & at rest.
  77. Streaming recovery.
  78. Common BI Tools JDBC/ODBC Leader NodeCompute Compute Compute Node Node Node
  79. Common BI Tools JDBC/ODBC Leader NodeCompute Compute Compute Node Node Node
  80. Common BI Tools JDBC/ODBC Leader NodeCompute Compute Compute Node Node Node
  81. Elastic.
  82. Common BI Tools JDBC/ODBC Leader NodeCompute Compute Compute Node Node Node
  83. Common BI Tools JDBC/ODBC Leader NodeCompute Compute Compute Compute Compute Node Node Node Node Node
  84. Common BI Tools JDBC/ODBC Leader NodeCompute Compute Compute Node Node Node
  85. Data warehouse node types.
  86. High Storage Extra Large (XL)15GB RAM2TB local attached storage3 drives2 virtual cores
  87. High Storage Extra Large (XL) 8 High Storage Extra Large (8XL)15GB RAM 120GB RAM2TB local attached storage 16TB local attached storage3 drives 24 drives2 virtual cores 16 virtual cores
  88. Pay as you go.
  89. Hourly Prices 2 TB nodes 16 TB nodesOn-demand $0.850 $6.801 Year $0.50 $4.00Reservation3 Year $0.228 $1.824Reservation
  90. Hourly Prices 2 TB nodes 16 TB nodesOn-demand $0.850 $6.801 Year $0.50 $4.00Reservation3 Year $0.228 $1.824Reservation
  91. $999 per TB
  92. Don’t pay for the leader node.
  93. No additional storage charge for backups of active clusters.
  94. VPC ready.
  95. Low cost. Easy to use.
  96. Focus on analysis.
  97. Private beta today.
  98. Available early this year.
  99. aws.amazon.com/redshift
  100. 2 billion row dataset. 6 representative queries.
  101. Amazon Redshift: 2 instance clusterCompared to 32 nodes. 128 CPUs. 4.2 TB RAM. 1.6 PB storage. 2 billion row data set. 12x to 150x faster
  102. 29 minutes 58 seconds down to 12 seconds
  103. IIIData security.
  104. Security is our number one priority.
  105. Shared responsibility.
  106. Choose your region.
  107. Availability zones.
  108. SOC 2 ISAE 3402 FISMA Moderate PCI DSS FIPS 140-2ISO 27001 ITAR HIPAA MPAA
  109. “You basically turn yourself into apolymorphic surface to which the attack guyhas a much tougher time getting at. That,ultimately, is the real key advantage to drivesecurity and make things much better for usacross the board.”Gus Hunt, CTOCentral Intelligence Agency
  110. Virtual Private Cloud.
  111. Network isolated environment.
  112. Public and private subnets.
  113. Redshift, relational databases, Hadoop can run inside the VPC.
  114. Extend your VPN.
  115. Identity and access federation.
  116. Identity and access management.
  117. IVData movement.
  118. “How do I get my data into the cloud?”
  119. Generated and stored in the AWS cloud.
  120. Inbound transfer if free.
  121. Multipart upload.
  122. Aspera, IRODS.
  123. Physical media.
  124. AWS Direct Connect.
  125. 1Gbps or 10Gbps
  126. Built in AZ replication.
  127. Regional replication.
  128. “How do I integrate my data?”
  129. Amazon S3 Amazon RDSAmazon DynamoDB Amazon RedshiftHDFS (Amazon EMR) On Premise
  130. AWS Data Pipeline
  131. Data-intensive orchestration & automation.
  132. Reliable, scheduleddata movement and analytics.
  133. aws.amazon.com/datapipeline
  134. aws.amazon.com
  135. IData, dataeverywhere
  136. I IIData, data Collection &everywhere storage
  137. I II IIIData, data Collection & Dataeverywhere storage security
  138. I II III IVData, data Collection & Data Dataeverywhere storage security movement
  139. Thank you.
  140. get in touch introducing MATTHEW@AMAZON.COMAMAZON REDSHIFT or @MZA forBUSINESS INTELLIGENCE AWS.AMAZON.COM

×